Movatterモバイル変換

[0]ホーム

Jump to content

Wikipedia talk:Writing articles with large language models

Page contents not supported in other languages.

From Wikipedia, the free encyclopedia

This is thetalk page for discussing improvements to theWriting articles with large language models page.

Put new text under old text.Click here to start a new topic.
New to Wikipedia? Welcome!Learn to edit;get help.

Shortcut

WT:NEWLLMWT:NEWLLM

Archives:1

view · edit

Frequently asked questions

Q1: What is the purpose of this guideline?

A1: To establish a ground rule against using AI tools to create articles from scratch.

Q2: This guideline covers so little! What's the point?

A2: The point is to havesomething. Instead of trying to get consensus for the perfect guideline on AI, which doesn't exist, we have been in practice pursuing a piecemeal approach: restrictions on AI images, AI-generated comments, etc. This is the next step. Eventually, we might merge them all into one to create one guideline on AI use.

Q3: Why doesn't this guideline explain or justify itself?

A3: Guidelines aren't information pages. We have plenty of information already about why using LLMs is usually a bad idea atWP:LLM.

Q4: Why is this guideline only restricted to new articles?

A4: This guideline, originally a proposal, was intentionally designed to be simple and narrow on purpose so consensus could easily be gained and it could become a guideline, with the intent to flesh it out in later discussions.

Why is this guideline only restricted to "new articles"? Shouldn't this apply to all articles? (and talk pages and so on...)

[edit]

Under my own reading of this rule, it seems like it only applies to new articles, and that pre-existing articles are somehow allowed to have AI-generated text inserted into them.GarethBaloney (talk)13:46, 24 November 2025 (UTC)[reply]

I think because it's a badly written sentence and was erroneously promoted to Guideline.qcne(talk)13:48, 24 November 2025 (UTC)[reply]

Well if people are saying it's a badly written guideline then we should make a new discussion on changing it!GarethBaloney (talk)14:05, 24 November 2025 (UTC)[reply]

Yes! Let's have all our guidelines be padded out with twelve-thousand word essays defending and justifying them and providing supplementary information such that no-one will ever read them and newbies have no freaking idea what it's actually telling them.Cremastra (talk ·contribs)02:05, 25 November 2025 (UTC)[reply]

The guideline and RFC were probably written minimalistically to increase its chances of passing an RFC, with the intent to flesh it out in follow up discussions. –Novem Linguae(talk)21:26, 24 November 2025 (UTC)[reply]

This one.Cremastra (talk ·contribs)01:08, 25 November 2025 (UTC)[reply]

Further amendment proposal #1: Festucalex

[edit]

This proposal is part of anWP:RFCBEFORE with the goal of reaching a comprehensive guideline for LLM use on Wikipedia.

Well,habemus guideline. Now, how is it going to be enforced, given the fact that the guideline is donut-shaped? We might as well address the "from scratch" loophole and preempt the thousands of man-hours that are going to be wasted debating it with LLM users. How should we define "from scratch"? In an ideal situation, the guideline would be this:

−

Large language models~~(orLLMs)can~~be~~usefultools,buttheyarenotgood~~at~~creatingentirelynewWikipediaarticles~~.~~Largelanguagemodels~~should not be used to~~generatenewWikipediaarticlesfromscratch~~.

Large language models should not be used toeditWikipedia.

This will close the loophole. Any improvements are welcome.〜Festucalex •talk14:05, 24 November 2025 (UTC)[reply]

Strong support. The usage of LLMs to directly edit, add to, or create articles should not be accepted in any way. The high likelihood, and inherent poor quality of sourcing of LLMs makes them ill suited for use on Wikipedia, and genuine human writing and research should be the standard.Stickymatch02:55, 25 November 2025 (UTC)[reply]

LLM sourcing can be 100% controlled (editor selects sources, uploads them, and explicitly prohibits using anything else). So the poor choice of sources is a human factor, evident in many human-written articles here.Викидим (talk)05:49, 25 November 2025 (UTC)[reply]

I am not sure if we can amend the proposal after all these !votes have been made, but could you make an exclusion for grammer checkers?Mikeycdiamond (talk)15:52, 26 November 2025 (UTC)[reply]

These are not !votes. This is anWP:RFCBEFORE discussion.voorts (talk/contributions)16:33, 26 November 2025 (UTC)[reply]

P.S. I just finished writing an essay against one of the proposed "accepted uses" for LLMs on Wikipedia. I welcome your feedback on the essay's talkpage.User:Festucalex/Don't use LLMs as search engines〜Festucalex •talk16:54, 24 November 2025 (UTC)[reply]

Isupport this wholeheartedly.GarethBaloney (talk)14:07, 24 November 2025 (UTC)[reply]

Support. "From scratch" is way too generous.TheBritinator (talk)14:25, 24 November 2025 (UTC)[reply]

Any policy or guideline that says "ban all uses of LLMs" is bound to get significant opposition.SuperPianoMan9167 (talk)14:31, 24 November 2025 (UTC)[reply]

And all policies and guidelines have abuilt-in loophole anyway.SuperPianoMan9167 (talk)14:36, 24 November 2025 (UTC)[reply]

The fact thatWP:IAR exists doesn't mean that we ought to actively introduce crippling loopholes into guidelines. Imagine if we banned vandalism only on new articles, or only on articles that begin with the letter P.〜Festucalex •talk14:57, 24 November 2025 (UTC)[reply]

If you look at the RfC you can see a significant number of users who disagree with the assertion that "all LLM use is bad", which is why I have doubts that a proposal to ban LLMs entirely will ever pass.SuperPianoMan9167 (talk)15:00, 24 November 2025 (UTC)[reply]

It'sWP:NOTVOTE and it should never be. As I said before, anyone who wants to open up uses for LLMs on Wikipedia should explainprecisely, minutely, down to the atomic level how and why LLMs can be used on Wikipedia and how these uses are legitimate and minimally disruptive as opposed to all other uses. The case against LLMs has been made practically thousands of times, while the pro-LLM case consists of nothing more than handwaving towards vague say-so assertions and AI company marketing buzzwords.〜Festucalex •talk15:09, 24 November 2025 (UTC)[reply]

WikiProject AI Tools was formed to coordinate legitimate uses of LLMs.SuperPianoMan9167 (talk)22:31, 24 November 2025 (UTC)[reply]

Also,the rules are principles. Thegeneral idea of this guideline is that using LLMs to generate new articles is bad. It isnot andshould not be a blanket ban on LLMs. LLMs are tools. Like all tools, they have valid use cases but can be misused. Yes, their outputs may be inherently unreliable, but it is incorrect to say they have no use cases.SuperPianoMan9167 (talk)22:39, 24 November 2025 (UTC)[reply]

Support but with the caveat that I think it's too broad for what this policy has already been approved for. This edit implies any use of LLMs is unacceptable, even if it's not LLM-generated content being included in an article. Given that there's still arguably a carveout for using LLMs to assist with idea generation etc, myCounterproposal if people find it more appealing can be found at#Further amendment proposal #3: Athanelar.Athanelar (talk)14:43, 24 November 2025 (UTC)[reply]

I think we ought to actively discourage other non-submission uses, even if we can't detect them. At least we'd be making it clear that the community disapproves. This only will stop the honest ones, but hey, that's something.〜Festucalex •talk14:55, 24 November 2025 (UTC)[reply]

I agree, that's why my initial statement is support, I just wanted to present a counterproposal in case the majority would prefer something that doesn't widen the scope so much.Athanelar (talk)14:59, 24 November 2025 (UTC)[reply]

Can you put the counterproposal in a different section to avoid confusion?NicheSports (talk)15:10, 24 November 2025 (UTC)[reply]

Done.Athanelar (talk)15:19, 24 November 2025 (UTC)[reply]

Support. We should probably add clarifying language to this (I have some ready I can propose), but definitely agree and think the community is ready to support a complete LLM banNicheSports (talk) 15:09, 24 November 2025 (UTC) Now that I understand what is meant by this proposal, I don't support it. I would support a ban on using LLMs to generate article content (per Kowal2701)NicheSports (talk)00:11, 25 November 2025 (UTC)[reply]

Similar to my comment below, this completely changes the purpose of this guideline (expanding its scope from new articles to all edits) and would require a new RfC.Toadspike [Talk]15:48, 24 November 2025 (UTC)[reply]

Definitely – I interpreted this as workshopping something that will be brought to another RFC. Is that fine to do here or should we move it toWP:AIC?NicheSports (talk)15:50, 24 November 2025 (UTC)[reply]

Yes, what we're doing here is theWP:RFCBEFORE that the original proposal never got. There are already 3 wordings on the table: mine, qcne's, and Athanelar's, and I hope this eventually crystallizes (after more refining) into a community-wide RFC. As the closing note pointed out, this issue requires a lot more work and discussion, and a lot of people agreed to Cremastra's proposal because they wantedanything to be instituted to stem the bleeding while the community deliberated on a wider policy.〜Festucalex •talk16:14, 24 November 2025 (UTC)[reply]

Oppose. AI is atool. For example, I routinely use AI to generate{{cite journal}} templates from loose text (like the references in other publications) or to check my grammar. This is IMHO no more dangerous than using thehttps://citer.toolforge.org/ for the same purpose (orGrammarly to check the grammar). We should encourage the disclosure, not start an un-enforceableProhibition.Викидим (talk)21:29, 24 November 2025 (UTC)[reply]

@Викидим What are your thoughts on my proposal #2, below, which has a specific carve-out for limited LLM use?qcne(talk)21:31, 24 November 2025 (UTC)[reply]

Does creating the journal template count as generating text for articles?GarethBaloney (talk)21:44, 24 November 2025 (UTC)[reply]

The sources are certainly part of the text. According to views expressed in the discussion, AI can hallucinate the citation. For the avoidance of doubt, in my opinion – and experience – this is not the case with this use, but then there are many other safe uses of AI – like translation – and all of these IMHO shall be explicitly allowed (yes, I also happen to like m-dashes).Викидим (talk)22:10, 24 November 2025 (UTC)[reply]

This is IMHO no more dangerous than using [...] I strongly disagree that using the hallucination machine specifically designed to create natural-sounding but not-necessarily-accurate language output is 'no more dangerous' for these purposes than using tools specifically designed for the tasks at hand.Athanelar (talk)21:54, 24 November 2025 (UTC)[reply]

The AI is not made to manufacture lies any more than a keyboard is. The difference is in performance and intent of the user – these are the ones we might want to address. Blaming tools is IMHO a dead end,Luddites, ostensibly also fighting for quality, quickly lost their battle.Викидим (talk)22:13, 24 November 2025 (UTC)[reply]

Are unscrupulous editors not more likely to use something like ChatGPT to try and sound professional even when they aren't? Besides, Grammarly is not the same as asking an LLM to generate a Wikipedia article, complete with possibly fake sources.GarethBaloney (talk)22:59, 24 November 2025 (UTC)[reply]

(1)try and sound professional even when they aren't We are (almost) all amateurs here, so a tool that makes non-professionals sound better is not necessarily bad. (2) The proposal readsshould not be used to edit Wikipedia leaving no exceptions for grammar checking.Викидим (talk)23:23, 24 November 2025 (UTC)[reply]

Grammar checking can done (and has been being done for decades) using non-LLM artificial intelligence models and programs.〜Festucalex •talk23:35, 24 November 2025 (UTC)[reply]

I was going to point this out, haha. There's been automatic grammar checking and spellcheck since what- Word 97? No LLM required.Stickymatch02:58, 25 November 2025 (UTC)[reply]

All modern translation and grammar checking tools use AI, as it produces superior results. Google for obvious reasons was heavily invested into both for almost 20 years. According to my source, they at first were trying to go the non-AI way (studying and parsing individual grammars, etc.) only to discover than direct mapping between texts does a better job at a lower cost. Everyone else of any importance followed their approach many years ago. It was just not a generic AI that we know now, but an AI nonetheless. Some detail can be found, for example, on p. 19 of the 2008 thesis[1] (there should be better written sources, naturally, but the fact is very well known).Викидим (talk)06:03, 25 November 2025 (UTC)[reply]

Strong support: removes all ambiguity.Z E T A^C21:34, 24 November 2025 (UTC)[reply]

Oppose, people often use stuff likeGrammarly. The ban needs to be on generating contentKowal2701 (talk)21:38, 24 November 2025 (UTC)[reply]

Grammarly is not an LLM.〜Festucalex •talk23:34, 24 November 2025 (UTC)[reply]

It's powered by LLMs:In April 2023, Grammarly launched a product usinggenerative AI built on theGPT-3 large language models. (fromthe article)SuperPianoMan9167 (talk)23:35, 24 November 2025 (UTC)[reply]

Generative AI tools like Grammarly are powered by a large language model, or LLM - from the Grammarly website[2]GreenLipstickLesbian 💌🧸23:37, 24 November 2025 (UTC)[reply]

Then users can use a grammar checker other than Grammarly.〜Festucalex •talk23:40, 24 November 2025 (UTC)[reply]

Wow.voorts (talk/contributions)23:45, 24 November 2025 (UTC)[reply]

I think what users on both sides of this ideological divide are running up against is a common thing that happens whenever there is such a divide between two groups; both groups assume that members of the other group are operating on the same fundamental value system that they are, and that their arguments are built from that same value system.

I.e., the 'less restrictive' party here (voorts, qcne et al) is beginning from the core value that 'the reason LLMs are problematic is that their output is generally not compatible with Wikipedia's standards,' and the argument that stems from that is 'any LLM policy we make should be designed around bringing the result of LLM usage in line with Wikipedia's standards, whether that be directly LLM-generated text, or simply users utilising LLMs in their creative process.'

The 'more restrictive' part here (myself, Festucalex et al) is beginning from the core value that 'LLMs and their output are inherently undesirable and detrimental (for some of us to the internet as a whole, for others perhaps specifically only to Wikipedia)' and the argument that stems from that is 'any LLM policy we make should be designed around minimising the influence of LLMs on the content of Wikipedia.'

That's why Festucalex pivoted here and said people should use something other than Grammarly. We simply believe that it's imperative that we purge LLM output from Wikipedia, regardless of whether it's reviewed or policy compliant or anything else. It's also important to keep in mind that NEWLLM as it stands is a product of the latter ideology, not the former, and I think that's why it appears to be so flawed to people like qcne; because it's solving a completely different problem than the one they're trying to solve.Athanelar (talk)01:03, 25 November 2025 (UTC)[reply]

I understand your views. What I don't see is evidence.voorts (talk/contributions)01:11, 25 November 2025 (UTC)[reply]

Exactly. I made an identical point about this fundamental dividein the RfC. (I have discovered I am pivoting more towards the "less restrictive" side in my comments here.)SuperPianoMan9167 (talk)01:59, 25 November 2025 (UTC)[reply]

Yes, I think people understand the divide is between this idea of fundamentalism (the intrinsic nature of LLMs is that they are bad) and those who don't subscribe to it. But what many of us who oppose this fundamentalism think is that rather than being based on evidence (voorts), it's an article of faith.Katzrockso (talk)02:56, 25 November 2025 (UTC)[reply]

Not workable – if somebody comes up to me and says "Hey, you've made a mistake inHanako (elephant)" or shows up on BLP saying "You have my birthdate wrong", then I don't care if they use a LLM to write their post, and I don't care if they use an LLM to translate it from their native language. I'm not even sure I care if they use the LLM to make the edit/explain themselves in the edit summary (but I'd rather they disclose it, for obvious reasons), assuming they do it right.

Ultimately, somebody who repeatedly introduces hoax material/fictitious references to articles repeatedly should be blocked quickly, whether they're using AI or not. Somebody who repeatedly introduces spammy text repeatedly should be blocked, whether they have a COI or not. Somebody who repeatedly introduces unsourced negative BLP information should be blocked, whether or not they're a vandal/have a COI. Somebody who repeatedly inserts copyright violations should be blocked, whether they're acting in good faith or not. The LLM is a red herring – once we've established that the content somebody writes is seriously flawed in a way that's not just accidentally, we need to block the contributor. If they say "but it's not my fault, ChatGPT told me to" then unblocks admins can take that into consideration & we can tban that editor from using automated or semi-automated tools as an unblock condition.GreenLipstickLesbian 💌🧸23:00, 24 November 2025 (UTC)[reply]

+1 This whole guideline is everyone just sticking their heads in the sand and hoping LLM usage will go away. We should be thinking about how LLMs can be used well, not outright banning their use.voorts (talk/contributions)23:08, 24 November 2025 (UTC)[reply]

It's also yet another example of why PAGmaking on the fly and without advanced deliberation is a terrible idea.voorts (talk/contributions)23:10, 24 November 2025 (UTC)[reply]

There are no legitimate uses for LLMs, just like there are no legitimate uses for chemical weapons. They're both technicallya tool, and anyone can argue thatsarin gas can technically be used against rodents, but is it really worth the risk of having it around the kitchen?〜Festucalex •talk23:46, 24 November 2025 (UTC)[reply]

Are you seriously comparing LLMs to chemical weapons?voorts (talk/contributions)23:48, 24 November 2025 (UTC)[reply]

Yep.〜Festucalex •talk23:49, 24 November 2025 (UTC)[reply]

65k bytes to get toGodwin's Law, nice!GreenLipstickLesbian 💌🧸00:03, 25 November 2025 (UTC)[reply]

Festucalex please lol. Also, idk if this is written down anywhere, there's probably an essay, but the fastest way to nuke support for a plausible idea here is to start saying stuff like "X is like sarin gas"NicheSports (talk)00:06, 25 November 2025 (UTC)[reply]

I think the analogy I'm making is clear: it's a technology whose risks override any potential benefits, at least in this context. Forget sarin gas, let's say it's like apogo stick in a porcelain museum.〜Festucalex •talk00:09, 25 November 2025 (UTC)[reply]

There are no legitimate uses for LLMs What aboutthis, andthis, andthis, andthis, andthis, andthis, andthis, and...

You get the point.SuperPianoMan9167 (talk)23:53, 24 November 2025 (UTC)[reply]

There are no legitimate uses of LLMs on Wikipedia. I have said it before and I will say it again. Even if it is impossible to stop all LLM usage, guidelines like this one can serve as a statement of principle.Yours, &c.RGloucester —☎00:00, 25 November 2025 (UTC)[reply]

So everyone inWikiProject AI Tools is editing in bad faith?SuperPianoMan9167 (talk)00:02, 25 November 2025 (UTC)[reply]

They're using bad tools in good faith because we don't have a comprehensive guideline yet.〜Festucalex •talk00:04, 25 November 2025 (UTC)[reply]

Why can't LLMs ever be legitimately used on Wikipedia?voorts (talk/contributions)00:06, 25 November 2025 (UTC)[reply]

What is the philosophical mission of Wikipedia?WP:ABOUT begins with the Jimbo quoteImagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That's what we're doing.

LLMs don't produce human knowledge. They produce realistic-sounding human language, because that's what they're designed to do, it's all they've ever been designed to do, it's all they can ever be designed to do – it's literally in their fundamental structure. Not only that, but the output they produce is explicitly biased by their programming and their training data, which are both determined by a private company with no transparency or oversight.

Would you be content if the entirety of Wikipedia's article content were created and maintained by a single editor? Let's assume that single editor is flawless in their work; all of their work is rigorous and meets the standards set by the community (who are still active in a non-article capacity), it's perfectly sourced etc; it's just that it's all coming from a single individual.

What about 90%? 80%? 50%? What percentage of the encyclopedia could be written and managed by a single individual before it would compromise the collaborative nature of Wikipedia?

Thesis 1: The output of an LLM is, effectively, the work of a single individual. Obviously it's more complex than that, but LLM output all has the same tone because it's all the product of the same algorithms from the same privatised training data.

Thesis 2: Given the opportunity, LLM output will comprise an increasingly large percentage of Wikipedia, because it is far faster to copyedit, rewrite and create with LLMs than it is to do so manually. This will only increase the more advanced LLMs get, because their output will require less and less human oversight to comply with Wikipedia's standards.

The conclusion, then, is how much of Wikipedia's total content you're willing to accept being authored by what is essentially a single individual with inscrutable biases and motivations. There must be some cutoff in your mind; and our contention is that if you allow them to get their foot in the door, then the result is going to end up going beyond whatever percentage cutoff you've decided as acceptable.Athanelar (talk)02:48, 25 November 2025 (UTC)[reply]

"The output of an LLM is, effectively, the work of a single individual. Obviously it's more complex than that" is putting it lightly. Notwithstanding the fact that more than one LLM exists, editors who opposite anti-LLM fundamentalism here have consistently advocated for the necessity of human review and editing when evaluating LLM output.Katzrockso (talk)02:58, 25 November 2025 (UTC)[reply]

editors who opposite anti-LLM fundamentalism here have consistently advocated for the necessity of human review and editing when evaluating LLM output.

Well, okay, take my initial example again, then. Let's say John Wikipedia is still producing 50 or 80 or 100% or whatever of Wikipedia's output, but it's first being checked by somebody else to make sure it meets standards. Would it now be acceptable that John Wikipedia is the sole author of the majority (or a plurality or simply a large percentage) of Wikipedia's content, simply because his work has been double-checked?Athanelar (talk)03:09, 25 November 2025 (UTC)[reply]

Yes, if John Wikipedia's contributions all accurately represents the sources as evaluated by other editors and meets our content standards, why would that be a problem?Katzrockso (talk)03:43, 25 November 2025 (UTC)[reply]

Well, that's just one of those fundamental value differences we'll never overcome, then. I don't think John Wikipedia should be the primary author of content on Wikipedia because that would undermine the point of Wikipedia being a communal project, and for that same reason I don't think we should allow AI-generated content to steadily overtake Wikipedia either, whether or not it's been reviewed or verified or what have you.Athanelar (talk)03:47, 25 November 2025 (UTC)[reply]

This happens all the time at smaller Wikipedias. There just aren't enough people who speak some languages + can afford to spend hours/days/years editing + actually want to do this for fun to have "a communal project" the way that you're thinking of it.WhatamIdoing (talk)06:09, 27 November 2025 (UTC)[reply]

What about uses of LLMs thataren't generating new content (which is what most of the tools atWikiProject AI Tools are about)?SuperPianoMan9167 (talk)03:03, 25 November 2025 (UTC)[reply]

I don't have any issue with that, because it's functionally impossible to identify and police. That's why my proposal is worded differently to Festucalex's, because I think it's only sensible and possible to prohibit the inclusion of AI-generated text, not the use of AI in one's editing process at all.Athanelar (talk)03:11, 25 November 2025 (UTC)[reply]

I asked why they can'tever be used. I have several FAs and GAs, but I'm terrible at spelling. If, as seems to be the direction the world is heading, most browsers replaced their original spellcheckers with LLM-powered ones, are you suggesting I'd need to install an obscure browser created by anti-AI people to avoid running afoul of this proposed dogmatism?voorts (talk/contributions)13:38, 25 November 2025 (UTC)[reply]

No, my proposal is to ban adding AI-generated content to Wikipedia, not to ban people using AI as part of their human editing workflow, that would be unenforceable.Athanelar (talk)14:13, 25 November 2025 (UTC)[reply]

None of these are legitimate, and I hope that our new guideline puts an end to them before they become standard practice. No use designing and marketing kitchen canisters for sarin gas.〜Festucalex •talk00:02, 25 November 2025 (UTC)[reply]

This reads more like a moral panic than a logically & evidentially supported proposalKatzrockso (talk)00:52, 25 November 2025 (UTC)[reply]

It's not a moral issue. LLMs undermine the whole foundation of this project. They were developed by companies that are indirect competition with Wikipedia. These companies have used our content with the aim of monetarising it through LLM chatbots, and now plot to replace Wikipedia altogether,à la Grokipedia. Promoting LLM use will rot the project from within, and ultimately result in its collapse.Yours, &c.RGloucester —☎06:12, 25 November 2025 (UTC)[reply]

Slippery slopeKatzrockso (talk)14:09, 25 November 2025 (UTC)[reply]

Yes, it is a 'slippery slope' argument, if anything, a better term is 'death by a thousand cuts'. It is a common misconception that a slippery slope argument is an inherent fallacy. I find it very interesting that some editors here prefer to place emphasis on the quality of the content produced, rather than on the actual mission of the project. Let us take this kind of argument to its logical conclusion. If some form of LLM were to advance, and were able to produce content of equivalent quality to the best Wikipedia editors, would we wind up the project, our mission complete? I'd like to hope that the answer would be no, because Wikipedia is meant to be a free encyclopaedia that any human can edit.

When one outsources some function to these 'tools', whether it be spellchecking or article writing, it will inevitably result in the decline of one's own copyediting and writing skills. As our editors lose the skills they have gained by working on this encyclopaedia over these past two decades, they will become more and more reliant on the LLMs. What happens then, when the corporations that own these LLMs decide to cease providing their 'tools' to the massesgratis? Editors, with their own skills weakened, will become helpless. Perhaps only those with the ability to pay to access LLMs will be able to produce content that meets new quality standards that have shifted to align with LLM output. Wikipedia's quality will decline as the pool of skilled editors dwindles, and our audience will shift toward alternatives, like the LLMs themselves. The whole mission of the project will be called into question, as Wikipedia loses its competitive advantage in the marketplace of knowledge.Yours, &c.RGloucester —☎00:20, 26 November 2025 (UTC)[reply]

But we shouldn'tsacrifice newcomers in the name of preserving the project by blocking them for using LLMs right after they join when they have no clue why or how LLMs are unreliable.SuperPianoMan9167 (talk)00:25, 26 November 2025 (UTC)[reply]

My hope for this guideline is that it will prevent that kind of blocking, since good faith newcomers who show up using LLMs will get reverted and linked to this page, instead of the previous situation where they get asked politely to stop, then when they don't, they eventually get dragged to ANI and TBanned from using LLMs, which is frustrating and much more difficult to understand than a simple page that says "Wikipedia doesn't accept LLM-generated articles because that's one of the things that makes Wikipedia different from Grokipedia". --LWG ^talk 00:57, 26 November 2025 (UTC)[reply]

Assuming we adopt this proposal, and assuming thatgood faith newcomers abide, there will still be editors whoget asked politely to stop (i.e., they will bewarned),then when they don't, they eventually [will] get dragged to ANI and blocked, not TBANNED (by my count, only 3 editors are topic banned from LLM use perWikipedia:Editing restrictions). I've blocked/revoked TPA of many accounts for repeated LLM use and I can assure you that almost none of those editors knew or cared about what any of our guidelines said. In no universe would a no-LLM rule result in any change to the process of having to drag people to ANI to get them blocked.voorts (talk/contributions)01:11, 26 November 2025 (UTC)[reply]

^this.

To use a real example, every single time anybody makes a post, they agree not to copy paste content from other sites, attribute it if they copy from within Wikipedia, and there are sooooooooooooooooooooooo many copyright blocks given out every year. Most of these people unambigiously acted in good faith. And each and every one got dragged to a noticeboard, often multiple times, before they were blocked. I'm sorry, but this won't be any different - and Wikipedia naturally draws the type of people who like to ask "why", so we're still going to have to point them toWP:LLM and won't be swayed by a simple page saying "no, because I said so".GreenLipstickLesbian 💌🧸08:05, 26 November 2025 (UTC)[reply]

SuperPianoMan, I agree with you, and I also agree with LWG. The problem until now was that Wikipedia has failed to clearly explain its stance on LLMs, blocking myriad editors without any obvious policy or guideline-based rationale. This ad hoc form of justice has gone on too long, and is unfair to newcomers, and is one reason why I supported the adoption of this guideline, despite its shortcomings. The community needs to clearly explain Wikipedia's purpose, and why LLMs are not suited for use on Wikipedia, to both new editors and our readership. Wikipedia should aim to promote the value of a project that is free, that anyone can edit, and that is made by independent men and women from right across the world. If anything, our position as a human encyclopaedia should be a merit in a competitive information marketplace.Yours, &c.RGloucester —☎01:11, 26 November 2025 (UTC)[reply]

they eventually get dragged to ANI and TBanned from using LLMs, which is frustrating and much more difficult to understand than a simple page

Yes exactly. People were regularly being sanctioned for a rule that they could not have known about it because no such rule existed. Even if not a single newbie ends up reading this guideline, its existence is still beneficial, because it means we are no longer punishing people for breaking unwritten rules.Gnomingstuff (talk)09:50, 26 November 2025 (UTC)[reply]

I don't think it's ever been practice to sanction somebody for just AI use, though? It's always been fictitious references, violating mass create, copyright issues,WP:V failures, UPE/COI, NPOV violations, ect. I'm not saying no admin has ever blocked a user for only using LLMs, (admins do act outside of policy, sometimes!) though I'd be interested to see any examples. Thanks,GreenLipstickLesbian 💌🧸10:23, 26 November 2025 (UTC)[reply]

Usually it's more than just AI use if it ends up at ANI but I doubt the distinction is really getting through to people, and a lot of !votes to block, CBAN, TBAN, etc. are made with the rationale of "AI has no place on Wikipedia ever." Sometimes the bulk of the thing is that (example:Wikipedia:Administrators'_noticeboard/IncidentArchive1185#User:_BishalNepal323)

There's also the uw-ai to uw-ai4 series of templates, which implies a four-strikes rule; I don't use them but others do.Gnomingstuff (talk)10:51, 26 November 2025 (UTC)[reply]

In your example, Ivanvector blocked for disruptive editing, not solely for AI use.voorts (talk/contributions)14:08, 26 November 2025 (UTC)[reply]

What are we arguing about here? Obviously people are getting blocked for LLMmisuse, not LLM use. And I agree with Gnomingstuff and LWG etc. I believe in AGF and have dozens of examples of editors who have stopped using LLMs after I alert them to the difficulty of using them in compliance with content policies.NicheSports (talk)14:22, 26 November 2025 (UTC)[reply]

We're arguing about the assertion that we need a no AI rule because we've been blocking people solely for AI use without any attendant disruption. That is not true and therefore not a good reason to impose a no AI rule.voorts (talk/contributions)14:23, 26 November 2025 (UTC)[reply]

To be more clear, when I said "the bulk of the thing" I meant the tenor of the responses in an average ANI posting. Several regulars at ANI generally seem to be under the impression that we do not allow AI, so most !votes are going to have largely unchallenged comments likeCIR block now. This LLM shit needs to be stopped by any means necessary. orLLM use should warrant an immediate block, only lifted when a user can demonstrate a clear understanding that they can't use LLMs in any situation. Or if someone gets hit with a uw-ai2, they are toldPlease refrain from making edits generated using a large language model (an "AI chatbot" or other application using such technology) to Wikipedia pages.Gnomingstuff (talk)00:39, 27 November 2025 (UTC)[reply]

People say a lot of incorrect things at ANI. We don't usually amend the PAGs to accommodate those people.voorts (talk/contributions)01:05, 27 November 2025 (UTC)[reply]

On the contrary, that's exactly what we do. PAGs are meant to reflect the actual practice of editors. The process of updating old PAGs or creating new ones to reflect changes in editorial practice is the foundation that has built all of our policies and guidelines.Yours, &c.RGloucester —☎03:10, 27 November 2025 (UTC)[reply]

We're not tho. Nobody as far as I can tell has ever been blocked solely for using AI/LLMs. This is a red herring.voorts (talk/contributions)13:54, 26 November 2025 (UTC)[reply]

If tomorrow, a LLM came out that could produce a FA-quality article on a given topic in 2 minutes, would you still suggest that LLMs have no place on Wikipedia?

Histrionic comparisons about scenarios that won't happen go both ways.Katzrockso (talk)07:59, 27 November 2025 (UTC)[reply]

Yes, I would, because using such a technology to produce articles is contrary to the purpose and mission of Wikipedia. Wikipedia's defining principles are that it is free, that any human can edit it, and that its content is produced collaboratively by divers volunteers. Others and I have already explained why machine-produced content contravenes these principles. I care less whether an article is 'FA-quality', whatever that means, and more about how it was made.Yours, &c.RGloucester —☎08:46, 27 November 2025 (UTC)[reply]

Wikipedia's defining principles are that it is free, that any human can edit it, and that its content is produced collaboratively by divers volunteers. Others and I have already explained why machine-produced content contravenes these principles. I am certainly not a fan of LLMs for generating content. However, I don't see how you can say that a human editor, who chooses to use an LLM to generate some content, checks the content to make sure that it accurately reflects its sources and is otherwise PAG-compliant, and finally adds the sourced content to an article contravenes these principles. Wikipedia is no less free, any human can still edit it, and divers volunteers are still able to collaboratively work on the article. Even though that particular content happend to have been produced by a machine. Cheers,SunloungerFrog (talk)09:12, 27 November 2025 (UTC)[reply]

Yes, in this hypothetical thought experiment. We don't live in a thought experiment. LLM output is getting better in that it is less obviously bad, but the nature of this kind of text generation means it is not suited well and may never be suited well to producing verifiable nonfiction articles.Gnomingstuff (talk)14:18, 27 November 2025 (UTC)[reply]

Why? What's wrong with an LLM spellchecker other than that you don't like it?voorts (talk/contributions)13:47, 25 November 2025 (UTC)[reply]

+1 Even the autocorrect on my iPhone uses atransformer, which is the same kind of neural network as that which powers LLMs. The major difference is in size (they're calledlarge language models for a reason).SuperPianoMan9167 (talk)14:19, 25 November 2025 (UTC)[reply]

Support. This guideline is a good start and I am glad it was approved but it should be expanded.LLMS are not an acceptable way to edit wiki as they cause lots of issues like hallucinations.Changing tooppose as I just realised this goes beyond creating content and would include thongs like grammerly .GothicGolem29 (Talk)18:35, 28 November 2025 (UTC)[reply]
@GothicGolem29: The Grammarly thing isn't necessarily included. As long as it doesn't generate its own output, it's not really alarge language model, even if it claims to use one. The important thing here is thatde novo output doesn't make it to the encyclopedia.〜Festucalex •talk23:46, 3 December 2025 (UTC)[reply]

Strong Oppose You need a fine-toothed guideline as to what is okay and what is not okay. According to this guideline, honest users wouldn't be allowed to use LLMs to save time in a ton of ever-expandingWikipedia:Maintenance issues, many of which can be handled with rephrasing of existing content.DemocraticLuntz (talk)16:47, 19 December 2025 (UTC)[reply]

Oppose – The most popular grammar checkers now have AI integrated into them. So, using a grammar checker on an article is editing with AI. AI is very useful for analyzing articles. If you apply any of its suggestions to an article, that's editing with AI. AI is also adept at finding AI hallucinations. Removing such text flagged by a chatbot is editing with AI. Word processors are increasingly including AI features built-in, and users will expect to be able to use the updated versions of their applications. There must be over a thousand use cases for applying AI to editing, directly or indirectly. Some work better than others, but it is up to the editor to ensure the quality of their edits, regardless of which tools they use. —The Transhumanist 22:57, 6 February 2026 (UTC)[reply]

Further amendment proposal #2: qcne

[edit]

This proposal is part of anWP:RFCBEFORE with the goal of reaching a comprehensive guideline for LLM use on Wikipedia.

Why the current version of the guideline is bad: A single sentence that clunkily prohibits all LLM use on new articles. How do we define that? Does "from scratch" cover the lead section only? the whole article? a stub? a list? Dunno! It doesn't bother to say! This is banning a method without actually defining where it begins or ends. Since no one can reliably tell if an LLM was used, enforcement would be impossible. LLM detection is unreliable, and we already have CSD G15 to handle unreviewed LLM slop.

I wrote this up a while ago and am now posting it for community consensus. I did just replace the Guideline with my version, but was sadly reverted.

Version 1

[edit]

See version 3 posted below

My suggested, much more comprehensive Guideline.

== Purpose and scope ==

This guideline describes how editors may and may not uselarge language models (LLMs, also known as AI chatbots) when editing Wikipedia. It applies to all content generated by an LLM regardless of model or vendor.

Editors remain fully responsible for all edits they make, including LLM-assisted edits. All edits must comply with existingWikipedia policies.

== Do not use an LLM to write an article from scratch ==

Large language models should not be used to generate new Wikipedia articles from scratch.

Do not paste raw or lightly edited LLM output into a new article, or into a draft intended to become an article.
Do not use LLMs to create the bulk of the prose of an article, even if you intend to fact-check it later.

Where an article, draft, or prose is largely or entirely based on unedited or lightly edited LLM output, it may be draftified, nominated for deletion, or removed, especially where the content is unverifiable, fabricated, or otherwise seriously non-compliant.

== Why LLM-written encyclopaedia content is problematic ==

LLMs are language generation tools. They generate plausible-sounding text, not verified knowledge. This leads to several recurring problems that conflict with Wikipedia policies.

=== Unverifiable content, hallucinations, and original research ===

LLMs frequently:

State claims that are not supported by any published source.
Invent references, including plausible-looking but non-existent citations, books, articles, and URLs.
Combine material from different sources into new conclusions.

LLM-generated prose is not acceptable unless the editor has independently verified every statement and every citation against real sources.

=== Bias, BLP, and non-neutral point of view ===

LLMs reflect the biases of their training data and prompts. They may:

Omit important viewpoints or give undue weight to others.
Phrase contentious material in a way that appears neutral but is not.
Repeat, amplify, or invent serious unsourced allegations about living people.

Unsourced or poorly sourced contentious material about living persons must not be added and should be removed immediately, regardless of whether it was generated by an LLM.

=== Copyright and licensing ===

LLMs can reproduce or closely paraphrase copyrighted material from their training data, including books, paywalled journalism, and other non-free content. Their outputs may therefore:

Contain verbatim or near-verbatim copying that is not compatible with Wikipedia's licenses.
Be derivative works whose copyright status is unclear or incompatible withCC BY-SA and theGNU Free Documentation License.

Editors must not add material that infringes copyright, whether written by themselves, copied from a source, or generated by an LLM. Where suspected copyright violations occur, normal copyright enforcement and deletion processes apply.

== Limited acceptable uses ==

LLMs are not forbidden. However, their use should be limited, and editors must already be competent at the task in question without LLM assistance.

Appropriate, low-risk uses might include:

Copyediting and style: Suggesting wording, tightening prose, or correcting grammar in text that the editor understands and has written or verified from sources.
Outlines and brainstorming: Suggesting article structures, lists of subtopics, or questions to investigate, which the editor then researches using independent reliable sources.
Technical assistance: Helping formatwikitext, tables, or templates where the editor can verify correctness.

In all such cases:

The editor must check every factual statement against suitable reliable sources.
The editor must ensure compliance with existing Wikipedia policies

If the editor cannot confidently check and fix the output, they should not use an LLM for that task. LLMs should not be used for tasks in which the editor has little or no independent experience.

Repeatedly making problematic LLM-assisted edits may be treated as acompetence issue and can result in being blocked.

== Communicating with LLMs and on-wiki ==

Editors should not use LLMs to generate substantive on-wiki comments, arguments, or !votes in discussions.

Wikipedia'sconsensus-building process depends on editors expressing their own views, based on their understanding of the issue and of Wikipedia's policies and guidelines. Comments that do not reflect an editor's own reasoning are not helpful and may be considered disruptive. Obvious LLM-generated comments may be collapsed or removed in line with existing policies.

Using an LLM to help with phrasing or grammar of a comment is permissible, provided the ideas and reasoning are the editor's own and the editor reviews the text before posting.

== Disclosure and responsibility ==

Editors should disclose significant LLM assistance in theedit summary (e.g. "copyedited with the help of ChatGPT 5.1 Thinking"). This helps other editors understand and review the edit.

Regardless of disclosure:

Editors are wholly responsible for the content they add or change, including LLM-assisted text.
"The AI wrote it" is not a defence for violations of policy

== Handling existing LLM-generated content ==

Where content appears to be substantially or wholly LLM-generated and does not comply with Wikipedia policy, editors may:

Remove the problematic material outright, especially inbiographies of living persons.
Replace it with a sourced, policy-compliant stub or summary.
Tag the page as LLM generated underTemplate:AI-generated.
Draftify, stubify, or nominate for deletion under the usual processes.
Mark the page for speedy deletion undercriteria G15.</nowiki>

Happy for feedback, but we really do need to do something quickly to fix the current version of this new Guideline.qcne(talk)14:27, 24 November 2025 (UTC)[reply]

While I admire your tenacity and what seems to be a well-written proposal, you can't seriously have expected a whole very long guideline to be implemented with no discussion immediately following the closure of a RfC establishing a short one.Katzrockso (talk)14:33, 24 November 2025 (UTC)[reply]

Yeah, I genuinely thought sense would prevail and the RFC would be closed as unsuccessful.qcne(talk)14:36, 24 November 2025 (UTC)[reply]

I think the only real consensus in the RFC above was that it is time for the AI-maximalists toWP:DROPTHESTICK and cease stonewalling every AI discussion. I hope we can eventually implement something similar to what you have proposed. --LWG ^talk 17:14, 24 November 2025 (UTC)[reply]

Support fully. While I agree that a discussion is necessary, having a guideline this important is so short is unacceptable. The proposal looks like an improvement in every way.TheBritinator (talk)14:44, 24 November 2025 (UTC)[reply]

While I think this isgenerally a good expansion of the guideline, I would axe the 'limited acceptable uses' section. I worry that any explicit carveout that 'LLMs are fine if you use them right' is just going to encourage people to use them because they think they can use them right. It's better if the guideline sticks to saying whatisn't acceptable (which I think should include any amount of identifiable AI-generated text in an article) and then we can let skilled and savvy people work around that on their own if they dare.Athanelar (talk)14:49, 24 November 2025 (UTC)[reply]

I would partially agree, perhaps reduce it to a single paragraph of:

LLMs are not forbidden. However, their use should be limited, and editors must already be competent at the task in question without LLM assistance. The editor must check every factual statement against suitable reliable sources and ensure compliance with existing Wikipedia policies. If the editor cannot confidently check and fix the output, they should not use an LLM for that task. LLMs should not be used for tasks in which the editor has little or no independent experience. Editors are wholly responsible for the content they add or change, including LLM-assisted text. "The AI wrote it" is not a defence for violations of policy.qcne(talk)15:04, 24 November 2025 (UTC)[reply]

I say that "comprehensive" in this case means a shorter, not longer guideline. Theattack surface must be minimized. I say this because we're not facing a small ragtag team of vandals, we're facing the single largest thing in the world economy and trying to protect the encyclopedia against it.〜Festucalex •talk15:12, 24 November 2025 (UTC)[reply]

But there's a point (and the current Guideline is it) where a short guideline is so short it is useless.qcne(talk)15:17, 24 November 2025 (UTC)[reply]

I thinkkeeping it simple and straightforward is the best way to avoid introducing new loopholes and new headaches.〜Festucalex •talk15:22, 24 November 2025 (UTC)[reply]

I don't agree that that applies to the current guideline at all. This is already pretty useful in that it allows NPPs and AFC reviewers to decline an article outright with a link toWP:NEWLLM instead of having to waste time checking for source-text discrepancies and otherwise determining if the article is 'unreviewed' enough to meet G15.

Thw problem isn't that this guideline doesn't do what it's designed for, it's that it isn't designed to do enough.Athanelar (talk)15:24, 24 November 2025 (UTC)[reply]

We should not be declining articles outright because of only suspected AI usage.qcne(talk)15:25, 24 November 2025 (UTC)[reply]

We would stillWP:AGF, obviously, just like we do with everything else.〜Festucalex •talk15:30, 24 November 2025 (UTC)[reply]

AI usage can only ever be 'suspected' outside of some smoking guns like communication intended for the user and utm_source=chatgpt.com fingerprints, but there are enough signs that one can build up a very strong suspicion indeed. As festucalex says, AGF will always apply, but there'smore than enough obvious, poor AI usage on Wikipedia for this guideline to be useful as it stands.Athanelar (talk)15:41, 24 November 2025 (UTC)[reply]

Hard disagree. That's the entire point: AI usage can never be fully proven (unless we're talking about G15 criteria). So the blanket "LLM is banned" sentence wording in this current Guideline is nowhere near enough to deal with endless edge cases, and effectively bans LLM tools at all levels without any nuance. It's crap.qcne(talk)15:48, 24 November 2025 (UTC)[reply]

Would you be more amenable to a clarification to the effect ofContentwhich consensus determines to be AI-generated is subject to removal, including entire articles if their content is determined to be primarily AI-generatedAthanelar (talk)15:51, 24 November 2025 (UTC)[reply]

> No, because AI content in itself isn’t inherently a problem: it’s unreviewed or unverified AI content that breaches policy. The issue isn’t whether something was generated by an LLM, but whether it meets our existing standards.

I just generated the above reply with ChatGPT 5.1 Auto. It is exactly how I feel, I just ran it through the text transformer to make a point. This Guideline would effectively censor that reply?qcne(talk)15:58, 24 November 2025 (UTC)[reply]

Whether AI content is inherently a problem is exactly what's being discussed. I don't agree that contributing AI-generated content to Wikipedia is fine as long as it doesn't breach policy, but I completely understand if you simply fundamentally disagree with that assertion.

My position is that even if I could hypothetically open up ChatGPT 7.0 and type 'Generate a Wikipedia article about foo which complies with all relevant policies and guidelines' and it would give me a completely acceptable output ready to be published to mainspace, I don't think I should be allowed to do that, because the end result would be that rather than Wikipedia being a passion project of human volunteers, it would be a glorified Grokipedia. The people responsible for Wikipedia's content would not be human volunteers who are here to build an encyclopedia but rather programmers at OpenAI.

There are biases and flaws inherent to large language models that are uncontrollable and antithetical to the philosophy of Wikipedia as a project. The rest of the internet is already being invaded and replaced en masse by machine-generated content, and it is absolutely vital that we prevent Wikipedia from going the same way. If we allow AI-generated content provided that it meets relevant standards, all we're doing is setting a timer for when the LLMs will be advanced enough to create Wikipedia-suitable output, at which point the entire website will be slowly overtaken by machine-generated output just like Google search results.

The argument that myself and others are making is that now is potentially our last chance to draw a line in the sand and declare that our active position is one of purging machine-generated output from Wikipedia in order to maintain the project's fundamental philosophy; free and open access to information.Athanelar (talk)16:08, 24 November 2025 (UTC)[reply]

I completely agree thatunreviewed LLM content should not be allowed. The current Guideline effectively bans LLM generated prose, both reviewed and unreviewed. I have no problem with LLM content that has been reviewed and is policy-compliant.

This quickly becomes aShip of Theseus situation with the current guideline. If I generate some prose in ChatGPT and review 50% of it and edit that 50% to comply with policies. Is that banned? What about 75% of the content? What if I take out a thesaurus and change every word, but keep the layout as ChatGPT generated it?qcne(talk)16:11, 24 November 2025 (UTC)[reply]

The problem is, how do we prove 'review?' If an LLM hypothetically becomes capable of generating Wikipedia-suitable content on its own, and a person inserts that content while chopping out all the utm_source fingerprints and user-directed communication etc; assuming there are no source-text discrepancies and the output is otherwise entirely acceptable for Wikipedia, has that been 'reviewed'? Should we accept it? Should we accept an increasing percentage of the encyclopedia consisting of that sort of content in the future?

My contention is that reviewed or unreviewed, the presence of AI-generated content on Wikipedia is philosophically against what we ought to stand for. I know I'm speaking in hypotheticals at the minute, but it's something we need to consider - if LLM technology is going to get better in the future, it's better we put our foot down on it now rather than saying 'it's fine for now as long as you review it' and then in a couple of years we have to frantically try to address it when human review no longer becomes necessary for AI output to pass Wikipedia's standards.

As for your ship of Theseus problem - my response to that is exactly what I've already proposed; that we should forbid and remove all content that consensus determines to be AI generated.Athanelar (talk)17:39, 24 November 2025 (UTC)[reply]

+1 I'd axe the "limited acceptable uses" section also.

Additionally, While the "disclosure and responsibility" sectionsounds good, it will indicate to editors that LLM use is sanctioned in some form, and will encourage use in some cases. Better not to mention it at all.fifteen thousand two hundred twenty four (talk)15:42, 24 November 2025 (UTC)[reply]

I understand your point of view, but LLM usageis sanctioned. We have plenty of experienced editors who regularly contribute to new articles using LLMs, and check the output carefully and ensure it's in-line with policies. AI is another research tool at this point.qcne(talk)15:49, 24 November 2025 (UTC)[reply]

Currently LLM use is only sanctioned insofar as it's not forbidden by policy, but a disclosure requirement would introduce into policy an explicit allowance, thus encouraging use. I don't think any allowances should be incorporated into policy in any form.fifteen thousand two hundred twenty four (talk)16:01, 24 November 2025 (UTC)[reply]

I hate AI slop, and my AfC and CSD log is littered with AI slop reports. But it's madness to ban a tool which is now used by 100~ million peoplea day. Which is double those who visit Wikipedia a day.qcne(talk)16:04, 24 November 2025 (UTC)[reply]

You will notice with careful reading that neither of my comments involve entirely banning LLMs. They both concern not putting explicit allowances into policy which would encourage further use, which is the status quo.fifteen thousand two hundred twenty four (talk)16:10, 24 November 2025 (UTC)[reply]

This is generally a sensible rewrite, and I would support it in an RfC. This is, however, not an RfC. I'm interpreting this as anRFCBEFORE/workshopping discussion before an RfC is opened. @Qcne, is that correct?Toadspike [Talk]15:41, 24 November 2025 (UTC)[reply]

@Toadspike Initially I was going to open an RFC onWP:LLM to promotethat to Guideline instead, but was told this has been tried repeatedly and failed. I wrote my version up as a condensed version ofWP:LLM.

Guidelines are not set in stone, and I don't think we needyet another protracted RFC. My hope was to be bold and change the Guideline to something better and see if anyone reverts it. Someone did.

Maybe we can incrementally change and improve the existing Guideline using bits from my version, and feedback from other users.

Whatever happens though, this current version of the Guideline is awful.qcne(talk)15:46, 24 November 2025 (UTC)[reply]

I heartily support this as the ideal guideline toward which we should be working. --LWG ^talk 17:23, 24 November 2025 (UTC)[reply]

I also support this proposal, because as I and many others have pointed out, there is no reason to ban carefully reviewed, policy compliant content.Kovcszaln6 (talk)17:34, 24 November 2025 (UTC)[reply]

Comment - this echoesWP:BOTMULTIOP on Disclosure & Competence which I think the current iteration of the guideline is missing. A shorter summary version linking to the full guideline could also be useful underWikipedia:Bot policy#Other bot-related matters.Sariel Xilo (talk)18:59, 24 November 2025 (UTC)[reply]

Version 2

[edit]

See version 3 posted below

Based off feedback, I have revised my proposed full Guideline as below. This strengthens the scope of the LLM use, lessens the acceptable use section, and adds in some definitions. Full version atUser:Qcne/LLMGuideline.

Version 2 of a suggested comprehensive Guideline.

== Purpose and scope ==

This guideline explains how editors may and may not uselarge language models (LLMs, also known as AI chatbots) when editing Wikipedia. It applies to all models and all LLM-generated output.

In this guideline, alarge language model (LLM) means any programme that can generate natural-language text (sentences and paragraphs) in response to prompts. This includes tools marketed as "AI chatbots" or "AI writing assistants", such asChatGPT,Google Gemini,Microsoft Copilot, and similar services, whether used in a browser, an app, or built into other software. It does not cover spellcheckers, grammar checkers, or basic autocomplete, although editors remain responsible for any text they accept from such tools.

Editors remain fully responsible for all edits they make, including LLM-assisted edits. All edits must comply with existingWikipedia policies.

== Do not use an LLM to add unreviewed content ==

Editors must not use an LLM to add unreviewed text to Wikipedia, whether creating a new article or editing an existing one. Do not use an LLM as the primary author of a new article or a major expansion of an existing article, even if you plan to edit the output later.

For this guideline,unreviewed means output that the editor has not checked line by line for accuracy, sourcing, neutrality, and copyright problems against suitable reliable sources and against existing Wikipedia policies, at least as rigorously as if the editor had written the text themselves. You must verify every substantive claim against the cited sources. Reading the output to ensure it "sounds correct" is not sufficient.

In particular, editors should not:

Paste raw or lightly edited LLM output as a new article or as a draft intended to become an article.
Paste raw or lightly edited LLM output into existing articles as new or expanded prose.
Paste raw or lightly edited LLM output as new discussions or replies to existing discussions.

Where content is largely or entirely based on unedited or lightly edited LLM output, it may be draftified, stubified, nominated for deletion, collapsed, or removed entirely, especially where the content is unverifiable, fabricated, or otherwise non-compliant with existing Wikipedia policies.

== Why LLM-written content is problematic ==

LLMs are language generation tools. They generate plausible-sounding text. This leads to recurring problems that conflict with Wikipedia policies.

=== Unverifiable content, hallucinations, and original research ===

LLMs frequently:

State claims that are not supported by any published source.
Invent references, including plausible-looking but non-existent citations, books, articles, and URLs.
Combine or synthesise material from different sources into new or false conclusions.

=== Bias, BLP, and non-neutral point of view ===

LLMs reflect the biases of their training data and prompts. They may:

Omit important viewpoints or give undue weight to others.
Phrase contentious material in a way that appears neutral but is not.
Repeat, amplify, or invent serious unsourced allegations about living people.

=== Copyright and licensing ===

LLMs can reproduce or closely paraphrase copyrighted material from their training data, including books, paywalled journalism, and other non-free content not compatible with Wikipedia's licences. Their outputs may:

Contain verbatim or near-verbatim content that infringes copyright.
Be derivative works whose copyright status is unclear or incompatible with Wikipedia.

== Limited use ==

Editors are strongly discouraged from using LLMs. LLMs, if used at all, should assist with narrow, well-understood tasks such as copyediting. New editors should not use LLMs when editing Wikipedia.

If an experienced editor nonetheless chooses to use an LLM, they must:

Not use it to generate the bulk of a new article or major expansion.
Check the output they intend to use against suitable reliable sources.
Ensure the output complies with existing Wikipedia policies.
Not treat the output as authoritative or as a substitute for their own judgement.

If the editor cannot confidently check and correct the output, they should not use an LLM for that task. LLMs should not be used for tasks in which the editor has little or no independent experience. Editors should also be cautious about using LLMs to write comments in discussions on Wikipedia.

=== Disclosure and responsibility ===

Editors should disclose LLM assistance in theedit summary (e.g. "copyedited with the help of ChatGPT 5.1 Thinking"). This helps other editors understand and review the edit.

Regardless of disclosure:

Editors are wholly responsible for the content they add or change, including LLM-assisted text.
Disclosure does not make non-compliant content acceptable.
"The AI wrote it" is not a defence for violations of Wikipedia policy or guideline.

== Handling existing LLM-generated content ==

The mere fact (or suspicion) that content was generated by an LLM is not, by itself, a reason to delete or remove it. Editors should base their actions on specific problems with the content under existing policies.

Where content appears to be substantially or wholly LLM-generated without review and does not comply with Wikipedia policy, editors may:

Remove the problematic material outright, especially inbiographies of living persons.
Replace it with a sourced, policy-compliant content.
Tag the page as LLM generated underTemplate:AI-generated.
Draftify, stubify, or nominate for deletion under the usual processes.
Mark the page for speedy deletion undercriteria G15.

Again, happy to take feedback and I have also calmed down a little from this afternoon's shock at the original RFC being closed.qcne(talk)20:57, 24 November 2025 (UTC)[reply]

Thanks for doing this. The new version is a big improvement, and I would gladly support its promotion to Guideline. --LWG ^talk 21:28, 24 November 2025 (UTC)[reply]

Support this as well. I especially like the disclosure part here. I think proposal #1 is a bit more straightforward, but this works too, and is way more detailed. It's definitely a major improvement over the current guideline, where the FAQ specifically stated that the proposal was vague on purpose to gain consensus.Z E T A^C21:36, 24 November 2025 (UTC)[reply]

This is great, but why not just sayLLMs should not be used to generate content. This excepts copyediting etc. Maybe put that in the nutshell?Kowal2701 (talk)21:36, 24 November 2025 (UTC)[reply]

It is not that simple: is an AI translation from a foreign language OK?Викидим (talk)21:53, 24 November 2025 (UTC)[reply]

No. LLM translation is one of the uses we specifically discourage, because without the output being checked by someone fluent in both languages there's no way to verify that it's an accurate and reliable translation of the source text; and if you have someone fluent in both languages available, they might as well just translate it themselves to begin with, since it's going to be just as much effort to double-check the resulting output.Athanelar (talk)21:56, 24 November 2025 (UTC)[reply]

Using machine translation to generate a draft from a source that will then be carefully reviewed and edited by translator is a pretty common practice in translation and should not be prohibited. It's generally not true that a fluent speaker can translate from scratch just as easily as they can revise a machine translation (if that were true machine-assistance wouldn't be widely used by basically all professional translators). The kinds of errors machines make in translation specifically are also much better understood than the kinds of errors LLMs make in generating content from prompts, so it's easier for a competent translator to catch and fix translation errors than for a competent writer to catch and fix the kinds of issues that LLM-generated prose tends to have. The critical key here is machine-translated draft text must be gone overline by line by a fluent speaker of both languages who also hastransfer competence, that is, awareness of the types of errors and misunderstandings that commonly come up in translation and how to mitigate them. --LWG ^talk 22:55, 24 November 2025 (UTC)[reply]

Most professional translators use machine translation (AI-based or otherwise). It's a significant time saver.WhatamIdoing (talk)06:15, 27 November 2025 (UTC)[reply]

I have professional translation experience. I have never used machine translation. In fact, I have had the misfortunate to spend an inordinate amount of time cleaning up after garbled machine translations produced by wayward 'professional' translators who couldn't discern even the most evident nuance in either source or target language. Of course, the quality of machine translation varies greatly by language pair, but as I said during the RfC, machine translation wastes more time than it saves. A good translator will spend less time producing his own translation than trying to make sense of machine-mangled dross.Yours, &c.RGloucester —☎08:55, 27 November 2025 (UTC)[reply]

I'm not a professional translator, but I've had the misfortune of cleaning up after human translators who miss (e.g.,) the distinction between "free as in beer" and "free as in freedom" even after they were told to watch out for that mistake. But not every language, not every translator, not every tool is perfect. Most translators use machine translation, thoughone survey said that only about half run the whole document through MT and then edit the results. However, there are other ways to use MT, some of which are not very different from using a dictionary.WhatamIdoing (talk)02:18, 28 November 2025 (UTC)[reply]

That's already covered byWP:MACHINETRANSLATION.Sariel Xilo (talk)22:02, 24 November 2025 (UTC)[reply]

The MACHINETRANSLATION talks aboutunedited machine translation. If the guideline would prohibit unedited AI output, I would have no problem here.Викидим (talk)22:22, 24 November 2025 (UTC)[reply]

This is way better as a starting point. To stay on the sane side, editors need carve-outs that providesafe harbors ("within reasonable limits, this particular behavior is OK, and, for alleged misuse, the burden of proof is on the accuser"). Note that if the use of the tolls were explicitly permitted for some tasks, I would argue the opposite: the editor using them should shoulder the burden of proof that their use was beneficial against "this is slop" accusations.Викидим (talk)21:47, 24 November 2025 (UTC)[reply]

I think this is 90% of the way there, but I still have some issues with it. Namely, this is essentially a proposal for a completely different, less restrictive guideline; it's more like promoting theWP:LLM essay to guideline rather than iterating on NEWLLM.

Editors should also be cautious about using LLMs to write comments in discussions on Wikipedia. this appears to undermine the well-regarded essay subsectionWP:LLMCOMM which states the more firmEditors should not use LLMs to write comments generatively. I think having a chatbot talk on your behalf is a pretty clear competence/communication is required issue which we should forbid, not caution.

Do not use an LLM to add unreviewed content This is essentially the bulk of your proposed guideline, but it seems to walk back what's been agreed upon in NEWLLM, which is that articles shouldn't be created wholesale with LLMs, regardless of whether they're reviewed or not. Now, whether that should apply toall content is a separate matter, of course, but your proposed guideline would essentially be a reversal of NEWLLM and would again permit wholly AI-generated articles, provided they were reviewed. If you're proposing this as an iteration of NEWLLM, it should reflect the agreed-upon prohibition of AI generating entire new articles, while stressing that unreviewed AI edits in general are not allowed. It seems like your new guideline wouldn't actually change anything from the consensus status-quo prior to the promotion of NEWLLM.

I do think in general you're swimming against the cultural current here. My takeaway from the NEWLLM RFC is that people (myself included, so I'm certainly biased) wantmore restrictions on AI usage, we don't want to codify carveouts and exceptions.

The mere fact (or suspicion) that content was generated by an LLM is not, by itself, a reason to delete or remove it. Ditto as above. This would be a direct reversal of NEWLLM, which is explicitly saying the opposite; that new articles generated 'from scratch' using an LLM are not allowed.

I would support this on the grounds that it's changed toat least explicitly prohibit the generation of new articles entirely or primarily with LLMs, because the RFC conclusion for NEWLLM has clearly determined that that restriction is approved by community consensus. In future I would propose a modification of it to forbidall AI-generated text, but that's my own agenda here anyway, of course. At a minimum, though, if this guideline is to serve as an expansion of what's been codified here at NEWLLM, it needs to explicitly forbid the creation of new articles with AI output.Athanelar (talk)21:48, 24 November 2025 (UTC)[reply]

Any blanket prohibition on LLM article creation or LLM use will never be absolute becauseall rules on Wikipedia have exceptions.SuperPianoMan9167 (talk)22:29, 24 November 2025 (UTC)[reply]

Relying on IAR daily is wrought with peril, IMHO. It is much better to codify the safe AI use, if any.Викидим (talk)22:32, 24 November 2025 (UTC)[reply]

We already rely on IAR daily, for exampleWP:SNOW. It’s the unconventional uses that are perilous. There’s a trade off between making a guideline/policy that perfectly addresses the problem, and one that all users will easily understand, simplicity is valuable hereKowal2701 (talk)22:41, 24 November 2025 (UTC)[reply]

Of course, but that doesn't mean the rule shouldn't exist in the first place. Vandalism is the perfect example. Everybody knows vandalism isn't allowed on Wikipedia. There's no objective standard for what 'vandalism' is, because it's something obviously impossible to define in a way that includes everything that a reasonable person would agree is vandalism and excludes everything that a reasonable person would agree isn't vandalism and respects the principle of AGF etc etc etc. Despite all those things, we can still say 'Vandalism isn't allowed on Wikipedia, vandalism will be removed, and vandals will be prevented from vandalising the wiki' because we're all sensible people capable of judging what is and isn't detrimental to Wikipedia and acting accordingly.

Saying 'creating articles wholesale from primarily AI-generated text is forbidden' is obviously going to have edge cases and exceptions, but the core of it is that if you create a new article, and reasonable consensus determines that the text of that article is primarily AI generated, the article is subject to removal.

"IAR exists" doesn't mean "we should never prohibit anything"Athanelar (talk)22:51, 24 November 2025 (UTC)[reply]

Newbies using LLMs are notintentionally trying to disrupt Wikipedia, and so youcannot compare them to vandals. Most of them have no idea how hostile the community is to LLMs and likely don't know why their outputs are inherently unreliable, likely because the hype around LLMs obscures their shortcomings in favor of promoting investment and public interest.SuperPianoMan9167 (talk)22:59, 24 November 2025 (UTC)[reply]

I didn't compare newbies using LLMs to vandals. I used vandalism as an illustrative example of why a prohibition being hard to strictly define and subject to exceptions doesn't mean the prohibition shouldn't exist.Athanelar (talk)23:01, 24 November 2025 (UTC)[reply]

That makes sense. Sorry. I've seen the comparison of LLM users to vandals elsewhere on this talk page and I wanted to address it.SuperPianoMan9167 (talk)23:04, 24 November 2025 (UTC)[reply]

This is thoughtfully written and I like the framework, but I cannot support a guideline that allows "reviewed LLM output" when the evidence is overwhelming that such review is almost always insufficient. I see zero argument that newer editors in particular should be allowed to use LLMs to generate article content under any circumstance. If we want to allow a carve out for reviewed LLM generated content, we should create anLLM-user user right with the same requirements as autopatrolled.NicheSports (talk)21:53, 24 November 2025 (UTC)[reply]

I think that an llm-user right makes perfect sense, and had proposed it all the way through the RfC.Викидим (talk)21:54, 24 November 2025 (UTC)[reply]

+1 to both Niche and AthanelarKowal2701 (talk)22:23, 24 November 2025 (UTC)[reply]

This is far too bloated with needless words and yet achieves very little–the opposite of what I wanted this guideline to achieve. This would be better given these changes:

== Purpose and scope ==

~~This guideline explains how editors may and may not use large language models (LLMs, also known as AI chatbots) when editing Wikipedia. It applies to all models and all LLM-generated output.~~Get to the point; everyone knows what a guideline is.

In this guideline, a large language model (LLM) means any programme that can generate natural-language text (sentences and paragraphs) in response to prompts. This includes tools marketed as "AI chatbots" or "AI writing assistants", such as ChatGPT, Google Gemini, Microsoft Copilot, and similar services, whether used in a browser, an app, or built into other software. It does not cover spellcheckers, grammar checkers, or basic autocomplete, although editors remain responsible for any text they accept from such tools.

~~Editors remain fully responsible for all edits they make, including LLM-assisted edits. All edits must comply with existing Wikipedia policies.~~This always applies; there's no need to restate our basic editing principles in every guideline

== Do not use an LLM to add~~unreviewed~~ contentAnyone imbecilic enough to want to use an LLM to add substantial new content to an article can't be trusted to review it properly. We have seen time and time again that "review" is unsufficient to counteract theinherent problems ==

Editors must not use an LLM to add~~unreviewed~~ text to Wikipedia, whether creating a new article or editing an existing one. Do not use an LLM as the primary author of a new article or a major expansion of an existing article, even if you plan to edit the output later.

For this guideline, unreviewed means output that the editor has not checked line by line for accuracy, sourcing, neutrality, and copyright problems against suitable reliable sources and against existing Wikipedia policies, at least as rigorously as if the editor had written the text themselves. You must verify every substantive claim against the cited sources. Reading the output to ensure it "sounds correct" is not sufficient.

In particular, editors should not:

Paste raw or lightly edited LLM output as a new article or as a draft intended to become an article.
Paste raw or lightly edited LLM output into existing articles as new or expanded prose.
Paste raw or lightly edited LLM output as new discussions or replies to existing discussions.

~~== Why LLM-written content is problematic ==~~

~~LLMs are language generation tools. They generate plausible-sounding text. This leads to recurring problems that conflict with Wikipedia policies.~~Guidelines are not information pages

{{{1}}}

Repeatedly making problematic LLM-assisted edits may be treated as a competence issue and can result in the editor being blocked. Such blocks are intended to stop further disruptive use of LLMs on Wikipedia.This is all irrelevent – not to mention redundant toWP:LLM. Bloat like this is what turns useful guidelines into mazes of dry prose wherein it cannot be distinguished what is actual guidance and what is useless "background" information.

{{{1}}}

~~"The AI wrote it" is not a defence for violations of Wikipedia policy or guideline.~~No, they shouldn't use it, period.

== Handling existing LLM-generated content ==

Where content appears to be substantially or wholly LLM-generated without review and does not comply with Wikipedia policy, editors may:

Remove the problematic material outright, especially in biographies of living persons.
Replace it with a sourced, policy-compliant content.
Tag the page as LLM generated under Template:AI-generated.
Draftify, stubify, or nominate for deletion under the usual processes.
Mark the page for speedy deletion under criteria G15.

Don't shoehorn in useless side information thatobscures the point of the guideline and bloats it into an information page, while also watering down the actual guideline to be next to useless. I think you've completely missed the point of what this guideline is trying to accomplish.Cremastra (talk ·contribs)02:03, 25 November 2025 (UTC)[reply]

@Cremastra, as a side note, could I introduce you toTemplate:Text diff? Use separate instances if you want to interpolate explanations.WhatamIdoing (talk)06:18, 27 November 2025 (UTC)[reply]

Some initial notes:

Many "grammar checkers" use AI now (Grammarly being the main example), and can go well beyond what most people would consider basic proofreading.
"Hallucinations" section should mention something about AI text claiming that real sources contain information or analysis that they actually don't. This isthe big issue with current chatbots and I think it gets overshadowed by "AI can make up fake sources."
Because of that, there should also be something saying "don't use AI to cite or summarize a source you didn't personally read."
Non-neutral point of view should add something about promotional/didactic/editorializing tone as this is a major issue, and also point out that AI can do this even when it claims it's making the text neutral.
Mention that AI can surface or cite unreliable sources, even when it claims they're reliable. (the "cites blog or free web host" tag gets a workout)

Gnomingstuff (talk)02:40, 25 November 2025 (UTC)[reply]

Yes, my experience with AIs is that they constantly hallucinate information, quotes, etc in sources that appears nowhere in the text, often to the opposite implication of what the source states.Katzrockso (talk)02:53, 25 November 2025 (UTC)[reply]

There is a very easy way to prevent AI from hallucinating sources. This is a two-step process: (1) human editor identifies the sourcesand uploads their text into AI along with bibliographic information and (2) in the prompt, editor explicitly states to "use only provided sources". That's it, the problem of sources is gone - AI will not hallucinate sources unless it is allowed to. Coaxing AI into creating OR by not providing sources or not explicitly prohibiting it to find guarantees the issues. To me, this is simply an aspect of the use of a complex tool, that can be solved by education (and blocking the ones that are unable to learn).Викидим (talk)05:45, 25 November 2025 (UTC)[reply]

This doesn't always work and I have an example, will share tomorrowNicheSports (talk)05:47, 25 November 2025 (UTC)[reply]

I use Google Gemini 2.5 and now 3.0 for work extensively, and never had an issue with it not following "use only provided sources" instruction. I would be very interested in seeing the counter-example, naturally.Викидим (talk)06:09, 25 November 2025 (UTC)[reply]

AI will not hallucinate sources unless it is allowed to. – Models are not so perfectly constrained, there is no fundamental blocker to a model hallucinating a new reference, or permuting a provided one. Hallucinations are fundamental and cannot be entirely obviated just by providing a "better prompt".fifteen thousand two hundred twenty four (talk)05:53, 25 November 2025 (UTC)[reply]

As I have said, my opinion is based on personal experience. One of the main uses of AI is to summarize long texts for human consumption, so all model makers are very careful not to ruin this experience, so instruction "just use what you were shown" is well-tested, and lapses of AI judgement (in this respect) should have low probability.Викидим (talk)06:13, 25 November 2025 (UTC)[reply]

Personal experience does not override the fundamental fact that these are, at their core, predictive models, and theywill predict wrong. This is unsolvable with current approaches.

Also my experience is that they are poor at summarizing content, a recent Wikipedia-specific example would be the simple summaries debacle[3][4] which you may have missed.fifteen thousand two hundred twenty four (talk)08:26, 25 November 2025 (UTC)[reply]

We are both entitled to our own opinions. Currently, mine is that the capability of modern AI models on many text-processing tasks rivals that of a skilled human professional. I am referring to relatively standard tasks for now, but the shift is palpable. As an example, less than an hour ago, I completed a project that previously would have required the services of a specialist. These types of projects used to take a week; I completed this one single-handedly in three stress-free days. What was most disturbing, it actually felt like collaborating with a technical expert. I could guide it: 'Shouldn't the value of X be zero here?' It would agree, provide the mathematical proof, and correct the handling of the condition. For reference, just a year ago, a similar attempt resulted in pages of incoherent text that the AI stubbornly insisted were correct. Consequently, I believe the current discourse on using AI in Wikipedia is largely driven by editors observing the output of free models of yesteryear used without proper prompting. To be clear, the vision of AI surpassing humans unsettles me, too, but this is the inescapable future. Humans must learn to use these new tools; those who don't will simply lose out to those who do. (AI disclosure: I have asked Gemini 3.0 to improve this text, the result certainly sounds more robotic, but feels way more readable now. For the avoidance of doubt, I do not use, and never used, AI to write of polish my comments in English, this was a one-off demonstration).Викидим (talk)11:29, 25 November 2025 (UTC)[reply]

Your comments were perfectly readable before...Gnomingstuff (talk)20:02, 25 November 2025 (UTC)[reply]

And if they hadn't been, he wouldn't be qualified to assess whether the LLM output was an improvement. --LWG ^talk 20:13, 25 November 2025 (UTC)[reply]

That's not really true. If you know your work is weak (e.g., dyslexia, English as a second or third language), then you can often recognize improvements when they're shown to you, even if you couldn't come up with the better option on your own.WhatamIdoing (talk)06:23, 27 November 2025 (UTC)[reply]

Not sure I agree. I can bang out some freshman 101-level Spanish, but if someone were to rewrite that into other Spanish text, I would have no idea whether it was an improvement, or for that matter whether it even meant the same thing. Or if I chose a random article on the Spanish Wikipedia - hitting random gave me thethe article for "calculator" - I don't know how good or accurate it is. I can guess that certain things sound good or questionable -- e.g.,LLM(Grandes Modelos del Lenguaje) que crean y adaptan calculadoras, abriendo un nuevo paradigma -- but I don't actually know because I'm not a fluent Spanish speaker.Gnomingstuff (talk)06:00, 30 November 2025 (UTC)[reply]

That Spanish means "Large language models that create and adapt calculators, opening a new paradigm".SuperPianoMan9167 (talk)06:05, 30 November 2025 (UTC)[reply]

I know basically what it means. But the nuances of the language -- is this sentence structure idiomatic? is it normal to use an English acronym like this in Spanish? doesparadigma have the same promotional connotations in Spanish as it does in English? does the present-participle thing apply to Spanish AI text? -- are beyond my level of understanding of Spanish. And that's just one sentence with closer cognates than usual, in a language that does share cognates with English.Gnomingstuff (talk)20:46, 30 November 2025 (UTC)[reply]

Got it.SuperPianoMan9167 (talk)20:47, 30 November 2025 (UTC)[reply]

While agreeing with you on details, I would still posit that identifying the traits is easier than understanding the same traits andmuch easier than reproducing the traits. After all, through evolution humans got really good at guessing the whole picture while seeing just parts of it (the ones who were bad at it became food for tigers hidden in the overgrowth of the jungle). I thus fully expect myself to be capable to appreciate the beauty of texts written in a foreign language without any ability to write equally good texts in the same way I can appreciate the classical music without any skills to create it.Викидим (talk)17:56, 1 December 2025 (UTC)[reply]

late to the party,support thinking of this as a sensible policy proposal.

Think main question is:
A) apparently it is now possible to continuously iterate and prompt-engineer over time an article. i.e. you can tell the ai that fact A was wrong, and the ai can regenerate without the fact. Does this count as line-by-line verification? i think no.
B) maybe include a link at the end toWP:AINB? and also info on when to escalate if a user refuses to stop doing LLM incorrectly? i.e.An editor caught using LLMs to generate and insert text repeatedly, and with such text not complying with other Wikipedia policy, can be reported to ANI. Do not report an editor for suspicion of LLM usage, but who has otherwise followed other Wikipedia policy.User:Bluethricecreamman(Talk·Contribs)06:06, 1 December 2025 (UTC)[reply]

Oppose. For one I do not agree the current guideline os bad it is a good starting point.It can be improved but in my view that improvement should be expanding it to cover generating content for an article so this proposal does not go far enough in my view.GothicGolem29 (Talk)18:55, 28 November 2025 (UTC)[reply]
This is a sensible policy proposal. My only quibble is that under "limited use" it should read "Editors are strongly discouraged from using LLMs to add content." --Enos733 (talk)18:17, 2 December 2025 (UTC)[reply]
Weak support: I guess this is the one that's emerging as the preferred revision. I have some nitpicks still about the text and frankly about the content (my ideal policy is basically NicheSports' below), but we don't have time to keepbikeshedding this shit. Going three years with AI policy has done substantial damage to the integrity of Wikipedia, and frankly we're probably near the precipice where that damage becomes irreversible.Gnomingstuff (talk)02:49, 3 December 2025 (UTC)[reply]
Strong oppose: As I statedabove for v1, I oppose anything that codifies "acceptible" LLM use, indicating to editors that there are forms of model use that are specificially sanctioned will further encourage LLM-editing, to the detriment of the project. Currently usage is only allowed insofar as it is not explicitly banned, and this is how it should remain at most.

For me, this means no explaining how editorsmay and may not use models, only how theymay not, noEditors remain fully responsible, including LLM-assisted edits (superfluous anyways), noLimited Use, etc.

I also oppose any form of disclosure requirement, requiring disclosure would again, be explicitly sanctioning LLM use. See prior relevant discussions atWP:Village pump (policy)/Archive 205#Checkbox for disclosure andWP:Village pump (idea lab)/Archive 47#Adding LLM edit tag where similar concerns were expressed.

I'm wary of bikeshedding, same asGnomingstuff, but adding a guideline that would explicitly endorse LLM editing is far from a trivial concern and requires careful consideration.fifteen thousand two hundred twenty four (talk)04:22, 3 December 2025 (UTC)[reply]

+1 The effort spent 'bikeshedding' now will be nothing in comparison to trying to enact more LLM restrictions down the road if we set something too permissive in stone now.Athanelar (talk)05:42, 3 December 2025 (UTC)[reply]

And in the meantime we have no policy.Gnomingstuff (talk)06:07, 3 December 2025 (UTC)[reply]

Extending the "meantime" is much preferred if it means we don't end up with a policy encouraging LLM use. I'm not seeking perfection, theDo not use an LLM to add unreviewed content section of qcne's proposal would be sufficient on its own. Much of the rest reminds me of anomnibus bill, packaging a policy that the community has previously indicated support for (via NEWLLM and G15) with unnecessary extras that the community has not (some LLM use is OK actually).fifteen thousand two hundred twenty four (talk)06:29, 3 December 2025 (UTC)[reply]

Having no policy effectivelyis encouraging LLM use. The longer we remain in that state, the longer we are encouraging LLM use, and the longer it accumulates on Wikipedia.Gnomingstuff (talk)13:02, 3 December 2025 (UTC)[reply]

Well, we have NEWLLM as it stands, which is already I would argue more restrictive (and therefore preferable to my own ideology on this) than qcne's proposal.Athanelar (talk)13:51, 3 December 2025 (UTC)[reply]

Version 3

[edit]

I still believe this Guideline is grossly short and needs to be expanded a little bit, but am also taking into account the feedback given.

Would mymuch shorter Version 3 guideline here be at all acceptable to the more hard-line anti-LLM editors? I have:

made it more concise.
removed the limited use carve-out, with the idea that experienced editors can be trusted to use LLMs, and this Guideline is more focused towards new editors.

Hidden ping to users who have participated.qcne(talk)22:37, 3 December 2025 (UTC)[reply]

I predict that hard-line anti-LLM editors will still want the word "unreviewed" removed from "do not add unreviewed LLM-generated content to new or existing articles".SuperPianoMan9167 (talk)22:40, 3 December 2025 (UTC)[reply]

Yes, potentially, but I would like to have some sort of compromise!qcne(talk)22:41, 3 December 2025 (UTC)[reply]

Agreed.SuperPianoMan9167 (talk)22:42, 3 December 2025 (UTC)[reply]

The compromise on the reviewed language is to only allow it for experienced editors with anllm-user right. A few editors have suggested this. There is a vast amount of evidence (AfC, NPP,1346 (hist ·log), any WikiEd class, etc.), that inexperienced editors essentially never sufficiently review LLM-generated prose or citations.NicheSports (talk)23:31, 3 December 2025 (UTC)[reply]

I think that'd have to be a separate RfC, would supportKowal2701 (talk)23:32, 3 December 2025 (UTC)[reply]

Given my experience with CCIs of autopatrolled and NPR editors, and even the odd admin, would you be offended if I scream "NO!" really loudly at the idea of tying LLM use to a user right?

Sorry, but I've had too much trouble with older users being grandfathered in to the autopatrolled system to be comfortabel with the idea of giving somebody the right to say "Oh, but my use of Chat GPT is fine - I have autopatrolled!"GreenLipstickLesbian 💌🧸23:48, 3 December 2025 (UTC)[reply]

Valid point. There's been at least one editor who had their autopatrolled right revoked for creating unreviewed LLM-generated articles.SuperPianoMan9167 (talk)23:53, 3 December 2025 (UTC)[reply]

far from being offended, I actually laughed 😅 but I would still much much rather deal with that problem than continuing the fantasy that inexperienced editors should be allowed to use these tools with review that is never performed!NicheSports (talk)23:56, 3 December 2025 (UTC)[reply]

Disagree with adding an LLM-user right, but either way I think that is best workshopped elsewhere.fifteen thousand two hundred twenty four (talk)23:55, 3 December 2025 (UTC)[reply]

The issue with "unreviewed" is that it is at risk of beingwikilawyered, even a bad review would be kosher. Otherwise it's great. I worry that by having a nuanced approach, it'd struggle to communicate a clear message, especially since people dispositioned to use LLMs likely already have CIR issues that LLM-use is compensating for. I'd remove "unreviewed", andespecially where the content is unverifiable, fabricated, or otherwise non-compliant with existing Wikipedia policies can support people's IAR "not what the policy was intended for" (if they so want) in the fringe cases LLM-use is not practically problematic, subject to consensus.Kowal2701 (talk)23:08, 3 December 2025 (UTC)[reply]

"insufficiently reviewed" has more wiggle room while still allowing for the edge cases; once any problem is identified, it puts the responsibility on the person adding the content rather than other editors.GreenLipstickLesbian 💌🧸23:43, 3 December 2025 (UTC)[reply]

That'd be good tooKowal2701 (talk)01:30, 4 December 2025 (UTC)[reply]

Honestly, "unreviewed" has been my main point of disagreement in every proposal that includes it -- thank you for articulating it. There are two fundamental problems:

First, if it's hard to know whether someone used AI, it's even harder to know how much they reviewed it.

Second, and more problematic: Properly "reviewing" LLM content means that every single word, fact, and claim needs to be verified against every single source. You essentially need to reconstruct the writing process in reverse, after the fact. But most good-faith editors who use AI seem to think "reviewing" means one of two things:

Quickly skimming it and going "yeah that looks OK."
Using AI to "review" the text.

This results, and will continue to result, in the following situation: Editor 1 finds some bad AI text. Editor 1 says that the AI text wasn't reviewed, and they aren't wrong. Editor 2 says that they did review the AI text, and they aren't lying. Meanwhile, the text remains bad.Gnomingstuff (talk)01:31, 5 December 2025 (UTC)[reply]

Enthusiastic support. I think this is the best we're going to get for a compromise option between the two LLM ideologies here.

You don't leave any room for 'acceptable' carve-outs, you've included the very direct "Editors should not use an LLM to add content to Wikipedia, whether creating a new article or editing an existing one." which, although it uses 'should' and not 'must,' serves to discourage LLM use in general, which is very desirable for me. You've preserved the spirit of NEWLLM by categorically saying "Do not" use an LLM to author an article or major expansion, you've codified LLMCOMM by saying "Do not" use LLMs for discussions.

My onlysuggested change would be to drop the "Why LLM content is problematic" section. We already have that covered at WP:LLM, there's no need to bloat this guideline by including it here. Other than that, I think this is exactly the kind of AI guideline we should have right now.Athanelar (talk)23:11, 3 December 2025 (UTC)[reply]

If we do that we should probably make WP:LLM an information page.SuperPianoMan9167 (talk)00:13, 4 December 2025 (UTC)[reply]

I think that's totally fine. We can link to it from qcne's proposal (and even promote it to supplement if necessary). It's better than adding unnecessary bloat to the guideline. The main target for this guideline, after all, is going to be people who are already using AI for something and need to be told to stop, who probably aren't going to be interested in the finer points of why LLM use is problematic. If they want to do the further reading, they can.Athanelar (talk)03:10, 4 December 2025 (UTC)[reply]

I did it.SuperPianoMan9167 (talk)03:16, 4 December 2025 (UTC)[reply]

Awesome, thank you @SuperPianoMan9167.qcne(talk)11:17, 4 December 2025 (UTC)[reply]

I wasreverted. I did say people could do that when I made the change.SuperPianoMan9167 (talk)16:46, 4 December 2025 (UTC)[reply]

I appreciate your work here. I do think what you have makes sense and also is realistic in how editors work. As for "unreviewed," could a footnote work to explain what "reviewed" means? -Enos733 (talk)23:14, 3 December 2025 (UTC)[reply]

I'd like that. FTR I'd still support this regardless as it's a massive improvementKowal2701 (talk)23:22, 3 December 2025 (UTC)[reply]

Your ping missed me, but I really like theversion 3 proposal. I agree with GreenLipstickLesbian that "insufficiently reviewed" would be better verbiage, but it's not a blocker. This would have mysupport as-is.Adding raw or lightly edited LLM output degrades the quality of the encyclopedia, and frequently wastes the time of other editors who must then cleanup after it. This proposed guideline would explicitly prohibit such nonconstructive model use in a clear manner, and would serve as a useful tool for addressing and preventing instances of misuse.fifteen thousand two hundred twenty four (talk)00:11, 4 December 2025 (UTC)[reply]

Support I like it. Since that's not an argument, I also think this is finally a version Randy in Boise can understand and follow.~ Argenti Aertheri^(Chat?)01:51, 4 December 2025 (UTC)[reply]

Serious concern: isn't this proposal contradictory? How can both of these statements be in the same guideline?

Do not use an LLM as the primary author of a new article or a major expansion of an existing article,even if you plan to edit the output later. (Emphasis my own)
Editors should not... Paste raw or lightly edited LLM output into existing articles as new or expanded prose. #2 strongly implies it is fine to add reviewed LLM content. But this directly contradicts #1.NicheSports (talk)02:06, 4 December 2025 (UTC)[reply]
These do not read as contradictory to me. Nowhere in #1 does it prohibit LLM use.
even if you plan to edit the output later means editors cannot immediately add LLM output to the project with an excuse of "I'll fix it later", they must fix it first before it can be added at all.fifteen thousand two hundred twenty four (talk)02:26, 4 December 2025 (UTC)[reply]
I'm not sure about that interpretation... what about the first part of that sentence:Do not use an LLM as the primary author.... Still pretty contradictory. Either you can use an LLM to generate a bunch of text and then edit it, or you can't. This guideline, as written, plays both sidesNicheSports (talk)02:47, 4 December 2025 (UTC)[reply]
I don't follow. #1 applies to edits which createnew articles or aremajor expansions, situations where majority-LLM authorship would be especially undesirable, and so that is explicitly disallowed. #2 applies to editing in general, where raw or lightly-edited LLM content is disallowed. Maybe you could pose a hypothetical editing scenario where you believe a contradiction would occur, and that would help me understand your point better.fifteen thousand two hundred twenty four (talk)03:15, 4 December 2025 (UTC)[reply]
Oh. With this interpretation, I would support! But if I don't understand this I guarantee you a lot of the non-native English speakers who are using LLMs would miss the distinction. Can we clarify the wording?NicheSports (talk)03:19, 4 December 2025 (UTC)[reply]
It reads well to me, so I'm not sure what changes could be made, @Qcne may have some suggestions?fifteen thousand two hundred twenty four (talk)03:30, 4 December 2025 (UTC)[reply]
I mean the header needs to be changed but it could just be changed to "Rules for using LLMs to assist with article content" or something neutral. We should specify that #1 above are rules for "major content additions" while #2 is rules for "minor content additions".NicheSports (talk)03:31, 4 December 2025 (UTC)[reply]
I do prefer the currentDo not use an LLM to add unreviewed content header, it communicates up-front what the most basic requirement is before providing more detail below.
#1 does already specify that it concernsnew articles or a major expansions, and #2 already applies to all editing, narrowing its scope would introduce another point of argumentation (define "minor" vs "major"). The grammatical clarity could maybe be improved, but right now it's in good enough condition for adoption, andas said prior, I'm wary ofbikeshedding.fifteen thousand two hundred twenty four (talk)03:51, 4 December 2025 (UTC)[reply]
I also think we need to be wary of any headline like the suggested "Rules for including LLM content" for fear of implying permission. I do think the "do not" header is the best way to go about it, and the way it's currently written is fine for a compromise guideline which isn't aiming to be a total ban.Athanelar (talk)04:00, 4 December 2025 (UTC)[reply]
The categories could just be "New articles or major expansions" and "General considerations". Could just be a bolded title before each section. That would be enough to make it clear (I support your interpretation butcompletely missed it when I first read). I disagree with the "unreviewed content" header, because it does contradict the guideline's language for new articles and major edits, and is going to confuse the heck out of newer editors, but I guess I can live with it for now.NicheSports (talk)04:05, 4 December 2025 (UTC)[reply]

Comment could you remove a word from the second heading -- "Do not use an LLM to add unreviewed content" -) "Do not use an LLM to add content"? Using AI to add content to Wikipedia goes against the spirit of the consensus developed in the RFC.Mikeycdiamond (talk)02:46, 4 December 2025 (UTC)[reply]
Called it.SuperPianoMan9167 (talk)02:56, 4 December 2025 (UTC)[reply]

3rd time is truly a charm. I really like his one.Викидим (talk)02:54, 4 December 2025 (UTC)[reply]

Remove the entire "Why LLM-written content is problematic" section. As I've said before,guidelines aren't information pages. Remove unnecessary words.
Change to: "Do not use an LLM to add~~unreviewed~~ content"
"Handling existing LLM-generated content" – good section. Thumbs up from me on this one.

Cremastra (talk ·contribs)03:06, 4 December 2025 (UTC)[reply]

If guidelines aren't information pages, then shouldn'tWP:LLM be tagged as an information page?SuperPianoMan9167 (talk)03:08, 4 December 2025 (UTC)[reply]

IMO, yes, because that's what it is – it provides useful information on why LLMs are problematic and factual tips to handle and identify them.Cremastra (talk ·contribs)03:10, 4 December 2025 (UTC)[reply]

Done inSpecial:Diff/1325613952. WP:LLM is now an information page.SuperPianoMan9167 (talk)03:16, 4 December 2025 (UTC)[reply]

When/if qcne's guideline goes live, we must remember to add it to the information page template there as a page that is interpreted by it.Athanelar (talk)03:31, 4 December 2025 (UTC)[reply]

Igot reverted.SuperPianoMan9167 (talk)16:44, 4 December 2025 (UTC)[reply]

Guidelines aren't information pages, true, but you do need need to explain to people why the guideline exists; Wikipedia attracts far too many free-thinking, contrarian, and libertarian types who like asking "why?" and will resist a nameless figure telling them what to do unless they're provided a reason to do otherwise.GreenLipstickLesbian 💌🧸03:10, 4 December 2025 (UTC)[reply]

Guidelines should absolutelylink – prominently! – to pertinent information pages, and give a one or two-sentence explanation of why the guideline is necessary. But whole sections dedicated to justifying its existence mean that the important parts are covered by clouds of factual information rather than principled guidance, which is confusing for new editors, who need the guidelines most.Cremastra (talk ·contribs)03:12, 4 December 2025 (UTC)[reply]

Change to: "Do not use an LLM to add~~unreviewed~~ content" – I don't think this is going to shape up to be that kind of full-ban proposal (unlike #1 and #3 on this page are). That said, the core text as-is would be straightforward improvement while also posing no impediment to adopting more restrictions in the future.WP:NEWLLM was a small step, this would be a larger one, I'd suggest not letting perfect be the enemy of better.fifteen thousand two hundred twenty four (talk)03:27, 4 December 2025 (UTC)[reply]

Thanks for all the comments. I have formally opened an RfC:User talk:Qcne/LLMGuideline#RfC: Replace text of Wikipedia:Writing articles with large language models.qcne(talk)11:28, 4 December 2025 (UTC)[reply]

Further amendment proposal #3: Athanelar

[edit]

This proposal is part of anWP:RFCBEFORE with the goal of reaching a comprehensive guideline for LLM use on Wikipedia.

Throwing my hat in the ring, essentially the same as Festucalex's proposal but just with slightly narrower scope that doesn't imply we're trying to police people using AI for idea generation or the likes.

−

Large language models (or LLMs)~~can~~be~~usefultools,butthey~~are not good at creating~~entirelynewWikipediaarticles~~.~~Largelanguagemodels~~ should not be used to generate~~newWikipediaarticlesfromscratch~~.

Large language models (or LLMs) are not good at creatingarticlecontentwhichissuitableforWikipedia,andtherefore should not be used to generatecontenttoaddtoWikipedia,whetherfornewarticlesorwheneditingexistingones.

Athanelar (talk)15:17, 24 November 2025 (UTC)[reply]

This completely changes the purpose of this guideline (expanding its scope from new articles to all edits) and would require a new RfC.Toadspike [Talk]15:48, 24 November 2025 (UTC)[reply]

That's sort of the intention, yes. I assume Festucalex is doing the same, and the intention is to gauge support before a formal RfC to expand the guideline.Athanelar (talk)15:52, 24 November 2025 (UTC)[reply]

@Qcne andAthanelar: May I have your permission to change the headers from their present titles to this:

Further amendment proposal #1: Festucalex
Further amendment proposal #2: qcne
Further amendment proposal #3: Athanelar

Just to make it clearer to other editors? I'll also change the section link that Athanelar put above.〜Festucalex •talk16:18, 24 November 2025 (UTC)[reply]

Of course, thank you.qcne(talk)16:19, 24 November 2025 (UTC)[reply]

Go ahead, thanks.Athanelar (talk)16:31, 24 November 2025 (UTC)[reply]

Done, thank you both. I took the liberty of adding an explanatory hatnote.〜Festucalex •talk16:34, 24 November 2025 (UTC)[reply]

This is all to see if people support a new guideline as opposed to a proper change.GarethBaloney (talk)16:37, 24 November 2025 (UTC)[reply]

I suggest dropping theLarge language models (or LLMs) can be useful tools part. It's not necessary and will cause an awkward divide if taken to RfC where editors who more broadly oppose LLM use would have to endorse that they are useful tools.fifteen thousand two hundred twenty four (talk)16:27, 24 November 2025 (UTC)[reply]

I've modified my wording somewhat. I agree that part is unnecessary.Athanelar (talk)16:35, 24 November 2025 (UTC)[reply]

As I've discussed previously, personally I would prefer any guidance not to refer to specific technology, as this changes and is not always evident to those using tools written by others, and focus on purpose. Along the lines of myprevious comment in the RfC, I suggest something like "Programs must not be used to generate text for inclusion in Wikipedia, where the text has content that goes beyond any human input used to trigger its creation." (Guidance for generated images is already covered byWikipedia:Image use policy § AI-generated images.)isaacl (talk)18:22, 24 November 2025 (UTC)[reply]

How wouldText generation software such as large language models (LLMs) should not [...] sound?Athanelar (talk)18:26, 24 November 2025 (UTC)[reply]

Personally, I prefer using a phrase such as "Programs must not be used to generate text" as I think it better reflects what many editors want: text written by a person, not a program. I think whether it's in a footnote or a clause, text generation should be defined, so using programs to help with copy-editing, or to fill in the blanks of a skeleton outline is still allowed. Also, I prefer "must" to "should".isaacl (talk)19:16, 24 November 2025 (UTC)[reply]

"Programs" is too nonspecific I think; a word processor is arguably a "program used to generate text" for example. We need to be somewhat specific about what sort of technology we're forbidding here.Athanelar (talk)19:30, 24 November 2025 (UTC)[reply]

Thus why I said the meaning of text generation should be defined, and as I suggested, the generated text should not have content that goes beyond any human input used to to trigger its creation. Accordingly, word processors do not fall within the definition.isaacl (talk)23:44, 24 November 2025 (UTC)[reply]

Honestly, I like this as the lead for Qcne's proposal above. Specifying it's about both creating articles and editing existing ones is good clarityKowal2701 (talk)21:41, 24 November 2025 (UTC)[reply]

Oppose. I would argue that the current text is already too restrictive (yes, AI can be abused, but so does theWP:AWB) and needs to be handled in other way altogether (like the AWB is handled).Викидим (talk)22:04, 24 November 2025 (UTC)[reply]

This proposal is more restrictive than proposal #2, so it can't serve as a lead for it.isaacl (talk)23:50, 24 November 2025 (UTC)[reply]

Support. I'm still going to try making incremental changes to improve the current version, but this closes the biggest loophole (inserting content into existing articles) while eliminating "from scratch". You're going to need to tighten your definitions though or "but it's only one sentence and I reviewed it".~ Argenti Aertheri^(Chat?)21:13, 26 November 2025 (UTC)[reply]

How would you know whether one sentence was AI-generated? Is it practical to prohibit an undetectable use? Unenforceable "laws" can lead to a general disregard for rules ("Oh, yes, driving that fast is illegal here, but everybody does it, and the police don't care" becomes "Nobody cares about speeding, and reckless driving is basically the same thing").WhatamIdoing (talk)06:28, 27 November 2025 (UTC)[reply]

Is it practical to prohibit an undetectable use? – Banning all use bans all use. All vandalism is prohibited, not just detectable vandalism, same for NPOV violations, promotion, undisclosed paid editing, sockpuppetry, etc. What can be detected will be, what can not will not. I do not understand your point.fifteen thousand two hundred twenty four (talk)06:43, 27 November 2025 (UTC)[reply]

Yes, banning bans all use. But if you can't tell whether the use happened, or prove that it didn't, then we might end up with drama instead of an LLM-free wiki.WhatamIdoing (talk)02:20, 28 November 2025 (UTC)[reply]

We can't prove COI or undisclosed paid editing either, we still don't allow them.~ Argenti Aertheri^(Chat?)19:39, 28 November 2025 (UTC)[reply]

And we end up with drama about that regularly, when an editor issues an accusation, and the targeted editor denies it, and how do you prove who's correct?WhatamIdoing (talk)02:47, 2 December 2025 (UTC)[reply]

Since that's all par the course for COI, I think you may have misunderstood my !vote. I'm sorry if it sounded like I was trying to say one reviewed sentence should (not) be allowed. I meant to say: this will come up if this goes for RfC, so address it before RfC. Personally I think one reasonable length sentence is my comfort level, if only because of how much GPTs like to ramble.~ Argenti Aertheri^(Chat?)18:31, 2 December 2025 (UTC)[reply]

Oppose. Instead of this approach, which I do not think would make for a useful guideline, I support adoptingWP:LLMCIR as a guideline.—Alalch E.00:04, 28 November 2025 (UTC)[reply]
Support. AI causes wikipedia numerous issues like hallucinations text that does not make sense and unsourced content etc. I believe the guidline prohibiting the use of ai to generate article content is the best way forward.GothicGolem29 (Talk)18:58, 28 November 2025 (UTC)[reply]
Oppose. LLMs are useful tools when used carefully.Anne drew (talk ·contribs)19:52, 3 December 2025 (UTC)[reply]

Oppose - the contexts of "generation" are nuanced, making the term itself vague. The distinction between composing and revising is blurry. The difference between an AI composing and assisting an editor with composing, is not clear. And then there is the issue of false positives, giving editors the worry that their edits may be mistaken for AI-generated text. Keep in mind that new generations are growing up communicating with AI, being taught by AI, and following the instructions of AI; it is only natural that they will pick up the writing styles of the AIs they interact with. Though, the most important thing here is that we don't overlook the key aspects of using AI: it is a skill that improves with practice, and the tools themselves are rapidly improving. —The Transhumanist 23:18, 6 February 2026 (UTC)[reply]

Expanding CSD G15 to align with this guideline

[edit]

Those participating in this discussion might also be interested inmy discussion about potentially expanding CSD G15 to apply to all AI-generated articles per this guideline.Athanelar (talk)16:53, 24 November 2025 (UTC)[reply]

Discussion withdrawn within six hours by the OP due to opposition.WhatamIdoing (talk)06:29, 27 November 2025 (UTC)[reply]

Not a proposal, just some stray ideas

[edit]

I didn't participate in the original RfC and I haven't fully read the new proposals and discussions here, but I'll table the rough notes I've been compiling atUser:ClaudineChionh/Guides/New editors and AI in case there are any useful ideas there. (There might be nothing useful there; I'm still slowly working my way through the discussions on this page.)ClaudineChionh(she/her ·talk ·email ·global)23:04, 24 November 2025 (UTC)[reply]

After reflecting on the common refrain in these discussions thatAI is just a tool, we should judge LLM text by the same standards we judge human text, I also finally put some of my thoughts on this matter into essay form (complete with clickbaity title!) atUser:LWG/10 Wikipedia Policies, Guidelines, and Expectations That Your ChatBot Use Probably Violates. There's also a little "spot the LLM" easter egg if anyone wants a small diversion. --LWG ^talk 03:03, 25 November 2025 (UTC)[reply]

Further amendment proposal #4: Mikeycdiamond

[edit]

This proposal is part of anWP:RFCBEFORE with the goal of reaching a comprehensive guideline for LLM use on Wikipedia.

During the initial discussion of this guideline, I noticed that people were complaining that others would use it to blanketly attack stuff at XFD because it might be by an AI. My proposal would fix that problem. I also noticed some slight overlap with the third sentence of my proposal and Qcne's proposal, but I would appreciate input on whether I should delete it. If my proposal were to be enacted, I believe it should be its own paragraph.

"When nominating an AI article for deletion, don't just point at it and say, "That's AI!" Please point out the policies or guidelines that the AI-generated article violated.WP:HOAX andWP:NPOV are examples of policies and guidelines that AIs commonly violate."Mikeycdiamond (talk)00:55, 25 November 2025 (UTC)[reply]

Oppose. I would compare the situation toWP:BURDEN - deleting AI slop should be easy at the slightest suspicion, keeping it should require disclosures / proofs of veracity, etc. (like BURDEN does in the case of unsourced text). This proposal goes in the opposite direction: another editor should be able to tell me that "this article looks like AI slop. Explain to me how you created this text", in the same way they can point to BURDEN and tell me "show me your sources or this paragraph will be gone".Викидим (talk)01:17, 25 November 2025 (UTC)[reply]

@Викидим, I have "the slightest suspicion" that the new articles you created atAttribute (art) andChristoph Ehrlich used AI tools. Exactly how easy should it be for me to get your new articles deleted?WhatamIdoing (talk)06:35, 27 November 2025 (UTC)[reply]

The key word in my remark is "slop". I do not think that everything that AI produces is sloppy. Incidentally, I already provide full disclosures on the talk pages. I hope this would convince other editors in the veracity of the article content, so the hypothetical AfD would not happen. So, (1) I firmly believe that using AI should be allowed and (2) acknowledge the need to restrict the cost of absorbing the AI-generated text into the encyclopedia.

My personal preference would be to have a special "generative AI" flag that allows the editor to use generative AI. For some reason this idea is not popular. An alternative would be to shift the burden onto of proof of quality onto the users of generative AI. For an article showing the telltale signs of AI use, absence of published prompts or prompts indicating that the AI was involved in the search for RS can be grounds for deletion IMHO.Викидим (talk)06:58, 27 November 2025 (UTC)[reply]

I think some editors believe "AI slop" is redundant (i.e., all generative AI is automatically slop), so your articles would be at risk of AFD.

Other editors believe that "deleting slop should be easy", even if it's not AI-related.WhatamIdoing (talk)02:22, 28 November 2025 (UTC)[reply]

Regarding the quality of AI output: based on what I have witnessed firsthand, the modern AI models, when properly used, can provide correct software code of quite non-trivial size. I will happily admit that the uncertainties inherent in any human language make operations with it harder than than with programming languages, but the fact that AI (as of late 2025)in principle can generate demonstrably correct text is undeniable. Same thing apparently happens when AI is asked to produce, say, a summary of facts relating to X from a few-hundred-page book that references back to the pages in the original book. Here, based on personal experience, I am yet to encounter major issues, too. Writing of a Wikipedia article is very close to this latter job, so I see no reason whymodern AI, properly prompted, should produce slop. Unlike in the former case, where the proof of correctness is definite, I can be wrong, and will happily acknowledge it if somebody provides me with an example of, say, Gemini 3.0 summarizing text on a "soft" topic wildly incorrectly after adequate prompts (which in this case are simple: "here is the file with text X, create summary of what it says about Y for use in an English Wikipedia article").Викидим (talk)04:39, 28 November 2025 (UTC)[reply]

Even if you think thatmodern AI can produce good content, other editors appear to be dead-set against it.

Additionally, you are opposing a request for editors to say more than "That's AI" when trying to get something deleted. Surely you at least mean for them to say "That's AI slop"? Because if "modern AI, properly prompted" is a reason for deletion, then your AI-generated articles will disappear soon.WhatamIdoing (talk)02:50, 2 December 2025 (UTC)[reply]

I understand the internal contradiction in my posture. I stems from the fact that I look at AI from two angles, as an editor who actually likes to create articles using AI and feels good about the need to wash hands prior to cooking the text, and as anWP:NPP member where I occasionally face the slop.Викидим (talk)06:46, 2 December 2025 (UTC)[reply]

My experience has been the opposite -- AI-generated text in my experience tends to represent sources so poorly that when I spot check some obviously-modern-AI text, there is a >50% chance that it's going to be the same old slop just with a citation tacked on.

Recent and characteristic example:Talk:Burn (Papa Roach song), generated a few days ago most likely with ChatGPT (based on utm_source params in the editor's other contributions). I don't know what LLM or prompt was used, but it took me only ~10 minutes to find several instances of AI-generated claims that sources say things that they simply don't. This isn't an especially noteworthy example either, it got it wrong in the exact same ways it usually does.

And if the article were to go to AfD -- note, I am not saying that it should --that is actually relevant, because the AI text is presenting one source as multiple, and in one case inventing fictitiousWP:SIGCOV literally just from a song's inclusion in a tracklisting. This becomes obvious when you read the cited sources, but many at AfD don't.Gnomingstuff (talk)20:55, 2 December 2025 (UTC)[reply]

Oppose in its current form. Generally I think AI usage falls underWP:NOTCLEANUP -- a lot of AI-generated articles are about notable subjects, especially the ones where there's a language gap. But I do think there are legitimate reasons to bring AI usage up at AfD, because AI can misrepresent sources, and in particular often misrepresents them by making a huge deal out of a passing mention, making coverage seemsignificant that actually isn't. I also think that for certain topics -- POV forks, BLPs, etc. -- AI generation is a legitimate reason to just delete the thing.Gnomingstuff (talk)01:23, 25 November 2025 (UTC)[reply]

Support. Explaining howWP:AfD is not cleanup is very important to clarifying the scope of this guidelineKatzrockso (talk)01:31, 25 November 2025 (UTC)[reply]

PromoteWP:LLM to guideline We cite it and treat it as if it were a guideline and not an essay. For Pete's sake, just promote it already! It has everything necessary for a comprehensive LLM usage guideline.SuperPianoMan9167 (talk)02:01, 25 November 2025 (UTC)[reply]

We've already gone through a month-long RFC to promote this to a guideline. Could you image how large the debate would be if we tried to promote that essay? It might be quicker to work on this guideline.Mikeycdiamond (talk)02:05, 25 November 2025 (UTC)[reply]

That essay is comprehensive and well-written. In my opinion, it would be quicker to just promote it to guideline instead. Besides, it already contains guidance in the spirit of this guideline in the form ofWP:LLMWRITE. It also containsWP:LLMDISCLOSE, which I think should bepolicy (and I am honestly baffled that it isn't).SuperPianoMan9167 (talk)02:09, 25 November 2025 (UTC)[reply]

No one is stopping you from making an RFC. I don't disagree with you, but I am not sure if it would pass.Mikeycdiamond (talk)02:12, 25 November 2025 (UTC)[reply]

I was looking through LLM's talk page archives; there was an RFC in 2023. The RFC showed large consensus against promoting it, but a lot has changed since then.Mikeycdiamond (talk)02:22, 25 November 2025 (UTC)[reply]

Oppose; misses the point of NEWLLM, which is specifically to forbid AI-generated articles simply because they are AI-generated, and not because of AI-related policy violation.Athanelar (talk)02:56, 25 November 2025 (UTC)[reply]

That's your interpretation of the guideline. Other editors will interpret it in different ways.SuperPianoMan9167 (talk)02:57, 25 November 2025 (UTC)[reply]

The text of the guideline is pretty clear on what it forbids. It says that LLMs are not good at generating articles, and should not be used to generate articles from scratch. We can argue all day about what 'from scratch' means (which is what these amendment proposals are meant to solve) but the fact that the guideline forbids AI writing in itself is not I think ambiguous in any sense; there is no room in the proposal to argue that it's saying AI-generated articles are only bad if they violate other policies.Athanelar (talk)03:06, 25 November 2025 (UTC)[reply]

If they don't violate other policies/guidelines, what is the point of deleting them? Isn't the sole reason of banning AIs because they violate our other policies/guidelines?Mikeycdiamond (talk)03:11, 25 November 2025 (UTC)[reply]

Because they violatethis guideline, which says you shouldn't generate articles using AI.Athanelar (talk)03:15, 25 November 2025 (UTC)[reply]

WP:IMPERFECT andWP:ATD-E are core Wikipedia policies that collectively suggestWP:SURMOUNTABLE problems that can be resolved with editing should not be deleted.Katzrockso (talk)03:45, 25 November 2025 (UTC)[reply]

In my eyes, a guideline which says "Articles should not be generated from scratch using an LLM" logically means the same thing as "An article generated from scratch using an LLM should not exist." It would be kind of odd to me to argue that this guideline doesn't support deletion; because what, you're saying that youshouldn't generate articles using AI, but if you happen to do so, then it's fine as long as it doesn't violate other policies/guidelines? That would mean that this guideline really does nothing at all.

And anyway, your argument also arguably applies to an AI-generated article which violates other policies/guidelines, too. I mean, those problems might also be surmountable, so what's the problem there? Should we disregard CSD G15 and say that unreviewed AI-generated articles are fine as long as the article subject is notable and the article is theoretically fixable with human intervention?

Basically, I think adding a paragraph to this guideline saying that you can't use it to support deletion would mean there's no point in this guideline existing at all, and you might as well just propose that the guideline be demoted again.Athanelar (talk)03:57, 25 November 2025 (UTC)[reply]

Say Mary Jane generates an LLM-written article that has some major, but surmountable, issues. For example, two of her citations are to fake links, but other sources are readily available to support the claims, three of the claims are improperly in wikivoice when they should be attributed, and there is a section of the article that is irrelevant/undue. Would you suggest this article be deleted in whole, despite being otherwise a notable topic, or should editors be allowed to remedy the problems generated by the LLM usage?Katzrockso (talk)04:04, 25 November 2025 (UTC)[reply]

I think in the given example it would essentially be the same amount of effort to TNT the article and start from scratch as to try to rework it from the flawed foundation; so yes, I'd say deletion would still be fine in that case.

Besides, what exactly would we be fighting to keep in the other case? It's not as if we'd be doing so out of a desire to respect Mary Jane's effort in creating the article. We'd be trying to hammer a square peg into a round hole for no reason other than 'well, the subject's notable and the article's here now, so...'Athanelar (talk)04:11, 25 November 2025 (UTC)[reply]

It's my (and my other editors) belief that TNT is not a policy-based solution remedy (WP:TNTTNT), but one that violates fundamental Wikipedia PAG. In my given example, I don't see how "it would essentially be the same amount of effort to TNT the article and start from scratch as to try to rework it from the flawed foundation". The remedy in my scenario would be:

1) Replace the fake link citations to the readily available real sources that support the claim

2) Change the three sentences that are improperly in wikivoice to attributed claims

3) Remove the off-topic/irrelevant section

If you think that is more difficult than starting from scratch, I don't know what to express other than shock and disbelief.Katzrockso (talk)06:02, 25 November 2025 (UTC)[reply]

About TNT: Has it ever occurred to you that the actual admin delete button isn't necessary? You can follow process you're thinking of (AFD, red link, start new article) or you could open the article, blank the contents, and replace it with the new article right there, without needing to spend time at AFD or anything else first.WhatamIdoing (talk)06:37, 27 November 2025 (UTC)[reply]

(also, the article you've given as your example here wouldalready be suitable for deletion under CSD G15 whether or notWP:NEWLLM existed, so if you don't think that article would be suitable for deletion, you're also arguing we shouldn't have CSD G15)Athanelar (talk)04:13, 25 November 2025 (UTC)[reply]

Things are only as good as the parts that make them up. If it wasn't for HOAX or NPOV--among many other-- violations, this guideline wouldn't exist. We already have policies and guidelines for the subjects AIs violate; why shouldn't we use them? It is much clearer to point out the specific thing the text violates then blindly saying it is AI. I know AI text is relatively easy to spot now, but it will get progressively better at hiding from detection. What if people use anti-AI detection software? This guideline is meant to back up stronger claims using other policies/guidelines, not be the sole argument in an XFD.Mikeycdiamond (talk)03:09, 25 November 2025 (UTC)[reply]

The text of this guideline literally says 'LLMs should not be used to generate articles from scratch.' Your proposed amendmentto that guideline is to tell people that when deleting AI-generated articles, they cannot reference the guideline that specifically says 'Don't generate articles with AI' and must instead referenceother policies/guidelines that the article violates.

That would seem to defeat the whole point of passing a guideline that says 'Don't generate articles with AI,' wouldn't it?Athanelar (talk)03:14, 25 November 2025 (UTC)[reply]

Deletion policy wasn't really discussed all too much in the RfC or the nonexistent RFCBEFORE, so whether it defeats the purpose is not established. Many editors expressed positive attitudes towards the guideline because it provided somewhere to point to explain to people why their LLM contributions aren't beneficial.Katzrockso (talk)03:47, 25 November 2025 (UTC)[reply]

Oppose as defeating the purpose of having a guideline. We just passed a guideline saying "don't create articles with LLMs", this would effectively negate that by turning around and saying "actually, it's fine if it doesn't violate anything else". It doesn't work that way with any other guideline and for good reason: imagine nominating something for deletion due to serious COI issues and being told "nah, prove it violates NPOV". No, the burden of proof is on the editor with the conflict becausethey're already violating one guideline. This is one guideline, violating one guideline is enough.~ Argenti Aertheri^(Chat?)21:27, 25 November 2025 (UTC)[reply]

I agree completely with the objections raised by Викидим and Gnomingstuff and Athanelar and Argenti Aertheri. AFD is about what an article is lacking (sourcing establishing notability), not about what bad content it has - just remove the bad content and AFD whatever is left if warranted. So there is no reason to treat NEWLLM differently from any other guideline there. --LWG ^talk 01:10, 26 November 2025 (UTC)[reply]

Oppose — This reminds me of when people tried to undercut the ban on AI slop images as soon as it passed. The guideline needs to made stronger, not weaker.—pythoncoder (talk |contribs)15:39, 26 November 2025 (UTC)[reply]

Oppose per all above. A guideline is a guideline and a statement of principle, and should be used directly, not as through proxies. If there is overwhelming evidence an article is wholly AI-generated such that it falls afoul of this guideline, the article should be deleted at AfD.Cremastra (talk ·contribs)19:01, 26 November 2025 (UTC)[reply]

Oppose. Not topical in this guideline as this guideline is not about deletion in the first place.—Alalch E.23:52, 27 November 2025 (UTC)[reply]

Some people think it is, see#Expanding CSD G15 to align with this guideline.SuperPianoMan9167 (talk)00:04, 28 November 2025 (UTC)[reply]

community consensus on how to identify LLM-generated writing

[edit]

Not sure how I feel about this one.

On the one hand, there is some research suggesting that consensus helps: specifically, when multiple people familiar with signs of AI writing agree on whether a given piece of text is AI,they can achieve up to 99% accuracy. Individual editors were topping out at around 90% accuracy (which is still very good obviously).

On the other hand,we have to treat an edit as human-generated until there's consensus otherwise seems like a massive restriction that came out of nowhere -- it doesn't have consensus in the RfC and I'm not sure more than a handful of people even said anything close. Like, just think about how that would work in practice. Do we have to commune a whole AI Tribunal before reverting text that is very clearly AI-generated? Is individual informed judgment not enough?

This stuff is really not hard to identify.WP:AISIGNS exists, and is relatively up to date with existing research on common characteristics of LLM-generated text -- and specifically, things it does that text prior to ~2022 just... didn't do very often. This is also the case with Wikipedia text prior to mid-2022. I've beenrunning similar if lax text crunching on Wikipedia articles before mid-2022, and the same tells have just skyrocketed. The problem is actually convincing people of this: that AI text consistently displays various patterns far more often than human text does (or for that matter, than LLM base models do), that people have actually studied those patterns, and that the individual edit they are looking at fits the pattern almost exactly. Is the page just not clear enough? Does it need additional citations?Gnomingstuff (talk)01:11, 25 November 2025 (UTC)[reply]

I think this caveat was added to the RfC only because the closer didn't believe there was enough consensus for the promotion to guideline, and adding the requirement for consensus to determine that an article is in fact AI generated helps to soothe those who think the guideline is over-restrictive.

I also think it's really a non-issue; since there'sno support currently to expand CSD G15 to apply to all AI-generated articles, any article suspected of being AI-generated in violation of NEWLLM will have to go to AfD anyway, which automatically will end up determining consensus about whether the article is AI generated and should be deleted under NEWLLM.Athanelar (talk)03:02, 25 November 2025 (UTC)[reply]

we have to treat an edit as human-generated until there's consensus otherwise Where did this come from?

As for your number crunching, I'm not sure if I understand the results, but if we are going to start taking phrases like "pivotal role in" and "significant contributions to" as evidence of LLM contributions, then I think this starts to pose problems.Katzrockso (talk)03:03, 25 November 2025 (UTC)[reply]

It's from the RFC closing note.Athanelar (talk)03:07, 25 November 2025 (UTC)[reply]

That sentence in the closing note is strange to me as well, and only makes sense in the context of an AFD or community sanctions on a problem user. In terms of reversion/restoration of individual suspected LLM-edits, theWP:BURDEN is clearly on the user who added the content to explain and justify the addition, not on a reverting editor to explain and justify their reversion. In the context of LLM use, that means that if someone asks an editor "did you use an LLM to generate this content, and if so what did that process look like?" they should get an clear and accurate answer, and if they don't get a clear and accurate answer the content should be removed until they do. --LWG ^talk 03:22, 25 November 2025 (UTC)[reply]

I think ultimately it's just an effort by the closer to avoid 'taking a side' on what they perceived as a pretty tight consensus, and to preempt a controversy about the nature of the guideline; which of course is occurring anyway.Athanelar (talk)03:32, 25 November 2025 (UTC)[reply]

No, it's not any of those things. It's me knowing this argument was going to be made and pre-empting it. Where there's no rule or guideline, Wikipedia makes content decisions by consensus; so an edit isn't to be treated as AI-generated until either we've got consensus for a test that it's AI-generated or else we've analysed the edit and reached consensus that it's AI-generated.

I know this limits the applicability of the guideline but that's not because I'm unclear or unsure about the RFC outcome or worried about taking sides. It's because of how long-established Wikipedia custom and practice works.

A test of what actually identifies AI-generated writing should really be the next step, folks.—S Marshall T/C08:57, 25 November 2025 (UTC)[reply]

The issue is that requiring consensusbefore tagging content as problematic (instead of tagging the content and then followingWP:BRD) imposes an unnecessary restriction, even on current practices, which wasn't brought up in the discussion. This close would mean, for example, that we can't tag a page as{{AI-generated}} anymore without first requiring an explicit consensus. This isn'tWikipedia custom and practice for tagging and has never been.ChaoticEnby (talk ·contribs)09:10, 25 November 2025 (UTC)[reply]

The best solution to these problems is to reach consensus on a test. But obviously, tagging doesn't need consensus and never has. What's not allowed is to revert or delete content for being AI-generated unless there's consensus to do so. Just to be clear: all our normal rules apply. You can still revert for all the usual reasons. BRD still applies. ONUS still applies. You can still tag stuff you suspect might be problematic.—S Marshall T/C09:43, 25 November 2025 (UTC)[reply]

But obviously, tagging doesn't need consensus and never has. This is certainly not obvious from your close, which says thatthis means that we have to treat an edit as human-generated until there's consensus otherwise. A closure should only summarize the given discussion, not add new policies that need to rely on the word of the closer for later clarification, even if they would be a logical development from previous practice.ChaoticEnby (talk ·contribs)10:44, 25 November 2025 (UTC)[reply]

Summarize and clarify. A close should summarize the community's decision and clarify its relationship to existing policy and procedure. What we don't want looks like this:I think this user is adding AI-generated content so I'm going to quick-fail all their AfC submissions and then follow them round reverting and prodding.—S Marshall T/C12:22, 25 November 2025 (UTC)[reply]

What's not allowed is to revert or delete content for being AI-generated unless there's consensus to do so --

I'm not aware of anything in policy stating this -- certainly not AI policy, because we don't have any. Based on the consensus of this RfC, and on the fact that people are already reverting and deleting content for being AI to relatively little outcry, I don't think there would be consensus for such a prohibition, and I think most people in the RfC would be surprised to learn they were !voting for one.Gnomingstuff (talk)10:16, 26 November 2025 (UTC)[reply]

As far as a test being the next step... I meanI'm trying Jennifer. We haveWP:AISIGNS and are trying to make it as research-backed as possible. It is an evolving document, and I'm sure most contributors to it have their own list of personal tells they've noticed. (For example I trust @Pythoncoder's judgment implicitly on detecting AI but they see stuff I have no idea about. Apologies if you don't want the ping, I figured the outcome here is relevant to you.) But there are several problems:

Problem 1: Getting people to actually believe that these are signs of AI use. There seems to be no amount of evidence that is enough.

Problem 2: Getting people to interpret things correctly. This stuff gets very in-the-weeds, and AISIGNS leaves out a lot for that reason. For instance, one "personal tell" I have noticed is thatAdditionally, starting a sentence with capitals and punctuation, is a strong indicator of possible AI use, but the wordadditionally as an infix isn't necessarily a sign. Other tells I have are still kind of in the oven until I can hammer out a version with as few false positives as possible, with as little potential for confusion.

Problem 3: We are doomed to remain in the world of evidence, not proof. It is impossible to prove whether AI was used in an edit unless you are the editor who made it. Since we have had AI text incoming since 2023, many of those editors aren't around anymore. Other editors are not forthcoming with the information. Some dodge the question, some trickle-truth it, small handful of editors lie.Gnomingstuff (talk)10:34, 26 November 2025 (UTC)[reply]

This is exactly the shit I mean. When:

A word is identified inmultiple academic studies as very over-represented in LLM-generated text compared to human text
The most obvious phrase containing that word is roughly 1605% more common in one admittedly less rigorous sample of AI-generated edits compared to human-generated -- a substantial portion of which are human-generated articles tagged as promotional

...then yes, it would seem to be empirical evidence? No one can prove how a user produced an edit besides that user, but when patterns start showing up that happen to be similar patterns to ones cited in external sources as characteristic of AI use, that is telling.Gnomingstuff (talk)03:31, 25 November 2025 (UTC)[reply]

Empirical evidence of what? I have humanly generated both those phrases before (not on Wikipedia, I don't think, but elsewhere), are you going to suggest deleting my contributions on these types of grounds, because your model suggests that LLMs use these phrases at higher rates? Keep in mind that human language is changing as a result of LLMs ([5]), for better or worse.Katzrockso (talk)03:53, 25 November 2025 (UTC)[reply]

Empirical evidence that these words and phrases appear more frequently in the aggregate of AI-generated text -- in this case, on Wikipedia -- compared to the aggregate of human-generated text on Wikipedia. They also tend to occur together, and occur in the same ways, in the same places in sentences, the same forms, etc. So if an edit shows up with a whole bunch of this crammed into 500 words, that's a very strong indication that the text is probably AI. Not a perfect indication -- for instance,this version ofJulia's Kitchen Wisdom is way too early for AI but sounds just like it -- but a very strong one.

I am aware of the studies that human language is changing as a result of LLMs -- one study suggests that this particular set of words is really just a supercharge to increases in those words that were naturally happening already. That particular study is less convincing because it seems to think podcasts are never scripted or pre-written, which is... not true. But anecdotally I do see it happening. (It's a bit weird to hear this stuff out of human mouths in the wild, although that's probably just thefrequency illusion given how much AI text I am seeing all day.) Not sure how much that affects Wikipedia, especially the last few years of AI stuff to deal with, given that the changes in human language feel like a lagging indicator.Gnomingstuff (talk)10:08, 26 November 2025 (UTC)[reply]

Incidentally GPTZero scans that revision of Julia's Kitchen Wisdom as 98% human,~~highlighting the pivotal role of~~ illustrating the benefit of using multiple channels of evidence to assess content. --LWG ^talk 17:47, 26 November 2025 (UTC)[reply]

I have the opposite reaction toIndividual editors were topping out at around 90% accuracy (which is still very good obviously): I look at that and say even the best of the best were making false accusations at least 10% of the time.

Imagine the uproar if someone wanted to work inWikipedia:Copyright problems, but they made false accusations of copyvios 10% of the time. We would not be talking about how good they are.

If anything, this information has convinced me that unilateral declarations of improper LLM use should be discouraged. Maybe tags such asTemplate:AI-generated should be re-written to suggest something like "This article needs to be checked for suspected AI use".WhatamIdoing (talk)07:01, 27 November 2025 (UTC)[reply]

The template already says the articlemay contain them. There is a separate parameter,certain=y, that is added for cases where the AI use is unambiguous.Gnomingstuff (talk)04:21, 28 November 2025 (UTC)[reply]

There does not need to be a community consensus on how to identify LLM-generated writing. It's a technical question. Different editors will apply different methods. Disputes will be resolved in the normal way. —Alalch E.23:49, 27 November 2025 (UTC)[reply]

Tell that to the closing admin who specifically said in the RfC closeIn particular we need community consensus on (a) How to identify LLM-generated writing [...]Athanelar (talk)00:13, 28 November 2025 (UTC)[reply]

That statement is true because mostsigns of AI writing, except for the limited criteria of G15, are largely subjective.SuperPianoMan9167 (talk)00:17, 28 November 2025 (UTC)[reply]

A closer does not need to be an admin and the closer wasn't in this case.GothicGolem29 (Talk)18:48, 28 November 2025 (UTC)[reply]

Further amendment proposal #5: Argenti Aertheri

[edit]

This proposal is part of anWP:RFCBEFORE with the goal of reaching a comprehensive guideline for LLM use on Wikipedia.

−

~~Largelanguagemodels(orLLMs)can~~be~~usefultools,buttheyare~~ not good at creating entirely new Wikipedia~~articles~~.~~Largelanguagemodels~~ should not be used to generate new Wikipedia articles from scratch.

Artificialintelligence,includingGPTsandLargelanguagemodels(orLLMs),is not good at creating entirely new Wikipediaarticles,and should not be used to generate new Wikipedia articles from scratch.

We barely got the thing passed, so I propose we make small, incremental, changes. Changing LLMs to all AI seems as good a place to start as any other, and probably less controversial than some.~ Argenti Aertheri^(Chat?)03:53, 25 November 2025 (UTC)[reply]

Oppose. One of the primary criticisms the first amendment proposals were trying to address was the prominent criticism during RfC that the term 'from scratch' has no agreed-upon definition and thus the scope ofwhich articles this guideline applies to isn't clearly defined; your proposal doesn't address that, and in the process introduces a whole host of new ambiguity as to what tools are and aren't allowed, and in what capacity one might be allowed to use them.Athanelar (talk)04:00, 25 November 2025 (UTC)[reply]

There's a definition atwikt:from scratch. Merriam-Webster offersa similar definition.

There were 37 uses of "from scratch" in the RFC; most of them were entirely favorable. There were 117 editors in the discussion; I see four who complained about the "from scratch" wording, and some of them (example) would still be valid no matter what words were used.WhatamIdoing (talk)07:10, 27 November 2025 (UTC)[reply]

GPT is a type of LLM, not something that can be contrasted with it. What other forms of "artificial intelligence" (a dubious + nebulous concept) are creating Wikipedia articles other than LLMs?Katzrockso (talk)04:00, 25 November 2025 (UTC)[reply]

The point isn't to address all the problems in the guideline that passed, just one: what technologies does this include. I know AI is a nebulous concept, that's actually why I chose it, so thatWP:Randy from Boise can tell in seconds if his use of his software is included. Porn is a nebulous concept too, but we allknow it when we see it.~ Argenti Aertheri^(Chat?)04:20, 25 November 2025 (UTC)[reply]

What is not covered by the existing guidelines that your change would include?Katzrockso (talk)05:56, 25 November 2025 (UTC)[reply]

1) Remove the unnecessary "can be useful tools", it's not relevant here.

2) Replace the technical term "LLM" with a more readily accessible definition that clarifies that we want human intelligence, not artificial intelligence, regardless of the exact technology being used. Ergo explicitly stating GPTs despite them being a subset of LLMs, people know what a GPT is and if they're using one.~ Argenti Aertheri^(Chat?)06:33, 25 November 2025 (UTC)[reply]

The "can be useful tools" part was just implemented as a part of the RfC on the two-sentence guideline, removing half of the approved text from the RfC is not a good start.

"clarifies that we want human intelligence, not artificial intelligence" makes no sense, is less clear than the current version and if anything muddies the scope and applicability of this guideline.Katzrockso (talk)09:34, 25 November 2025 (UTC)[reply]

Would you find it acceptable to change the current wording from "LLMs" to "LLMs, including GPTs" if no other changes were made?~ Argenti Aertheri^(Chat?)19:02, 25 November 2025 (UTC)[reply]

I would find it acceptable/unobjectionable, I just think it's superfluousKatzrockso (talk)00:14, 26 November 2025 (UTC)[reply]

It's redundant if you know that GPTs are LLMs, but not if you're just Randy from Boise asking ChatGPT about the Peloponnesian War. Randy would likely have an easier time understanding the guideline with that explicitly spelled out.~ Argenti Aertheri^(Chat?)01:35, 26 November 2025 (UTC)[reply]

Maybe a footnote like the one inWP:G15 would work, which saysThe technology behind AI chatbots such as ChatGPT and Google Gemini.SuperPianoMan9167 (talk)02:07, 26 November 2025 (UTC)[reply]

Works for me, hopefully it works for Randy too. Should I reword this proposal orWP:BRD?~ Argenti Aertheri^(Chat?)07:22, 26 November 2025 (UTC)[reply]

I went ahead andadded the footnote.SuperPianoMan9167 (talk)22:47, 26 November 2025 (UTC)[reply]

This is much clearer/explanatory than the term "GPTs" or "artificial intelligence". Support this changeKatzrockso (talk)07:47, 27 November 2025 (UTC)[reply]

I think that @Katzrockso and @Argenti Aertheri make a good point, and it's one that could be solved by making a list. Imagine something that says "This bans article creation with AI-based tools such as ChatGPT, Gemini, and that paragraph at the top of Google search results. This does not ban the use of AI-using tools such as Grammarly, the AI grammar tools inside Google Docs, or spellcheck tools."

These lists don't need to be in this guideline, but it might help if they were long. It should be possible to get a list of the notable AI tools inTemplate:Artificial intelligence navbox.WhatamIdoing (talk)07:17, 27 November 2025 (UTC)[reply]

So this begs the question why is Grammarly spell check allowed but not ChatGPT spellchecking? I'm not saying that people should plop "Write me a Wikipedia article" into a LLM and paste that into Wikipedia, but these LLMs have other use cases too. What use cases people want to prohibit/permit really need to be laid out more explicitly for this to be workable.Katzrockso (talk)07:46, 27 November 2025 (UTC)[reply]

Here (as someone who admittedly has not used Grammarly since their adoption of LLM tech) it would (potentially) be that Grammarly uses a narrow and specific LLM model that has additional guardrails that prevent it from acting in the generative manner that ChatGPT does. Or at least that would have been the smart way of rolling out LLM tech for Grammarly, as said I've not used it so I don't know where they have implemented rails. --Cdjp1 (talk)16:54, 27 November 2025 (UTC)[reply]

In my experience reading Grammarly-edited text, it doesn't always use those guardrails well. It also tends to push a lot of more expansive AI features on people.Gnomingstuff (talk)17:07, 29 November 2025 (UTC)[reply]

In rethis begs the question why is Grammarly spell check allowed but not ChatGPT spellchecking? Yes, well, that is a question, isn't it? And I think it's a question that editors won't be able to answer if they don't realize that ChatGPT can do spellchecking.

https://arxiv.org/html/2501.15654v2 (which someone linked above) gave 300 articles to a bunch of humans, and asked them to decide whether each article was AI-generated or human-written. They learned that an individual who doesn't use LLM incorrectly missed 43% of the LLM-written articles and falsely accused 52% of the human-written articles as being LLMs. This is in the range of a coin-flip; it is almost random chance.

I'm reminded of this because those non-users (e.g., me) are also going to be unaware of the various features or tools in the LLMs. A list might inform people of what's available, and therefore let us use a bit more common sense when we say "This tool is acceptable for checking your spelling, but that tool is prohibited."WhatamIdoing (talk)02:30, 28 November 2025 (UTC)[reply]

It's spellcheck, no one cares how you figure out how to spell a word as long as you knew which word you were trying to spell. I'd be wary of grammarly unless they put guardrails as Cdjp1 suggests though, and if they have guardrails then that's what needs to be specified: which built-in guardrails make it ok?~ Argenti Aertheri^(Chat?)04:50, 28 November 2025 (UTC)[reply]

Nobodyshould care how you figure out how to spell a word, but it sounds like some editors aren't operating with that level of nuance.WhatamIdoing (talk)02:52, 2 December 2025 (UTC)[reply]

LLMs can't do spellchecking in the sense we are used to. They can do something that can be similar in output, but the underlying process used won't be the same, due to the fundamental way llms work. In terms of tools, any llm-use will have this underlying generative framework because everything is converted into mathematics and then reconverted in some way. As Cdjp1 and Gnomingstuff note, refining any llm-use is about building the right guardrails, but these don't change the way the underlying program works. The complication with Grammarly is that it has its original software and new llm-based tools, and I'm not sure how much control or even knowledge the user has. Same possibly with Microsoft these days.CMD (talk)07:24, 2 December 2025 (UTC)[reply]

In a couple of years, will the average person realistically have a way to use ordinary word processing software (e.g., MS Word or Google Docs) without an LLM being used somewhere in the background? I don't know. Maybe it just looks inevitable because of where we are in theGartner hype cycle right now, but the inadvertent use of LLMs feels like it will only get bigger over time.WhatamIdoing (talk)19:59, 4 December 2025 (UTC)[reply]

Since copying over the footnote seems pretty non-controversial, version 2:

−

Large language models (or LLMs)~~can~~be~~usefultools,butthey~~are not good at creating entirely new Wikipedia articles. Large language models should not be used to generate new Wikipedia articles from scratch.

Large language models (or LLMs) are not good at creating entirely new Wikipedia articles. Large language models should not be used to generate new Wikipedia articles from scratch.

While true, it's not relevant and only makes this mess messier. If it's a guideline about content creation then it doesn't really matter how well LLMs can do other tasks.~ Argenti Aertheri^(Chat?)— Precedingundated comment addedan unspecified datestamp.

Since you didn't get any direct replies to this, here's a late comment:

We're trying to present this as a guideline that involved reasonable people making a reasonable choice about reasonable things, rather than a bunch of ill-informed AI haters. The guideline is less likely to seem unreasonable or to be challenged by pro-AI folks if it acknowledges reality before taking away their tools. Therefore the guideline acknowledges and agrees with their POV ("can be useful"), names the community's concern ("not good at creating entirely new Wikipedia articles"), and then states the rule ("should not be used to generate new Wikipedia articles from scratch").WhatamIdoing (talk)20:07, 4 December 2025 (UTC)[reply]

Agreed. The rules are principles, not lists of things that editors should and should not do.SuperPianoMan9167 (talk)20:10, 4 December 2025 (UTC)[reply]

Agreed.Alaexis_¿question?21:04, 5 December 2025 (UTC)[reply]

Supplemental essay proposal on identifying AI-generated text

[edit]

Seeing as it has been noted (particularly by the RfC closer) that the existence of a guideline which prohibits AI-generated articles necessitates the existence of a consensus standard on identifying AI-generated articles, I've drafted a proposal which aims to codify ways that AI text can be identified for the purpose of enforcing this guideline (and any other future AI-restricting guideline)

The essay content is largely redundant toWP:AISIGNS but rather than just a list of AI indicators it specifically aims to be a standard by which content can be labelled as AI-generated.

Your feedback and proposed changes/additions are most welcome atUser:Athanelar/Identifying AI-generated text. If reception is positive I will submit an RFC.

Pinging some editors who were active in this discussion: @Qcne @Voorts @Gnomingstuff @Festucalex @Mikeycdiamond @Argenti Aertheri @LWGAthanelar (talk)17:55, 26 November 2025 (UTC)[reply]

I agree a consensus standard is implied, but I would guess any rate of false positives or negatives will render either a guideline or tools controversial. I have a few suggestions: 1) I prefer a 'weak' or humble standard, using various criteria or methods may suggest but not prove AI use. 2) Checking the volume of changes, either as a single submission or in terms of bytes/second from a given IP or account, may occasionally serve as a cheaper semi-accurate proxy for AI detection, although once again there will be false positives and negatives. 3) Given the rapid development and diversity of AI tools, and the resources involved, I do not think developing uncontroversial tools for AI detection is a feasible goal in the near future. Deploying automatic tools sitewide or on-demand would likely be prohibited by cost, but if individual users wish to run them, I think their findings could contribute evidence towards a finding - so long as we guard against bias and overconfidence in the use of these tools. --Edwin Herdman (talk)19:32, 26 November 2025 (UTC)[reply]

The "suggest" wording is a good idea. For those who worry it may not be workable, our entire concept ofnotability rests on similar wording (e.g. "presumed to be suitable", "typically presumed to be notable"). If we're going down this road, I'd support wording like this and judgement by consensus in case of dispute.Toadspike [Talk]21:08, 26 November 2025 (UTC)[reply]

Regarding AI tools changing quickly, I did somevery very very rough analysis of text pre- and post-GPT-5 if anyone is interested. Will revisit once I have more data.Gnomingstuff (talk)03:57, 27 November 2025 (UTC)[reply]

I made one small tweak -- adding the bit about edits having to be post-2022 for AI use to even be possible. "Strongly suggest" is the best we can do, unfortunately. If the burden of proof is on the person tagging/identifying AI-generated text, then that is almost literally impossible to provide because no one knows how someone made an edit but that person.

As far as automated tools, you could do worse than just scraping all articles containing >5 instances (or whatever) of the listed "AI vocabulary" words, and then manually checking those to see what's up. (This is basically what I've been doing, minus the tools.) The elephant in the room, though, is that LLMs are changingright now --GPT-5.1 came out just 2 weeks ago. We also almost never know which tools people are using, let alone the version or prompt or provided sources. And all that is compounded by the fact that even researchers don't know why AI sounds the way it does. The whole thing is largely a black box, and it's honestly kind of surprising we (as in we-the-public) have figured anything out at all.Gnomingstuff (talk)00:11, 27 November 2025 (UTC)[reply]

Thanks for your tweak. I haven't had any adverse reaction to this essay yet, so I'll give it until the 24 hour mark and if nobody's raised any major objections I'll put it up for RfC, and providing that passes then we can link to my essay from the NEWLLM page and that'll at least solve one of the RfC close's two problems. Then it'll just be a matter of codifying what we do if something breaches NEWLLM; but people seem to be generally on board with 'send it to AfD' as a solution for that already.

My fingers are crossed we can move onto RfC for a proposal to expand NEWLLM to include all AI-generated contributions and not just new articles.Athanelar (talk)00:16, 27 November 2025 (UTC)[reply]

This is redundant toWP:AISIGNS. Perhaps some content can be merged with AISIGNS. —Alalch E.23:47, 27 November 2025 (UTC)[reply]

Note for everyone subscribed to this discussion; I have raised an RfC at the essay's talk page.Athanelar (talk)00:20, 28 November 2025 (UTC)[reply]

A hypothetical scenario

[edit]

Here's a hypothetical scenario to consider. Say you have an editor writing an article. It's a well-written, comprehensive article. They publish their draft and it gets approved at AfC and moved to mainspace. If that editor then says "I used AI to write the first draft of this article", does this guidelinerequire the article be deleted, even though the content is perfectly acceptable?SuperPianoMan9167 (talk)00:52, 27 November 2025 (UTC)[reply]

Personally I believe that if the article has beencomprehensively rewritten and checked line by line for accuracyprior to asking other editors to spend time on it at AfC, the tools used for the initial draft don't matter. --LWG ^talk 01:04, 27 November 2025 (UTC)[reply]

To me, "from scratch" implies a lack of rigorous review or corrections from a human editor. I attempted to clarify this in[6], but it got reverted. No reasonable person would require a perfectly-written and verified article to be deleted merely because an early draft was written with software assistance.Anne drew (talk ·contribs)01:05, 27 November 2025 (UTC)[reply]

It's possibly already happened, and certainly has been used for edits. One temporary account recently asked about it atthe help desk. I wrote my questions 0 and 1 for this case. Reasons I think are good for disallowing it are: 1) We don't like the 'moral hazard' of letting a part of the process not have human input, and the larger the change without human input and oversight, the greater the potential problem. 2) Openly allowing AI use might cause human reviewers to be overwhelmed. 3) The copyright status of Wikipedia content could be challenged, especially if 'substantive' AI edits are allowed to stand, a concern I think may be decisive for Wikimedia Foundation and ArbCom given the potential for losses. I think a lot of the rest of it is similar to the risks we accept in ordinary editing - bias and errors may propagate for a long time, but we hope that eventually somebody spots the problem. --Edwin Herdman (talk)02:34, 27 November 2025 (UTC)[reply]

It has absolutely already happened, to the tune of thousands of articlesthat we know about. And the ones we know about, we know about because there were enough signs in the text to be identifiable as AI.Gnomingstuff (talk)02:35, 27 November 2025 (UTC)[reply]

@Edwin Herdman, I don't think I understandThe copyright status of Wikipedia content could be challenged, especially if 'substantive' AI edits are allowed to stand, a concern I think may be decisive for Wikimedia Foundation and ArbCom given the potential for losses.

Does this mean that:

some of Wikipedia's contents will not be eligible for copyright protection? In that case, the WMF isn't going to care (they're willing to host public domain/CC-0 content, though they would prefer that it was properly labeled), and protecting editors' copyrights is none of ArbCom's business. (ArbCom cares about editors' behavior on wiki. They are not a general-purpose governance group.)
someone might (correctly) claim that they own the copyright for the AI-generated/AI-plagiarized contents of an article? In that case, the WMF will point them to theWP:DMCA process to have the material removed. If the copyright holder wishes to sue someone over this copyvio, they will need to sue the editor who posted it (not the WMF or ArbCom). This is in thefoundation:Policy:Terms of Use; look for sentences like "Responsibility — You take responsibility for your edits (since we onlyhost your content)" (emphasis in the original) and "You are responsible for your own actions: You are legally responsible for your edits and contributions" (ditto).

WhatamIdoing (talk)05:48, 27 November 2025 (UTC)[reply]

I wrote that badly, but you've clarified the issue. I can't assume Wikipedia will always benefit from the Safe Harbor provision - the DMCA might be amended again or even repealed, or Wikipedia might be found to fail the Safe Harbor criteria. Even without a suit seeking damages, the DMCA process imposes at least some administrative burdens which I would consider worth a rough worst-case scenario estimate. I'll be happy if wrong; AI risks on copyright aren't totally unlike what any editor can do without AI, what's different is mainly spam potential and the changing legal landscape. My final thought is that LLMs don't inherently bring copyright issues - it's possible an LLM with a clear legal status might be developed. --Edwin Herdman (talk)08:38, 27 November 2025 (UTC)[reply]

Based purely on the plain meaning of 'from scratch,' I would say that if the majority of the article's text is AI generated, then this guideline would suggest that the article should be deleted.

If a 'first draft' was written with AI and then substantially rewritten by a human, it would essentially be the same as doing it from scratch by the human, so it gets a pass.

'From scratch' to me implies you had nothing before, now you have an article. If that article was written with AI, then it falls afoul of this guideline.Athanelar (talk)15:07, 27 November 2025 (UTC)[reply]

I would argue that there are actually two ways to parse how the “from scratch” guideline applies:

1. (as intended) You may not use an LLM to write a wholly new article that does not exist on Wikipedia as of yet.

2. You may not write an article by asking an LLM to generate it “from scratch”- ie without putting in any information. (Implied- you may use an LLM if you provide it with raw data)

In other words, it is entirely possible to read the “from scratch” clause as referring to the LLM generation process, and not the Wikipedia article process.~2025-36891-99 (talk)20:09, 27 November 2025 (UTC)[reply]

The answer is: No. To delete an article, it must be done in accordance with thewp:Deletion policy. —Alalch E.23:37, 27 November 2025 (UTC)[reply]

IMO this misses the point. We don't set policy based on what it ispossible, but based on theoverall impact on the project. For example, I am sure there are users who could constructively edit withinWP:PIA from their first edit, but we don't let them, because on average letting inexperienced users edit in that topic area was leading to huge problems. Same logic applies here. We need to set LLM policy based on overall impact to the project.NicheSports (talk)23:57, 27 November 2025 (UTC)[reply]

We don't let new editors edit in the PIA topic area becauseArbCom remedies are binding and cannot be overturned by fiat. This guideline is not like that.Reasonable exceptions should still be allowed.SuperPianoMan9167 (talk)00:13, 28 November 2025 (UTC)[reply]

I was speaking more generally about how our LLM PAGs should develop in the future. This guideline is far from ideal and clearly is going to change. I don't know the right first step, I just know what I want it to get to.NicheSports (talk)00:16, 28 November 2025 (UTC)[reply]

Is your ideal LLM guideline something likeWP:LLM?SuperPianoMan9167 (talk)00:20, 28 November 2025 (UTC)[reply]

WP:LLM covers a lot, so there are parts I'd probably agree with, but as it relates to usage of LLMs, no. My ideal policies would be

LLMs cannot be used to generate articleprose or citations, regardless of the amount of review that is subsequently performed, unless the editor is experienced and possesses thellm-user right
Experienced editors could apply for thellm-user right, with the same requirements as autopatrolled
Users without thellm-user right could use LLMs for non prose-generating tasks. A few examples of this could be generating tables, doing proofreading, etc. We would need to draft an approved list of uses
I want to add a G15 criteria for machine-generated articles with multiple material verification failures. This would efficiently handle problematic LLM-generated articles
Content policy compliant LLM-generated articles would not need to be deleted. Although if they were discovered to be created by a user without thellm-user user right, we would warn the user about not doing so in the future.

NicheSports (talk)00:38, 28 November 2025 (UTC)[reply]

So kinda like howAutoWikiBrowser (LLMs, like AWB, could be considered automated editing tools that assist a human editor) requiresspecial approval?SuperPianoMan9167 (talk)00:41, 28 November 2025 (UTC)[reply]

Yes, but with more restrictive criteria than AWB. I think the autopatrolled requirements are a nice fit (and kind of spiritually related)NicheSports (talk)00:46, 28 November 2025 (UTC)[reply]

Please drop tables from the list of approved uses, it does it, and on face value seems to do it well, but under the hood is a different story. Maybe there's some version that does it well, or we could put guide rails on it, but GPTs format tables with overlapping column and row spans that are barely human readable. They're great with templates in general though if you check they haven't done more than copy and paste. "Put this text in this template following these rules" usually works beautifully, but not tables, the wiki table formatting is just too weird I guess.~ Argenti Aertheri^(Chat?)02:10, 28 November 2025 (UTC)[reply]

This is a very nice proposal, reflecting both the current situation (AI is simply as good as most humans on many technical tasks, so banning its use makes no sense) and concerns about a flood of disastrous content generated with AI due to ignorance, greed, or malice.Викидим (talk)18:24, 2 December 2025 (UTC)[reply]

Content self feedback

[edit]

I would like to suggest that the concept ofclosed loop system be considered and somehow discussed in the guideline. The LLM nightmare is when other sources pick half baked content from AI generated material, and said sources pick it up again themselves. The feedback can continue and eventually many sources will affirm each other. The term to use then is: jambalaya knowledge.Yesterday, all my dreams... (talk)16:17, 29 November 2025 (UTC)[reply]

We do haveWP:CITOGENESIS which describes this regarding Wikipedia, not quite the same but Wikipedia is a big feeder for AI training sets.Gnomingstuff (talk)17:06, 29 November 2025 (UTC)[reply]

I did not know about that page, so thank you. The LLM problem is in fact a super turbocharged version of that.Yesterday, all my dreams... (talk)20:59, 29 November 2025 (UTC)[reply]

We do have a mainspace article onmodel collapse which is the term for this phenomenon in large language models. It's not really relevant to this guideline specifically, though.Athanelar (talk)14:29, 30 November 2025 (UTC)[reply]

Nutshell

[edit]

@Novem Linguae: Nothing personal, but I challenge your assertion that this page is too short to have a nutshell. Having a modicum of humor helps keep this project from drowning in bureaucracy. —Hex•talk14:38, 30 November 2025 (UTC)[reply]

Discussion atWikipedia:Village pump (policy) § RfC: Replace text of Wikipedia:Writing articles with large language models

[edit]

You are invited to join the discussion atWikipedia:Village pump (policy) § RfC: Replace text of Wikipedia:Writing articles with large language models. –Novem Linguae(talk)23:40, 5 December 2025 (UTC)[reply]

so... anyone want to RFCBEFORE

[edit]

this is the official statement that an RFCBEFORE was attempted on December 8, 2025, so anyone saying that there wasn't one can be referred to this timestampGnomingstuff (talk)15:58, 8 December 2025 (UTC)[reply]

I put my ideas for what a comprehensive LLM guideline should have inUser:SuperPianoMan9167/LLM guideline ideas.SuperPianoMan9167 (talk)16:59, 8 December 2025 (UTC)[reply]

The proposals are also considered RFCBEFORE attempts.Mikeycdiamond (talk)19:37, 8 December 2025 (UTC)[reply]

They sure are. Maybe if the magic word "RFCBEFORE" is intoned, people will actually acknowledge that fact.Gnomingstuff (talk)21:09, 8 December 2025 (UTC)[reply]

I think people are mostly complaining about the short time period between when Qcne posted v3 of the proposal and when the RfC was opened.SuperPianoMan9167 (talk)22:04, 8 December 2025 (UTC)[reply]

In support of this idea, I hereby invite all interested parties to leave a short comment~~on my talk page~~, especially about what they most want to see in the guideline. Of course I am just a random editor who has had only a little bit of involvement in all this, but I'm here forRFCBEFORE to get this moving to the next stage. --Edwin Herdman (talk)21:53, 8 December 2025 (UTC)[reply]

Imo, this will be easier to keep track of if it's all in one place. Join us on the talk page for SuperPianoMan's ideas instead?~ Argenti Aertheri^(Chat?)23:45, 8 December 2025 (UTC)[reply]

Agreed but I think this should be the one place, more visible than a userpageGnomingstuff (talk)23:52, 8 December 2025 (UTC)[reply]

The list of things I would like to see in a comprehensive LLM guideline is too long to put here unless I wrap it up in a{{cot}}. Besides, Qcne did the same thing (write it on a userpage) with his proposed guideline. I can transclude the text on this page so that it is more visible if that's more helpful.SuperPianoMan9167 (talk)23:55, 8 December 2025 (UTC)[reply]

I think Gnomingstuff is talking about discussing it here.Mikeycdiamond (talk)11:52, 9 December 2025 (UTC)[reply]

Also, Qcne is currently hosting a RFC on this guideline at the Village Pump.Mikeycdiamond (talk)13:25, 9 December 2025 (UTC)[reply]

I am aware of the RfC and have contributed to it multiple times. The entire point of this is because people are dismissing the RfC out of hand because "discussion did not take place," when we're coming up on 3 years of discussion of AI policy as well as 4 months of concentrated recent discussion, I'm curious what made you think I didn't know about it?

I would prefer all discussion happen here, in one consolidated place, rather than several scattered places including userpages and the like. That way it's impossible for anyone to say it didn't happen. Or rather, it's very possible for people to say that, and they probably will, but at least there will be a solid timestamp to point to.Gnomingstuff (talk)13:45, 9 December 2025 (UTC)[reply]

That way it's impossible for anyone to say it didn't happen. People definitely still will, but you're right that having it directly here would enable pointing to the "you had your chance" diff. SuperPianoMan,{{cot}} it I guess?~ Argenti Aertheri^(Chat?)13:54, 9 December 2025 (UTC)[reply]

I wasn't. I was just referring to the fact that a RFCBEFORE after a RFC started is redundant. Also, I am losing track of who contributed where between the AI and temp account discussions. I currently have 24 notifications, and I'm exhausted.Mikeycdiamond (talk)14:17, 9 December 2025 (UTC)[reply]

It's probably not going to pass because of the contradictory language in the proposal, which is feedback I raised in the (24 hour) RFCBEFORE that Qcne ignored for some reason. Anyways, you know that I'm likely to support anything you propose, but what do you want to discuss here G? I think the best way forward would be to stick with Qcne's proposal but adjust it based on the RFC feedback.NicheSports (talk)14:25, 9 December 2025 (UTC)[reply]

Yes, I agree that editing Qcne's proposal might be the best way forward. Also, my talk page has moved at a glacial pace over the last two decades, but point taken, I'm striking the recommendation to use my talk page. --Edwin Herdman (talk)16:48, 9 December 2025 (UTC)[reply]

RFCBEFORE isn't some checkbox or hoop to jump through. Simply saying "this is an RFCBEFORE" doesn't somehow inoculate the RFC that comes out of this. The problem with the current RFC is that the RFC question--the specific proposal--had only been up for 12 hrs or so before the RFC. There is no specific proposal here, so, if and when somebody drafts an RFC question or RFC proposal, they should still waitdays after posting the proposed RFC question/proposal here for feedback, before launching the RFC. That was the RFCBEFORE mistake made in the current RFC.

More substantively, before trying to write a guideline that summarizes consensus, consider first trying to figure out what the consensus is. So, a potential RFC question might be something like "Should all LLM use on Wikipedia be banned?" or, "What LLM use should be allowed and what LLM use should not be allowed?" When we have the answers to those questions, then we'll have a better understanding of what the consensus is, andthen we'll have a better chance of writing a guideline that documents that consensus. Avoid a "cart before the horse" situation.Levivich (talk)20:08, 9 December 2025 (UTC)[reply]

Maybe we need to clarifyWP:RFCBEFORE. The main point of the RFCBEFORE section (I know, I know,Wikipedia:Nobody reads the directions...) is:

Before you start an RFC, see if you should be using some other, non-RFC method (e.g., an ordinary discussion on the talk page, or apeer review or whatever)instead of an RFC.

In other words, RFCBEFORE is primarily about ways you can avoid having an RFC at all. That's not viable for aWP:PROPOSAL.

RFCBEFORE is

Nnot:

a requirement,
a delaying tactic,
a discussion about whether to have an RFC, or
a discussion about how to word an RFC question.

In this case, I think that the proposal would have been more successful if the OP had engaged in more discussion before exposing the proposal to friendly fire (see also"It is crucial to improve a proposal in response to feedback received from outside editors. Consensus is built through a process of listening to and discussing the proposal with many other editors" inWP:PROPOSAL), but the problem there isn't a supposed violation of RFCBEFORE "rules". The problem there is that the proposal wasn't strong enough to get accepted. Those "but there wasn't any RFCBEFORE" comments should probably be understood as meaning something like "youjumped the gun with a weak proposal, and now we've lost one of our best chances to get these ideas approved".WhatamIdoing (talk)22:00, 9 December 2025 (UTC)[reply]

I think the main benefit of RFCBEFORE for this particular topic is to hash out the wordings of everything. Once you launch the big RFC, it is difficult to change any wordings. So for example, inWikipedia:Village pump (policy)#RfC: Replace text of Wikipedia:Writing articles with large language models, it's currently at approximately 22 support, 16 oppose. But many of the opposes cite that the proposed wordings contradict each other, with a couple sentences completely forbidding LLM use, then a couple sentences right after saying onlyraw or lightly edited LLM output is forbidden. This is the kind of thing to hash out in an RFCBEFORE. If this had been properly hashed out, perhaps there'd be less opposes in the current RFC. –Novem Linguae(talk)05:06, 10 December 2025 (UTC)[reply]

I agree with you that a discussion of the proposal would have been helpful in this instance.

Can you agree with me that nothing related to that is found inWP:RFCBEFORE, and that advice on discussing RFC questions before starting an RFC is instead found inWikipedia:Requests for comment#Statement should be neutral and brief?WhatamIdoing (talk)05:34, 10 December 2025 (UTC)[reply]

Perhaps the defined meaning of WP:RFCBEFORE and the commonly used meaning have diverged. I think of RFCBEFORE as improving RFC questions and options so that the future RFC isn't bogged down by those issues and can focus more on substance. –Novem Linguae(talk)06:28, 10 December 2025 (UTC)[reply]

I think that might be the case. People sometimes guess from theWP:UPPERCASE what it "ought" to mean, without checking to see what it actually says.WhatamIdoing (talk)19:16, 10 December 2025 (UTC)[reply]

This sort of thing often happens. The use of "Keep per WP:CHEAP" at RfD has now more or less completely diverged from whatWP:CHEAP actually says.Cremastra (talk ·contribs)19:54, 10 December 2025 (UTC)[reply]

Maybe you'd like to add both of those toWP:UPPERCASE.WhatamIdoing (talk)23:42, 10 December 2025 (UTC)[reply]

Ideas from SuperPianoMan9167

[edit]

Here are my ideas (not proposals yet) for what a comprehensive LLM proposal should have (transcluded from one of my user subpages).This is NOT a draft guideline. This is a list of ideas for what a hypothetical all-inclusive LLM guideline would include.

SuperPianoMan9167's ideas

I think any comprehensive LLM policy or guideline should have:

An explanation of what an LLM is and how it differs from other applications of AI and simple automation (e.g. spellchecking)
- The difference between generative and non-generative use cases
A list explicitly outlining acceptable uses of LLMs. These include the following (note that none of them involve directly copy-pasting chatbot output):
- Analyzing diffs and usernames while doing recent changes patrolling,provided that a human makes the actual moderation decisions (examples includeWikiShield andUser:Ca/Automated RCP)
- Generating ideas for new articles that a human will write,provided that the human checks for notability and does not copy-paste LLM output.
- Providing suggestions on how to improve an article that a human then acts on, such as finding typos,provided that the human is aware of the LLM's limitations and does not copy-paste LLM output (à laUser:Polygnotus/Scripts/AI Proofreader)
- On-demand, client-side summaries/simplified versions/explanations/question answering for articles,provided that it is only client-side, and never modifies article content. This could be as simple as using the chatbot sidebar in a web browser, which is impossible to detect anyway if no content was modified.
- Wikitext manipulation suggestions/advice,provided that all suggestions are reviewed and approved by a human, and the human does not copy-paste LLM output (like inUser:JPxG/LLM demonstration; noticing a pattern here?)
- Suggestions on how to improve your own writing,provided that all the writing is still done by a human and thatthe human does not copy-paste LLM output
- Codinguser scripts oruser CSS pages,provided that the human has adequate knowledge of JavaScript and CSS, is aware of the security issues, and fully double-checks the output. Even then, the human shouldavoid copy-pasting LLM output
A list of unacceptable uses of LLMs. This is anything that involves a lack of competency, direct human supervision (only humans should make decisions), or both:
- Copy-pasting LLM output to create a new article or draft with no or limited human review. And when I say "human review", I meanmanually checking for compliance with all applicable Wikipedia policies and guidelines, as well as manually verifying every statement to its source. In any case, using LLMs to write or expand articles should generally be discouraged. (I am of the opinion that LLM-generated articles cannot bebanned entirely asall rules on Wikipedia allow for reasonable exceptions.)
- Copy-pasting LLM output into existing articles to expand them with no or limited human review. As above.
- Copy-pasting LLM output to start a discussion or into a discussion as replies
- Generating links to sources or verifying source material without a human double-checkingeverything to see if the output is actually correct. In any case, LLMs arenot ideal as search engines.
An explanation of how to detect LLM-generated content
- For this, a heavily simplified version ofWP:AISIGNS that gives a broad overview and a link to the full page would suffice; I also think that page should be tagged as an information/supplement page
An explanation on how to deal with problematic LLM-generated content
- Wikipedia:WikiProject AI Cleanup/Guide is a good start
A detailed explanation ofwhy LLMs are unacceptable for tasks like writing articles. This is important; otherwise the guideline will come across as "don't do it because we said so". This should ideally beon the same page as the guideline, but if necessary, it can be simplified for the guideline page and kept on a different page.
- As part of the above, detailed information about what policies and guidelines LLMs tend to violate and why
A disclosure requirement (WP:LLMDISCLOSE). In my opinion this should bepolicy.
A competency requirement (WP:LLMCIR). This should also bepolicy.
- An explanation that high-speed rewrites/edits to articles, as well as rapid article generation using LLMs, violatesWP:MEATBOT
A description of what sanctions could be applied and why
- A pointer toWP:AINB
- An explanation ofWP:G15 and its three criteria
- An explanation of LLM-related blocks
A separate explanation for images (and also video and audio, but those are less common on Wikipedia), since LLMs primarily generate text and require extra model architecture, like a token-to-image converter, to output in other modalities. Specifically:
- In general,AI-generated images should not be used, particularly not for BLPs, with reasonable, case-by-case exceptions (such as AI-generated images that are independently notable)
A recommendation that newbies should avoid using LLMs until they gain sufficient experience doing the same tasks without LLMs
An explicit reminder toassume good faith of editors unaware of the limitations of LLMs
A reminder to remaincivil when handling LLM-generated content.

Feedback is welcome.SuperPianoMan9167 (talk)14:38, 9 December 2025 (UTC)[reply]

I think this is far too involved. 1. Our guideline doesn't need to define what an LLM is; that's what our articles are for. 2. This needlessly duplicates a lot of existing policies and guidelines, which do not stop applying once LLMs are involved. 3. I don't think it should contain exhaustive lists of what uses are okay, because editors who screw up may then point to it as a kind of "get out of jail free" card and we may end up misleading folks into getting themselves blocked. 4. Codifying a lot of this as a guideline, including AISIGNS, risks lagging behind the fast-changing technology that LLMs are and eliminating the very editor discretion and common sense that distinguishes us from machines.Toadspike [Talk]14:57, 9 December 2025 (UTC)[reply]

My responses to these points, in order:

We still need to define LLMs because very often, editors are accused of using LLMs, they say "I didn't, I used [some grammar tool]!", and then other editors point out that many of those tools (like Grammarly) do in fact use LLMs. So I think some guidance is necessary to avoid this confusion on what an LLM is for the purposes of the guideline.
I think it is useful to define how exactly existing policies apply to LLMs so that we have solid justifications for the guideline and so editors don't just think that Wikipedia is run by AI-haters when they get blocked for misusing LLMs. The point of a policy or guideline is toinstruct; pointing to other policies and guidelines and saying "you can't use LLMs because they violate [X policy]" is not at all helpful to new editors who have no idea what [X policy] says. For this reason, we should explainboth what [X policy] saysand how LLMs violate it.
This argument has been made many, many,many times before. I believe it is invalid because guidelines arenot supposed to be lists of rules that you must follow or face consequences, despite many users endorsing such rules for LLMs (i.e. they want a guideline consisting only of enforceable rules like "if you use LLMs to write articles they will be deleted. If you repeatedly do this you will be blocked.") Again, the point of a policy or guideline is toinstruct, which means giving both dos and don'ts. For example, thesockpuppetry policy includesbothillegitimateandlegitimate uses of alternative accounts.
LLMs are not a fast-changing technology; they are a stagnating technology. ChatGPT is three years old. Thetransformer, the type of neural network powering LLMs, is eight years old. The AI industry has basically forced itself into a dead end. The problems with LLM-generated articles, like hallucination, are due to theinherent limitations of the models themselves, and these limitationscannot be overcome just by making the models larger.

One last point: the main reason why I think this guideline would be helpful, despite it potentially being redundant to other policies and guidelines, is that it's more helpful to point to one page and say "that's our AI guidelines" than to say "well actually, we have no AI guidelineper se, but youcan't write articles with AI, youcan't write comments with AI, youcan't use AI-generated images..." etc.SuperPianoMan9167 (talk)16:51, 9 December 2025 (UTC)[reply]

LLMs might not be a fast-changing technology, but the characteristics of LLM output are fast-changing, in unpredictable (or at lest heavily guarded) ways. AI-generated text from January 2024 readsway differently than AI-generated text from December 2025. Claude text probably reads differently than GPT-5 text than Gemini text than GPT-4o text. (Grok textabsolutely does.) OpenAI is apparently rushing out a new ChatGPT update in a couple of days.

This means that yes, AISIGNS is always going to be a lagging indicator (as is the research it cites). The whole thing is like trying to map a black box. I don't think that's a huge problem, though, because A) even the outdated stuff, like the promotional "stands as a testament" crap, still useful for finding undetected AI edits from 2023-2024, B) it's not meant to be a tool to sanction editors, and C) any AI guideline is useless if it doesn't contain guidance on how to actually find the AI text.Gnomingstuff (talk)20:38, 10 December 2025 (UTC)[reply]

In my mind it's less a policy draft and more a list of requirements for that policy, so we can at least all agree on what exactly we're trying to write here. Thus, since your points 1-3 have been repeatedly requested during various (before) RfCs, they should probably be touched on at least in the final policy. No policy is ever going to make everyone happy, but hopefully, if we can mostly agree on what that policy should include, we can stop getting sidetracked by minutiae.~ Argenti Aertheri^(Chat?)18:46, 9 December 2025 (UTC)[reply]

In my mind it's less a policy draft and more a list of requirements for that policy, so we can at least all agree on what exactly we're trying to write here Yes, exactly.SuperPianoMan9167 (talk)19:24, 9 December 2025 (UTC)[reply]

Regarding point 3: I'm not the only one who disagrees with the idea that listing acceptable uses will encourage the unacceptable uses:

I disagree with "anyone who gets accused of using them unacceptably is just going to claim they were doing one of the acceptable things". This is an old argument on Wikipedia that I've seen raised many times, and I think it's bad advice, contrary to the fundamental purpose of guidelines, which is to teach. The notion that we shouldn't outline what is acceptable because people who do unacceptable things will claim it's acceptable is nonsensical to me.
— User:Levivich

Quoted fromthis comment.SuperPianoMan9167 (talk)19:34, 9 December 2025 (UTC)[reply]

I strongly agree withAn explanation of what an LLM is and how it differs from other applications of AI and simple automation (e.g. spellchecking). Let's develop a good, shared understanding of what "this" is before we declare that "this" is banned.WhatamIdoing (talk)22:03, 9 December 2025 (UTC)[reply]

On that note, since the real issue for a lot of us is the generative parts specifically, should we actually be using "GPT" instead of "LLM"?~ Argenti Aertheri^(Chat?)23:23, 9 December 2025 (UTC)[reply]

A GPT is a specific type of LLM. GPT stands for "generative pre-trained transformer". The problem with using GPT instead of LLM is people areguaranteed to conflate "GPT" with "ChatGPT" (this has led tobrand issues with OpenAI's models).SuperPianoMan9167 (talk)23:41, 9 December 2025 (UTC)[reply]

I know GPTs are a subset of LLMs, but they're the one that's actually causing headaches. Sucks for OpenAI, but for us perhaps something like "GPTs,including but not limited to ChatGPT"? Someone must have market data we can use to build a list of the popular ones.~ Argenti Aertheri^(Chat?)02:00, 10 December 2025 (UTC)[reply]

Using the long name ofGenerative pre-trained transformer might interrupt the GPT = ChatGPT assumption.WhatamIdoing (talk)05:36, 10 December 2025 (UTC)[reply]

I don't know whether this is better suited to a guideline or an essay, but I really would like to see a section on "what to do if someone says you used AI." Rough idea below:

When someone asks whether you used AI, they're not trying to ban you or accuse you of editing in bad faith. They are trying to gather information that they currently don't have, and then use that information to improve the article. As the editor,you and you alone know whether you used AI (and if you truly don't know, that's a little concerning). The most productive way to respond, then, is to provide that information:

If you did use AI: Say that you used AI, including the versionand prompt, the workflow, and the review you did. (Reminder about LLMDISCLOSE here if it ends up being policy, which it should).

Dodging the question is unlikely to go the way you want it to. In particular, the following are common responses to accusations of AI that usually don't go well:

Saying things like "if there is AI," "there shouldn't be AI," etc. We don't want to know whether thereshould be AI, or whether therecould be AI -- we want to know whether thereis.
Saying that "an AI detector said it wasn't AI." We don't want to hear from an AI detector, we want to hear from you.
Asking them to prove that you used AI. No one can prove whether you used AI but yourself; if you want proof, provide it yourself.
Responding to the question with AI (reminder here about HATGPT).
Asking for the AI cleanup tag to be removed from the article before the cleanup is done. If you put AI-generated text into an article, then a template stating that the article contains AI-generated text is a true statement.
Lying (obviously)
(am I missing anything? I probably am)

If you didn't use AI: Say that you didn't use AI. There is no reason to dance around the truth here. Also, consider the possibility that you used AI without knowing it. Tools like Grammarly use the same language models as other AI and often produce the same problems as them. An increasing amount of software, including Microsoft Word, incorporates AI into certain features. Most of them are not especially transparent about this; this is not your fault.

Also, keep in mind:

If someone says that an article is AI, they are talking aboutthe article, not criticizing you personally.
No one owns an article. This is a wiki and anything can be edited or deleted at any time, regardless of how much work you put into it or how much you want it to stay the same.
(am I missing anything?)

Gnomingstuff (talk)21:08, 10 December 2025 (UTC)[reply]

Since that keeps coming up in these discussions, your last section should probably include that questions about your AI use are not accusations of bad faith, but might reflect a competence issue andWP:LLMCIR.~ Argenti Aertheri^(Chat?)22:25, 10 December 2025 (UTC)[reply]

Where do we go from here?

[edit]

Sorry for fucking up the RfC. I am fairly certain it will end up as no-consensus after well-reasoned Oppose votes. I rushed into it, my first RfC, after feeling pretty good about my Version 3 posted above.

So, where do we go from here?

I have been avoiding commenting in the RfC so as not to prejudice the result, but have been thinking about the feedback and the clear split in the community.

Would an RfC with four clearly defined options be better?

Option 1: Status Quo

[edit]

Retain the current text ofWP:NEWLLM.

Summary: Large language models should not be used to generate new Wikipedia articles from scratch.
Pros: Short, simple, current consensus.
Cons: Vague definition of "from scratch"; does not address LLM text added to existing articles; does not address talk page use.

Option 2: Prohibition on Unreviewed LLM Content

[edit]

User:Qcne/LLMGuidelineOption2

Summary: Prohibits adding unreviewed LLM content; permits use only with rigorous verification.
Pros: Focuses on editor responsibility; permits some non-disruptive LLM assisted edits; hopefully resolves the contradictory language from the RfC.
Cons: Not acceptable to anti-LLM editors; does not define acceptable LLM use; does not have a LLM use disclosure section.

Option 3: Prohibition on LLM Content

[edit]

User:Qcne/LLMGuidelineOption3

Summary: Prohibits LLM content.
Pros: Clear and enforceable; unambiguous.
Cons: Not acceptable to pro or neutral-LLM editors; may be overly restrictive for constructive tools; enforcement relies on unreliable detection.

Option 4: Limited LLM Use with Disclosure

[edit]

User:Qcne/LLMGuidelineOption4

Summary: Permits limited LLM assistance with mandatory disclosure; prohibits generation from scratch.
Pros: Promotes transparent LLM use; codifies some best practices
Cons: Not acceptable to anti-LLM editors; limited-use boundaries may be pushed.

I feel slightly at a loss on the appropriate next steps.qcne(talk)21:58, 10 December 2025 (UTC)[reply]

Differences in a table

[edit]

This is an AI-generated comparison (see the details in the edit comment), but I am the one responsible for errors and omissions. Feel free to edit or object. --Викидим (talk)23:22, 10 December 2025 (UTC)[reply]

Extended content

Comparison of LLM Guideline Options
Section	Option 2	Option 3	Option 4
Title	Prohibition on Unreviewed LLM Content	Prohibition on LLM Content	Limited LLM Use with Disclosure
Nutshell	Focuses on prohibitingunreviewed content.	Focuses on prohibiting content generation entirely.	Focuses on prohibitingunreviewed content (same as Option 2).
Scope	(Identical across all three versions) Defines LLMs, applies to all models/outputs, and links to the main information page.
Primary Restriction Policy	Do not use an LLM to add unreviewed content Permits LLM useonly if thoroughly reviewed. Defines "unreviewed" as output not checked line-by-line against reliable sources.	Do not use an LLM to add content Strict prohibition. States that using an LLM to generate new articles, drafts, or expand existing articles is "not permitted".	Do not use an LLM to add unreviewed content (Same text as Option 2) Permits LLM useonly if thoroughly reviewed.
Specific Prohibitions (Bulleted List)	Prohibits pasting"raw or unreviewed" LLM output.	Prohibits pasting"LLM output" (removes the "raw or unreviewed" qualifier).	Prohibits pasting"raw or unreviewed" LLM output.
Limited Use & Disclosure	(Section not present)	(Section not present)	Limited use Strongly discourages use. Suggests use only for narrow tasks (e.g., copyediting) by experienced editors. Disclosure and responsibility Requires editors todisclose LLM assistance in the edit summary. Reaffirms editor responsibility for content.
Handling Existing Content	(Identical across all three versions) Allows removal, replacement, tagging, or deletion (including speedy deletion G15) of problematic LLM content.
See Also	(Identical across all three versions)

Discussion of the options suggested by Qcne

[edit]

Quick question aboutIt does not cover spellcheckers, grammar checkers, you are aware that Grammarly is now Grammarly AI? People don't even realize they are using AI, they think its just a normal spellchecker/grammar checker.Polygnotus (talk)22:11, 10 December 2025 (UTC)[reply]

I believe that our next proposal should still be based on Qcne's drafts. The proposal at the open RFC is roughly the right length and has a lot of support - I want to avoid ping ponging back and forth between totally different approaches. Many decisions and modifications to be made but let's stay on the Qcne path!NicheSports (talk)22:24, 10 December 2025 (UTC)[reply]

The discussion around SuperPianoMan's ideas directly above this isn't an attempt to write a guideline, but rather to reach some consensus as to what the final guideline should include. I'd encourage you to join in regardless which option you prefer here as "do we define LLM in the guideline or not" is policy agnostic.~ Argenti Aertheri^(Chat?)22:37, 10 December 2025 (UTC)[reply]

I tried striking my comments but I kept hitting an edit conflict as Polygnotus has already collapsed it. Thank you for that. There's too much nuance to make such a comparison.SuperPianoMan9167 (talk)22:26, 10 December 2025 (UTC)[reply]

@SuperPianoMan9167 Thanks! Yeah its complicated stuff.Polygnotus (talk)22:27, 10 December 2025 (UTC)[reply]

It may be a good idea to explain that LLMs cannot and do not differentiate between training data and input, and cannot summarize text.Polygnotus (talk)22:34, 10 December 2025 (UTC)[reply]

I agree that it may be a good idea to (briefly) explain how LLMs work in general, as that will demonstratewhy the output is not policy-compliant. The text at the start ofthis section of AISIGNS seems like a good foundation to modify.SuperPianoMan9167 (talk)22:37, 10 December 2025 (UTC)[reply]

Yeah it would be nice to have a place you can point people to when they don't understand why LLMs are a problem/which of their shortcomings can harm Wikipedia.Polygnotus (talk)22:44, 10 December 2025 (UTC)[reply]

Which is why I think a short guideline that doesn't explain itself is not the policy path we should be pursuing here. Otherwise, newbies will see the guideline and think that all Wikipedians are AI-haters. The current system, whereWP:LLM provides the actual background information, doesn't seem to be working as WP:LLM is liable to be dismissed as "just an essay".SuperPianoMan9167 (talk)22:55, 10 December 2025 (UTC)[reply]

I mean when wording this we should also be concerned about the flipside, newbies seeing the guideline and thinking that all Wikipedians are AI supporters. I don't know the proportions of those two groups, but given the recent wave of WMF advertising about how this is the human encyclopedia in the age of AI and that's why you should donate money...Gnomingstuff (talk)23:05, 10 December 2025 (UTC)[reply]

The proposals 2-4 share a lot of text. It would be nice to have the textual differences in the table form here, so that the eye-strauining "visual" diffs are not required. --Викидим (talk)22:48, 10 December 2025 (UTC)[reply]

Done using AI.Викидим (talk)23:17, 10 December 2025 (UTC)[reply]

Per edit summary it was Gemini, and it handled colspans properly. Maybe it's just that rowspan are already barely human readable? Could you please plop an example of Gemini doing a more complicated table on my talk page? This is the first "correct" ai generated table I've seen and I'm curious how you did it since I've seen so many bad ones.~ Argenti Aertheri^(Chat?)23:27, 10 December 2025 (UTC)[reply]

You did not fuck up the RfC.Gnomingstuff (talk)23:01, 10 December 2025 (UTC)[reply]

I think the main problem with the proposed draft is that it needed a few folks with wikilawyering skills to find and fix all the loopholes and similar problems before it was proposed.

I suggest that we pause efforts to expand this guideline. There is nothing that we can do that will improve the behavior of editors in the next few months anyway. The next few months are worth thinking about, because January usually brings an uptick in new editors (or new UPEs, perhaps), and that probably means an uptick in AI-generated contributions. We can come back to it when it's clearer, and we can propose bits and pieces piecemeal.

In the meantime, ifWikipedia:Village pump (idea lab)#Wikipedia as a human-written encyclopedia reaches agreement, then we might be able to turn that agreement into a MediaWiki: message that discourages AI-generated content ("Wikipedia is written by humans. Please don't copy/paste content from AI or chatbots here"?). In the short term, that might be more protective than a guideline thatnobody reads.WhatamIdoing (talk)00:00, 11 December 2025 (UTC)[reply]

That "Wikipedia as a human-written encyclopedia" discussion is very very unlikely to lead to anything, people don't even agree on if that is a good claim to make and certainly not what to do with that idea (banner? policy? something else?).

we might be able to turn that agreement into a MediaWiki: message that discourages AI-generated content People who don't read guidelines are unlikely to read (or care about) such messages.Polygnotus (talk)00:14, 11 December 2025 (UTC)[reply]

I don't think that wikilegalese would help at all. Common-law-like ambiguity is useful: it allows community to set up the boundaries later without going through any formal processes. This is especially important in a seminal situation like this, when too lax rules will cause humans to drown in a sea of slop, while rules that are too tight will cause us to lose to grokipedias of the forthcoming scary, but inescapable IMHO, world.Викидим (talk)01:06, 11 December 2025 (UTC)[reply]

Yeah the normal procedure is to iron out the details as we go.Polygnotus (talk)01:21, 11 December 2025 (UTC)[reply]

This AI thing is new and big and scary-looking. So my right hand is stretched out for a handshake, while left hasbrass knuckles ready just in case.Викидим (talk)01:29, 11 December 2025 (UTC)[reply]

@Викидим Are you right or left handed?Polygnotus (talk)10:06, 11 December 2025 (UTC)[reply]

Good question. Physically, my left is stronger. Otherwise, I am a righty.Викидим (talk)10:29, 11 December 2025 (UTC)[reply]

I love option three, but could you make a minor change?

−

Editors should not use an LLM to generate content for~~Wikipedia~~

Editors should not use an LLM to generate content forWikipedia,evenifyouedittheresult

Also, is this an RFCBEFORE?Mikeycdiamond (talk)02:36, 11 December 2025 (UTC)[reply]

This is not going to work. It is akin to requesting editors to write manuscripts by hand and only accept the handwritten images, not typed texts. This cat is out of the bag, and there is no way to convince people who tried the AI to abandon the 10x improvement in performance on the most tedious and annoying tasks, like creating{{cite book}}s or tables. The fact is: on these tasks, the AI is not only faster, but it isbetter than many (definitely better than me). Pretty soon it will be better than most of us at actually writing the texts. We need to learn how to use it to our advantage, not prohibit it. So option 3 will not have any staying power. I personally prefer option 4 with its explicit endorsements of some use.Викидим (talk)08:20, 11 December 2025 (UTC)[reply]

After testing ChatGPT with citation templates, I agree with you that AI is good at menial tasks, but I disagree that AI will ever get better at writing than us. Nonetheless, option 4 has gotten popular support, so I would like to recommend some changes to it.

−

Check the output they intend to use against suitable reliable sources.

Check the output they intend to use againstthesuitablereliablesourcesthattheAIcited,whichtheymustciteasthesourceofthetext.IftheAIdidn'tcitereliablesources,gettheAItogeneratetextwithinformationfrom reliable sources.

−

Editors~~should~~ disclose LLM assistance in the edit summary (e.g. "copyedited with the help of ChatGPT 5.1 Thinking"). This helps other editors understand and review the edit.

Editorsmust disclose LLM assistance in the edit summary (e.g. "copyedited with the help of ChatGPT 5.1 Thinking"). This helps other editors understand and review the edit.

Mikeycdiamond (talk)19:32, 11 December 2025 (UTC)[reply]

I like this in principle, although I'm afraid that your last addition to the first paragraph might be interpreted by some editors as "get AI to add sources to the text it wrote", which usually won't be great from a text-source integrity perspective (or, for that matter, from a writing perspective).ChaoticEnby (talk ·contribs)19:38, 11 December 2025 (UTC)[reply]

Is this better?

−

Check the output they intend to use against suitable reliable sources.

Check the output they intend to use againstthesuitable reliable sourcesthattheAIcited,whichtheymustciteasthesourceofthetext.UseAIsthatautomaticallycitesources,suchasChatGPT'ssearchprogram,andasktheAItoonlyusereliablesources.Makesuretoalsomanuallycheckthereliabilityofthesources,eitherbyyourselforthroughthe[[Wikipedia:Reliablesources/Noticeboard|ReliableSourceNoticeboard]]

Mikeycdiamond (talk)20:39, 11 December 2025 (UTC)[reply]

That works much better, thanks! Not sure if "by yourself" and "through RSN" are exclusive here – for many sources, there is a consensus at RSN, and we don't want editors to sidestep it by either going "looks trustworthy enough" or starting a new repetitive discussion. Maybe:

−

Check the output they intend to use against suitable reliable sources.

Check the output they intend to use againstthesuitablereliablesourcesthattheAIcited,whichtheymustciteasthesourceofthetext.UseAIsthatautomaticallycitesources,suchasChatGPT'ssearchprogram,andasktheAItoonlyuse reliablesources.Makesuretoalsomanuallycheckthereliabilityofthesources,andlookforpreviousdiscussionsofthese sourcesatthe[[Wikipedia:Reliablesources/Noticeboard|ReliableSourceNoticeboard]].

ChaoticEnby (talk ·contribs)20:56, 11 December 2025 (UTC)[reply]

This is great! Should we implement it, or wait for more input?Mikeycdiamond (talk)21:03, 11 December 2025 (UTC)[reply]

LLMs are not good at being search engines, despite being able to cite sources. I think a better approach would be to recommend finding sources manually first and then including them in the prompt.SuperPianoMan9167 (talk)21:05, 11 December 2025 (UTC)[reply]

Well, the AI gets the source from somewhere, and it is better to get the sources it used than working backward.Mikeycdiamond (talk)21:10, 11 December 2025 (UTC)[reply]

I agree,Working backward makes it~~easier~~harder.SuperPianoMan9167 (talk)21:18, 11 December 2025 (UTC)[reply]

"This is much harder than doing it forward. Experienced editors would say at least 20 times as hard. Step 2 is of course the difficult bit." The essay you cited seems to disagree with you. Also, take a look at my revision of the edit.Mikeycdiamond (talk)21:21, 11 December 2025 (UTC)[reply]

Oops, I fell victim toWP:UPPERCASE again :) Sorry!SuperPianoMan9167 (talk)21:40, 11 December 2025 (UTC)[reply]

I think I was under the impression that "backwards" meant "sources first", which is not at all what the essay says.

Self-troutSuperPianoMan9167 (talk)21:49, 11 December 2025 (UTC)[reply]

It is fine. Could you take a look at the revision to the edit I made in response to your criticism?Mikeycdiamond (talk)21:51, 11 December 2025 (UTC)[reply]

I just did. I was alsopolitely asked to be more thoughtful of how often I comment on LLM PAG discussions, so I might not reply here for a bit. (I bludgeoned Qcne's RfC and I apologize.)SuperPianoMan9167 (talk)21:56, 11 December 2025 (UTC)[reply]

How about this?

−

Check the output they intend to use against suitable reliable sources.

Check the output they intend to use againstthesuitable reliablesourcesthattheAIcited,whichtheymustciteasthesourceofthetext.TheyshouldfindsourcesbeforehandandprompttheAItocreateanarticlefromthosesources.Makesuretoalsomanuallycheckthereliabilityofthesourcesandlookforpreviousdiscussionsofthese sourcesatthe[[Wikipedia:Reliablesources/Noticeboard|ReliableSourceNoticeboard]].

Mikeycdiamond (talk)21:16, 11 December 2025 (UTC)[reply]

Yes, I think this is an improvement.SuperPianoMan9167 (talk)21:52, 11 December 2025 (UTC)[reply]

Ok, should I implement the edits? I don't have much experience with RFCBEFOREs, or RFCs in general.Mikeycdiamond (talk)21:58, 11 December 2025 (UTC)[reply]

IMHO, the core to our activity areWP:N andWP:UNDUE that require human understanding. Therefore we should not allow AI to find sources as it goes. Whatever we do, the guidelines should reflects that the sources arealways selected by human. I see the search for sources and their use as two distinct activities that require a human firewall in-between.Викидим (talk)21:15, 11 December 2025 (UTC)[reply]

Is any of that what you meant? Or were you referring to our talk page discussion that resulted in wording along the lines ofautomation of filling in templates? If so, here's that discussion:User talk:Argenti Aertheri#Gemini 3 and tables, and an example from my own editing:

<ref>Animal Bones, Carcasses Found At Closed School Archived 2012-03-23 at theWayback Machine -Dozens Of Surviving Animals Rescued By Upstate Group, WYFF4, September 10, 2010</ref>

<ref>"Animal Bones, Carcasses Found At Closed School".WYFF4. September 10, 2010. Archived fromthe original on 2012-03-23. Retrieved2025-07-27.</ref>

I already had all the information for the citation, just in plain text, all ChatGPT did was "put it in this template".~ Argenti Aertheri^(Chat?)22:40, 11 December 2025 (UTC)[reply]

Anything that happens before an RfC is part of the RFCBEFORE. RFCBEFORE is not a requirement to point at one discussion and say "this satisfies RFCBEFORE!"SuperPianoMan9167 (talk)15:31, 11 December 2025 (UTC)[reply]

so since this discussion seems to have died, can we say that we have done our best to have an RFCBEFORE now, we tried, an attempt was made, the box was ticked, whatever people want from usGnomingstuff (talk)03:14, 27 December 2025 (UTC)[reply]

RFCBEFORE isn't some checkbox or hoop to jump through. ... There is no specific proposal here, so, if and when somebody drafts an RFC question or RFC proposal, they should still waitdays after posting the proposed RFC question/proposal here for feedback, before launching the RFC.There is an RFC going on right now and it looks like it will reach consensus. If someone thinks another RFC is needed after this current one, they shouldstill post the draft RFC question and waitdays for input before launching it. It's not really a hard thing to do, and it's not hard to understand that this is what (some) people want.Levivich (talk)04:06, 27 December 2025 (UTC)[reply]

I've said it before and I'm going to say it again since you asked again: the best next step is to figure out what the consensus is regarding LLM usebefore trying to document that consensus in a guideline. I would oppose Option 4 because of the line "Editors are strongly discouraged from using LLMs," among other lines. You might save yourself a lot of time by first asking the community if it aggrees that editors should be strongly discouraged from using LLMs.Levivich (talk)03:47, 11 December 2025 (UTC)[reply]

Well, regarding your last sentence, I believe that is exactly what this RfC would be for? Usually, guidelines are established through a consensus-making process, instead of having a separate undocumented consensus-making and then writing a guideline out of it. For that matter, I broadly agree with the text of Option 4.ChaoticEnby (talk ·contribs)11:07, 11 December 2025 (UTC)[reply]

That's not accurate, CE (the part about "usually..."). Look into the history of how various guidelines/policies are written. Look at howWP:RECALL was developed, it came out ofWP:RFA2024. We didn't start with a fully drafted guideline and adopt it, or even a choice among several fully drafted guidelines. It started with first figuring out what the consensus was, in multiple phases of RFCs, and only after the consensus was determined was theWP:RECALL page written. Same withWP:AELECT, which was also only written after RFA2024. The problem is in trying to write the guideline first, and then figuring out what the community wants after. It's a backwards approach. In fact, I can't think of a time that somebody drafted an entirely new guideline, and it was adopted. Can you? (The current RfC, if it passes, might be the first.)Levivich (talk)14:31, 11 December 2025 (UTC)[reply]

The "entirely new" part is where I object (althoughWP:NEWLLM might fall under this). I'm not saying that this is always a one-round voting on an existing guideline proposal, and we do have tools likeWP:RFCBEFORE to workshop these guidelines from early drafts. The thing is, that is exactly what is being done right now, so saying that we should first ask the communityas a prerequisite to the RFCBEFORE isn't helpful.ChaoticEnby (talk ·contribs)14:35, 11 December 2025 (UTC)[reply]

Not as a prerequisite to an RFCBEFORE. I'm not sure if you don't understand what I'm saying or what, but it's about what the RfC question is. You can have RfC questions that ask "what should the rule be," or you can have RfC questions that propose written text and ask to adopt that text. The former should be done first, then the latter. Often, the latter (the written text) doesn't need an RfC at all.Levivich (talk)14:39, 11 December 2025 (UTC)[reply]

WP:RECALL andWP:AELECT were both multi-part RfCs to establish entire new processes. Shorter guidelines, such asWP:NEWLLM, have been proposed directly, and asking for a RfC on everyone's opinion before a second RfC on the specific wording (whichwill be debated in the case of AI policy) mostly adds bureaucracy. I'm not opposed to making this a part-by-part RfC (where each aspect of the guideline is voted on separately instead of giving the voters a single choice of full policies), but we shouldn't dismiss it just on the basis that we have well-defined options for each.

In fact, there have been many cases (likeWikipedia:Requests for comment/2024 Wikipedia blackout) where a RfC was opposed specifically because it asked a broad question and didn't present a full, detailed proposal.ChaoticEnby (talk ·contribs)14:55, 11 December 2025 (UTC)[reply]

Ok, well, hey, maybe the current RfC will end up supporting the proposed guideline and that'll be the end of the issue. Or maybe the next fully-written guideline (or one of several proposed guidelines) will gain consensus.

Your diagnoses, though, is incorrect. If it's true that "Shorter guidelines, such as WP:NEWLLM, have been proposed directly," then can you name one, besides NEWLLM? I can't think of any.

Both RECALL and AELECT came out of thesame RFC. It wasn't, as you wrote, an RfC that proposed new processes--you got that backwards. The RfC asked what the consensus was, both as to what "the problem" was and what proposed solutions to those problems were. What came out of that was some new processes as well as reforms to existing processes. Various solutions were tried, some stuck and others (like the three day discussion period) were abandoned after trials. The whole point is that RFA2024 asked what the problem wasbefore proposing any solutions. RFA2024 was more extensive than I think an LLM RFC needs to be, but the lesson is to first figure out what the community thinks, and only second to try and craft solutions to those problems (and only third, to try and document it).

FWIW, I don't think your blackout proposal failed because it asked a broad question and didn't present a full, detailed proposal. It didn't ask a broad question, it proposed a specific course of action, for which there was not consensus. I think it failed because of the same issue I'm raising here: proposing a solution without first figuring out what "the problem" is, in the eyes of the community.

But you don't have to listen to me, maybe your approach will work this time, or the next time.Levivich (talk)15:19, 11 December 2025 (UTC)[reply]

Your diagnoses, though, is incorrect. If it's true that "Shorter guidelines, such as WP:NEWLLM, have been proposed directly," then can you name one, besides NEWLLM? I can't think of any.

Yes, many. Just this year alone, I can think ofWikipedia talk:Please do not bite the newcomers#RfC: Rewriting specific sections (a full guideline rewrite) orWikipedia talk:Speedy deletion/Archive 93#RfC: Replacing U5 with a primarily procedural mechanism (where editors !voted on fully written speedy deletion criteria).

Both RECALL and AELECT came out of the same RFC. It wasn't, as you wrote, an RfC that proposed new processes--you got that backwards.

WP:RECALL hada standalone, point-by-point RfC, that came after a broader RfC that proposed new processes, of which RECALL was a successful one. A big criticism of how RECALL was handled was exactly thatlong, multi-part RfC process, and it is not a model to follow in all future RfCs.

FWIW, I don't think your blackout proposal failed because it asked a broad question and didn't present a full, detailed proposal. It didn't ask a broad question, it proposed a specific course of action, for which there was not consensus.

Much of the early opposition (and of the discussion around the proposal) centered around the fact that implementation details weren't provided at first. The #Specifics section was only added later to account for that feedback.ChaoticEnby (talk ·contribs)15:36, 11 December 2025 (UTC)[reply]

If there is an existing best practice that editors commonly follow, then I have seen guidance written that codifies it which gains community consensus approval. I'll agree though that for matters with contradictory opinions from significant numbers of editors among those who like to discuss those matters, it's hard to craft a detailed guideline proposal that will gain enough support to be approved. The inescapable reality is that although many people want to minimize the time they spent considering a matter and so prefer to see one RfC with a proposal, the difficulty in getting such a proposal approved means that it may be better to have multiple phases. Determine broad parameters for guidance in an initial phase, and get a proposal approved in a later phase. (This is a pretty common approach with organizations who need to consult with a broad population.) Unfortunately, the shifting nature of who participates in any given discussion means it can take a while, possibly with resets happening as different concerns come to the forefront.isaacl (talk)15:54, 11 December 2025 (UTC)[reply]

undocumented consensus-makingis the consensus-making process.SuperPianoMan9167 (talk)14:33, 11 December 2025 (UTC)[reply]

Yes, but that's what we're doing here as an RFCBEFORE. There's no point in asking for a "before" to the RFCBEFORE.ChaoticEnby (talk ·contribs)14:37, 11 December 2025 (UTC)[reply]

Asking what the problem is is still part of the RFCBEFORE. RFCBEFORE is not just one discussion that you point to and say "that's our RFCBEFORE!"WP:RFCBEFOREactually saysIf you can reach a consensus or have your questions answered through discussion, then there is no need to start an RfC.Anything that happens before the RfC is part of the RFCBEFORE.SuperPianoMan9167 (talk)15:29, 11 December 2025 (UTC)[reply]

_{Does every fucking discussion on-Wiki devolve into arguments over semantics}qcne(talk)15:31, 11 December 2025 (UTC)[reply]

YesSuperPianoMan9167 (talk)15:33, 11 December 2025 (UTC)[reply]

Agreed that these meta discussions about an rfcbefore are largely unproductive, and could become disruptive. If they need to happen, can they happen on a different thread? We should focus on Qcne's proposals here.NicheSports (talk)15:36, 11 December 2025 (UTC)[reply]

I agree that arguing about labels isn't helpful. As has been pointed out by others,Wikipedia:Requests for comment § Before starting the process is actually about alternatives to having a request for comments discussion. The actual concern is more about how much work has gone into developing an RfC question that is likely to result in progress moving forward. So... no matter what it's called, let's just continue the development work.isaacl (talk)16:00, 11 December 2025 (UTC)[reply]

Option 4 is just as contradictory as the current proposal at the RFC :( is there any chance you could create a "CE version" of option 4 that tries to resolve those contradictions? I would love to see what your preferred policy is.NicheSports (talk)14:56, 11 December 2025 (UTC)[reply]

@NicheSports What do you find contradictory about it? I've tried hard to resolve the contradiction.qcne(talk)14:57, 11 December 2025 (UTC)[reply]

Sure thing Qcne!

The summary says:Permits limited LLM assistance with mandatory disclosure; prohibits generation from scratch.. What is in the scope of "limited LLM assistance"? Does "from scratch" mean "regardless of subsquent review"? Or does it still allow reviewed LLM-generated content?
But then the section header in the actual proposed guideline is still titledDo not use an LLM to add unreviewed content. Ok so I guess reviewed content is still fine. But what about the limits mentioned in the summary?
And the bulk of the guideline is identical to Option 2 (which permits reviewed LLM-generated content).Editors should not use an LLM to generate content for Wikipedia unless they have thoroughly reviewed and verified the output...Editors should not: Paste raw or unreviewed LLM output as a new article or as a draft intended to become an article... (etc.) Ok no discussion of limits here either, this section is basically identical to Option 2.
But then this section is added (not in Option 2).Editors are strongly discouraged from using LLMs. LLMs, if used at all, should assist with narrow, well-understood tasks such as copyediting. How do I square this with the previous section? This seems more consistent with the "limited LLM assistance" language from the summary at least!

Frankly this might bemore contradictory than the current proposal, to the extent that I don't know what you are actually suggesting here. It is possible this could be my preferred guideline, if it is banning the use of LLMs to generate text for new articles or significant article expansions/rewrites (regardless of subsequent human review), but permits the use of LLMs, with review, on a limited basis for copyediting small sections of prose (a few sentences here or there).NicheSports (talk)15:18, 11 December 2025 (UTC)[reply]

Does "from scratch" mean "regardless of subsquent review"? No. "From scratch" means that reviewed content is okay. Proposal 4 is the same thing as proposal 2 except with a disclosure requirement. It bans the unreviewed use of LLMs to generate text for new articles or article expansions but allows rigorously reviewed content. I think the contradiction is resolved. It makes perfect sense to say both "do not add LLM content unless it is reviewed" and "despite that, LLM use is strongly discouraged, especially for newer editors".SuperPianoMan9167 (talk)15:26, 11 December 2025 (UTC)[reply]

I almost think the correct question right now to ask is:

What LLM use is tolerated BESIDE the actual addition of text to a Wikipedia article?

We know the actual addition of text to the article space is the main sticking point for a wide variety of reasons, which will be the real bear to tackle long-term.

So define what if anything IS tolerated first, encode that as custom, and then you've got the first half of a "policy". —Very Polite Person (talk/contribs)22:49, 11 December 2025 (UTC)[reply]

Option 2 seems best, but as others have highlighted, I would also want a disclosure clause ideally. While I have been somewhat stringent against LLM usage as I've found it, I am not against it completely, and as others have suggested that really positive example cases are needed for guidance on usage, this is something I could potentially help with, as I have used locally run LLMs in some writing exercises outside of Wikipedia to see what benefits or pitfalls there may be in topics where I am able to discern where it get's the information not obviously wrong, but wrong nonetheless. --Cdjp1 (talk)18:16, 11 December 2025 (UTC)[reply]

I prefer something similar to Option 4 with even stricter disclosure requirements. Any text that was generated and/or substantively edited by AI (rule of thumb: if words are replaced with other words that mean other things, so typos and punctuation don't count nor does markup), whether reviewed or not, must be disclosed not just in the edit summary buton the article itself, in a prominent place where readers will see it. (The equivalent of our current cleanup tag, except that the disclaimer remains as long as there is AI-generated text present.) This is consistent with what we do when we incorporate text from elsewhere (e.g. Catholic Encyclopedia as mentioned above). How that disclaimer looks and what it says can be discussed -- it does not necessarily have to resemble a cleanup tag or talk about cleanup -- as long as it is not hidden away in somedark pattern dungeon. (If it were entirely up to me, I would even go so far as to require any AI-generated text to be highlighted in a different color.)

I find this more realistic; if we can't stop people from using AI, we can at least let readers know where their articles came from. Does this mean that we will accumulate a great deal of these disclaimers? Yes, them's the breaks. The text would be there either way. And if AI-generated text is there, readers deserve to know about that, especially if those readers are on Wikipedia expecting to read non-AI-generated text, which many are.Gnomingstuff (talk)19:29, 11 December 2025 (UTC)[reply]

I would support this in theory, although the one worry I do have is that this might discourage editors from being transparent about their use of AI. However, that isn't a deal-breaker, and we shouldn't be writing our policies from a standpoint of appeasing potentially disruptive editors.ChaoticEnby (talk ·contribs)19:41, 11 December 2025 (UTC)[reply]

I mean only a handful of people are proactively transparent about their use of AI right now, so there's nowhere to go but upGnomingstuff (talk)19:53, 11 December 2025 (UTC)[reply]

When would such a tag be removed?SuperPianoMan9167 (talk)20:02, 11 December 2025 (UTC)[reply]

It wouldn't (unless the AI-generated text is removed). Basically, it would be the equivalent of the disclaimers in research studies and news articles.Gnomingstuff (talk)22:03, 11 December 2025 (UTC)[reply]

Do all the prohibitions specifically come down to what gets "into the edit box"?

Because we literally can't tell people not to use or police use of it for things like grammar, searching, dissecting documents, and other actions. It's increasingly culturally ingrained already in too many workplaces that work with computers daily. Many places require it now for some things. More tools we have on our computers are now pre-loaded with the stuff. Grammarly was mentioned above here.

It would be impossible to police anyonebefore it gets to the edit box. Is that the intention, to dissuade people from using it at all, or just for what actually gets saved as a revision?

The former is impossible (and beyond any presumed authority or right we have to police). The latter is not. —Very Polite Person (talk/contribs)22:44, 11 December 2025 (UTC)[reply]

I want to add some suggestions to the "unreviewed" section:

LLMs are prone to hallucinating facts and citing non-existent sources. Not wrong but a little obsolete (this happens less often with newer AI), and can lead people to think that if the URL isn't broken then everything's fine and their review is done.
It should mention how newer AI tends to not outright hallucinate as much, but instead generate its own interpretations then claim a source said that, which isWP:OR (if AI slop can be called "research"). Just went through some examples atTalk:Us_(Gracie_Abrams_song)#AI
It should mention close paraphrasing as AI is prone to do this. I see this a lot in music articles -- if a reviewer says "heavy pulsating beats" or "smooth honeyed harmonies" or whatever, AI will often just regurgitate that in wikivoice. This isWP:CLOP bordering on plagiarism.Gnomingstuff (talk)15:47, 12 December 2025 (UTC)[reply]

I'm an Option 3 guy because by far the majority of LLM additions come from people who are too rushed or incompetent to determine whether their AI-generated text is a good summary of the best available literature and an accurate representation of the topic. If they had the time or competence to research the literature and write a balanced summary, they would not be using LLM tools in the first place. Let's not fool ourselves about catering to the notional expert user who is skilled enough and has enough time to review the AI-generated text with regard to all of the existing literature; these people are few and far between. If these rare people DO produce LLM-assisted text that is indistinguishable from the measured and balanced assessment of a veteran topic scholar then I have no problem with it. But we should be writing guidance for the masses who are rushed and incompetent.Binksternet (talk)23:40, 16 December 2025 (UTC)[reply]

Option 3 any text that incorporates LLM, even for grammar, will be noticeable and harm the reputation of Wikipedia. As such, no generated text for Wikipedia should incorporate LLMs. Also, is this supposed to be anRfC or something?Wikieditor662 (talk)17:22, 17 December 2025 (UTC)[reply]

Not every discussion with bold !votes is an RfC.SuperPianoMan9167 (talk)17:52, 17 December 2025 (UTC)[reply]

No, I was just trying to gather further thoughts as my actual RfC got a bunch of criticism for contradictory language/not going far enough/going too far.qcne(talk)17:58, 17 December 2025 (UTC)[reply]

Well then, since so many people are participating, is thereconsensus for option 3? If so can we add it to the guideline?Wikieditor662 (talk)19:48, 17 December 2025 (UTC)[reply]

Definitely not consensus (although I personally support it). Also it looks like the existing RfC might actually pass. We need to wait and see.NicheSports (talk)19:51, 17 December 2025 (UTC)[reply]

Also it looks like the existing RfC might actually pass. There's an existing RfC? Where can I find it?Wikieditor662 (talk)19:54, 17 December 2025 (UTC)[reply]

@Wikieditor662 Wikipedia:Village pump (policy)#RfC: Replace text of Wikipedia:Writing articles with large language modelsqcne(talk)20:16, 17 December 2025 (UTC)[reply]

In that RfC it seems you don't even have the option to !vote for anything like option 3 here.Wikieditor662 (talk)20:27, 17 December 2025 (UTC)[reply]

Indeed. I made this post *after* that RfC.qcne(talk)20:30, 17 December 2025 (UTC)[reply]

Well that's not good, is there a way to update that RfC or do something else that incorporates this? Or perhaps we could wait until the RfC is over, and if the essay is added then we could have a new RfC about whether to incorporate option 3?Wikieditor662 (talk)20:35, 17 December 2025 (UTC)[reply]

I was going to wait until the RfC was over, then re-assess.qcne(talk)20:36, 17 December 2025 (UTC)[reply]

A question for your RfC. Since the proposed text expands LLM use guidelines beyond writing articles, "Writing articles with large language models" is an unsuitable title. What title would work?SuperPianoMan9167 (talk)20:39, 17 December 2025 (UTC)[reply]

Option 3 bans all use of LLMs. That idea has and will continue to receive pushback.SuperPianoMan9167 (talk)20:37, 17 December 2025 (UTC)[reply]

I think it's more about all use of LLMs inside of articles. Not that you can't use it outside of the article, for example for help with finding rules (as long as you don't cite it).Wikieditor662 (talk)20:41, 17 December 2025 (UTC)[reply]

I'd favor wording such asGPTs (and other LLMs) are fundamentally incapable of writing inwp:wikivoice, and thus must not be used to add prose.WP:KISS?~ Argenti Aertheri^(Chat?)02:51, 18 December 2025 (UTC)[reply]

It's not bad, but unfortunately we're in the middle of a large RfC so these suggestions can't work right now. We have to wait for the RfC to be over, and if the essay is added,then we can suggest it.Wikieditor662 (talk)03:06, 18 December 2025 (UTC)[reply]

The guideline has in my opinion several qualities but also some remaining critical flaws. For example, the "Scope" section is pretty good and not overly long, and the proposed guideline indicates actions that may be taken. But I will focus here on what I believe is problematic. A major point of contention stems from the fact that there is a broad spectrum of LLM uses. Some are minor, others not. The main issue most of us have in mind is adding large blocks of AI-generated content, which most of us agree is a serious problem, especially when not properly reviewed. But LLM usage can also be minor/benign. For example, searching for a source or fixing grammar issues. These are use cases that we may not have in mind when making guidelines because they are rarely a problem that attracts attention, but banning it would likely be harmful. We need a good way to differentiate between these and to be more selective about what we don't want.

I thinkthe option 4 of forcing disclosure in edit summaries is problematic for several reasons. First, it would make the edit summaries significantly more bloated. Secondly, it seems indiscriminate, and would force to diclose LLM usage in the edit summary even if it just resulted in fixing a typo. But fixing typos is a factual improvement for which it doesn't really matter if an LLM was involved. Thirdly, the incentives are not aligned. It most often wouldn't be in a user's best interest to disclose LLM usage. Personally, I often use LLMs to double-check what I change and verify that it's grammatical, accurate, and get a second opinion on whether it's an improvement or not. Having that feedback is really useful in refining my modifications and filtering out bad edits. If I had to indicate in the edit summary any LLM contribution, I may just not use LLMs for verification, and that would degrade the quality of the edits without benefit. Furthermore, many editors may not be (or would potentially just pretend not to be) aware of this rule. I also oppose the idea of limiting LLM usage to experienced editors, because LLMs are actually more helpful to inexperienced editors, who are unfamiliar with the norms, the wiki syntax and the writing style, and who need some kind of tutoring at the beginning, which LLMs can provide. To make option 4 more practical, I would suggest to either require disclosure only when there is a significant amount of LLM-generated content (e.g. a paragraph or more), or to allow people to disclose LLM usage once and for all on their user page.

Some of these issues also apply tooption 3. Option 3 also says "Do not use an LLM to add content", but something like "Do not add LLM-generated content" or "Do not add content mostly generated by LLMs" would be better, because there are various peripheral uses of LLMs that don't involve copy-pasting content into Wikipedia.

Option 2 looks more reasonable, but has some remaining issues that make it impractical. For example it says that users should check line by line the generated content for copyright problems, which almost no one is going to do, because it's unclear how to efficiently do it. The "at least as rigorously as if the editor had written the text themselves" is vague and unactionable, and verification is already better covered by the next sentence which looks good: "You must verify every substantive claim against the cited sources." Cambalachero also made relevant comments inWikipedia:Village pump (policy)/Replace NEWLLM that need to be addressed. I suggest also adding something like "despite being warned not to do so" to "Repeatedly making problematic LLM-assisted edits". I think there should beat least a warning before considering an indefinite block for that. It's also not needed to assume incompetence here: "may be treated as a competence issue". The bullet list about not pasting raw LLM output can probably be refactored to a single sentence, for example: "In general, editors should not paste raw or unreviewed LLM output to articles or talk pages." I suggest adding "In general" here because some cases of copying-pasting raw but reviewed LLM output can be ok, such as copying a translated sentence or categories proposed by an LLM for an article.Alenoach (talk)11:37, 4 January 2026 (UTC)[reply]

What LLM use is tolerated BESIDE the actual addition of text to a Wikipedia article?

[edit]

We know the actual addition of text to the article space is the main sticking point for a wide variety of reasons, which will be the real bear to tackle long-term. Remember too that non-technically savvy people may be inadvertantly using LLM without even knowing it.

Couple what-if's and examples:

I sometimes use Google Translate to, unsurprisingly, translate. Is that AI/LLM? If it is, can I use such translations at all on-wiki? If someone who English isn't their first language uses Google Translate to sound coherent here, is that OK? What if they use GPT for that same ends?
What if MS 365 turns Word's spell and grammar check into AI/LLM instead of dictionary/file/code type backing? Do I need to now edit in Notepad instead? What if I have no idea what Word is doing under the hood?
What if I set a tool like this loose to find me every web page that mentions a certain text string published between the years 2005-2018, as a reference source? Can I not use the links if an AI tool found them versus my hand Googling them down?
What if someone wants to check if their article is MOS compliant, but isn't familiar with them all, and asks a GPT tool? The tool says something like, "These lines are problems for these reasons," gives you MOS pages, and suggested language edits. There's only so many permutations for certain language problems that both make sense, fit MOS, and don't suck. Is it ok for them to audit their content like that to learn how to fix it?
What if someone stuffs a PDF of a book into one of these to find references to XYZ?

The above examples are all "before saving a page version".

So define what if anything IS tolerated first, encode that as custom, and then you've got the first half of a "policy".

For the addition of text; that's the easiest example and also the hardest nut to crack. Put it aside for now. What IS tolerated? Define what no one reasonable or the majority do not object to, or are willing to not care about. A total hard "no way" prohibition is going to be impossible to police, especially as integration of these tools becomes more endemic. —Very Polite Person (talk/contribs)22:53, 11 December 2025 (UTC)[reply]

Regarding #1,the consensus is explicitly against using unedited machine translations. #2 doesn't have any regulations against it as far as I know, and I don't think any reasonable person would object to AI-assisted spellchecking. However, if it starts changing word choices or sentence structure (similar to your "suggested language edits" in #4), that can be more problematic. LLMs usually have a poor grasp of MOS (and especially of NPOV/editorializing/weasel words), and asking one to "copyedit for neutrality" usually makes things worse, while giving the user the impression that they helped.ChaoticEnby (talk ·contribs)23:18, 11 December 2025 (UTC)[reply]

Right, that's the sort of answer I was talking about! Every time I read one of these (no offense to anyone in particular), I often get "harumph" vibes as opposed to, and I hate the business terminology, "actionable things".

This makes perfect sense:Wikipedia consensus is that an unedited machine translation, left as a Wikipedia article, is worse than nothing, especially with the connective tissue there on why, with the explanations. That's basically a section/passage we can airlift straight into a LLM policy.

For your points on #2, again, I agree. But can we ever actually police that? What's the line between man and machine there? I just wrote free hand:"Very Polite Person is a Wikipedia editor who is noted for sometimes being terribly pedantic." I then asked my GPT five times in "temp chats" (to not pollute with any data about me; sandboxes it; normally used for sports stuff/recipes more than anything):I just wrote that free hand (I am that person). I need an experiment -- without changing that message at ALL, but only terminology and structure, how many unique ways do you think you can reword that in English? Don't actually show them. Just give me an integer number. Best effort. If you can clear 100+, estimate percentage likelihood of that. Every way I ask, it says it can get that passage to 100+ variants with 99% likelihood. I honestly wasn't expecting numbers or odds that high just now.

I honestly can't think of how to police that. Any combo a person will come up an AI can more easily. If some editor is writing that about a BLP or something, and changes "terribly" to "famously" and the sourcing fits, we'd never be capable of knowing. —Very Polite Person (talk/contribs)23:31, 11 December 2025 (UTC)[reply]

Any combo a person will come up an AI can more easily. That's assuming a lot of current AI models. Yes, they couldtheoretically produce any possible sentence, but there are clear known patterns that make them more problematic most of the time.ChaoticEnby (talk ·contribs)23:39, 11 December 2025 (UTC)[reply]

Yeah, like shark eyes, dead. There's rarely creative spark. We're supposed to write dry though, and half of these things are trained on us. That's why it's so hard to tell unless people are stupid and just copy/paste walls of AI into here.

But am I making sense? Any policy that isn't a hard unenforceable "not one single letter or character from GPT et al is allowed" must inherently be somewhere on the spectrum of whatcan be policed in a realistic context. Not what we may prefer. What we can do.

All these that I've read are variants of "How I want us to handle LLM usage," but no one's asking the actually important first question: "What can we actually police of LLM usage?" —Very Polite Person (talk/contribs)23:43, 11 December 2025 (UTC)[reply]

In general I feel like your focus is too much on "will someone punish me for this" when it should be "is this text problematic."

I sometimes use Google Translate to, unsurprisingly, translate. Is that AI/LLM? If it is, can I use such translations at all on-wiki? If someone who English isn't their first language uses Google Translate to sound coherent here, is that OK? What if they use GPT for that same ends?

As far as I know~~Google has not yet shoved Gemini into Google Translate~~ Google Translate does not use generative AI as it is currently understood. In my experience it tends to err on the opposite side, being too awkwardly literal a translation rather than turning meaning into slop.

What if MS 365 turns Word's spell and grammar check into AI/LLM instead of dictionary/file/code type backing? Do I need to now edit in Notepad instead? What if I have no idea what Word is doing under the hood?

It already has. (So has Notepad.) Again, if you have no idea Word is doing this under the hood, that doesn't mean you're editing in bad faith, but it does mean that the content you are adding is problematic.

What if I set a tool like this loose to find me every web page that mentions a certain text string published between the years 2005-2018, as a reference source? Can I not use the links if an AI tool found them versus my hand Googling them down?

If you have to, although AI is not great at distinguishing reliable sources and may contain gaps/biases in what it surfaces. Humans can obviously do this too, but they at least sometimes have a framework in their brain to tell the difference. AI does not.

What if someone wants to check if their article is MOS compliant, but isn't familiar with them all, and asks a GPT tool? The tool says something like, "These lines are problems for these reasons," gives you MOS pages, and suggested language edits. There's only so many permutations for certain language problems that both make sense, fit MOS, and don't suck. Is it ok for them to audit their content like that to learn how to fix it?

They should cut out the middleman and~~RTFM~~ read the MOS. AI's suggestions are frequently wrong (introducing promotional tone while saying it's making it neutral), superficial (pointlessly twiddling words), or impossible (pointing out gaps in an article when the gaps are there because that information simply does not exist in reliable sources).

What if someone stuffs a PDF of a book into one of these to find references to XYZ?

I think this is OK as long as it's used as a tool to go back and check the actual source, not as the source of truth. (Full disclosure, I do this sometimes with AI-generated transcripts of videos that I'm fact-checking, to find out what timestamp I need to listen to.)Gnomingstuff (talk)23:51, 11 December 2025 (UTC)[reply]

They should cut out the middleman and~~RTFM~~ read the MOS.

I agree, but the point I'm tyring to make is we can't expect this. The real question is what we can know. We keep circling back to what we ideally or ideologically prefer, versus what is really achievable here. If every discussion always turns into variants of "don't do it, do it this way instead", we'll be having these discussions through 2030 with no progress, as these tools get ever more present. This is why I'm saying: proscriptions against it as a hard prohibition already don't work, so writing it more Formally won't make it work any better. —Very Polite Person (talk/contribs)00:41, 12 December 2025 (UTC)[reply]

The risk of an inexperienced user asking an llm if an article meets MOS is that llms probably don't understand MOS, and will give a falsely confident answer. There's nothing to stop any user doing it, but at some point they will be misled.CMD (talk)01:37, 12 December 2025 (UTC)[reply]

What if someone wants to check if their article is MOS compliant, but isn't familiar with them all, and asks a GPT tool? The tool says something like, "These lines are problems for these reasons," gives you MOS pages, and suggested language edits. There's only so many permutations for certain language problems that both make sense, fit MOS, and don't suck. Is it ok for them to audit their content like that to learn how to fix it?

Checking for things like curly quotes is a simple regex, and for more complicated MOS fixes there areWikipedia:User scripts/List. Including a words to watch highlighter:User:Danski454/w2wFinder. Editors have already written good, Wikipedia specific, programs for this shit, why reinvent the wheel as a many sided polygon when we've already got a circle?~ Argenti Aertheri^(Chat?)00:32, 12 December 2025 (UTC)[reply]

why reinvent the wheel as a many sided polygon when we've already got a circle?

Like I said in my last reply, I agree, but writing how we want users to do things ever more formally won't really do anything. The real question is what canwedo, with the tools and capabilitiesavailable to us? Unless we have a way to somehow police AI/LLM usage that happens BEFORE the user saves, we have no way to make a policy that has any actual way to act against that or detail how it should be done.

The outcome we want is the least important question (yet). The most important question is what can we actually detect. That informs everything after. Everyone just immediately transitions into variants of "nope" to AI, as if dislike or disfavor hasactionable authority. It does not. —Very Polite Person (talk/contribs)00:44, 12 December 2025 (UTC)[reply]

The most important question is what can we actually detect. This keeps coming up and frankly I don't get it. It's virtually impossible to detect undisclosed paid editing or CoI, we still don't allow them.~ Argenti Aertheri^(Chat?)02:04, 12 December 2025 (UTC)[reply]

Exactly, the COI thing is functionally the exact same thing. If the whole thing is just the "message" as with that policy, then that's reasonable. But with that, it's a binary. You are or you kinda aren't.

With AI, there's that entire spectrum of "GPT, give me an article that is about this species of bat for Wikipedia please" and suddenly making anAwesome bat article, compared to people who use it as a fact-checker, research tool, grammar helper, or "does this sound like shit?" assistant.

If we say NO usage is allowed, we have to actually explain what that means: nothing, it's supposed to be what YOU by HAND found online, to your brain, to your fingers, to text entry box. If we take it to the ideology or needed maximized licensing/lowest risk of copywrite issues far extreme end. And we need to explain, "that includes this, and this and all these tools and products like this", so unaware users aren't caught out.

Or, we say X is allowed if you Y, and then we need to spell it out the same. —Very Polite Person (talk/contribs)02:45, 12 December 2025 (UTC)[reply]

there's that entire spectrum with CoI too though. No one is really going to get upset if someone corrects a typo in the article about themselves, versus, idk, politicians erasing scandals.~ Argenti Aertheri^(Chat?)02:59, 12 December 2025 (UTC)[reply]

Hit enter to soon on mobile, sorry. I agree we need to reach consensus on whatexactly is and isn't allowed, but my opinion remains to favor built in tools and user scripts written by fellow editors over what an LLM can piece together from its training.~ Argenti Aertheri^(Chat?)03:04, 12 December 2025 (UTC)[reply]

I'm agnostic, because I believe it's going to be increasingly hard to catch as it improves. Even if the entire "industry" implodes, the technology won't go away, and like every tech ever, it'll get cheaper, easier, and dumber to run as years go by. Things rednecks do today would be deity hijinks an eon ago. We just need it to be VERY clear and particular, given the nuance. If it's all-out or graduated, we gotta explain EXACTLY what that means at a level anyone can understand on day zero of their Wikipedia experience. —Very Polite Person (talk/contribs)03:11, 12 December 2025 (UTC)[reply]

AI detectors are reasonably reliable in my experience, so a pattern of quasi-robotic changes should not be hard to spot. The approach should be more of the self-declaration, patterned onWP:UPE.Викидим (talk)02:05, 12 December 2025 (UTC)[reply]

As I've mentioned in previous discussions, the implementation of LLMs in other toolscould be fine, if they are implemented in a smart way (specifying to the task, having guardrails to counter downfalls in the specific task, etc.), but we unfortunately in most cases will not know the technicals behind any such implementation.

Hypothetical 1 caught my attention as I have already seen issues in translation from Google's outfits using LLMs. In a YouTube video that advertised a new French documentary on theCagots, I was unfortunately hit by having it auto-play with YouTube's new "autodubbing", where a transcript of the video is autotranslated and a voice generated for the translated text both by LLMs. A rather prominent thing with regards to the Cagots, is that they were a discriminated against group. The autodubbing provided the sentence that the Cagots oppressed the people around them. This is patently false. When I set the video to French and listened to the presenter, she in fact said that the Cagots were oppressedby the people around them. So, in YouTube's LLM translation and dubbing, they turned an oppressed people into the oppressor, and this isn't even a case where it is a language that is very different from English, it was from French to English, two of the languages with the biggest corpuses available, and which we have literal millennia of translation work in. So, if the LLM translation can make such a fundamental mistake translating in a circumstance where it should be theeasiest for it to get correct, I have little hope for it's ability to do so between languages that are more different to English, or that have a very limited number of speakers and very limited corpuses. --Cdjp1 (talk)11:32, 17 December 2025 (UTC)[reply]

I assume you've seen this already butthere was a recent MIT Technology Review article about LLM translations on languages with smaller corpora.Gnomingstuff (talk)18:49, 17 December 2025 (UTC)[reply]

I saw it's headline when it was first published, and it can be summed up in two words for those who know: "Scots Wikipedia".

In a less vulgar manner than I would normally express the phrase, "bad data in, bad data out". And, for all the good we can espouse of something like English Wikipedia, there is a lot of bad data across the Wikipedias and they've been ingested. --Cdjp1 (talk)14:44, 18 December 2025 (UTC)[reply]

A salient and recent example of good LLM use besides the actual addition of text to a Wikipedia article:vibe coding. See the close atWP:Requests for comment/Recall check-in, and in particular, the LLM-assisted code atUser:Dr vulpes/code/R/2025 RfC Recall check-in.R that generated helpful graphs. This would have been prevented by a rule prohibiting all LLM use. This is an example of the baby that I fear will be thrown out with the bathwater.Levivich (talk)04:38, 28 December 2025 (UTC)[reply]

I disagree with this point; allowing LLM generated code would encourage a great deal of harm as well.

I would like to point to this incident for an example of how LLM generated code can result in disaster:Wikipedia:Administrators'_noticeboard/IncidentArchive1211#TattooedLeprechaunMEN KISSING(she/they)T -C -Email me!04:50, 8 January 2026 (UTC)[reply]

The problem in this case isn't LLM use but the operation of an unauthorized bot.Ca^{talk to me!}04:01, 20 January 2026 (UTC)[reply]

Running an unauthorized bot is one thing. But the LLM errors in the code caused another layer of problems.MEN KISSING(she/they)T -C -Email me!06:09, 20 January 2026 (UTC)[reply]

Advice I give to new editors

[edit]

Generally here's what I advise new editors trying to us AI, and it's how I've used it occasionally. I thought it might be useful for this brief guideline to describe (briefly) ways that an AI can be used productively.

When used as a collaborator or assistant rather than as an author, an AI can be helpful with these tasks:

Collecting reliable sources about a topic that you might not otherwise find yourself. You can ask the AI to restrict its results to sources that are reliable, independent of the topic, and provide significant coverage.
Identifying salient points or common themes among the sources found.
Suggesting a rough outline of an article based only on what those sources say.
Correcting grammar and sentence structure,only after you have written the articlein your own words without help from the AI.

This way, you're still doing the work as an editor, but the AI is your assistant, not the author.

This is analogous to how the judges on the US Supreme Court write their opinions: They have clerks who do research, gather case histories, provide summaries, and so on, but the judge ends up writing the opinion. ~Anachronist (who / me)(talk)19:38, 28 December 2025 (UTC)[reply]

Yes, for what it's worth, traditional search engines are getting increasingly worse at finding sources. LLMs are one way to cast a wider net, so long as you check to reliability of the sources they link.Katzrockso (talk)20:12, 28 December 2025 (UTC)[reply]

What I find useful is that an AI can search based on context, whereas a traditional search engine searches based on keywords. I do both, and the AI manages to find a couple of useful things that I couldn't. ~Anachronist (who / me)(talk)00:25, 29 December 2025 (UTC)[reply]

Seeking clarification regarding human-reviewed translations

[edit]

I have started a discussion at the Village Pump (Policy) regarding how this guideline affects machine-assisted translations when they have gone through thorough human review. Please participatethere.7804j (talk)14:12, 3 January 2026 (UTC)[reply]

Policy on creating articles

[edit]

@Cremastra This is a guideline about creating articles under one scenario thereof. However, the overall subject of creating articles is covered at policy level, atWikipedia:Editing policy#Creating articles. So this guideline is a subtopic of that policy in scope, but for no particular reason, it is at an inferior level, and, non-helpfully, it is not mentioned in it. I'm seeing room for improvement. The first thing that comes to my mind is ... —Alalch E.17:52, 5 February 2026 (UTC)[reply]

In what ways are LLMs useful?

[edit]

This policy says that they can be useful tools but i cant seem to think of a reason they would be useful for Wikipedia it should either list useful reasons for use on Wikipedia or the claim be removed entirelyof course they are not useful for creating articles and that part should remain in the policyIs la🏳️‍⚧23:25, 17 February 2026 (UTC)[reply]

One application I've found to be useful is to drop the PDF of a source I just read into the LLM with the prompt "please generate a Wikipedia-format citation for this document. If any bibliographic information is missing, search the web to find it, and provide me with links to where you found the missing information so I can verify it is correct". --LWG ^talk ^(VOPOV)23:48, 17 February 2026 (UTC)[reply]

Retrieved from "https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Writing_articles_with_large_language_models&oldid=1338912372"

[8]ページ先頭