Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Wikipedia:Village pump (proposals)/RfC LLMCOMM guideline

From Wikipedia, the free encyclopedia
<Wikipedia:Village pump (proposals)

RfC: Turning LLMCOMM into a guideline

[edit]
initial discussion

Given that at this point the prohibition against using LLMs in user-to-user communicationWP:LLMCOMM has become something of a norm, I think it would be sensible to make it an official guideline as part of the ongoing attempt to strengthen our LLM policy.

Rather than just promote the exact text of LLMCOMM, I've decided to try to create something which synthesises LLMCOMM, HATGPT and general advice about LLMs in user-to-user communication. My proposal as it currently stands is atUser:Athanelar/Don't use LLMs to talk for you

My proposed guideline would forbid editors from using LLMs to generate or modify any text to be used in user-to-user communications. Please take a look at it and let me know if there's anything that should be added or modified, and if you agree with the proposed restrictions. I'd love to workshop this a bit and get it to a stage where it can be RfCed.Athanelar (talk)13:33, 7 January 2026 (UTC)[reply]

My thoughts:
  • Your proposal ismuch more strict than LLMCOMM (which is already enshrined in guidelines asWP:AITALK). It doesn't just synthesize LLMCOMM and HATGPT, which both allow exceptions for refining one's ideas; it goes beyond that and bans LLM use entirely for writing comments. This, combined with theEditors should not use an LLM to add content to Wikipedia phrasing of the proposed NEWLLM expansion, would effectively banall use of LLMsanywhere on the English Wikipedia. This makes sense given your stated opinions on LLM policy, but I'm sure this is going to get significant opposition.
  • Your proposal also goes beyond commenting to basically say that LLMs are useless for any Wikipedia editing at all, as indicated by the section about copyediting. This seems out of place for a guideline that is supposed to be about using LLMs for comments. Again, this makes sense given your stated anti-LLM sentiments, but I have seen it repeatedly demonstrated that such a sentiment is far from universal.
  • I am concerned mainly because this guideline assumes bad faith from LLM-using editors. Most LLM-using editors are unaware of their limitations because of the massive hype surrounding them. My opinion is that instead of setting down harsh sanctions for LLM use, we should insteadeducate new users onwhy LLMs are bad and teach them to contribute to Wikipedia without them.
  • Finally, a lot of editors are just worn out at this point from having so many LLM policy discussions in such a short period of time. Can we at least wait until the NEWLLM expansion proposal is over?SuperPianoMan9167 (talk)14:23, 7 January 2026 (UTC)[reply]
I appreciate your feedback and your continued presence as a moderate force in these discussions.
  1. I recognise my proposal is quite extreme. My goal was to shoot for 'best case' and compromise from there as necessary.
  2. The subsection on copyediting exists to justify the restriction against using LLMs to refactor, modify, fix punctuation etc; because at best the LLMs are unfit for this task anyway, and at worst it provides a get out of jail free card for bad faith editors. The overall section is in fact expressly intended to demonstrate that LLMs simply are not any good at doing the things people might want them to do in discussions.
  3. I have tried to avoid that by pointing out that I believe the motivation to use LLMs in these cases comes from a good place (concerns about one's abilities)
  4. I understand; but I still have the passion and energy, and I hope others do too. We are in something of a race against the clock here; every month we wait before strengthening our policies is another month of steadily being invaded by this type of content.
Athanelar (talk)14:43, 7 January 2026 (UTC)[reply]
Some quick notes:
  • Missing the biggest reason not to use LLMs for your comments: it will make people more likely to dismiss your comments, not less.
  • As usual I think we need to specifically name what tools we are talking about. People genuinely don't know things are AI that actually are, and if we can't convince them of that, we can at least say "don't use ____" in the guideline.
  • they are not specifically trained in generating convincing-sounding arguments based on Wikipedia policies and guidelines, and considering they have no way to actually read and interpret them - Technically you could provide policies and guidelines in a prompt. Most people probably aren't doing that, but they could.
  • There are probably better copyedit examples; the first one seems like splitting hairs, and the original sentence had the same problem with different punctuation. The one wherean AI copyedit turned "did not support Donald Trump" to "withdrew her support for Donald Trump" comes to mind. Better yet would be a copyedit to a talk page comment, though that might be hard to come by without using AI yourself.
Gnomingstuff (talk)15:07, 7 January 2026 (UTC)[reply]
Much appreciated, I'll replace the example in the copyedit section later and add the info from your first two bullets.Athanelar (talk)17:53, 7 January 2026 (UTC)[reply]
Might be worth saying that people are much more likely to engage with a comment in broken english than an llm-generated oneKowal2701 (talk)21:16, 7 January 2026 (UTC)[reply]
I agree with this.SuperPianoMan9167 (talk)21:19, 7 January 2026 (UTC)[reply]
Is it true? I've seen plenty of broken English comments reverted without comment, or the response has been unhelpful.WhatamIdoing (talk)03:20, 9 January 2026 (UTC)[reply]
"Editorsare not permitted to use large language models to generateor modify any text for user-to-user communication" will have adisparate impact that discriminates against people with some kinds of disabilities, such asdyslexia. A blanket ban is therefore in conflict withWP:ACCESS and possibly withfoundation:Wikimedia Foundation Universal Code of Conduct.
I think it is patronizing to tell people "You don't need it" when some of them actually do. I oppose telling English language learners to simply go away ("If your English is insufficient to communicate effectively, then once again, you unfortunately lack the required language ability to participate on the English Wikipedia, and you should instead participate onthe relevant Wikipedia for your preferred language"), because (a) that's rude, and (b) sometimes we need them to bring information to us. If you don't speak English, but you are aware of a serious problem in an English Wikipedia article, I want you to use all reasonable methods to alert us to the problem.
Here's thecheckY goal:
  • A: I don't know English very well, but the name on the picture in this article is wrong.
  • B: Thanks for letting us know about this factual error. I'll fix it.
Here's the☒N anti-goal:
  • A: I don't know English very well, but the name on the picture in this article is wrong.
  • B: This is obvious AI slop. If you can't write in English without using a chatbot to translate, then just go away and correct the errors at the Wikipedia for your native language instead!
  • A: But the error is at the English Wikipedia.
  • B: I don't have to read your obvious machine-generated post!
Unfortunately, that's not a made-up example.WhatamIdoing (talk)19:06, 7 January 2026 (UTC)[reply]
Discussion on English competence requirements on enwiki
Atminimum there must be a carve-out for machine translation because basically all machine translation nowadays uses the LLM architecture, as it typically performs better than other types of neural networks. (In fact, the very firsttransformer from the 2017 paperAttention Is All You Need was not designed for text generation; it was designed for machine translation. The generative aspect was pioneered by OpenAI'sGPT model architecture with the release ofGPT-1 in 2018.)
There's currently aproposed guideline for LLM translation that I think will help with this issue.SuperPianoMan9167 (talk)19:26, 7 January 2026 (UTC)[reply]
I understand your point, but what you're essentially arguing then is that WP:CIR also needs to be modified because we shouldn't require communicative English proficiency.
I think it is patronizing to tell people "You don't need it" when some of them actually do. My point is that the people who need AI to talk for them, translate for them, interpret PAGs for them etc have a fundamental CIR issue that the LLM is being used to circumvent. We can't simultaneously say "competence is required" and also "if you lack competence you can get ChatGPT to do it for you"Athanelar (talk)19:27, 7 January 2026 (UTC)[reply]
FromWP:CIR:It does not mean one must be anative English speaker. Spelling and grammar mistakes can be fixed by others, and editors with intermediate English skills may be able to work very well in maintenance areas. If poor English prevents an editor from writing comprehensible text directly in articles, they can instead post anedit request on thearticle talk page.
Interpreting CIR in such a way leads to a worsening of Wikipedia'ssystematic bias.SuperPianoMan9167 (talk)19:32, 7 January 2026 (UTC)[reply]
Nor am I saying anyone must be a native speaker, merely that if someone's English level is so low that theyrequire an LLM to communicate legibly, then they are blatantly not meeting the CIR requirement to havethe ability to read and write English well enough [...] to communicate effectively
Saying "actually, if you can't communicate effectively then you can just have an LLM talk for you" seems to be sidestepping this requirement.
I also simply don't see thereason. Other-language Wikipedias already struggle for editors compared to enwiki, why should we encourage editors without functional English to find loopholes to edit here rather than being productive members of the wider Wikipedia project?Athanelar (talk)19:41, 7 January 2026 (UTC)[reply]
Because we need people to tell us about errors in our English-language articles even if they can't communicate easily in English. It is better to have someone using LLM-based machine translation to say "Hey, this is wrong!" than to have our articles stay wrong.
This should not be a difficult concept:Articles must be accurate. If the only way to make our articles accurate is to have someone use an LLM-based machine translation tool to tell us about errors, then that's better than the alternative of having our articles staywrong.WhatamIdoing (talk)19:55, 7 January 2026 (UTC)[reply]
We really don't need English competence. If you don't know English, you can post in your native language, and someone else can translate it. By the way, the CIR discussion seems to be tangential.Nononsense101 (talk)19:35, 7 January 2026 (UTC)[reply]
WP:Competence is required directly states editors must havethe ability to read and write English well enough to avoid introducing incomprehensible text into articles and to communicate effectively. and I have absolutely never heard of it being acceptable to participate in the English wikipedia by typing in another language and having others translate.Athanelar (talk)19:42, 7 January 2026 (UTC)[reply]
There's alsoWP:ENGLISHPLEASE.fifteen thousand two hundred twenty four (talk)19:45, 7 January 2026 (UTC)[reply]
ENGLISHPLEASE says:This is the English-language Wikipedia; discussions shouldnormally be conducted in English. If using another language is unavoidable, try to provide a translation, or get help atWikipedia:Embassy. (emphasis mine)
If the rule prevents non-English speakers from being able to help improve Wikipedia,ignore it.SuperPianoMan9167 (talk)20:38, 7 January 2026 (UTC)[reply]
Athanelar, I don't know how else to say this: This is a huge project, and you've only been editing for two years. There's a lot you've never heard of. For example, I'd guess that you've never heard of the oldWikipedia:Local Embassy system, in which the ordinary and normal thing to do was "typing in another language and having others translate". Just because one editor (any editor, including me) hasn't seen it before doesn't mean that it doesn't happen, or even that it isn't officially encouraged in some corner of this vast place.WhatamIdoing (talk)19:58, 7 January 2026 (UTC)[reply]
Yes, I get what you mean, but I've also seen the contrary plenty of times; people show up to the teahouse or helpdesk and ask questions not in English, and the response is universally "sorry, this is the English Wikipedia"
It just seems needlessly obtuse to say "well, there's technically hypothetically a carveout for occasional non-English participation here, sometimes, maybe" when in practice that really isn't (and shouldn't be) the case.Athanelar (talk)21:14, 7 January 2026 (UTC)[reply]
well, there's technically hypothetically a carveout for occasional non-English participation here, sometimes, maybeThere's always a carveout.SuperPianoMan9167 (talk)21:17, 7 January 2026 (UTC)[reply]
Okay, sure, but IAR can never be used as a justification to not prohibit something, because by that logic we can't forbid anything because IAR always provides an exception.Athanelar (talk)21:21, 7 January 2026 (UTC)[reply]
Yes, editors are sometimes inhospitable and dismissive. Yes, editors sometimes misquote and misunderstand the rules. I could probably fill an entire day just writing messages telling people that they'd fallen into another one of the commonWP:UPPERCASE misunderstandings. It is literally not possible for anyone to know and remember all the rules. Even if you tried to read them all, by the time you finished, you'd have to start back at the beginning to figure out what had changed while you were reading. None of this should be surprising to anyone who's spent much time in discussions. But the fact that somebody said something wrong doesn't prove that the rule doesn't exist. It only shows their ignorance.
The ideal in theWP:ENGLISHPLEASE rule (part ofWikipedia:Talk page guidelines) is for non-English speakers to write in their own language, run it through translation, and paste both the non-English original and the machine translation on wiki. A guideline that says not to use machine translation on talk pages would conflict with that.WhatamIdoing (talk)21:21, 7 January 2026 (UTC)[reply]
I really have an issue with this line of logic, because what doesif using another language is unavoidable even mean? It seems to directly conflict with both itself andWP:CIR
Please use English on talk pages, and also you are required to be able to communicate effectively in English, but if you can't then actually you aren't required and you can just machine-translate it.
Nevermindmy guideline proposal, it sounds like the existing guidelines and norms are already in a quantum superposition on this issue.Athanelar (talk)21:24, 7 January 2026 (UTC)[reply]
CIRalso says:It does not mean we should ignore people and not try to help improve their competence.SuperPianoMan9167 (talk)20:40, 7 January 2026 (UTC)[reply]
@WhatamIdoing, spelling out this scenario has helped me think through some of what I'm seeing in this discussion. I think that a weak point in LLMCOMM, CIR, and similar guidelines is that there are really at least three different broad categories of "editors" who have different needs and interests:
  • People who genuinely want to help build an encyclopaedia and may be in this for the long term ("Wikipedians") – most of our policies and guidelines are written with these editors in mind
  • People who have identified serious problems in specific articles (regardless of whether they're article subjects or have a COI, or are uninvolved) – if there are serious problems that need to be fixed, we need to fix them, and we should be thanking these helpfulnon-Wikipedians, not putting up barriers based on CIR or LLMCOMM
  • People who are here for self-promotion, not to build an encyclopaedia – we have rules and procedures for dealing with these
Trying to address these different types of editors/visitors with one broad brush does a disservice to the well-intentioned non-Wikipedians, as WAID has illustrated.ClaudineChionh(she/her ·talk ·email ·global)21:22, 7 January 2026 (UTC)[reply]
Amazingly I don't think this is said anywhere in LLM PAGs or essays, but we should say somewhere that "Wikipedia does have a steep learning curve and it is very normal for a new editor to struggle. Some learn quicker than others, and people are obligated to be patient with new editors and help them improve." Basically, don't worry if you find it hard. I'd rather something like that replaced "You don't need it"Kowal2701 (talk)21:23, 7 January 2026 (UTC)[reply]
"People are obligated to be patient with new editors and help them improve" =Wikipedia:Please do not bite the newcomers (a behavioral guideline).WhatamIdoing (talk)03:27, 9 January 2026 (UTC)[reply]
Revision 1

Revision 1

[edit]
Note that I have slightly rewritten the "You don't need it" section to focus a bit more on the encouragement, and also to soften the language around English proficiency. @WhatamIdoing @SuperPianoMan9167 et al, is this something more in line with your ideal spirit?Athanelar (talk)21:34, 7 January 2026 (UTC)[reply]
Yes! I'm still somewhat opposed to the general premise, banning all use of LLMs in comments, but that section ismuch better now.
My ideal version of such a guideline would be:
  • Generating comments with LLMs (outsourcing your thinking to a chatbot) isprohibited. You have to be able to come up with your own ideas.
  • Modifying comments with LLMs, such as using them for formatting, isstrongly discouraged. This is due to the risk of the LLM going beyond changing formatting and fundamentally changing the meaning of the comments.
SuperPianoMan9167 (talk)21:43, 7 January 2026 (UTC)[reply]
I think this is a compromise I would absolutely be on board with, if others agree.Athanelar (talk)21:46, 7 January 2026 (UTC)[reply]
I like this a lot.Gnomingstuff (talk)22:21, 7 January 2026 (UTC)[reply]
I do also like this. Many editors say that they used LLMs "only for grammar" while having the kind of issues that only comes with LLM generation (for example, the same vague, nonspecific boilerplate reassurances that can be found almost word-for-word in at least half of the unblock requests I've seen), and others might genuinely not realize that the LLM has completely changed the meaning of their comment behind a facade of "grammar fixes".ChaoticEnby (talk ·contribs)23:04, 7 January 2026 (UTC)[reply]
Revision 2

Revision 2

[edit]

Per the feedback given, I have changed the scope of the proposal. The proposal now:

  • Forbids the use of LLMs to generate user-to-user communication, including to generate a starter or idea that a human then edits. (this clause is added to close the inevitable loophole that would arise from that)
  • Strongly discourages the use of LLMs to review or edit human-written user-to-user communication, and explains that if doing so results in text which appears wholly LLM-generated, then it may be subject to the same remedies as for LLM-generated text

So, LLM-written and LLM-written, human-reviewed communications; not allowed.

Human-written, LLM-reviewed communications; strongly discouraged

Please re-review accordingly and let me know if there's anything else that needs changing or adding before we move to RfC.Athanelar (talk)00:15, 8 January 2026 (UTC)[reply]

The sentence about people unwilling or unable to communicate/interpret/understand feedback etc. should be reworded to the following: People unable to communicate with other editors, interpret and apply policies and guidelines, understand and act upon feedback given to them etc. should ask for help at theteahouse. If you keep the current, the word incompatible should not be linked as the linked page is something on categories and redirects, not related to the linking sentence. In any case, I support the proposal.Nononsense101 (talk)02:39, 8 January 2026 (UTC)[reply]
  • Proposal: "LLM-generated communications in talk pages, deletion discussions,requests for comment and the like should becollapsed and excluded from assessments of consensus".
  • Scenario:
    • A: Hi, I found a serious factual error in the article. Here's a source to show I'm not making this up.
    • B: Collapsed! GPTZero said this is AI generated. Come back when you can right in your own words.
    • A: But I didn'tgenerate this idea; I only used it for copyediting.
    • B: Collapsed again! The rules say Ishould collapse your AI-generated nonsense, so I'm following the rules!
This is looking like the☒N anti-goal to me.WhatamIdoing (talk)00:26, 9 January 2026 (UTC)[reply]
Nobody is arguing that we should treat text as AI generated just because GPTZero says so; this is a strawman. I even have another proposal specifically to address the identification of AI generated text, but that's for another time.Athanelar (talk)00:39, 9 January 2026 (UTC)[reply]
Nobody (here) is arguing that we should trust GPTZero, and I suspect that everybody here has seen editors actually do that, and believe they are completely justified in doing that.WhatamIdoing (talk)03:30, 9 January 2026 (UTC)[reply]
Sure, but if someone quoted my hypothetical guideline to justify collapsing an evidently good-faith, human-written edit request just because GPTZero said it's AI generated, I think any sensible editor seeing that would say it's not a reasonable application of the guideline.
You can't argue against a guideline by taking the worst possible way a person could misinterpret it. It constantly happens that editors accuse other editors of personal attacks because they get told their contribution was bad, does that meanWP:NPA isn't fit for purpose?Athanelar (talk)03:48, 9 January 2026 (UTC)[reply]
For many editors, "GPTZero said it's AI generated"proves that it's not a "human-written edit request". If you don't want that to happen per your proposal, then you need to increase its already bloated (~1800 words) size even more, to tell editors not to believe GPTZero.WP:NPA might be a viable model for this, as it explains both what isand isn't a personal attack, and how to respond to differing scenarios.
I can,and have, since before some our editors were even born, argued against potentially harmful rules by taking the worst possibles way a person could misinterpret it, and then deciding whether that worst-case wikilawyer is both tolerable and likely. Thinking about how your wording might be misunderstood or twisted out of recognition ishow you're supposed to write rules.
This has been known since at least the 18th century, whenJames Madison wrote inFederalist No. 10 that "It is in vain to say, that enlightened statesmen will be able to adjust these clashing interests, and render them all subservient to the public good. Enlightened statesmen will not always be at the helm: Nor, in many cases, can such an adjustment be made at all, without taking into view indirect and remote considerations, which will rarely prevail over the immediate interest which one party may find in disregarding the rights of another, or the good of the whole", and went on to propose a large federal republic as a way of keeping individual liberty (which is a necessary precondition for factionalism) and national diversity (which leads to factionalism through an us-versus-them mechanism) while reducing the opportunity for any one faction to seize power over the others.
I recommend Madison's work on factionalism is to anyone who wants a career in policy writing, but for now, spend a few minutes thinking about how we could adapt Madison's definition of afaction: "a number ofWikipedians...who are united and actuated by some common impulse ofpassionagainst AI...adverse to the rights of otherWikipedians (e.g., to have others focus on the content, instead of focusing on the tools used to write it), or to the permanent and aggregate interests of the community(e.g., to notWP:BITE newcomers or have hundreds of good-faith contributors told they're not welcome and notWP:COMPETENT)."
In the present century, we call this phenomenon things likemisaligned incentives (e.g., editors would rather reject comments on atechnicality than go to the trouble of correcting errors in articles or explaining why it isn't actually an error, but articles need to be corrected, and explanations help real humans), and we address it through processes likedesigning for evil (e.g., don't write "rules" that can be easily quoted out of context; don't optimize processes for dismissive or insulting responses) anduse cases (e.g., How will this rule affect a person who doesn't speak English well? AWP:UPE? A person with dyslexia? An autistic person? A one-off or short-term editor?).
For example:
  • Protect the English language learner by declaring AI-based machine translation to be acceptable.
  • Ignore the UPE's AI use assmall potatoes and block them for bigger problems.
  • Educate anti-AI editors that both human- and AI-based detectors make mistakes, and that these mistakes are more likely to result in editors unintentionally discriminating against editors with communication disabilities.
  • Remind editors toWP:Focus on content, which sometimes means saying "Thanks for reporting the error" instead of collapsing AI-generated comments.
These are only examples; the more important part is to think through each bit yourself.WhatamIdoing (talk)05:10, 9 January 2026 (UTC)[reply]
I do understand your point, and am truly appreciative of the time and effort you're taking to make it. I still have two concerns with it;
  1. The first is bloat; as you've indicated, words are precious in any policymaking effort and the longer people have to read to 'get to the point' the less chance they will. I'm concerned at how much weight should be added to cover things like "it's also possible to make mistakes without AI" that in any case should be assumed by any reasonable audience. It also feels redundant, i.e., AGF and BITE still apply even if I don't explicitly restate them. The existence of a guideline prohibiting AI-generated text is by no means a carte blanche to ignore those other, more fundamental principles.
  2. Given that your primary cause for concern seems to be about collapsing AI-generated comments; well, that already exists asWP:HATGPT, all I'm doing is restating it here. However, on rereading that, I suppose I could (and will) add some language specifying that conversations should not be collapsed if their content proves otherwise extraordinarily useful, which should cover the edge cases you're concerned about, with super-useful AI users and overly anal-retentive wikilawyers.
Athanelar (talk)10:13, 9 January 2026 (UTC)[reply]
@Athanelar, when I'm working on policy-type pages, the definitions inRFC 2119 are never far from my mind. Here are the most relevant bits:
  • SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
  • MAY This word, or the adjective "OPTIONAL", mean that an item is truly optional.
And now let's compare what you wrote vs the text at HATGPT:
In short, you've said that this "SHOULD" normally happen unless someone has carefully considered the situation and decided to make a special exception (e.g., for"extraordinarily useful" comments), and the existing guideline says that this "MAY" happen, but it's strictly optional and not ever required. Can you see the gap between what you're proposing and the existing guideline? If you genuinely believe thatwell, that already exists as WP:HATGPT, all I'm doing is restating it here, then I don't think we're speaking the same language.WhatamIdoing (talk)01:31, 12 January 2026 (UTC)[reply]
Sure, nevertheless, the kind of person you're describing would do what you're saying regardless of whether it's 'should' or 'may', and my entire contention is as to whether it's realistic enough of a concern to affect anything, which I simply doubt that it is. People who fail to AGF won't get a free pass just because 'Athanelar's guideline says 'should' so that means I can collapse whatever I want"
I think the real solution to your concern is to establish a consensus standard for determining what is "AI-generated text" for the purpose of these types of guidelines, whichI already tried beforeAthanelar (talk)10:29, 12 January 2026 (UTC)[reply]
I do think that if someone reads AI-generated comment, and collapses it per your proposal, that theyshould "get a free pass", because they were actually following the guideline to the best of their good-faith ability.
As a minimum, I suggest that when you change "may" optionally to "should" normally, you don't present that as a non-change that isalready enshrined in guidelines. This is a significant change; either own it as being a change, or don't propose it.WhatamIdoing (talk)23:40, 12 January 2026 (UTC)[reply]
Who are these editors who are relying on GPTZero and nothing else? That doesn't describe anyone I'm aware of working on AI cleanup, and it doesn't describe most of what goes on at ANI (the people who bring in GPTZero or whatever tend to be uninvolved participants).Gnomingstuff (talk)14:27, 9 January 2026 (UTC)[reply]

There was a discussion not long ago about LLMs creating spam; seehere. As I said there, I thinkthis is one way to look at it -- we will not be able to detect all uses of LLMs, but if our rules force LLMs to become hard to detect (because they have improved the usefulness of their posts) maybe that's the best outcome we can hope for. I can see why we want to ban LLMs for user communication, and for things like FAC and GAN reviews, but there is no guaranteed way to detect LLM-generated text. Plus I'd argue that in the right hands they are useful. I have used them myself to find problems in articles I have worked on, for example. TL;DR: I am not strongly opposed to a rule like the one suggested here, but I doubt it will be very useful. I don't have a better suggestion, though.Mike Christie (talk -contribs -library)03:56, 8 January 2026 (UTC)[reply]

I wonder how long it will be before attempting a ban is just pointless, either because we can't detect it at all, or because the amount of time spent arguing over whether a comment is prohibited type of AI overtakes the cost of permitting it.WhatamIdoing (talk)00:19, 9 January 2026 (UTC)[reply]
The amount of effort spent arguing whether something is AI-generated is already at times greater than the amount of effort spent determining whether the content of that something is actually problematic.Thryduulf (talk)03:23, 9 January 2026 (UTC)[reply]
Inmy anti-goal scenario above, what's motivating B to ignore the reported error and focus on the communication style? Why is B failing atPostel's law of practical communication? Assume that B would respond in a practical manner if he was realistically able to. Is the problem within B himself (e.g., B is fixated on rule-following, to the point of not being able to recognize that the error report is more important than the method by which the error report was communicated)? Is the problem accumulated pain (this is the 47th one just today, and B's patience expired several comments ago)? Is the problem in our systems (e.g., if B can quickly dismiss a request as AI-generated, then harder work can avoided)?WhatamIdoing (talk)05:28, 9 January 2026 (UTC)[reply]
I have to admit I've never really thought about that before. My gut reaction is to say it's mostly "accumulated pain" with a bit of over-focus in there too: Someone finds and fixes a problem, then they find another similar one and fix that. At some point they realise they've seen a quite a few of these and start looking for others to fix. This becomes an issue if they get overwhelmed by the scale of the issue and/or stop looking at the wider context to see whether it is actually a problem that needs to be fixed.Thryduulf (talk)12:27, 9 January 2026 (UTC)[reply]
It's kind of a mix of:
  • This is the 47th one today, and the majority of the last 46 people were either hostile or uncooperative.
  • AI copyedits go much farther than "normal" copyedits do in terms of rewriting meaning -- they're more akin to line edits -- but AI companies do not always make it clear how much they're changing. So when someone hears "I just used it for copyediting," they're inclined to distrust that.
  • A subtler one - there are at least some cultural differences in what "copyediting" entails and how transformative it is. I don't know all of these butI know American copyeditors and British subeditors have different expectations of the job.
  • In reality, this kind of conversation does not usually begin with "Hi, I found a serious factual error in the article. Here's a source to show I'm not making this up," it begins with a wordy behemoth full of AI platitudes. But even those -- at least on article talk pages -- often don't result in that because many editors watching individual articles aren't aware that AI is even a thing (still). Where this conversation usually happens is someone asking a question, and receiving an AIreply.
Gnomingstuff (talk)14:30, 9 January 2026 (UTC)[reply]
  1. Plus there's another 103 to go, and it always feels like "I" am the only person doing this (because there's no way to know how many people have already checked the thing that I'm checking now).
  2. How much an AI copyedit changes shouldn't be visible to the other talk page participants.
  3. I love that link, and I hope someone gets her book and improvesCopyediting with it. I wonder if British editors are more irritated by AI 'copyediting' for tone/voice reasons.
  4. That's a testable claim. Here's the first five recently edited talk pages that contain aTemplate:Collapse AI top:Talk:Kristi Noem#Semi-protected edit request on 8 January 2026 (2),Talk:Primerica#Request for balanced description reflecting regulator sources + independent media sources,Talk:Scott Wiener#Wiener’s CLJC role warrants a stand-alone section,Talk:Pythagorean triple, andTalk:2026 Tamil Nadu Legislative Assembly election#Major cleanup required. In all five cases, it was either the first comment in the section or a later reply from the editor who started the section.
WhatamIdoing (talk)19:27, 9 January 2026 (UTC)[reply]
Not sure these are great examples:
  • Kristi Noem seems like a standard "why is Wikipedia mentioning this public scandal, it must be politically biased" type comment, so the good outcome was already off the table
  • Primerica seems like a standard "why is Wikipedia mentioning this negative coverage of a company instead of promoting it?" type comment, which, same
  • Scott Wiener comment hatted four months after it was posted; there was a substantive discussion before then
  • Pythagorean triple thread began with a wordy platitude behemoth yet still was not hatted until several comments deep (after the LLM had already provided low-quality sources when asked about them)
  • Hard to tell what's going on in the 2026 Tamil Nadu Legislative Assembly election thread but it looks like there wassome backstory and at least one block prior to the comment
Gnomingstuff (talk)03:25, 10 January 2026 (UTC)[reply]
I'm not saying that none of these deserved to be hatted. I'm saying that they're not evidence supporting the claim that what "usually happens is someone asking a question, and receiving an AIreply".
You canrepeat the search yourself if you'd like, and pick a different sample set. It's sorted to have the most recently edited Talk: pages containing that template at the top (e.g., /Archive pages if an archive bot just did its daily run – it's most recent edit of any kind, not specifically the most recent addition of the template).WhatamIdoing (talk)05:53, 10 January 2026 (UTC)[reply]
As a Polish native speaker, my English is strong, but it is not at a native level. I can easily understand written and spoken English, but expressing myself in English - especially writing comments - is much harder for me than it is for native speakers. Banning the use of machine translation tools (which increasingly rely on LLMs) to edit messages would be exclusionary and would push people like me out of discussions about building the encyclopedia, even when we can judge whether a translation faithfully conveys what we meant.
This is even more pronounced for non-native speakers with dyslexia. Without tools that help with grammar and punctuation, even strong substantive arguments can come across as weaker or less persuasive - not because the reasoning is bad, but because the English reads poorly.Grudzio240 (talk)10:49, 12 January 2026 (UTC)[reply]
Myself and many others have expressed and will continue to express that a poorly-worded non-native comment will always read better and be stronger and more persuasive than a comment which reads like LLM generation. In the former we can at least be confident that the ideas and convictions presented are your own, whereas in the latter we have no means to differentiate you from any of the other people that generate some boilerplate slop and call it a day.Athanelar (talk)10:55, 12 January 2026 (UTC)[reply]
I understand the concern: when a comment reads like it was generated, it’s harder to trust that the wording reflects the editor’s own thinking. That said, as machine translation and AI-assisted editing become more common - and harder to detect - “imperfect English” will increasingly become a marker that singles people out. In practice, that can discourage non-native speakers (and people with dyslexia) from participating, even when their underlying points are solid. I think the better approach is to focus on the substance and evidence, and allow limited language assistance (especially translation), while still discouraging using LLMs togenerate arguments or positions.
Also, reactions to AI-assisted text vary a lot. Not everyone reacts negatively to AI-assisted wording, and I don’t think policy should be optimized for the most suspicious readers. If the content is clear and sourced, that should matter more than whether the phrasing “sounds too polished”.Grudzio240 (talk)11:08, 12 January 2026 (UTC)[reply]
Regardingthat should matter more than whether the phrasing “sounds too polished”, the fact is that this isn't the most glaring sign of AI writing. We've seen many editors write in a quite refined way, without being suspected of using LLM assistance, as LLMs will overuse specific kinds of sentence structures (e.g.WP:AISIGNS). This is very different from the pop-culture idea of "anyone who uses refined or precise language will sound like a LLM".
As these get associated with people using these tools to generate arguments completely divorced from policy, they get picked up as cues that make readers tune out from the substance of arguments, and end up hurting the non-native speakers they hope to help. As a non-native speaker myself, I might worry about flawed grammatical structures here and there, but I would much prefer that to other editors reading my answers with immediate suspicion due to obvious AI signs.ChaoticEnby (talk ·contribs)11:17, 12 January 2026 (UTC)[reply]
Defaults treating comments that have “AI SIGNS” as suspicious may undermineWikipedia:Assume good faith. "Unless there isclear evidence to the contrary, assume that fellow editors are trying to improve the project, not harm it." We should start by evaluating the content; proceed to distrust only when the content or behavior indicates a real problem.Grudzio240 (talk)11:38, 12 January 2026 (UTC)[reply]
The entire conceit of this guideline is that AI-generated comments are problematic in and of themselves. If the guideline said "ignore whether the comment is AI generated and just assess whether it violated any other policy or guideline" then it would be a pointless guideline. Obviously AI-generated comments which violate other PAGs are already forbidden -- because they violate other PAGs. The point of this is to forbid AI generating comments regardless of whether their content breaks any other PAG (obviously subject tothe usual exception)Athanelar (talk)11:42, 12 January 2026 (UTC)[reply]
I wonder how many editors agree thatallAI-generated comments are problematic in and of themselves.WhatamIdoing (talk)23:43, 12 January 2026 (UTC)[reply]
Well, thethe close of the VPP discussion emphasized that“The word ‘generative’ is very, very important” and that“This consensus does not apply to comments where the reasoning is the editor's own, but an LLM has been used to refine their meaning… Editors who are non-fluent speakers, or have developmental or learning disabilities, are welcome … [and] this consensus should not be taken to deny them the option of using assistive technologies to improve their comments.”WP:AITALK was written on the basis of that discussion, and it seems to rest on exactly this distinction: LLMs should not be used to generate the substance of user-to-user communication (i.e., the arguments/positions themselves), but meaning-preserving assistance e.g. translation or limited copyediting where the editor’s reasoning remains their own) was explicitly not the target of that consensus.Grudzio240 (talk)11:58, 13 January 2026 (UTC)[reply]
The core issue is that AI signs are heavily correlated with fully AI-generated arguments, themselves usually detached from policy.AGF is not a suicide pact, and editors used to the preponderance of flawed AI-generated arguments (compared to the few meaningful arguments where AI has only played a role in translation/refinement) might discount all as falling in the former category. This is magnified by many editors choosing to defend clearly abusive uses of AI (for example, adding hallucinated citations) as only using it to refine grammar or correct typos, even when that manifestly wasn't the case.ChaoticEnby (talk ·contribs)12:10, 12 January 2026 (UTC)[reply]
For approximately the gazillionth time, saying that text is likely to have been generated by AI says nothing about good faith or bad faith. It is pointing out characteristics of text. By this logic, adding a "copyedit" or "unreferenced" tag is assuming bad faith.Gnomingstuff (talk)17:20, 12 January 2026 (UTC)[reply]
Yes, but it's just a short step from "I think he used AI" to Chaotic Enby's "heavily correlated with fully AI-generated arguments" to "This person is just wasting my time with fake arguments and has no interest in helping improve Wikipedia".WhatamIdoing (talk)23:45, 12 January 2026 (UTC)[reply]
That is very much assuming bad faith in what I said – I'm not saying that we should discount comments on that basis, only that some editors will do it, and I was explaining the source of that distrust rather than defending it.ChaoticEnby (talk ·contribs)00:07, 13 January 2026 (UTC)[reply]
I agree with you: Some editors will do it.WhatamIdoing (talk)00:28, 13 January 2026 (UTC)[reply]
I know this is not entirely whatyou are arguing, but this kind of LLM-nihilist stance I keep hearing like "why do you care how it was made if the content is policy compliant?" seems patently absurd to me. If the only thing we care about is the substance and not who made it or how we might as well use Grokipedia. It's rather like presenting someone with a dish ofOrtolan and saying "if it tastes good, why concern yourself with the ethical implications of its production?" The ends simply do not justify the means, in my mind.Athanelar (talk)11:33, 12 January 2026 (UTC)[reply]
I think there are huge differences in values that people have, from a cultural or philosophical perspective. Some people see AI as inherently evil and some just see it as a tool. If there was an end product of English Wikipedia, something we were actually trying to finalise one day, it seems silly to the faction that sees AI as a tool to make humans do the job of machines. You get the same thing either way, we're just making it harder on ourselves. They see no value in human labour over machine labour. The means don't need to be justified because they don't see anything wrong with them. People who prefer human output do. This is especially true in the larger context of modern society, where the system requires people to work even when there's no work to be done. If a machine is doing your job for you, that doesn't mean you don't have to work, it means you have to create a need for yourself. If robots are writing encyclopaedias, that's just another existing need filled.~2026-24291-5 (talk)12:13, 12 January 2026 (UTC)[reply]
You're missing the third camp of "AI is a tool, but a flawed one". Using AI as a tool to write an encyclopedia would work in theory, and might be a very real possibility in the future, but has shown its current limits, and regulating it is necessary to address those immediate concerns, rather than for more abstract philosophical reasons.ChaoticEnby (talk ·contribs)12:18, 12 January 2026 (UTC)[reply]

I would vote no on any guideline that purports to tell editors what technology they can and can't use to draft communications. It's none of our business. You don't like somebody's writing style? Too bad; don't read it. It doesn't matter if it was generated by ChatGPT or polished by Grammarly or if it's just bad writing: we can judge people for thecontent of their posts (e.g.,WP:NPA,WP:NOTFORUM,WP:BLUDGEON, etc.), but not for the tools they use to draft that content. Also, if you've been active on Wikipedia for 3 months, maybe you don't try to write a new guideline that purports to tell everybody else what tools they can and can't use to communicate on Wikipedia. If most of your edits are about trying to fight against LLM use, you might beWP:RGW instead ofWP:HERE.Levivich (talk)17:59, 12 January 2026 (UTC)[reply]

We can and already do judge people for the tools they use to edit: that is why we havea bot policy, for example, or limitations on fast tools such asAWB. In these cases, the reason is the same as the proposed reasons to limit AI-generated writing. Namely, the potential for fast disruption at scale: someone can generate 50 proposals in a few minutes, leaving other editors in need of a disproportionate effort to address them all – or leave the unread ones to be accepted assilent consensus, as no one will take the time to analyze 50 different proposals in detail.
Additionally, it isn't necessarily helpful to say thatif you've been active on Wikipedia for 3 months, maybe you don't try to write a new guideline, as newcomers can absolutely learn fast and have worthy insights – especially as you wish to judge others for the content of their posts.ChaoticEnby (talk ·contribs)18:44, 12 January 2026 (UTC)[reply]
Has anyone posted 50 proposals in a few minutes? Or even 10 long talk-page comments in a few minutes?WhatamIdoing (talk)22:55, 12 January 2026 (UTC)[reply]
This isn't a new guideline. It's a refinement ofWP:AITALK, which has existed for an entire year now. TheRfC that produced the guideline was closed stating in part (boling mine):There is astrong consensus that comments that do not represent an actual person's thoughts are not useful in discussions. Thus, if a comment is written entirely by an LLM, it is(in principle) not appropriate. The main topic of debate was the enforceability of this principle..
So, if you wanted to "vote no," on that, you should have done that one year ago, which you didn't.Gnomingstuff (talk)18:44, 12 January 2026 (UTC)[reply]
Sorry, but both of you missed what I was saying. Re CE: I didn't sayto edit, I saidto draft communications, and our existing bot policy already prohibits spam (as you point out). Re GS:WP:AITALK is about thecontent--the output, what gets published on this website--not about themethod. AITALK doesn't say editors can't use AI to start or refine their posts, or to copyedit or fix grammar. Any proposed guideline that says anything likeThis prohibition includes the use of large language models to generate a 'starter' or 'idea' which is then reviewed or substantially modified by a human editor. orEditors are strongly discouraged from using large language models to copyedit, fix tone, correct punctuation, create markup, or in any way cosmetically adjust or refactor human-written text for user-to-user communication. would draw an oppose vote from me. Re both: how long before the community thinks repeated anti-LLM RFCs are a bigger problem than the use of LLMs on Wikipedia? Be judicious, mind the backlash, note the difference between the LLM proposals that have passed, and the ones that have failed. (Hint: the super-anti-AI proposals are the ones that have failed. The ones that allow use within reasonable boundaries have passed.)Levivich (talk)18:52, 12 January 2026 (UTC)[reply]
The main problem with those RfCs is that the stricter proposals get shot down by editors wanting reasonable boundaries, and the more lenient proposals get shot by "all-or-nothing" editors. Given that, and the speed at which the technology advances, it isn't surprising that we are often discussing these issues – especially since recent proposals have been closed with consensus that the community wantssome regulation but disagreed on the exact wording proposed. In that regards, the disruption doesn't come from the RfCs themselves, but from the inability of editors on both sides to compromise.
Additionally, we also regulate what someone may do to draft communications, withproxy editing being the best example – if we can disallow proposals coming from a banned user, we can disallow proposals coming from a tool that has repeatedly proven disruptive.ChaoticEnby (talk ·contribs)19:09, 12 January 2026 (UTC)[reply]
Re proxy edits, that is not regulating the technology used to draft communications. We don't tell people what word processor to use, or whether they can use a typewriter, or which spellchecker to use, etc. etc. This proposed guideline would be a first in that sense, and I believe is doomed for that reason.
As to the main problem with the RFCs, yes, I agree with you, but does this proposed guideline look like any kind of compromise? It's proposing rules that are stricter than the rules we havefor mainspace (for Pete's sake!), and it's still trying to do the thing that the community has repeatedly said no to, which is to stop or "strongly discourage" all or almost all use of LLMs (as opposed to just "bad" use of LLMs). The drafter, in comments above, below, and elsewhere, is very transparent that the goal of the proposed guideline is to get people to stop using LLMs (as opposed to getting them to use LLMs correctly rather than incorrectly).
I'll say again the same the thing I said about the last doomed RFC: hey, go ahead and run it, maybe I'm wrong and it'll get consensus, or maybe the next one will :-P
But really, CE, you've been around long enough to know what's up, I think you know I'm right... the reason your proposal at WT:TRANSLATE is on its way to passing is because that was a good proposal that compromised and is obviously responsive to community concerns from other RFCs (btw, great job there!). This proposal is not like that, it's almost the opposite in its stubbornness.
And I know you've personally put in a lot of time and effort into trying to get a handle on increased LLM usage on Wikipedia, what with the WikiProject and all, and I hate to see those productive efforts get sunk because we (collectively) aren't being clear enough to the hard liners in saying: "No. Stop trying to stop everybody from using LLMs, it's counterproductive." Because right now, NEWLLM is still laughably short, and it's not getting any better, because we're wasting time on uncompromising proposals like this one, instead of on compromise proposals like the translation one. And, frankly, it's because people who have no experience building consensus are being allowed to drive the bus, and are driving it off the road, rather than deferring to people who do know how to build consensus (like you).Levivich (talk)21:06, 12 January 2026 (UTC)[reply]
Yep, I think we agree on the broad strokes here. I still respectfully disagree that proxy edits are that far away from using ChatGPT to generate an argument from scratch (as in both cases, you're delegating the thoughts to someone/something else), but the crux of the issue isn't a specific policy detail, but the fact that compromises end up being overshadowed by more hardline proposals on which a consensus can't realistically be reached.ChaoticEnby (talk ·contribs)21:42, 12 January 2026 (UTC)[reply]
I am explicitly open to compromise here. Iwant people to propose compromises that they find acceptable. I have already changed my initial proposal in response to one such compromise. I know you know that, I just want to put it out there.Athanelar (talk)21:52, 12 January 2026 (UTC)[reply]
You alsoexplicitly said that you want an"extreme", maximalist rule. Demanding everything and then taking as much as you can get in practice isn't exactly the same as compromising.WhatamIdoing (talk)23:13, 12 January 2026 (UTC)[reply]
The problem with allowing starters or ideas generated by AI is because first it permits an unfalsifiable loophole ("My comment isn't subject to this guideline because it's not AI generated, I just used AI to tell me what to say and then reworded it") and second, while the style of AI-generated posts is certainly problematic, another problem (as addressed in my guideline)is the content, and generating a starter with AI means the idea is stillnot yours but is rather the AI's, which is the whole thing this guideline aims to address.
If the AI tells someone to wikilawyer by citing a nonexistent policy or misapplying one that does exist, it doesn't matter if they do it in their own words or not.
So the point is to say that the ideas need to be your own, not just the presentation thereof.
As forhow long before the community thinks repeated anti-LLM RFCs are a bigger problem than the use of LLMs on Wikipedia? to take a page from your own book in dismissing one's interlocutor; perhaps a person who is not active in the constant organised AI cleanup efforts doesn't have the best perspective on how much of a problem LLMs are.
I really encourage you to take some time and tackle one of the tracking subpages atWP:AINB some time. Take a look atthis one of a user who generated 200+ articles on mainspace wholesale using AI with no review or verification and tell us again how the people trying to fight the fire are the real problem because they're getting everything wet in the process.Athanelar (talk)19:10, 12 January 2026 (UTC)[reply]
Yeah, ironically,If most of your edits are about trying to fight against LLM use, you might be WP:RGW instead of WP:HERE is closer toactually assuming bad faith than anything people doing AI cleanup have been accused of.Gnomingstuff (talk)21:04, 12 January 2026 (UTC)[reply]
...which is the whole thing this guideline aims to address. Yes, that's the problem, in my view: you are trying to address something Wikipedia has absolutely no business to address, which is what technology people use to communicate. As has been pointed out by others above, there is, first and foremost, the accessibility issues and the issues for non-native English speakers (like me btw). But beyond that, how a human being gets from a thought in their head, to a policy-compliant non-disruptive comment posted on Wikipedia, is none of our (the community's) business. It doesn't matter if they use a typewriter or what spellcheck or Grammarly or an LLM. If the output is not disruptive--if it's not bludgeoning or uncivil, etc.--we have no business telling an editor what technology they can and can't use to generate that output. (And btw if you think 200+ bad articles is a lot, lol, we've had people generate tens of thousands of bad articles, redirects, etc., without using LLM, and that's happened for the entire history of Wikipedia--we still never banned people from using scripts or bots, despite the fact that they've been abused by some, and with much worse consequences that what's being reported at AINB).Levivich (talk)21:15, 12 January 2026 (UTC)[reply]
we still never banned people from using scripts or bots, But... we do? As pointed out before, we absolutely do restrict what technology people use to edit. You need express permission to operate a bot because of the potential for rapid, large-scale disruption.
You cannot seriously compare an LLM to a word processor or typewriter. Neither of those things is capable of wholesale generating a reply without any human thought behind it.Athanelar (talk)21:21, 12 January 2026 (UTC)[reply]
WP:MEATBOT comes to mind.SuperPianoMan9167 (talk)21:22, 12 January 2026 (UTC)[reply]
We don't require permission to use a script. You don't need permission to use theWP:API. What is regulated is theoutput--specifically, BOTPOL and MEATBOT prevent unauthorizedbot-like editingregardless of whether a script is actually used or not. It's the effect, not the method, that's regulated (in fact, the effect is regulated the same way -- bot or meat -- regardless of the method!). And yes, I am absolutely comparing LLMs to the pen, the typewriter, the word processor, the spellchecker, the grammar checker, autocorrect, predictive text, etc. It's just the latest technological advance in writing tools. And LLMs are not capable of generatinganything "without any human thought behind it"; they require prompts, which require human thought, and their training data is a bunch of human thought.Levivich (talk)22:49, 12 January 2026 (UTC)[reply]
Sure, but that's like arguing that paying someone to do your homework is materially the same as if you did it yourself, because you still had to describe the task to somebody else and then they still came up with an answer. You must know you're splitting hairs by now.Athanelar (talk)22:53, 12 January 2026 (UTC)[reply]
Maybe hiring a secretary to write a letter on your behalf would be a more relevant analogy: Bob Business tells his secretary to send a letter saying he accepts their offer to buy 1,000 widgets but wants to change the delivery date slightly. He glances over the letter, decides that it makes the points that he wanted to communicate, and signs it before mailing it.
Do you think the typical recipient of that letter would be offended to discover that Bob didn't choose every single word himself? Is the recipient likely to believe that the facts communicated did not represent Bob's own thoughts?WhatamIdoing (talk)23:33, 12 January 2026 (UTC)[reply]
That analogy only makes sense if you assume AI never makes up new arguments, and that it is only ever used to clarify existing thoughts that have been communicated in the prompt, rather than something like "please write me an unblock request". In the latter case, the fact that the substance of the unblock request isn't an original thought (but only the request to write one) is problematic, as we can't evaluate whether or not the blocked user properly understands the issues. That specific case is very much not theoretical, as around half of unblock requests have strong signs of LLM writing.ChaoticEnby (talk ·contribs)23:42, 12 January 2026 (UTC)[reply]
That analogy makes lots of sense, if you've ever worked with (or been) a human secretary.
(You might be interested in reading about the students who used AI to apologize after getting caught using AI to cheat,[1][2] if you haven't already seen it.)WhatamIdoing (talk)23:55, 12 January 2026 (UTC)[reply]
The problem is that this analogy is very far removed from the actual situations we're facing, and makes it harder to talk about them in precise terms. In one case, you're having a secretary playing a purely functional role of transmitting a message and helping convey thoughts to an interlocutor, possibly adding some context of their own. The key task is to transmit the information, and using a secretary (or AI) to do it makes sense. On the other hand, an unblock request aims to show that the blocked user has some level of understanding of the situation. If a secretary (or AI) writes the unblock request, with the blocked user having only told them "write me an unblock request", then the unblock request fails at its purpose.ChaoticEnby (talk ·contribs)00:12, 13 January 2026 (UTC)[reply]
But how do we know what the prompt was? If the prompt was "write me an unblock request" and that's it, then your point holds true. But what if the prompt was "write an unblock request that says [user's own understanding]"? Like, for example, "write an unblock request that says I lost my cool and said something I shouldn't have and in the future I'll be sure to walk away from the keyboard when things get too heated and also I'm going to avoid this topic area for a while"? Could you tell what the prompt was based on the output? I don't think so...Levivich (talk)00:25, 13 January 2026 (UTC)[reply]
We don't know what the prompt was exactly, but we can get some strong indications when the user leaves unfilled phrasal templates, or apologizes for nonexistent issues completely unrelated to their behavior, or only writes generic, nonspecific commitments that could apply to literally any unblock request. In many of these cases (and, again, these are a large proportion of the unblock requests I'm seeing), I'd probably be even more worried if the prompt came from the user's own "understanding".ChaoticEnby (talk ·contribs)00:56, 13 January 2026 (UTC)[reply]
The unblock process might fail at its intended purpose, but that's entirely within the realm of normal secretary behavior. Have you never read tales like thehttps://www.snopes.com/fact-check/the-bedbug-letter/? Or heard stories about secretaries who make sure that the boss always remembers to buy a present for his wife's birthday, send flowers on their wedding anniversary, and so forth?
In the end, I think that it might make more sense for us to re-design the unblock process (to make it more AI-resistant) than to tell people they shouldn't use AI. Maybe a series of tickboxes, setting up a sort of semi-customizable contract? "▢ I agree that I won't put the wordpoop in any more articles" or "▢ I agree that I won't write long comments on talk pages" or whatever.WhatamIdoing (talk)00:37, 13 January 2026 (UTC)[reply]
Regarding that latter point, you might want to take a look at theWikipedia:Unblock wizard, which is a step in that direction.ChaoticEnby (talk ·contribs)00:50, 13 January 2026 (UTC)[reply]
Yes, I remember you starting that project. I think a wizard would be the right approach for a new, AI-resistant unblocking request model.WhatamIdoing (talk)00:56, 13 January 2026 (UTC)[reply]
To note: last time someone generated tens of thousands of redirects, we had to createa whole new speedy deletion criterion for it. More generally, there have been many discussions on article creation at scale (the otherWP:ACAS) and attempts at building a framework to regulate it. So, while I don't disagree that we can't control everything, the issue of disruption at scale isn't new to Wikipedia, and efforts to address it aren't new either.ChaoticEnby (talk ·contribs)21:45, 12 January 2026 (UTC)[reply]
Yeah, we did that this time, too, and kudos to the community, it got toWP:G15 much faster than it took to get toWP:X1. But you know what we didn't do about the redirects or sports articles? Prohibit, or try to prohibit, people from using scripts or templates or bots, etc. We never went after the technology that made that spam possible, we went after the editors who did the spamming, and made new tools to efficiently deal with the spam (csd's). And those were 100,000-page problems; whereas this is like thousands of articles? (How many G15s have there been so far? I see 46 in the logs in the last two days.) So like an order or two orders of magnitude less? And our response, or some folks' response, has been an order or two orders of magnitude stronger.Levivich (talk)22:41, 12 January 2026 (UTC)[reply]
More to the point: We didn't try to "Prohibit, or try to prohibit"everyone else "from using scripts or templates or bots, etc." just because a few people abused those tools.WhatamIdoing (talk)23:20, 12 January 2026 (UTC)[reply]
But I never said we should prohibit anythingentirely, just have a framework to regulate it. Which is exactly what we've done with bots (throughWP:BRFA), with mass creation of articles and redirects (through draftification and new page patrolling), etc.ChaoticEnby (talk ·contribs)23:22, 12 January 2026 (UTC)[reply]
You: "I never said we should prohibit anythingentirely".
Proposal: "Editorsare not permitted to use large language models to generate user-to-user communications" (emphasis in the original)
One of These Things (Is Not Like the Others).WhatamIdoing (talk)00:10, 13 January 2026 (UTC)[reply]
As you might have noticed, I did not write the proposal.ChaoticEnby (talk ·contribs)00:12, 13 January 2026 (UTC)[reply]
The main worry I have with AI is that it is much more widely distributed. We don't have a few editors who can be blocked to get rid of the spamming, but tools that have been causing issues in the hands of a much broader range of editors, mostly because, sadly, many of them don't know how to use it responsibly. Banning the tool entirely is too harsh, blocking individual editors doesn't solve the underlying problem, meaning we're in this problem zone where it's hard to craft good policy.
G15 is for the most extreme, blatant cases, butCategory:Articles containing suspected AI-generated texts contains nearly 5000 pages, whileCategory:AfC submissions declined as a large language model output adds another 4000, just from the last 6 months. With all the smaller tracking categories, plus the expired drafts, we're easily above 10,000 pages.ChaoticEnby (talk ·contribs)23:20, 12 January 2026 (UTC)[reply]
I agree that we're in a difficult place. I don't like the idea of Wikipedia appearing to be AI-generated (even if it's not). I don't like the idea of Wikipedia having the problems associated with AI-generated content (including, but not limited to, factual errors).
But if:
  • We can't accurately detect/reject AI-generated content before it's posted
  • Many people believe that it's normal, usual, and reasonable to use AI tools to create the content they need for Wikipedia
  • The individual incentives to use AI (e.g., being able to post in a language you can barely read; being able to post an article quickly) exceed the expected costs (e.g., the UPE's throwaway account may get blocked)
then I think that having a rule, or even having an ✨Official™ Policy🌟, will not change anything (except maybe making our more rule-focused editors even angrier, which is not actually helpful).WhatamIdoing (talk)00:27, 13 January 2026 (UTC)[reply]
@Gnomingstuff I think it would help readers if the summary also reflected an important nuance from the earlier RfC: it explicitly carved out cases where the reasoning is the editor’s own and an LLM is used only to refine meaning (e.g. for non-fluent speakers or users with disabilities).This consensus does not apply to comments where the reasoning is the editor's own, but an LLM has been used to refine their meaning... Editors who are non-fluent speakers, or have developmental or learning disabilities, are welcome ...
The current proposal seems materially more restrictive than that consensus, because it prohibits even “starter/idea” use and goes further by strongly discouraging copyediting/tone/formatting with LLMs:
Editors arestrongly discouraged from using large language models to copyedit...
If the intent is to align with the earlier consensus, it may be worth explicitly stating that assistive uses that don’t outsource the editor’s reasoning (especially accessibility/translation-adjacent cases) are not what the guideline is trying to discourage.Grudzio240 (talk)09:25, 13 January 2026 (UTC)[reply]
and i have one more concern about the “copyedit/tone/formatting” section: it reads as shifting the downside risk onto the editor in a way that can chill legitimate assistive use. The proposal first strongly discourages even cosmetic LLM assistance, and then says that editors who do so “should be understanding” if their LLM-reviewed comment “appears to be LLM-generated” and is therefore subject to collapsing/discounting/other remedies.
Editors who choose to do so despite this caution … should be understanding if their LLM-reviewed comment/complaint/nomination etc. appears to be LLM-generated and is subject to the remedies listed above.
That framing seems to pre-emptively validate adverse outcomes based on appearance (“looks LLM”) rather than on whether the editor’s reasoning is their own. If the intent is accessibility/meaning-preserving assistance to remain acceptable, it may be worth rewording this to avoid implying that a “looks LLM” judgment is presumptively correct, and explicitly protect meaning-preserving copyedits/formatting from being treated as fully LLM-generated.Grudzio240 (talk)09:31, 13 January 2026 (UTC)[reply]
Revision 3

Revision 3

[edit]

Updated based on the ongoing feedback. Relevant diffs:[3][4] Please review and give feedback, particularlyLevivich andWhatamIdoingAthanelar (talk)15:42, 13 January 2026 (UTC)[reply]

Like the idea of the different prompt examples. That said, if someone is writingI understand that edit warring and insulting other editors was disruptive, and that in the future I plan to avoid editing disputes which frustrate me in that way to prevent a repeat of my conduct, and that I am willing to accept a voluntary 1RR restriction if it will help with my unblock., it seems like they could just... say that, instead, without AI, and that doing so would be more likely to produce a positive outcome.Gnomingstuff (talk)15:52, 13 January 2026 (UTC)[reply]
I have added the explanatory paragraph "These are examples of a prompt that would result in an obviously unacceptable output and a prompt that would result in a likely acceptable one, to act as guidance for editors who might use LLMs. They should not be taken as a standard to measure against, nor is the prompt given necessarily always going to correlate with the acceptability of the output. Whether or not the output falls afoul of this guideline depends entirely on whether it demonstrates that it reflects actual thought and effort on the part of the editor and is not simply boilerplate."Athanelar (talk)16:28, 13 January 2026 (UTC)[reply]
I fully support it as is. And to address@Gnomingstuff:, it is an example of an acceptable prompt.Nononsense101 (talk)16:20, 13 January 2026 (UTC)[reply]
I appreciate the effort but I'm probably not the best person to give feedback given I think (1) there shouldn't be a new guideline at all (Wikipedia needs fewerWP:PAGs, not more); (2) there shouldn't be a new guideline about "LLM communication" (as opposed to about LLM use in mainspace or LLM translation); (3) "Large language models are unsuited for and ineffective at accomplishing this, and as such using them to generate user-to-user communication is forbidden." is a deal breaker for me, in principle (I don't agree it's ineffective or unsuited or that it should be forbidden); (4) I do not support "a prohibition against outsourcing one's thought process to a large language model"; (5) I do not support "Editors are not permitted to use large language models to generate user-to-user communications"; (6) I do not agree with "It is always preferable to entirely avoid the use of LLMs and instead make the best effort you can on your own"; (7) the entire section "Large language models are not suitable for this task" is basically wrong, including "Large language models cannot perform logical reasoning" (false/misleading statement, they do perform some logical reasoning); and (8) I disagree with the entire section "Anything an LLM can do, you can do better". This is a guideline that says, in a nutshell, LLMs are bad and you shouldn't use them, and since I think LLMs are good, and people should use them, I don't think we're going to find a compromise text here. For me. But I'm just one person.Levivich (talk)18:34, 13 January 2026 (UTC)[reply]
Understandable. So long as your disagreements are ideological and not "there's a fundamental contradiction" or the likes, that's still a good indication for me that I'm in the right direction. Much appreciated.Athanelar (talk)18:44, 13 January 2026 (UTC)[reply]
I don't think this is anideological disagreement. (Some proponents of a ban on LLMs may be operating from an ideological position; consider whatEric Hoffer about movements rising and spreading without a God "but never without belief in a devil". AI is the devil that they blame for many problems.) I do think that as someone hoping to have a successfulWP:PROPOSAL, it's your job to seek information about what the sources of disagreement are, and to take those into account as much as possible, so that you can increase your proposal's chance of success (which I currently put at rather less than 50%, BTW). Feedback is a gift, as they say in software development.
For example:
  1. Levivich expresses concerns about the proliferation of new guidelines. There have been several editors saying things like that recently. Do you really, really, reallyneed a{{guideline}} tag on this? Maybe you should consider alternatives, like putting it in the project namespace and waiting a bit to see if/how editors use it.
  2. He wonders whether a new guideline against "LLM communication" should be prioritized over AI problems in the mainspace. What are you going to say to editors who look at your proposal and say that it's weird to advocate for a total ban on the Talk: pages, when it's still 'legal' to use AI in the mainspace? You don't have to agree with him, but you should consider what he's telling you and think about whether you can re-write (or re-schedule) to defend against this potential complaint.
  3. Your statement that "Large language models are unsuited for and ineffective at accomplishing this" is a claim of fact (getting us back to that ideology power word: opponents of LLMs are entitled to their own opinions, but not to their own facts). Are LLMsreally unsuited and ineffective? Can you back that up with sources? Does it logically follow from "success depends on the ability of its participants to communicate" that a tool helping people communicate isalways going to be ineffective at accomplishing our goals?
    • What if the use of AI in a particular instance is "Dear chatbot, please re-write the following profanity-laced tirade so that it is brief and polite, because I am way too angry to do this myself"? Does that interfere with the goal of "civil communication"? Or would that use of a chatbot actually improve compliance with ourWikipedia:Civility policy? Is it really true that "Anything an LLM can do, you can do better" – right now?
    • What if the use is a newbie who is pointing out a problem and who used an LLM to try to present their information as "professionally" as possible? What I'm seeing in my news feed is that Kids These Days™ aren't doing so well with reading and writing in school. Does trying to communicate clearly interfere with our goals of reaching consensus, resolving disagreements, and finding solutions?
    • What if the realistically available alternatives are also less than ideally effective? You've added a paragraph about dyslexia and English language learners (thank you), but how is the average editor supposed to know whether the person has a relevant limitation? For comparison, many years ago, we briefly had an editor whotypedallthewordstogetherlikethis and said that pressing the space bar was painful due toRepetitive strain injury, which he thought we should accept on talk pages as areasonable accommodation for his disability. I never have been able to decide whether he had a surprisingly inflated sense of entitlement or if it was a piece ofperformance art, but wesent him on his way with a recommendation to look into speech-to-text software. Thinking back, I'd have preferred that he used an LLM to what he was doing. It would have been more effective at supporting communication than what he was doing. But: If he was here today, and used an LLM today, how would the other editors know that he had (in his opinion) a true medical reason for using an LLM? More importantly, if LLMs are effective for those groups of people, does that invalidate the factual claim that LLMs are "unsuited for and ineffective at" discussions?
You should go through the rest of Levivich's feedback and see whether there is any adjustment you can make that might reduce the likelihood that anyone else would vote against your proposal on the same grounds. Can you re-write it to be less strident? Less absolute?
Or take the opposite approach: Write an essay, and tell us how you really feel. Don't say "The substance of this guideline is a prohibition against outsourcing one's thought process to a large language model"; instead say something like "Whenever I see LLM-style comments on talk pages, I feel like I'm talking to a machine instead of a human. I worry that if you aren't writing in your own words, you won't read or understand my reply. I worry that if you're misunderstanding something, you won't care – you'll just tell the LLM 'she said I'm wrong; write a reply that explains why I'm right anyway'. That's not what I'mWP:HERE for."WhatamIdoing (talk)20:51, 13 January 2026 (UTC)[reply]
This is all very good, and gives me something to work with for another round of improvements on this thing, so I appreciate it greatly. One thing I want to address specifically in this reply is the question of "why a guideline rather than an essay in WPspace?" and the answer is that while I absolutely do have a lot to say about LLMs on Wikipedia, I want to materially improve the situation by doing something about it, not just vent. The community norm is already against the use of LLMs in talk pages. People who use LLMs for that pretty much universally get told "hey, quit it" so I thought it would be sensible to make the unwritten rule written rather than having it exist in this nebulously-enforceable grey area.Athanelar (talk)21:16, 13 January 2026 (UTC)[reply]
It's not an unwritten rule. It's written down atWP:AITALK, which is already a guideline. So – whyanother guideline? Why aseparate guideline?WhatamIdoing (talk)21:18, 13 January 2026 (UTC)[reply]
AITALK doesn't forbid the use of LLMs for discussions, it merely suggests that they may be hatted (which, by the way, if you didn't notice I also changed my 'should' to 'may'.) The only time LLM use in talk pages tends to escalate to sanctions is when a user persistently lies about it; which to be fair is common, but what I'm proposing is thatany persistent (i.e., continuing after being notified and obviously subject to the limited carveouts) LLM usage for discussions should be considered disruptive. As my title here says, it's moreWP:LLMCOMM than it isWP:AITALK. LLMCOMM begins with the sentenceEditors should not use LLMs to write comments generatively. and my whole goal here is to basically turn that 'should' into a 'must' (while giving reasoning, addressing loopholes, and also synthesising AITALK into it to provide remedies/sanctions for the prohibited action)Athanelar (talk)21:23, 13 January 2026 (UTC)[reply]
I have been looking at this with fresh eyes, and I think that the entire ==Large language models are not suitable for this task== section can be safely removed.
Overall, I feel like the bulk of the page is trying to persuade the reader to hold the Right™ view, instead of laying out our (proposed) rules.
The ===Boldness is encouraged and mistakes are easily fixed=== subsection is irrelevant. Boldness is encouraged in articles. Mistakes can be fixed in articles (though if you're listening to what people are saying about fixing poor translations and LLM-generated text, "easily" is not true). In the context of user-to-user communication, boldness has costs, and some mistakes are not fixable. Maybe a decade ago, we had an influx of Indian editors (a class?) who had some problems, and in a well-intentioned effort to be warm and friendly, they addressed other editors as "buddy" (e.g., "Can you help me with this, buddy?"). This irritated some editors to the point that there were complaints about the whole group being patronizing, rude, etc. As the sales teams say, you only have one chance to make afirst impression. Even if you're just trying to fix grammar errors and simply typos, theHalo effect is real, and it is especially real in a community that takes pride in our brilliant prose (←the original name forWikipedia:Featured articles). A well-written comment really does get a better reception here than broken English or error-filled posts.
Also, "using an LLM to communicate on your behalf on Wikipedia fails to demonstrate that you...have the required competence to communicate with other editors" might feelableist to people with communication disorders. The link toWikipedia:Not compatible with a collaborative project is misleading (it's about people who are arrogant, mean, or think they should be exempt from pesky restrictions like copyrights; it's not about people who are trying to cooperate but struggle to write in English).
I have been thinking about an essay along these lines:
How to encourage non-AI comments
There are practical steps experienced editors can take to encourage non-AI participation.
  • Please do not bite the newcomers. People who use AI regularly are often surprised that this community rejects most LLM-style content. Gently inform newcomers about the community's preferences.
  • Focus on content, not on the contributor or your perception of their skills. Don't tell newcomers that theWikipedia:Competence is required essay says they have to be able to communicate in English. Kind and helpful responses to broken English, machine translation, non-English comments, typos, and other mistakes encourage people to participate freely. If people see that well-intentioned comments written in less-than-perfect English sometimes produce rude responses, they will be more motivated to use AI tools.
  • Accept mistakes, apologies, corrections, and clarifications with grace. Ask for more information if you think the person's comment doesn't make sense. Ask for a short summary if it is particularly long.
 
but I'm not sure it would actually help. People who are most irritated by "AI slop" don't automatically all have the social and emotional skills to be patient with the people who are irritating them.
I've posted a much shorter (~20%) and softer version of this proposalin my sandbox. I tried to remove persuasive content and examples from the mainspace, as well as shortening the few explanations that I kept. I also added practical information for experienced editors (so we're permitting dyslexic editors to use LLMs, but you're permitted to HATGPT, so...let's at least not edit war?). Maybe the contrast between the two will be informative.WhatamIdoing (talk)19:53, 14 January 2026 (UTC)[reply]
Imuch prefer WAID's version as it restricts itself to the point and doesn't preach or demonise anyone or anything. I would though reprhase the authorised uses section so as to focus on the uses rather than actions, advice or specific conditions. Perhaps something like:
The following uses are explicitly permitted:
  • Careful copyediting: You may use an LLM to copyedit what you have written (for example to check your spelling and grammar), but you must always check the output as the tools sometimes change the meaning of a sentence.
  • As an assistive technology: If you have a communication disorder, for example severe dyslexia, LLM tools are permitted as a usefulassistive technology. You are not required to disclose any details about your disability.
  • Translation. People with limited English, including those learning the language, may use AI-assistedmachine translation tools (e.g.,DeepL Translator) to post comments in English. Please consider posting both your original textplus the machine translation.
You are not required to state why you are using an LLM but in some cases doing so may help other editors understand you.
That last sentence in particular needs significant improvement though.Thryduulf (talk)00:37, 15 January 2026 (UTC)[reply]
I do plan to synthesise some of WAID's into mine, but I still have major issues with the suggestions for how to handle some of these carveouts; because they provide any bad-faith editor (which, given the amount of people I see lie about using LLMs, is a lot) a get-out-of-jail free card. Or rather, it means we essentially can't enforce the guideline in good faith at all. We can't simultaneously say "you shouldn't generate comments with LLMs" and also say "but if you have certain exempting circumstances, you can essentially do whatever you want with LLMs with no disclosure whatsoever" because it makes it impossible for us to enforce against users using LLMs 'wrong' without inevitably catching, for example, a dyslexic editor who decides they want an LLM to compose their entire comment and so it sounds 100% AI generated.Athanelar (talk)02:28, 15 January 2026 (UTC)[reply]
Yes, this is a problem. We can declare a total ban and thereby officially endorse discrimination against people with disabilities and English language learners into our guidelines.
Alternatively, we can permit reasonable accommodations and give editors no way to be certain that the person is using it truly qualifies for it. We can predict that we will have a number ofemotional support peacocks in addition to people who don't know that it's banned, people who legitimately do fall into one of the reasonable exceptions, some rule-breaking jerks, and some people who believe that what they're doing is reasonable (in their eyes) and therefore the community's rule is unreasonable and shouldn't be enforced against them. (I'm pretty sure psychology has a name for the belief that rules don't apply to you unless you agree with/consent to them, but I don't remember what the word is.)
Plus, of course, no matter what we write, there would still be the problem of editors incorrectly hatting comments written by English language learners and autistic editors, because AI-generated text resembles some common ESL and autistic writing styles (e.g., simpler sentence structure).
Ultimately, to loosely adaptBlackstone's formulation, the question is: Would you rather discriminate against autistic people and people with limited English proficiency, or would you rather give the benefit of doubt to lazy liars?WhatamIdoing (talk)02:59, 15 January 2026 (UTC)[reply]
I support revision 3 as is, without any changes that would further weaken its language. Having seen how LLM use is currently being handled by the community at other venues, including article talk pages, content-related noticeboards, andWP:ANI, my impression is that the discussion here is not representative of the community sentiment toward LLM use as a conduct issue, which is much more negative than is being portrayed here. A request for comment will invite input from the editors who spend more time resolving issues resulting from LLM use but do not closely follow all of the relevant village pump discussions. — Newslinger talk05:11, 15 January 2026 (UTC)[reply]
you'd think so, but there are large amounts of people who don't seem to be aware they happened (plural)Gnomingstuff (talk)06:49, 15 January 2026 (UTC)[reply]

Request for Comment

[edit]

Should the proposed guideline atUser:Athanelar/Don't use LLMs to talk for you be accepted?

Please see the collapsed sections above for the pre-RfC workshopping and discussion of the topic.

Please indicate whether youSupport orOppose the proposed guideline and why.Athanelar (talk)19:37, 15 January 2026 (UTC)[reply]

  • Oppose for all the reasons thatWhatamIdoing explained in the discussion far more eloquently than I can. It's not a guideline to help editors undestand the issues and good practice around LLM use on talk pages it's an overly-long essay proslethising the evils of LLMs (well, that's a bit hyerbolic, but not by huge amounts). Don't get me wrong, we should have a guideline in this area, but this is not it.Thryduulf (talk)19:44, 15 January 2026 (UTC)[reply]
  • Support I see no problem with it.Nononsense101 (talk)19:51, 15 January 2026 (UTC)[reply]
  • Oppose the guideline per WhatamIdoing and support her alternative proposal atUser:WhatamIdoing/Sandbox. This policy conflates all sorts of problems with AI (what is the sectionUser:Athanelar/Don't use LLMs to talk for you#Yes, even copyediting doing here when the substance of that section is about copyediting articletext in a guideline that is about talk page comments?), makes a number of dubious claims about LLMs that rather than supported by evidence are supposed to be taken on faith, and is once again either dubiously unclear or internally contradictory (the claim thatguideline does not aim to restrict the use of LLMs [for those with certain disabilities or limitations], for example). This would be great as anWP:Essay, but definitely not as a guideline.Katzrockso (talk)23:03, 15 January 2026 (UTC)[reply]
    what is the section User:Athanelar/Don't use LLMs to talk for you#Yes, even copyediting doing here when the substance of that section is about copyediting articletext in a guideline that is about talk page comments? Showing that LLMs have trouble staying on task when copyediting is relevant regardless of where that copyediting takes place, whether it's in articletext or talk page comments. It's a supplement to the caution in the 'Guidance' section about using LLMs to cosmetically enhance comments.Athanelar (talk)23:11, 15 January 2026 (UTC)[reply]
    Elsewhere I have used LLMs to copyedit few times and I have noticed this phenomenon (LLMs making additional changes beyond what you asked) using the freely available LLMs (I believe that the behavior of models is wildly variable so I cannot speak about the paid ones, which I refuse to pay for on principle). However, this was not a problem when I gave the LLM more specific instructions (i.e. do not change text outside of the specific sentence I am asking you to fix). The gist of the argument in that section is anon-sequitur: from the three examples given, the conclusionLLMs cannot be trusted to copyedit text and create formatting without making other, more problematic changes does not follow.Katzrockso (talk)23:41, 15 January 2026 (UTC)[reply]
    Athanelar, guidelines don't normally spent a lot of time trying to justify their existence. Think about an ordinary guideline, likeWikipedia:Reliable sources. You don't expect to find a section in there about what would happen to Wikipedia if people used unreliable sources, right? This kind of content is off topic for a guideline.WhatamIdoing (talk)00:24, 16 January 2026 (UTC)[reply]
    Sure, but LLMs are a topic that people are uniquely wont to quibble about, whether because their daily workflow is already heavily LLM-reliant or simply because they have no idea why anybody would want to restrict the use of LLMs. I think it's sensible to assume that our target audience here will be people who aren't privy to LLM discourse, especially Wikipedia LLM discourse, and so some amount of thesis statement is sensible.Athanelar (talk)01:44, 16 January 2026 (UTC)[reply]
    I think that separating simple clearly defined guidelines from longer essays justifying the reasoning would be the prudent thing to do here. Some people just need to be told that no means no without getting bogged down in the weeds, while others might genuinely be interested in a deeper understanding of the issues.ChompyTheGogoat (talk)09:25, 28 January 2026 (UTC)[reply]
  • Oppose We should do something, but this manifesto isn't it. For example:
    • This is supposed to be about Talk: pages, and it spends200+ words complaining about LLMs putting errors into infoboxes and article text.
    • Sections such asA large language model can't be competent on your behalf repeatedly invoke an essay, while apparently ignoring the advice in that same essay (e.g., "Be cautious when referencing this page...as it could be considered a personal attack"). In fact, that same essay saysIf poor English prevents an editor from writing comprehensible text directly in articles, they can instead post an edit request on the article talk page – something that will be harder for editors to do, if they're told they can't use machine translation because the best machine translation for the relevant language pair now uses some form of LLM/AI – especiallyDeepL Translator.
    • Anything an LLM can do, you can do better is factually wrong. It's a nice slogan, but LLMs are better at some tasks than humans, for the same reason thatWikipedia:Bots are better at some tasks than humans.
  • Overall, this is anextreme, maximalist proposal that doesn't solve the problems and will probably result in more drama. In particular, if adopted, I expect irritable editors to improperly revert comments that sound like it was LLM-generated (in their personal opinion) when they shouldn't. IMO "when they shouldn't" includes comments pointing out errors and omissions in articles, people withcommunication disorders such as severe dyslexia (because they'll see "bad LLM user" and never stop to askwhy they used it), people with autism (whose natural, human writing style is more likely to be mistaken for LLM output), and people who don't speak English and who are trying to follow theWP:ENGLISHPLEASE guideline.WhatamIdoing (talk)23:07, 15 January 2026 (UTC)[reply]
    This has probably (hopefully?) been mentioned elsewhere and I just haven't found it yet, but there needs to be a clear distinction drawn between simple tools that correct spelling and grammar or providedirect translation vs LLMs that actually generate original (using the term loosely) content. I can't imagine anyone taking issue with using spellcheck. Direct translations are usually sufficiently intelligible for a simple request like examples given here when they're in common languages, though that may be less true if the original language is obscure, and I don't know whether or not LLMs are any better at those. Personally I'm quite willing to engage in conversations via simple translations and attempt to work out any errors, whereas blatant LLM text from a user that can't possibly understand what it's saying on their behalf is very off-putting - and concerning IMHO. I would suggest a simple requirement for a disclaimer when someone is using any method of translation for their comments to help prevent misunderstandings. I would also like to see apreference for simple tools, but in the name ofWP:ACCESS LLMs could be permitted if the user truly feels they cannot explain themselves sufficiently without it, with emphasis on minimizing use as much as possible. That would require an assumption of good faith as no one else is entitled to make such a judgement call on someone else's behalf. Of course this still only applies to TPs etc, not article content.ChompyTheGogoat (talk)10:03, 28 January 2026 (UTC)[reply]
  • Oppose We should be judging edits and comments on their actual content, not on whether any of it appears to be LLM-generated. -Donald Albury00:31, 16 January 2026 (UTC)[reply]
    I agree in principle. That said, from the discussion above, I do think we need to redesign the unblock process to make it less dependent on English skills, because needing to post a well-written apology is why many people turn to their favorite LLM. I'm looking at theWikipedia:Unblock wizard idea, which I think is sound, but it still wants people to write "in your own words". For most requests, it would probably make more sense to offer tickboxes, like "Check all that apply: □ I lost my temper.  □ I'm a paid editor.  □ I wrote or changed an article about myself, my friends, or my family.  □ I wrote or changed an article about my client, employer, or business" and so forth.WhatamIdoing (talk)00:40, 16 January 2026 (UTC)[reply]
    From my limited experience reading unblock requests, it appears that the main theme that administrators are looking for is admission of the problem that led to the block and a genuine commitment to avoiding the same behavior in future editing. I think some people might object to such a formulaic tickbox (likely for the same reasons they oppose the use of LLMs in unblock requests) as it removes the ability of editors to assess whether the appeal is 'genuine' (whether editors are reliable arbiters of whether an appeal is genuine or not is a different question), which is evinced from the wording and content of the appeal.Katzrockso (talk)01:25, 16 January 2026 (UTC)[reply]
    I think we need to move away from model in which we're looking for an emotional repentance and towards a contract or fact-based model: This happened; I agree to do that.WhatamIdoing (talk)04:06, 16 January 2026 (UTC)[reply]
    I think the key thing that needs to be communicated is that they understand why they were blocked. Not just a "I got blocked for editwarring" but an "I now understand editwarring is bad because...". Agreeing what happened is a necessary part of that (if you don't know why you were blocked you don't know what to avoid doing again) but not sufficient because if you don't understandwhy we regard doing X as bad, then you're likely to do something similar to X and get blocked again.Thryduulf (talk)04:33, 16 January 2026 (UTC)[reply]
    @WhatamIdoing, I have good news for you: most or even all of us already use the model you desire (for which tickboxes would be effectively useless). You may be interested in watchingCAT:RFU yourself to see how these work out. I've also gotan in-progress guide on handling unblocks. --asilvering (talk)17:43, 16 January 2026 (UTC)[reply]
    My thought with tickboxes is that there is no opportunity to use an LLM when all you're doing is ticking a box.
    l partly agree with your view that "It doesn't matter whether they're sorry". It doesn't matter in terms of changing their behavior, but it can matter a lot in terms of restoring relationships with any people they hurt. This is one of the difficulties.WhatamIdoing (talk)17:50, 16 January 2026 (UTC)[reply]
    Sure, there's no opportunity to use an LLM. But then we haveexactly the same problem that we have when they're using LLMs: we don't actually know that they understand anything at all. --asilvering (talk)18:38, 16 January 2026 (UTC)[reply]
    I think it depends on what you put in the checkboxes. Maybe "□ I believe my actions were justified under the circumstances" or "□ I was edit warring, but it was for a good reason".WhatamIdoing (talk)20:28, 16 January 2026 (UTC)[reply]
    Are we still talking about language barriers, or just trolls? Because if someone doesn't have a firm grasp on English it's quite possible they also don't grasp the underlying problem behind the block for that reason, as opposed to being an ESL troll. Neither tick boxes or LLM rules can solve that. Perhaps (and feel free to correct me if such things already exist) there needs to be an easy and obvious way to request translation assistance regarding blocks and bans, so the translator can ensure the user knows what they're being told and has communicated the extent of their understanding in their own words? I know that might cause a delay, but if users choose to bypass it because they're impatient they'd need to live with the consequences.ChompyTheGogoat (talk)10:14, 28 January 2026 (UTC)[reply]
    We're talking about a general purpose unblock request system. Currently, there is no easy and obvious way to do anything related to unblock requests.WhatamIdoing (talk)19:04, 28 January 2026 (UTC)[reply]
    I have no experience with the process, but there's generally a template involved with blocks, right? My recommendation would be to add something along the lines of a warning that their request/response may have permanent consequences, should be intheir own words, and a link to wherever translation requests are usually handled. It won't make any difference with bad actors, but might help when someone with a language barrier is acting in good faith and doesn't realize that a well phrased LLM response will hurt their case.ChompyTheGogoat (talk)11:48, 29 January 2026 (UTC)[reply]
  • I'd support it if it was tweaked First, a preamble. We continue to nibble around the edges of the LLM issue without addressing the core issues. I still think we need to make disclosure of AI use mandatory before we're going to have any sort of effective discussion about how to regulate it. You can't control what you don't know is happening. That might take software tools to auto-tag AI likely revisions, or us building a culture where its okay to use LLM as long as you're being open about it.
    General grumbles aside, lets approach the particular quibbles with this proposal. This guideline is contradictory. The lead says that using LLM is forbidden...but the body is mostly focused on trying to convince you that LLM use is bad. Its more essay than guideline. I also think that it doesn't allow an exemption for translation, which is...lets be honest...pervasive. Saying you can't use translate at all to talk to other editors will simply be ignored. I think this needs more time on the drawing board, but I'd tentatively support this if the wording was"therefore using them to generate user-to-user communication isstrongly discouraged." rather thanforbidden.CaptainEekEdits Ho Cap'n!01:33, 16 January 2026 (UTC)[reply]
    The text of the guideline explicitly allows translation subject to review; but if that's unclear that's a problem in itself.Athanelar (talk)01:45, 16 January 2026 (UTC)[reply]
    Just one small point, but from a literal reading of two current rules, you arealready required to disclose when you produce entirely LLM generated comments or comments with a significant amount of machine generated material; the current position of many Wikipedia communities (relevantly, us and Commons) is that this text is public domain, and all editors, whenever they make an edit with public domain content, "agree to label it appropriately".[5]. Therefore, said disclosure is already mandatory - mainspace, talkspace, everywhere. The fact that people don't disclose, despite agreeing that they will whenever they save an edit, is a separate issue to the fact that those rule already exists.GreenLipstickLesbian💌🧸06:31, 16 January 2026 (UTC)[reply]
    @GreenLipstickLesbian, I think that's a defensible position, but not one that will make any sense to the vast majority of people who use LLMs. So if we want people to disclose that they've used LLMs, we have to ask that specifically, rather than expecting them to agree with us on whether LLM-generated text is PD. --asilvering (talk)18:40, 16 January 2026 (UTC)[reply]
    @Asilvering Yes, but the language not being clear enough for people to understand is, from my perspective, a separate issue as to whether or not the rule exists. We don't need to convince editors to agree with us that LLM generated text is PD, just the same way I don't actually need other editors to agree with me on whether text they find on the internet is public domain or that you can't use the Daily Mail for sensitive BLP issues- there just needs to be a clear enough rule saying "do this", and they can follow it and edit freely, or not and get blocked.
    And just going to sandwich on my point to @CaptainEek - it is becoming increasing impossible to determine if another editor's text in any way incorporates text from a LLM, given their ubiquity in translator programs and grammar/spellcheck/tone checking programs, which even editors themselves may not be aware use such technology. So LLMDISCLOSE, as worded, will always remain unenforceable and can never be made mandatory - an that's before getting into the part where it says you should say what version on an LLM you used, when a very large segment of the population using LLMs simply is not computer literate enough to provide that information. (Also, I strongly suspect that saying "I used an LLM to proofread this" after every two line post which the editor ran through Grammarly, which is technically what LLMDISCLOSE calls for, would render the disclosures as somewhat equivalent to theProp 65 labels - somewhere between annoying and meaningless in many cases, and something which a certain populace of editors stick on the end of every comment because you believe that's less likely to get you sanctioned than forgetting to mention you had Grammarly installed)
    However, conversely, what the average enWiki editor cares about is substantial LLM interference - creation of entire sentences, extensive reformulation - aka, the point at which the public domain aspect of LLM text and the PD labeling requirement starts kicking in. It's not a perfect relationship, admittedly, but it covers the cases that I believe most editors view should be disclosed, while leaving alone many of the LLM use cases (like spellcheck, limited translation, formatting) that most editors are fine with or can, at the very least, tolerate.GreenLipstickLesbian💌🧸19:35, 16 January 2026 (UTC)[reply]
    Perhaps a mandatory checkbox:
    I confirm that no part of this text was generated by AI/LLM, with an accompanying "What's this?" explanation script. A second option disclosing AI use (actual generated content) could require an additional review process, and that delay might help discourage it.ChompyTheGogoat (talk)10:31, 28 January 2026 (UTC)[reply]
    People will tick whatever boxes they need to tick to accomplish their goal. If they have to swear to hand over their first-born child to post something, they'll do that.WhatamIdoing (talk)19:05, 28 January 2026 (UTC)[reply]
    Longer explanation now atWikipedia:Checking boxes encourages people to tell lies.WhatamIdoing (talk)20:53, 28 January 2026 (UTC)[reply]
    People will (and do) lie in any format if they're acting in bad faith. @GreenLipstickLesbian saidSo LLMDISCLOSE, as worded, will always remain unenforceable and can never be made mandatory...when a very large segment of the population using LLMs simply is not computer literate enough to provide that information.
    This would make the point clear every time something is posted (with the "What's this" explanation being an important inclusion), providing simple justification for deleting/ignoring if they DO lie, which is the bigger purpose. "You checked the box agreeing that you understand the policy and did not use AI, then copypasta'd blatant slop anyway. Into the bin." Done.
    People who truly believe they're using it ethically could agree to the additional review option, so they don't feel forced into the "lie or don't post anything" corner. And I don't think it's strictly necessary for built in review tools, because if it's just copyediting their own work it's not likely to cause the same issues as full on intentional generation. Reviewers can use their own judgement and let it slide or maybe just give a warning if the content itself is acceptable. It usually won't be obvious that a LLM was used in that case anyway, which is kind of the point. We want to discourage problematic use and minimize time and energy spent on trolls without violatingWP:BITE.ChompyTheGogoat (talk)12:11, 29 January 2026 (UTC)[reply]
    If the "reviewers" (you mean "ordinary editors who are supposed to befocusing on content", right?) take the "simple" step of reverting or warning people who use AI tools under this proposal, they'd probably be screwing up. This proposal would explicitly authorize their use for people who have relevant disabilities or are English language learners. Therefore, complying with the proposed guideline would probably mean that your first response as a reviewer would be a polite question: "Did you use ChatGPT or a similar tool to write this? We don't normally like those on wiki, though we make a few exceptions as explained inUser:Athanelar/Don't use LLMs to talk for you#Caution."WhatamIdoing (talk)19:00, 29 January 2026 (UTC)[reply]
    Sorry, I got a bit off track - I was referring to the hypothetical checkbox option and people who DON'T disclose more subtle uses. Theoretically that could qualify it to be summarily deleted for lying the same way a wall of slop would be under the same potential policy, but editors could give more leeway if it's less obvious and/or possibly unintentional. By "reviewers" I meant volunteers for the additional AI process, or anyone who notices that's been dodged. Ideally people who are using it legitimately abs knowingly would disclose and go through that process, so not doing so would either be an mistake or bad faith. If it's ChatGPT copypasta it's hard to claim that checking the "no AI" box was a mistake because they didn't know.ChompyTheGogoat (talk)23:25, 29 January 2026 (UTC)[reply]
    Do you want to check a "no AI" box every time you post a comment? I don't.WhatamIdoing (talk)23:58, 29 January 2026 (UTC)[reply]
    @GreenLipstickLesbianWP:LLMDISCLOSE isn't mandatory though, just advised. In a system where it is not mandated, it won't be done unless folks are feeling kindly. But I acknowledge that with the current text of LLMDISCLOSE, we could begin to foster a culture that encourages, rewards, and advertises the importance of LLM disclosure. We may need a sort of PR campaign where it's like "are you using AI? You should be disclosing that!" But I think it'd be more successful if we could say you *must*.CaptainEekEdits Ho Cap'n!18:57, 16 January 2026 (UTC)[reply]
    For the most part, people do what's easy and avoid what's painful. If you want LLM use disclosed, then you need to make it easy and not painful. For example, do we have some userboxes, and can that be considered good enough disclosure? If so, let's advertise those and make it easy for people to disclose. Similarly, if we want people to disclose, we have to not punish them for doing so (e.g., don't yell at them for being horrible LLM-using scum).WhatamIdoing (talk)20:18, 16 January 2026 (UTC)[reply]
    Userbox is easy to make and a good start. A checkbox for every edit, like the minor edit checkbox...now that could get us somewhere!CaptainEekEdits Ho Cap'n!21:09, 16 January 2026 (UTC)[reply]
    Maybe... or maybe a checkbox would just get some people to check it always, even if they're not using an LLM (we don't have a rule against false disclosures), and still be ignored by most LLM-using editors.WhatamIdoing (talk)21:23, 16 January 2026 (UTC)[reply]
    One argument I've seen against a per-edit checkbox is that it presumes acceptability. I.e., if you tick the box then it means your use of LLMs was fine because you disclosed it.Athanelar (talk)21:28, 16 January 2026 (UTC)[reply]
    See my above suggestion re: additional review process. That would help to discourage it without an outright ban. It could also include a dropdown menu of common AI programs if someone does disclose it. An example for the explanation box might be something like:Wikipedia strongly encourages editors to use their own words when writing articles or comments and to avoid AI use whenever possible. However, we recognize that some users find value in such tools, so use is permitted with full disclosure. If you are not sure whether a program you use is considered AI, please check the dropdown list below. There is an additional review process for AI content, which may result in a delay.

    Someone else could probably phrase it better - maybe with a mention of accessibility, but you don't want people to assume "I don't have a disability so it doesn't apply to me." Keep it broad. But you get the gist. Heavy emphasis on avoidance but an option to disclose without being punished, and justification for swift consequences if they reject those options in favor of lying. It could also contain a "Why is this required" link, preferably to a new guideline written for that express purpose.

    As far as said review process, my suggestion would be to send them into a queue atWP:AIC with any article content requiring review BEFORE it goes live (and maybe a bot could create a TP section for it on the article in question) while TP comments could be posted immediately and just given a quick skim whenever someone gets to them. That might take longer to implement, but it could start as just a simple flag that anyone can glance over if they see it.ChompyTheGogoat (talk)12:53, 29 January 2026 (UTC)[reply]
    Please be mindful ofWP:BLUDGEON, I see you're replying a lot in this RfC. Especially considering you're proposing something entirely different and unrelated to the topic here.Athanelar (talk)13:05, 29 January 2026 (UTC)[reply]
    I wouldn't say wholly unrelated, though I do see how it's off track in terms of !votes for your proposal. My intention with multiple comments was only to expand on the concept as I see more input from others, not to keep pushing the same point, but I'll step back regardless since it's better suited for discussion elsewhere (somewhere). My brain tends to take off like that whenever something speaks an idea.ChompyTheGogoat (talk)13:29, 29 January 2026 (UTC)[reply]
  • Support, enough is enough and the level of obstructionism is mind-blowing.Gnomingstuff (talk)01:45, 16 January 2026 (UTC)[reply]
  • Unfortunately, there's no foolproof way to tell whether a comment was LLM generated or not (sure, there areWP:AISIGNS, but again, those are just signs). Agree with Katzrockso that this would work better as an essay than a guideline.Some1 (talk)02:30, 16 January 2026 (UTC)[reply]
  • Support creating a guideline banning LLM usage on talk pages.OpposeUser:Athanelar/Don't use LLMs to talk for you for being too verbose and complex. Perhaps we can just add a paragraph or two to an existing talk page guideline. –Novem Linguae(talk)05:39, 16 January 2026 (UTC)[reply]
  • Oppose this proposal, wouldsupport a simpler proposal, especially a simple addition to existing guidelines as NL says above.~~ AirshipJungleman29 (talk)13:30, 16 January 2026 (UTC)[reply]
    'Simpler' in terms of scope or prose?Athanelar (talk)13:53, 16 January 2026 (UTC)[reply]
  • Oppose. Too long, and I don't think a fourth revision would address the problems; this is trying to do too much, some of which is unnecessary and some of which is impossible to legislate. I agree with those who say a paragraph (or even a sentence) somewhere saying LLMs should not be used for talk page communication would be reasonable.Mike Christie (talk -contribs -library)13:38, 16 January 2026 (UTC)[reply]
  • Support the crux of the proposal, which would prohibit using an LLM to"generate user-to-user communication". This is analogous toWP:LLMCOMM's"Editors should not use LLMs to write comments generatively", and would close the loophole of how the existingWP:AITALK guideline does not explicitly disallow LLM misuse in discussions or designate it as a behavioral problem. A review of theWP:ANI archives shows that editors are regularly blocked for posting LLM-generated arguments on talk pages and noticeboards, and the fact that our policies and guidelines do not specifically address this very common situation is misleading new editors into believing that this type of LLM misuse is acceptable. Editors with limited English proficiency are, of course, welcome to use dedicatedmachine translation tools (such as the ones inthis comparison) to assist with communication. The passage of theWP:NEWLLM policy suggests that LLM-related policy proposals are more likely to succeed when they are short and specific, so I recommend moving most of the proposed document to aninformation or supplemental page that can be edited more freely without needing a community-wide review. — Newslinger talk14:17, 16 January 2026 (UTC)[reply]
    I do wonder if some of the 'overlong' complainants, @Mike Christie @Novem Linguae etc would support the guideline with the explanatory sections removed and only the substance (i.e., the 'Guidance for Editors' section) remainingAthanelar (talk)14:24, 16 January 2026 (UTC)[reply]
    For what it's worth, @Athanelar, I thinkUser:Athanelar/Don't use LLMs to talk for you would make an excellent essay. Agree its too long to be a Guideline (as learnt from my recent LLM policy RfCs).qcne(talk)14:37, 16 January 2026 (UTC)[reply]
    I said above under version 2 that I don't think much of what is being addressed here is legislatable at all, but if anything is to be added I'd like to see a sentence or two added to a suitable guideline as Novem Linguae suggests. I think making this into an essay is currently the best option. Essays can be influential, especially when they reflect a common opinion, so it's not the worst thing that can happen to your work.Mike Christie (talk -contribs -library)16:41, 16 January 2026 (UTC)[reply]
    @Newslinger, I've read that some of the "dedicated machine translation" tools are using LLMs internally (e.g.,DeepL Translator). Even some ordinary grammar check tools (e.g., inside old-fashioned word processing software like MS Word) are using LLMs now. Many people are (or will soon be) using LLMs indirectly, with no knowledge that they are doing so.WhatamIdoing (talk)17:44, 16 January 2026 (UTC)[reply]
Which is one of the reasons why 1) people who can't communicate in English really shouldn't be participating in discussions on enwiki and 2) people who use machine translation (of any type) really should disclose this and reference the source text (so other users who either speak the source language or prefer a different machine translation tool can double-check the translation themselves). --LWGtalk(VOPOV)17:52, 16 January 2026 (UTC)[reply]
We sometimes need people who can't write well in English to be communicating with us. We need comments from readers and newcomers that tell us that an article contains factual errors, outdated information, or a non-neutral bias. When the subject of the article is closely tied to a non-English speaking place/culture, then the people most likely to notice those problems is someone who doesn't write easily in English. If one of them spots a problem, our response should sound like "Thanks for telling us. I'll fix it" instead of "People who can't communicate in English really shouldn't be participating in discussions on enwiki. This article can just stay wrong until you learn to write in English without using machine translation tools!"WhatamIdoing (talk)19:51, 16 January 2026 (UTC)[reply]
IMO if they are capable of identifying factual errors, outdated information, or non-neutral bias in content written in English, then they should be capable of communicating their concerns in English as well, or at least of saying "I have some concerns about this article, I wrote up a description of my concerns in [language] and translated it with [tool], hopefully it is helpful." With that said, I definitely don't support biting newbies, and an appropriate response to someone who accidentally offends a Wikipedia norm is "Thanks for your contribution. Just you know, we usually do things differently here, please do it this other way in the future." --LWGtalk(VOPOV)20:04, 16 January 2026 (UTC)[reply]
Because English is thelingua franca of the internet, millions of people around the world use browser extensions that automatically translate websites into their preferred language. Consequently, people can be capable of identifying problems in articles but not actually be able to write in English.WhatamIdoing (talk)20:22, 16 January 2026 (UTC)[reply]
I don't speak Dutch or Portuguese beyond a handful of words but I can tell you that if I found an article about a political party or similar group saying "De leider is een vreselijke man." or "O líder é um homem horrível." that it needs to be changed. Similarly I can tell you that an article infobox saying the gradient of a railway is 40% is definitely incorrect, but I can't tell you what either needs changing to and I can't articulate what the problem is in Dutch or Portuguese, but I can use machine translation to give any editors there who don't speak English enough of a clue that they can fix it. The same is true in reverse.Thryduulf (talk)21:48, 16 January 2026 (UTC)[reply]
@WhatamIdoing @Thryduulf those are fair points, and situations like that wouldn't bother me (though transparency and posting the source text would still be preferred to reduce possible misunderstandings). --LWGtalk(VOPOV)22:17, 16 January 2026 (UTC)[reply]
I don't know. A recent example: I removed a paragraph containing a hallucinated source from an article here recently. That paragraph had made it to the Korean Wikipedia (I checked dates to confirm the direction of transit), so I removed it there too, and used Google Translate to post an explanation because otherwise it'd look like I was just removing text for no reason.Gnomingstuff (talk)01:38, 18 January 2026 (UTC)[reply]
Yes, I also recognize that many machine translation tools now incorporate LLMs in their implementation. (The other active RfC atWikipedia talk:Translation § Request for comment seeks to address this for translated article content, but not translated discussion comments.) When an editor in an LLM-related conduct dispute mentions that they are using an LLM for translation, I have always responded that there is a distinction between using a dedicated machine translation tool (such asGoogle Translate orDeepL Translator) that aims to convey a faithful representation of one's words in the target language, and an AI chatbot that can generate all kinds of additional content. If someone uses a language other than English to ask an AI chatbot to generate a talk page argument in English, the output would not be acceptable in a talk page discussion. But, if someone uses an LLM-based tool (preferably a dedicated machine translation tool) solely to translate their words to English without augmenting the content of their original message, that would not be a generative use of LLM and should not be restricted by the proposal. — Newslinger talk01:00, 17 January 2026 (UTC)[reply]
Unless there is somereliable way for someone other than the person posting the comment to know which it was then the distinction is not something we can or should incorporate into our policies, etc.Thryduulf (talk)01:02, 17 January 2026 (UTC)[reply]
Oppose this version,support in principle. I agree with every point, but it's overly long and essay-y. I'd back something more likeUser:WhatamIdoing/Sandbox.JustARandomSquid (talk)16:17, 16 January 2026 (UTC)[reply]
Support. I agree with the concerns that it is too long, and certainly far from a perfect proposal, but having something imperfect is better than a consensus against having any regulation at all. I do also agree with Newslinger's proposal of moving the bulk of it to an information page if there is consensus for it.ChaoticEnby (talk ·contribs)17:22, 16 January 2026 (UTC)[reply]
  • Oppose - To restrictive and long. There is a reasonable way to use LLMs and this effectively disallows it which is a step to far. That coupled with the at best educated guessing on if it is actually is an LLM and assuming it is all unreviewed makes it untenable.PackMecEng (talk)17:31, 16 January 2026 (UTC)[reply]
  • Support the spirit of the opening paragraph, but too long and in need of tone improvements. Currently the language in this feels like it is too internally-oriented to the discussions we have been having on-wiki about this issue, whereas I would prefer it to be oriented in a way that will help outsiders with no context understand why use-cases for LLMs that might be accepted elsewhere aren't accepted here. The version atUser:WhatamIdoing/Sandbox is more appropriate in length and tone, but too weak IMO. I would support WhatamIdoing's version if posting the original text/prompt along with LLM-polished/translated output were upgraded from a suggestion to an expectation. With that said,upgradingWP:LLMDISCLOSE andWP:LLMCOMM to guidelines is the simplest solution and is what we should actually do here. --LWGtalk(VOPOV)17:34, 16 January 2026 (UTC)[reply]
    I would also support those last two proposals, with the first one being required from a copyright perspective (disclosure of public domain contributions) and the second one being a much more concise version of the proposal currently under discussion.ChaoticEnby (talk ·contribs)17:36, 16 January 2026 (UTC)[reply]
    Also, if we did that thenUser:Athanelar/Don't use LLMs to talk for you would be a useful explanatory essay. --LWGtalk(VOPOV)17:38, 16 January 2026 (UTC)[reply]
Support per Chaotic Enby and Newslinger, I don't see an issue with length since the lead and nutshell exist for this reason, but am fine with some of it being moved to an information page. LWG's idea above is also good, though re LLMDISCLOSE,Every edit that incorporates LLM output should be marked as LLM-assisted by identifying the name and, if possible, version of the AI in the edit summary is something nobody is going to do unprompted (and personally I've never seen).Kowal2701 (talk)19:01, 16 January 2026 (UTC)[reply]
something nobody is going to do unprompted true, but it's something peopleshould be doing. Failing to realize you ought to disclose LLM use is understandable, but failing to disclose it when specifically asked to do so is disruptive - there's simply no constructive reason to conceal the provenance of text you insert into Wikipedia. So while I don't expect people to do this unprompted, I think we should be firmly and kindly prompting people to do it. --LWGtalk(VOPOV)19:11, 16 January 2026 (UTC)[reply]
I'd rather something likeTransparency about LLM-use is strongly encouraged., and we should have practically zero tolerance for people denying LLM-use in unambiguous cases, ought to be met with a conditional mainspace block. I'll be bold and add somethingKowal2701 (talk)20:47, 16 January 2026 (UTC)[reply]
Oppose as written. For an actual guideline, I would prefer something likeUser:WhatamIdoing/Sandbox. It makes clear the general expectations of the community should it be adopted. This proposal reads like an essay; it's trying to convince you of a certain viewpoint. Guidelines should be unambiguous declarations about the community's policies. For me, the proposed guideline is preaching to the choir, I agree with basically all of it, but I don't see it as appropriate for a guideline. I second what Chaotic Enby, Newslinger, and CaptianEek have said, and absolutelysupport the creation of a guideline of this nature. --Agentdoge (talk)19:27, 16 January 2026 (UTC)[reply]
  • Support per Newslinger and Chaotic Enby.fifteen thousand two hundred twenty four (talk)19:50, 16 January 2026 (UTC)[reply]
  • Oppose. This guideline is too long and too complicated. The guideline should be pretty simple - the length and the complexity should be similar toUser:WhatamIdoing/Sandbox. Explaining the problem of the LLM may be added as a later explanatory essay, but not in this guideline. Provisions that editors are expected toWP:AGF before blaming LLM should also be made.SunDawnContact me!02:05, 17 January 2026 (UTC)[reply]
  • Weak support for the original proposal because the crux of it is still better than the status quo in spite of its flaws (too long, unfocused, essay-like);Strong supportWhatamIdoing's version, whether or not tweaks happen to it.Choucas0 🐦‍⬛14:32, 17 January 2026 (UTC)[reply]
  • Isupport a ban on LLM-written communication, butoppose this draft, as I find it to be poorly written in several respects. It is too long. The "Remedies" section entirely duplicates current practice elsewhere, some (maybe all?) of which is already documented in other guidelines. It says something is "forbidden" but then goes on to give an example of how LLMs actuallycan be used. And it generally contains far too much explaining of its own logic, which makes it much weaker and open to wikilawyering. "Anything an LLM can do, you can do better"? Nope. "Large language models cannot interpret and apply Wikipedia policies and guidelines"? Dubious.Toadspike[Talk]18:22, 17 January 2026 (UTC)[reply]
    Isupport WhatamIdoing's proposal, which essentially addresses all of my concerns with the original proposal by Athanelar.Toadspike[Talk]18:24, 17 January 2026 (UTC)[reply]
Oppose as written,support a ban/ regulation on LLM communication. Toadspike and WhatamIdoing expressed my concerns here quite nicely, I think this proposal is not yet ready to become a guideline. For example it uses examples from article editing for comparisons to communication. I also think that it insufficiently addressesWP:CIVILITY, some people may write in a way others interpret as LLM-generated, if we the fire the harshest wording we have at them we might scare away some great contributors from the project. Therefore I toosupport WhatamIdoing's proposal. Best,Squawk7700 (talk)22:23, 17 January 2026 (UTC)[reply]
Just to be clear: The thing I bashed together in my sandbox is IMO not ready for aWP:PROPOSAL. I created it only as an illustration of what could be done.WhatamIdoing (talk)23:10, 17 January 2026 (UTC)[reply]
Thanks for the clarification! I myself meant it as more of a literal proposal (not a WP:PROPOSAL), can't speak for the others here of course and do think we'd probably need to so some workshopping. That said I think you already did quite a good job on that draft. Kind regardsSquawk7700 (talk)23:21, 17 January 2026 (UTC)[reply]
Now I feel bad knowing I supported someone's bashed together draft over a third iteration proposal.JustARandomSquid (talk)23:49, 17 January 2026 (UTC)[reply]
Don't feel bad. I've been writing Wikipedia's policies and guidelines for longer than some of our editors have been alive, and we spent hours discussing the problems we're having with AI comments before Athanelar launched this RFC. Given all that, it would be surprising if I couldn't throw together something that looks okay.WhatamIdoing (talk)23:59, 17 January 2026 (UTC)[reply]
Any outcome here which results in more restriction against LLM usage is a positive one for me. I might not be fully satisfied with WAID's approach to the matter, but more community consensus against AI will only ever be an improvement as far as I'm concerned. I'll be happy if my RfC leads to that, whether I wrote the final result is irrelevant.Athanelar (talk)02:19, 18 January 2026 (UTC)[reply]
It could use some tweaks but I definitely think you're closer to a viable solution. It needs to be simple and concise, and it needs to recognize that some people ARE going to use them regardless of we decide, so a blanket ban probably isn't the best way to go. There's nothing preventing others from composing longer essays on the matter, but newbies and casual users probably aren't going to read something that long.ChompyTheGogoat (talk)13:20, 29 January 2026 (UTC)[reply]
Regarding examples from articles: Finding an example of a talk page edit of this nature would be difficult bordering on impossible; people aren't supposed to edit others' comments, and they almost never do except to vandalize them or occasionally fix typos.Gnomingstuff (talk)23:33, 18 January 2026 (UTC)[reply]
Oppose. Most of the section "Large language models are not suitable for this task" digresses from the main topic and relies on a very outdated understanding of what LLMs can do. Among its issues, the idea that LLMs only repeat text from their training data is untenable in 2026. The section also completely ignores LLM fine-tuning. However, like dlthewave, I wouldSupport something similar toWhatamIdoing's sandbox, which is well-reasoned, properly nuanced and relatively concise.Alenoach (talk)20:37, 18 January 2026 (UTC)[reply]
[It] relies on a very outdated understanding of what LLMs can do.[citation needed]Among its issues, the idea that LLMs only repeat text from their training data is untenable in 2026.[citation needed]
SuperPianoMan9167 (talk)21:08, 18 January 2026 (UTC)[reply]
For example, it claims that LLMs are not able to perform arithmetic operations, and that they instead only retrieve memorized results. But what if you ask to calculate e.g. 73616*3*168346/4? Surely, this can't be in its training data, and yet ChatGPT gets the exact answer even without code execution.Alenoach (talk)21:19, 18 January 2026 (UTC)[reply]
ChatGPT secretly writes Python code to do the math. The actual LLM cannot do math.SuperPianoMan9167 (talk)21:21, 18 January 2026 (UTC)[reply]
Who cares about the LLM part of the final product, people aren't going in and using specifically the LLM part of ChatGPT.Katzrockso (talk)21:44, 18 January 2026 (UTC)[reply]
It can use code interpreter, but it generally doesn't and it's often visible in the chain-of-thought when it uses it. You can also see that when the number of digits in a multiplication is too high (e.g. > 20 digits), it will start making mistakes (similarly to a human brain), whereas a Python interpreter would get an exact answer. I haven't found any great source assessing ChatGPT on multiplications, butthe graph shown here gives an idea of what ChatGPT's performance profile on multiplications looks like.Alenoach (talk)21:54, 18 January 2026 (UTC)[reply]
Because there's such a staggering amount of correct calculations in its training data that it usually gets things right, but it's fundamentally still a largelanguage model.JustARandomSquid (talk)21:23, 18 January 2026 (UTC)[reply]
That, and it literally writes Python code to do the math when some non-AI code detects the prompt has arithmetic. Seethis article.SuperPianoMan9167 (talk)21:25, 18 January 2026 (UTC)[reply]
...which means that we're wrong to say that thesetools can't do these things.
Remember that not everyone is going to draw a distinction between "the LLM part of ChatGPT" and "the non-LLM parts of ChatGPT". Non-technical people will often say "LLM" or "AI" when they mean something vaguely in that general category. The proposal here doesn't distinguish between the LLM and non-LLM components of these tools. It seeks to ban them all.WhatamIdoing (talk)21:32, 18 January 2026 (UTC)[reply]
Mine doesn't seem to disclose that. If they've started secretly doing stuff under the hood that's messed up. Still though, guidelines shouldn't have to apologetically explain themselves like this proposal does.JustARandomSquid (talk)21:32, 18 January 2026 (UTC)[reply]
The section "Large language models are not suitable for this task" relies on outdated information about LLMs, which undermines its relevance and accuracy. (Or possibly even earlier.) The subsection's premise that LLMs are limited to "repeating training data" is outdated and has been disproven by recent research showing a "neuron-level differentiation" between memorization and generalization. Distributional generalization benchmarks also show that LLMs are not merely recalling information, but actively generating new data. A further fatal flaw with this subsection is that it does not account for Parameter-Efficient Fine-Tuning (PEFT) and LoRA, techniques which allow for fine-grained control over a model's domain and rigidly enforcing task-specific constraints, nor does it reflect the reality of how LLMs are used in practice. Furthermore, it fails to acknowledge the rapid pace at which developers are delivering improvements and addressing concerns.
I would, like dlthewave, support WhatamIdoing's sandbox, which is well-reasoned and properly nuanced and does not fall prey to any of the above flaws.Sparks19923 (talk)22:21, 28 January 2026 (UTC)[reply]
[citation needed]SuperPianoMan9167 (talk)00:12, 29 January 2026 (UTC)[reply]
recent research showing a "neuron-level differentiation" between memorization and generalization Would you be willing to provide a link to said research?
Distributional generalization benchmarks also show that LLMs are not merely recalling information, but actively generating new data. Which benchmarks? And what aboutthis research?SuperPianoMan9167 (talk)00:16, 29 January 2026 (UTC)[reply]
Support LLM outputs tend to be repetitive, irrelevant and reference common guidelines that everyone knows to support the poster's argument, they are a time sink because even an AI generated response with no effort put in it, has to be addressed by human effort. It is better to just ban it all.Zalaraz (talk)01:50, 20 January 2026 (UTC)[reply]
  • Oppose per Thryduulf, "we should have a guideline in this area, but this is not it."Benjamin (talk)06:28, 20 January 2026 (UTC)[reply]
  • Oppose as guideline, but retain this laudable effort on the part of Athanelar to address a growing problem as a proto-essay. While I am in favor of adapting WhatamIdoing's proposal into a badly-needed guideline, I fully understand OP's frustration. I say this as someone fresh out of a not-so-pleasant encounter with a disruptive LLM-using sockpuppet. As someone who uses next to no automated tools or bots himself (I appreciate 'Find on page'—though it feels a bit like cheating—and(sarcasm alert) it's nice to not have to constantly format a custom sig) you can imagine the frustration of having to watch someone spit out (I use the phrase advisedly) rapid-fire artificial responses to your questions. Worse, the LLM argues on its own behalf due to the lack of a clear guideline forbidding itself to do so, while repeating things you've brought up—such as articles, guidelines, and shortcuts—in an imbecilic way. I don't feel the same way about participating in atranslated discussion, and the fact that I've never noticed being in one probably means the process works well. In any case, the sooner an LLM-talk guideline forbidding AI-produced discussion is implemented, the better.StonyBrookbabble07:55, 20 January 2026 (UTC)[reply]
    As a practical matter, if someone is breaking theWP:SOCK rules, then they'll also break any anti-LLM rules we adopt. Having rules only prevents behavior if the user wants to follow the rules (which is most people)and they know about the rulesand the cost to them(!) of following the rule isn't too high in their(!) opinion.WhatamIdoing (talk)17:52, 20 January 2026 (UTC)[reply]
    It's actually not about the cost - it's about theperceived cost to them of complying with the rules vs theperceived cost to them of not complying with the rules. In some cases whetherthey understand why the rule exists and whetherthey think it makes sense are also part of the equation.Thryduulf (talk)18:22, 20 January 2026 (UTC)[reply]
    This was essentially my line of thinking in including the 'essay-like' content of the guideline.Athanelar (talk)01:27, 21 January 2026 (UTC)[reply]
    I agree. It is not unreasonable to expect our colleagues to use their own words for communicating with eachother instead of relying on AI.Zalaraz (talk)01:14, 21 January 2026 (UTC)[reply]
    Earlier today, I removed some controversial and misleading content from the Polish Wikipedia. I don't speak any Polish. How would you have me "use my own words" there? I assume you think the rules should be the same, and that the English Wikipedia shouldn't make rules that we wouldn't want to follow ourselves.WhatamIdoing (talk)03:02, 21 January 2026 (UTC)[reply]
    Not sure how this is related. You can use a language you do speak to speak your own words. There's no ideal way to edit in a language you don't speak.CMD (talk)03:09, 21 January 2026 (UTC)[reply]
    I can use English, and get the equivalent ofWP:ENGLISHPLEASE in reply, because not everyone there reads English. Alternatively, I can use some type of machine translation, and everyone will figure it out. Which would you prefer, if the situation was reversed?WhatamIdoing (talk)05:00, 21 January 2026 (UTC)[reply]
    Why is this a hypothetical? People post in non-English here all the time. Usually I take a second or two to translate with browser tools. (99% of the time it's nonsense, but that is not limited to non-English posts.)CMD (talk)13:04, 21 January 2026 (UTC)[reply]
    You do know that those browser tools incorporate LLM technologies to the same degree as things like Google Translate (if you are using Chrome then itis Google Translate).Thryduulf (talk)13:07, 21 January 2026 (UTC)[reply]
    Not on Chrome, but yes, and? I am not posting the results as my own words.CMD (talk)13:13, 21 January 2026 (UTC)[reply]
    So LLM translations are good enough for you to use, but not good enough for the editor who is posting? Even though you don't have access to machine translation for every language, and even though you often won't know which translation tool is best for the language in question?
    I prefer our recommendation of providing the original plus a machine-translated version for a talk-page comment, but I don't think it makes sense for each of us to use machine translation while trying to ban the editor from doing exactly what we're each individually doing and posting the results.WhatamIdoing (talk)18:19, 21 January 2026 (UTC)[reply]
    This is a bizarre interpretation. Machine and llm translations are not "good enough", when I used them they are the best at hand. They are not optimal. Furthermore, when I am using a translatorI know I am using a translator. I would have the original text of what the person was saying on hand, andI would know that this original text had gone through both a linguistic filter and through a pattern matching filter. For some language's I am even aware of common translation issue, and if other editors can see the original text as well that means a higher chance another editor will have such information or even know the original language. I note the motte and Bailey switch in the second paragraph, that would be great but was not your original argument and what flows from it follows the bizarre interpretation.CMD (talk)07:29, 22 January 2026 (UTC)[reply]
    They're obviously good enough (for some purposes), because you keep using them. If they weren't doing something desirable for you, you'd stop using them entirely.
    I do take your point about the benefits of knowing that machine translation was being used.
    From my POV, and remembering that we're specifically talking here about editor-to-editor communication (and not, e.g., writing an article), there's a spectrum of functionality: labeled machine translation → unlabeled machine translation → non-English → not getting information we need.WhatamIdoing (talk)18:57, 22 January 2026 (UTC)[reply]
    Something being good enough "for some purposes" does not mean that thing is good enough forother purposes. Those are distinct purposes and equating them is a fallacy. Editor to editor communication is enhanced if editors know what they're saying; theDirty Hungarian Phrasebook does not enhance functionality beyond the native language. "unlabeled machine translation" can equal "not getting the information we need" in some situations, and in all situations does equal "not getting some of the information we need" because it means we lack the very knowledge of machine translation.CMD (talk)08:20, 23 January 2026 (UTC)[reply]
  • oppose as written very wordy and convoluted. and honestly,User:WhatamIdoing/Sandbox seems to be a much more straightforward guideline. --Aunva6talk -contribs18:29, 20 January 2026 (UTC)[reply]
  • Support something. Athanelar's version is more of an explanatory essay, WhatamIdoing's is to weak. Fully support LWG's idea of upgrading LLMDISCLOSE and LLMCOMM, they would work well with LLMTALK. --LCUActivelyDisinterested«@» °∆t°19:09, 20 January 2026 (UTC)[reply]
Oppose. This guideline is too strict, and will chill legitimate use of assistive devices. This deletion request would essentially prohibit LLM-assisted text when the fundamental point and argument are presented by the editor themselves. I don't feel that this treatment squares with Wikipedia's mission of fostering helpful communication. Just as we currently expect editors to communicate in their own voice and ideas, grammarly or LLM tone checkers assist with quality contributions without taking away an editors ability to have original thoughts. Allow editors to edit grammar/tone while maintaining the argument.Sparks19923 (talk)19:59, 20 January 2026 (UTC)[reply]
  • Oppose: I would prefer a simpler rule: if you write a comment, by whatever means you use to do so, that comment is yours and you are responsible for it. If you use an AI and the AI misrepresents your idea and you did not notice... that isyour problem, not ours. Besides, the page has a strong anti-AI bias: "AI may commit mistakes, therefore it always does and it's useless". Actually, more often than not AI does a good job, that's why it's so widespread.Cambalachero (talk)07:18, 21 January 2026 (UTC)[reply]
  • Oppose as written, works better as an esssay than as a guideline. WAID's proposed guideline reads far better and I would support it, with caution (as we have no tool to reliably detect AI-generated comments other than the "duck test" cases already covered byWP:HATGPT).JavaHurricane14:17, 21 January 2026 (UTC)[reply]
Oppose for theeight reasons I listed in the RFCBEFORE discussion. WAID's draft is much, much better, for the various reasons people have listed above, but I wouldn't support it as written (e.g., I disagree with "Don't outsource your thinking"). I prefer to expand an existing guideline rather than make a new one; WAID's is a good start for that expansion.
Fundamentally, though, the rule should simply be "you are responsible for your edits, regardless of what technology you used to make them". Doesn't matter if you used a pen or Visual Editor or WikiPlus or a script or a bot or an API or LLM or typewriter or carrier pigeon, what you publish on this website with your account is your responsibility. If you cut and paste LLM text and it's a hoax, that's the same as if you wrote the hoax yourself. If it has fake cites, it'll be treated as if you added fake cites, because you did. If the talk page comment is verbose and repetitive, no one might read it, whether it's AI slop or you're just a bad writer. That's what the guidelines should warn editors about. Telling people not to use LLM at all is not only wrong, but pointless. It's like trying to tell people not to use a computer or not to use their mobile phones. Good luck with that. :-)Levivich (talk)18:40, 21 January 2026 (UTC)[reply]
Support – I don't think it's perfect or even decent, but it is something that we can discuss later and improve, rather than proposing a bran new guideline every time. Ping on reply.FaviFake (talk)16:34, 26 January 2026 (UTC)[reply]
Support perGnomingstuff,enough is enough. More workshopping will just lead to this becoming even more verbose and complex, when the core ofdo not use LLMs to communicate just really needs to be written as a PAG yesterday. Dealing with LLM editors/pasters is exhausting enough as it is.--Gurkubondinn (talk)11:12, 27 January 2026 (UTC)[reply]
Oppose - A classic case of "scope creep." I can understand why people would want to do something about LLM slop, but this proposed policy is literally unenforcable, and will do more damage than it will prevent.
Sparks19923 (talk)22:02, 28 January 2026 (UTC)[reply]
Support. Chatbot text is spam applied to human attention. Nobody should be expected to treat spam with human consideration as if it were not spam attacking human considerationDavid Gerard (talk)21:33, 21 January 2026 (UTC)[reply]
  • Support. WhatamIdoing's take is, however, probably more workable than Athanelar's, even if it doesn't go as far as I might like. I am actually of the opinion that LLM-generated content should be banned outright on Wikipedia (except, obviously, in direct quotes and the like when it's the subject of an article!), with no other exceptions of any kind under any circumstances whatsoever -- though I'm going to admit such a hardline stance is both hard to practically enforce and unlikely to gain traction. Something like this, at any rate, is the least we can do.Gimubrc (talk)20:36, 23 January 2026 (UTC)[reply]
  • Support the proposal of turning the prohibition of using LLMs for even user to user communication into a guideline. There isn't a need for AI junk here, it's a waste of time if people aren't communicating themselves and I can't imagine any circumstance where it is better than direct communication other than those who may not be proficient enough in English but if you can't communicate directly then you are likely not going to be able to edit well. LLMs should be just kept outside Wikipedia under any circumstance.Omen2019 (talk)10:21, 25 January 2026 (UTC)[reply]
  • Oppose - it does not differentiate sample AI content, presented as such. Like "here's what the AI can do", or "here's what the AI says about that, and I'm inclined to agree". Ask perplexity.ai what will likely happen to Wikipedia, and you might be shocked. If all you want to do is ask for other's input on it, why would you have to re-compose it from scratch? Just quote the AI and be done with it.   —The Transhumanist  11:28, 25 January 2026 (UTC)[reply]
  • Oppose - Too wordy (which causesWP:CREEP issues) and essay-like. I would support the further development and future consideration ofUser:WhatamIdoing/Sandbox. Cheers,Suriname0 (talk)18:07, 25 January 2026 (UTC)[reply]
  • Support any version that can reach consensus. We need some sort of guideline, ideally yesterday; LLMs threaten to overwhelm volunteers and waste huge amounts of editorial time and energy. This is certainly better than nothing, and putting it in place will encourage people to compromise on future incremential improvements rather than stalling out and leaving us with no guideline at all. And the fear that it will lead to arguments due to the inability to clearly identify LLM-generated comments is absurd; the same is true for huge swaths of our other conduct policies, such asWP:CANVASS,WP:MEAT, sockpuppetry and COIs - sometimes it is easy to tell when someone is in violation; sometimes it is harder; and sometimes, yes, we do get spurious accusations. We've always been able to deal with them in the past and we'd be able to deal with them in the future; it's not a reason to have no guidelineat all. Likewise, many people say "not this one but some hypothetical better one in the future" - we've been doing this song-and-dance with AI policies for years now; no version will be perfect. It's better to getsomething in to encourage people to compromise on improving it, rather than constantly arguing over a perfect version that will never exist. All our other policies took time to grow incrementially; expecting AI policies to come into existence fully-formed and perfect from the get-go is unrealistic and not a valid reason to oppose something that can, at least, serve as a good starting point. --Aquillion (talk)17:32, 29 January 2026 (UTC)[reply]
    Do you really think that LLM misusespecificallyin discussions threatens to overwhelm editors? My experience is that it irritates editors (extremely so, for a small number of editors), but there's just not that much of it total. I did a quick survey of AFDs some time ago, and I think it turned up less than 1% of AFDs having any LLM-style wall of text. I glanced through theWikipedia:Teahouse and found that less than 10% of the conversations mention LLMs (and mostly that people were asking or disclosing their use in an article or draft, not so much using it in the discussion). There are over 100 discussions there and almost 700 comments. That's not what I'd expect to find if LLMs were threatening to overwhelm us. Similarly, there are about 300 comments on this page, and I think they're also (detectable) LLM-free.
    Therefore I wonder whether the "threatening to overwhelm" part is about article creation, rather than in discussions.WhatamIdoing (talk)19:14, 29 January 2026 (UTC)[reply]
  • Support in principle, per the rationale provided by Aquillion above. I agree that, at the very least, give this time to mature if it becomes a guideline (and while it is one).XtraJovial (talkcontribs)03:37, 1 February 2026 (UTC)[reply]
  • Editorsare not permitted to use large language models to generate user-to-user communications, including but not limited to talk page comments, noticeboard complaints, and comments or nominations in deletion discussions.
    This prohibitionincludes the use of LLM-generated text which is then reviewed, reworded or otherwise modified by the human editor, but where the fundamental idea or argument is ultimately still from the LLM output.
    That's fine. Delete the rest. Add it as a policy section toWP:TPG. Done.~ ToBeFree (talk)18:35, 4 February 2026 (UTC)[reply]
  • Oppose as too wordy and mostly dedicated to the evils of LLM, as opposed to the real problem:LLM can generate text much faster that human can reply to or even just read. I would argue that in the use of LLM in discussions we should concentrate on the volume of comments, not their quality. Yes, LLM can generate nonsense, so do the humans. We already know how to wade through human bullshit, problem is in the amount of text IMHO.Викидим (talk)00:05, 8 February 2026 (UTC)[reply]
  • Oppose as written per whatUser:CaptainEek said above as far as non-talk-page-use goes. The copyediting bit looks out of place and will lead to questions once it's a guideline (I can see the questions answered in the previous revisions collapsed above, but it needs to be clearer to be a guideline or removed). I agree with CaptainEek about: we need better data to make better decisions, a way of getting that data is to collect statistics on how many edits are LLM driven, one way of making that easier is to make disclosure mandatory first.

    I would support conversion to a guideline, if it was limited totalk page use of LLMs (basically what the title of the guideline is) and if it was (as others likeUser:ToBeFree said) much more focused and to the point (so that it can be easily integrated into existing policy).

    Practically: editors may find it difficult to understand what copyediting is allowed -- because, if you put something through Grammarly (or a similar tool), chances are it's being reviewed and copyedited by a language model even though "language model" is not written on the tin. Removing the copyediting part may make the change more amenable to discussion here.

    🔥Komonzia (message)14:24, 8 February 2026 (UTC)[reply]
  • Oppose as a guideline; better used as an essay explaining the potential downsides to using LLMs. A user shouldn't be required to "know" that regular online searches and MS Word now use AI and have therefore become prohibited tools. Come on! There is also so much incorrect in that manifesto.   ▶ I am Grorp ◀18:27, 8 February 2026 (UTC)[reply]
Support I'm surprised this isn't already a guideline.Rosaecetalkcontribs09:20, 10 February 2026 (UTC)[reply]
Support the principle and the spirit, but agree with @LWG that it would be best to turn the (fairly clear and simple) summary atWP:LLMCOMM into a formal guideline. I also likeUser:WhatamIdoing/Sandbox (for anyone reading this years from now,permalink) as another fairly short and simple alternative. Ultimately, though, all of these options are moving in the right direction - we do need something to discourage the problem of wholly LLM-generated responses.Andrew Gray (talk)19:17, 12 February 2026 (UTC)[reply]

Discussion

[edit]
  • As policy, this proposal assumes that it is the case (and will for some to continue to be the case) that people can confidently identify the output of Large Language Models. I am skeptical that, for short texts, this can reliably be done today. Worse, I can see this shifting debate on talk pages and drama boards toward partisan allegations of AI use that can neither be confirmed nor refuted. Worse, in the coming months, expect to see LLM integration into operating systems, browsers, and text editors for writing and editing assistance; many people won’t know whether the dotted red underlines derive from an LLM or a dictionary.MarkBernstein (talk)23:08, 19 January 2026 (UTC)[reply]
    in the coming months, expect to see LLM integration into operating systems, browsers, and text editors for writing and editing assistance We're already there. Gemini is in everything Google, you can't really use any Meta product without accidentally triggering Meta AI, Microsoft Word has Copilot, Windows 11 is an "AI-focused operating system", there are multiple competingAI browsers likeChatGPT Atlas, etc. etc. etc.SuperPianoMan9167 (talk)23:19, 19 January 2026 (UTC)[reply]
  • Putting this in the discussion section, as I feel it's too lengthy to put in the survey section (and I don't think a simple support or oppose would be nuanced enough)
I generally agree with most points made in the proposed guideline. In my opinion, there's no place for LLM content on Wikipedia. Obviously, writing articles is the primary thing I think most people can agree that LLMs shouldn't be involved in, but furthermore I don't think there should be LLM-generated content in any part of the consensus/decision making process on here; whether that be on talk pages, discussion and deletion venues, and administrative boards. The latter part is where it seems many people disagree on the methods of finding LLM-generated content and what to do with it, both proactively and after it happens. We ultimately need some kind of guideline, as the current method of beating around the bush and dealing with LLM usage as it comes up is not a productive use of anyone's time. And this proposal gets many things right, if it is a bit lengthy. I disagree with the restriction of translation, and theExamples section, but even then I wouldn't say they're big enough issues to make me oppose the proposition. Even if this proposal specifically isn't accepted, I would be amenable to one of the many other suggestions above simply for the fact that there needs to be some kind of guideline about LLM usage.SmittenGalaxy|talk!01:18, 22 January 2026 (UTC)[reply]
  • Those commenting on this RfC may want to take a look atWP:Administrators' noticeboard#Should people be using AI to create AfDs? which has links to a few AI-generated AfDs which cite now-obsolete notability guidelines (presumably leftovers in the AI's older training data) as it's very relevant to the point made in my proposal (and other supporting arguments) that LLMs are no good at formulating arguments based on or interpretations of PAGs.Athanelar (talk)12:11, 25 January 2026 (UTC)[reply]
  • (The discussion below is moved from the survey section above to avoid clogging the survey)Athanelar (talk)19:17, 25 January 2026 (UTC)[reply]
    Ask perplexity.ai what will likely happen to Wikipedia, and you might be shocked The question is why on Earth one should care? I will never understand the compulsion to "ask" an AI for an answer to opinion-based questions and then treat the result like it has any weight whatsoever.
    It's also interesting to me (and I'm not saying this applies to you) that such things are often said by the same people who argue that AI is just a tool and should be viewed morally-neutrally as a result. Personally I've never asked my screwdriver or hammer for its opinion on the project I'm using it in.Athanelar (talk)12:03, 25 January 2026 (UTC)[reply]
    I've been given plenty of answers from AIs without asking. Sometimes that answer is useful, sometimes it isn't, sometimes (including for the most recent google search I did) the answer was partly useful and party not useful. AtWikipedia:Redirects for discussion/Log/2025 December 4#Thank you amendment I didn't quote the AI, but I did extensively reference the answer it gave me (without asking). In slightly different circumstances it would have made sense to quote the AI response when analysing that response. While I'm not sure what relevance asking an AI about the future of Wikipedia has to this discussion, TTH's underlying point is directly relevant. LLMs are tools, they are very different tools to hammers and screwdrivers but that doesn't stop them being tools - you wouldn't use a hammer to unscrew something, nor would you use a dishwasher to mow your lawn or a lawnmower to clean your dishes but not being appropriate for a task unrelated to their purpose does not make them any less of a tool. How good a tool is at the task it is intended to perform is also not related to whether or not it is a tool - the other day the disposable wooden knife I was given with my meal at a foot outlet on Paddington station snapped rather than cut a piece of chicken, but that doesn't mean knives are not a tool, it just means that this knife was a bad tool.Thryduulf (talk)13:37, 25 January 2026 (UTC)[reply]
    I agree TTH's underlying point is relevant, I didn't address it because I don't think any supporter here, including my original proposal, is suggesting that any ban on AI generated communication would include a ban on quoting AI for demonstrative purposes, considering that is, in my eyes, very obviously not covered by a prohibition against generating user-to-user communication.
    Unless TT means, like, "I asked the AI to generate a response to your comment and here's what it said:" whichwould be banned, but would also very clearly just be an attempt at finding a loophole to the ban in the first place, so I don't think needs a carveout as an acceptable use case.Athanelar (talk)13:47, 25 January 2026 (UTC)[reply]
    p.s., my point re: tools is one of agency. Every tool has a different function, but the thing that makes it a tool is that it needs to be utilised in order to do anything. You don't ask a screwdriver about a screw or your lawnmower about your lawn, you use it to accomplish the task.
    I think people who simultaneously say "AI is just a tool" but then also "ask" AI 'how' or 'why' or other opinion-based topics are essentially trying to have it both ways. I think you would agree with me that anything capable of having an opinion cannot be described as simply a tool, and so you cannot simultaneously believe that AI is merely a tool which assists in human workflows and also that it is caosble of giving you an original opinion worth listening to.Athanelar (talk)13:51, 25 January 2026 (UTC)[reply]
    I don't agree with that. (Also, AI doesn't have opinions.) Lots of tools perform analysis. Like software, for example, is a tool that has been performing that function for almost a century.Levivich (talk)14:02, 25 January 2026 (UTC)[reply]
    Even analysis software doesn't draw conclusions for you, though. Sure, it can tell you that there's a certain likelihood a Wikipedia edit is vandalistic based on its training data, but it's not going to go that next step of saying "This is probably vandalism, you should remove it." It's a tool, it gives you the information and leaves you to do what you will with it.
    I'm well aware that LLMs can'tactually have opinions, but theyact as though they do, with enough efficacy that people genuinely use them for that purpose, as seen above. "Ask perplexity what it thinks is going to happen to Wikipedia" is something we're told to do, as if the conclusion perplexity will assemble based on its training data is something we're supposed to consider compelling. It's exactly the kind of 'thought-outsourcing' I'm cautioning against in my proposal.Athanelar (talk)18:25, 25 January 2026 (UTC)[reply]
    but it's not going to go that next step of saying "This is probably vandalism, you should remove it." Cluebot and similar tools do exactly that.Thryduulf (talk)18:27, 25 January 2026 (UTC)[reply]
    I suppose I'm not communicating my point well, I don't want to sound like I'm constantly moving the goalposts; those things are still far more deterministic than LLMs are. "If the likelihood of potential vandalism is >X, then revert" is still quite a markedly different sort of task than "Tell me what is going to happen to Wikipedia in the future." The way that people use LLMs is uniquely 'personal' because of their nature as language models, in a way that I think both hampers their usefulness as tools and also damages the people using them.Athanelar (talk)18:45, 25 January 2026 (UTC)[reply]
    think both hampers their usefulness as tools and also damages the people using them Are they tools (as you say now) or not tools (as you said earlier)? Whether or not they "damage the people using them" is an entirely subjective opinion that, regardless of whether it is correct or not, is completely irrelevant to this proposed guideline.Thryduulf (talk)19:03, 25 January 2026 (UTC)[reply]
    All I said earlier was that I perceive a contradiction in the pro-AI camp between the defense that they are merely tools (which has been often invoked as an argument for why they should not be too harshly restricted) and the tendency to use them in ways that are distinctly non tool-like.Athanelar (talk)19:10, 25 January 2026 (UTC)[reply]
    Even a calculator draws conclusions (2+2=4), and as Thryd points out, some software like Cluebot even acts on those conclusions.Levivich (talk)18:40, 25 January 2026 (UTC)[reply]
    Cluebot has near perfect accuracy in detecting vandalism btw.Levivich (talk)18:40, 25 January 2026 (UTC)[reply]
    ....no? Not to get sidetracked, but Cluebot is very good. It's not "near perfect" by any means - I just looked at its ten most recent reverts, and these this one[6] is certainly not vandalism, while[7] is most likely not; the very first episode memorialized her. Good, not perfect. Nowhere close.GreenLipstickLesbian💌🧸19:05, 25 January 2026 (UTC)[reply]
    LLMs have (more or less stable) opinions. The issue is that the opinions come either from the pre-training (reading and predicting text from the internet) or the fine-tuning (where the developers shape the character and writing style, including viaRLHF). These opinions don't necessarily come from methodical, reflective reasoning, it may just come from common opinions on the internet or things the company trained the LLM to say.
    The question of whether they have moral agency may hinge on the exact definition, but events like LLMsstrategically lying in order to remain harmless suggest at least some functional moral agency.Alenoach (talk)19:11, 25 January 2026 (UTC)[reply]
    (edit conflict) I do disagree with you (mostly). A tool doesn't have agency, but neither does an AI - it requires a prompt in order to do anything. You prompt an AI in order to accomplish the task you want to achieve, which is a very different task to a screwdriver but not so different to a grammar checker or reading ease calculator, which also give opinions about the input based on their training data and programming.
    I frequently ask search engines how to do something, and it gives me answers that allow me to accomplish the task (at least in theory, you'd be more than welcome to refit the soft closer to my cupboard door as despite numerous diagrams, descriptions and youtube videos I have consistently failed!). Search engines are unquestionably a tool.Thryduulf (talk)14:04, 25 January 2026 (UTC)[reply]
    I've assumed that if people are asking a chatbot for an opinion, they're mostly getting a summary of what the internet says about the subject. That is, if you asked something like "Is Wikipedia reliable?", then the chatbot would assemble a string of words similar to other webpages that say something about Wikipedia's reliability.WhatamIdoing (talk)18:16, 25 January 2026 (UTC)[reply]
    If anyone wants to see an example of what LLMs can do when you ask for suggestions improving an article, I did a demonstration of that last year at VPThere. LLMs are demonstrably useful for this purpose (but still require human review before making any edits, of course). That was before GPT-5 was released, so it'd probably be even better now.Levivich (talk)14:00, 25 January 2026 (UTC)[reply]
  • Question: Can anybody point to an example of an instance where a LLM actually helped correct an error on a page, or was not used disruptively (intentionally or unintentionally)?Rhinocratt
    c
    19:20, 26 January 2026 (UTC)[reply]
    Like a link to the revision?Rhinocratt
    c
    19:21, 26 January 2026 (UTC)[reply]
    @Rhinocrat, I think you're looking for something likeTalk:Böksta Runestone#Clarification on Image Caption – "Possibly Depicting" vs. "Showing", in which someone who isn't a native English speaker pointed out an error in an article. I don't think it'd be fair to blame his use of an LLM for the response from the Wikipedia editors, but of course YMMV.WhatamIdoing (talk)19:51, 26 January 2026 (UTC)[reply]
    You may be interested inthis Signpost article, which evaluates ChatGPT's ability to find errors in featured articles. The author (HaeB) agreed with 68% of the errors reported by ChatGPT, and disagreed with 23%. ChatGPT found at least one confirmed error in 90% of the featured articles.Alenoach (talk)19:49, 26 January 2026 (UTC)[reply]
    Using it toidentify errors for editors to fix is a different implementation than generating content, however.ChompyTheGogoat (talk)11:31, 29 January 2026 (UTC)[reply]
    True. It isn't the same as posting LLM-generated communication onwikiRhinocratt
    c
    10:00, 6 February 2026 (UTC)[reply]
    Here'sa couple thousand LLM edits. Still working through them, and they are fairly indiscriminate, but the infoboxe additions are mostly OK, and some of theextremely minor copyedits are fine or only in need of minor tweaks. (when the AI copyediting goes beyond extremely minor, it starts to not be fine). Still a bit of using a sledgehammer to hammer a nail, though.
    Similarly I am OK with people who already know what they're doing using Claude Code or the like for plugins, etc.Gnomingstuff (talk)15:30, 27 January 2026 (UTC)[reply]
    I looked at one (1) of those thousands. All it did was cite sources. One had a broken URL (easily fixed). I'd class the added sources in the "kind of weak, but better than nothing, I guess" range.WhatamIdoing (talk)20:38, 27 January 2026 (UTC)[reply]
    @Rhinocrat, I've been usingthis script to identify inaccurate citations,diff1diff2.Alaexis¿question?07:11, 5 February 2026 (UTC)[reply]
    I've had llms catch typos in numbers in a couple of my drafts. Good pattern recognition and so on.CMD (talk)12:35, 6 February 2026 (UTC)[reply]
    It's also decent at catching things that aren't typos, but are instead incorrect or unidiomatic use of words, e.g. as I used it here:[8] and[9] (and others on the same page). I don't believe a word processor spellcheck would have found many of those instances (though there are instances that a word processor grammar check might have found).
    In this case (it was another student contribution), I didn't use it via chatgpt.com but through OpenAI's API "playground", so I don't have a link to the chatlog.
    Specific example from the first linked diff:

    Some usersexpress that theypreferred Replika as itis always available andshows interest in what the users have to say which makes them feel safer around an AI chat bot than other people.

    The tenses didn't agree but this is something I would usually not even notice. I appliedmost of the LLM suggestions but in other cases I opted for removing sentences or rewriting them in an entirely different way.🔥Komonzia (message)19:01, 8 February 2026 (UTC)[reply]
    Perhaps in some situations, but I've often found it offers bad advice or pointless advice as often as good advice when it comes to detailed wording. There can also be issues with English varieties, I often see it preferring American English constructions.CMD (talk)01:38, 9 February 2026 (UTC)[reply]
    See here:https://chatgpt.com/share/69889d00-5514-8009-8165-c81d03fc141b -- I found one or two errors myself in the article based on my general knowledge of the subject, and then shoved the rest into ChatGPT to find more errors. My changes you can see here:Environmental movement in South Africa. There was a high volume of errors because there is a trend among student editors to makehuge edits and be lazy with checking sources. Double checking, and using an LLM with search tools enabled, are essential. In this case I was essentially only using it as a better semantic search engine.🔥Komonzia (message)14:27, 8 February 2026 (UTC)[reply]
    I think a good standard would be something like "air-gapped positive verification" which is presumably what you've done here: Verify every detail and type the edit yourself, just like you would for an edit request from a random IP on a controversial topic. Anything less than that, especially copy-pasting entire passages, risks the pitfall of trusting text that has good tone and style and passes a few spot checks. We know that it's extremely unusual for a human editor to add well-written, well-sourced contentwith a random nonexistent river thrown in, and it's extremely difficult to overcome that bias when assessing LLM-written content. Limiting LLM use to these check-and-verify applications would be a good move. –dlthewave17:26, 8 February 2026 (UTC)[reply]
    Even those who want to use AI more should be in favour of a careful method like this to editing & posting things online (anywhere), otherwise the tools they like so much will end up ingesting its own slop (pardon the graphic metaphor -- better known asmodel collapse) since those tools do use and are trained on text from Wikipedia. The users who want to improve Wikipedia with AI and are pro-AI generally should prefer careful positive verification for their own sakes.🔥Komonzia (message)19:22, 8 February 2026 (UTC)[reply]
    With a quite high probability,an edit request from a random IP on a controversial topic is malicious. AI has no agency, thus IMHO such an anonymous edit request and input from AI are not the same thing by a very wide margin.Викидим (talk)19:48, 8 February 2026 (UTC)[reply]

On paper, I would like to see LLM use be a blockable offense. In reality, we are probably better off giving editors the space to admit it and steering them in the right direction. If it were totally barred, then they would make more efforts to hide it, which maybe would actually be better. But this is my main hesitance. ← Metallurgist (talk)19:36, 10 February 2026 (UTC)[reply]

Retrieved from "https://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_(proposals)/RfC_LLMCOMM_guideline&oldid=1338372180"

[8]ページ先頭

©2009-2026 Movatter.jp