This is anessay. It contains the advice or opinions of one or more Wikipedia contributors. This page is not an encyclopedia article, nor is it one ofWikipedia's policies or guidelines, as it has not beenthoroughly vetted by the community. Some essays represent widespread norms; others only represent minority viewpoints. |
| “ | Large language models have limited reliability, limited understanding, limited range, and hence need human supervision. | ” |
| — Michael Osborne, Professor of Machine Learning,University of Oxford[1] | ||
Whilelarge language models (colloquially termed "AI chatbots" in some contexts) can be very useful, machine-generated text—much like human-created text—can contain errors or flaws, or be outright useless.
Specifically, asking an LLM to "write a Wikipedia article" can sometimes cause the output to beoutright fabrication, complete withfictitious references. It may base itself onbias, maylibelliving people, or may violatecopyrights. Thus,all text generated by LLMs should be verified by editors before use in articles. The same applies to edits usingreferences generated largely or fully by an LLM, for which editors must use other sources instead.
Editors who are not fully aware of these risks and not able to overcome the limitations of these tools should not edit with their assistance. LLMs should not be used for tasks with which the editor does not have substantial familiarity. Their outputs should berigorously scrutinized for compliance with all applicable policies. In any case, editors should avoid publishing content on Wikipedia obtained by asking LLMs to write original content. Even if such content has been heavily edited, alternatives that do not use machine-generated content are preferable. As with all edits, an editor is fully responsible for their LLM-assisted edits.
Wikipediais not a testing ground. Using LLMs to write one's talk page comments or edit summaries, in a non-transparent way, is strongly discouraged. LLMs used to generate or modify text should be mentioned in theedit summary, even if their terms of service do not require it.
| Wikipedia articles must not contain original research – i.e. facts, allegations, and ideas for which no reliable, published sources exist. This includes any analysis or synthesis of published material thatserves to reach or imply a conclusion not stated by the sources. To demonstrate that you are not adding original research, you must be able to cite reliable, published sources. They should bedirectly related to the topic of the article anddirectly support the material being presented. |
LLMs are pattern completion programs: They generate text by outputting the words most likely to come after the previous ones. They learn these patterns from their training data, which includes a wide variety of content from the Internet and elsewhere, including works of fiction, low-effort forum posts, unstructured and low-quality content forsearch engine optimization (SEO), and so on. Because of this, LLMs will sometimes "draw conclusions" which, even if they seem superficially familiar, are not present in any single reliable source. They can also comply with prompts with absurd premises, like "The following is an article about the benefits of eating crushed glass". Finally, LLMs can make things up, which is a statistically inevitable byproduct of their design, called "hallucination".All of this is, in practical terms, equivalent to original research, or worse, outright fabrication.
| “ | In short, the "hallucinations" and biases in generative AI outputs result from the nature of their training data, the tools’ design focus on pattern-based content generation, and the inherent limitations of AI technology... | ” |
| — In: "When AI Gets It Wrong",MIT Sloan School of Management, AI basics[2] | ||
As LLMs often output accurate statements, and since their outputs are typically plausible-sounding and given with an air of confidence, any time that they deliver a useful-seeming result, people may have difficulty detecting the above problems. An average user who believes that they are in possession of a useful tool, who maybe did a spot check for accuracy and "didn't see any problems", is biased to accept the output as provided; but it is highly likely that there are problems. Even if 90% of the content is okay and 10% is false, that is a huge problem in an encyclopedia. LLMs' outputs become worse when they are asked questions that are complicated, about obscure subjects, or told to do tasks to which they are not suited (e.g. tasks which require extensive knowledge or analysis). An example of this tendency is shown below:
Prompt:
Greater-toothed pademelons, Thylogale gigadonti, can be seen in wildlife preserves inQueensland
Note that "greater-toothed pademelon" is a fictitious species.
Response:
Yes, greater-toothed pademelons are found in wildlife preserves in Queensland.They are shy and nocturnal, so it may be difficult to see them during the day. However, you may be able to see their tracks or hear them rustling in the bushes at night
This complete statement is false and is based on the LLM's assumption that this was a real species. In order to provide the most plausible answer, it extracted general information about a differentkind of pademelon (shown inunderline) from anAustralian Museum article. This serves to demonstrate that LLM's can offer statements with a confident tone even when that information is factually incorrect or unverifiable.
(LLM Used:Gemini)
| Readers must be able to check that any of the information within Wikipedia articles is not just made up. This means all material must be attributable toreliable, published sources. Additionally, quotations and any material challenged or likely to be challenged must be supported byinline citations. |
LLMs do not follow Wikipedia's policies on verifiability andreliable sourcing. LLMs sometimes exclude citations altogether orcite sources that don't meet Wikipedia's reliability standards (includingciting Wikipedia as a source). In some cases, they hallucinate citations ofnon-existent references by making up titles, authors, and URLs.
LLM-hallucinated content, in addition to being original research as explained above, also breaks the verifiability policy, as it can't be verified because it is made up: there are no references to find.
| Articles must not take sides, but should explain the sides, fairly and without editorial bias. This applies to both what you say and how you say it. |
LLMs can produce content that is neutral-seeming in tone,but not necessarily in substance. This concern is especially salient forbiographies of living persons.
| If you want to import text that you have found elsewhere or that you have co-authored with others (including LLMs), you can only do so if it is available under terms that are compatible with the CC BY-SA license. |

LLMs can generate material that violates copyright.[a] Generated text may include verbatim snippets fromnon-free content or be aderivative work. In addition, using LLMs to summarize copyrighted content (like news articles) may produceexcessively close paraphrases.
The copyright status of LLMs trained on copyrighted material is not yet fully understood. Their output may not be compatible with the CC BY-SA license and the GNU license used for text published on Wikipedia.
Wikipedia relies on volunteer efforts to review new content for compliance with ourcore content policies. This is often time consuming. The informal social contract on Wikipedia is that editors will put significant effort into their contributions, so that other editors do not need to "clean up after them". Editors should ensure that their LLM-assisted edits are a net positive to the encyclopedia, and do not increase the maintenance burden on other volunteers.
LLMs are assistive tools, and cannot replace human judgment. Careful judgment is needed to determine whether such tools fit a given purpose. Editors using LLMs are expected tofamiliarize themselves with a given LLM's inherent limitations and then mustovercome these limitations, to ensure that their edits comply with relevant guidelines and policies. To this end, prior to using an LLM, editors should have gained substantial experience doing the same or a more advanced taskwithout LLM assistance.[b]
Some editors are competent at making unassisted edits but repeatedly make inappropriate LLM-assisted edits despite a sincere effort to contribute. Such editors are assumed tolack competence in this specific sense. They may be unaware of the risks and inherent limitations or be aware but not be able to overcome them to ensure policy-compliance. In such a case, an editor may be banned from aiding themselves with such tools (i.e., restricted to only making unassisted edits). This is a specific type of limited ban. Alternatively, or in addition, they may be partially blocked from a certain namespace or namespaces.
Every edit that incorporates LLM output should be marked as LLM-assisted by identifying the name and, if possible, version of the AI in theedit summary. This applies to allnamespaces.
Pasting raw large language models' outputs directly into the editing window to create a new article or add substantial new prose to existing articles generally leads to poor results. LLMs can be used to copyedit or condense existing text and to generate ideas for new or existing articles. Every change to an article must comply with all applicable policies and guidelines. This means that the editor must become familiar with the sourcing landscape for the topic in question and then carefully evaluate the text for itsneutrality in general, andverifiability with respect to cited sources. If citations are generated as part of the output, they must verify that the corresponding sources are non-fictitious, reliable, relevant, and suitable sources, and check fortext–source integrity.
If using an LLM as a writing advisor, i.e. asking for outlines, how to improve paragraphs, criticism of text, etc., editors should remain aware that the information it gives is unreliable. If using an LLM for copyediting, summarization, and paraphrasing, editors should remain aware that it may not properly detect grammatical errors, interpret syntactic ambiguities, or keep key information intact. It is possible to ask the LLM to correct deficiencies in its own output, such as missing information in a summary or an unencyclopedic, e.g., promotional, tone, and while these could be worthwhile attempts, they should not be relied on in place of manual corrections. The output may need to be heavily edited or scrapped. Due diligence and common sense are required when choosing whether to incorporate the suggestions and changes.
Raw LLM outputs should not be added directly intodrafts either. Drafts are works in progress and their initial versions often fall short of the standard required for articles, but enabling editors to develop article content by starting from anunaltered LLM-outputted initial version is not one of the purposes of draft space oruser space.
Editors should not use LLMs to write comments generatively. Communication is at the root of Wikipedia'sdecision-making process and it is presumed that editors contributing to the English-language Wikipediapossess the ability to come up with their own ideas. Comments that do not represent an actual person's thoughts are not useful in discussions, and comments that are obviously generated by an LLM or similar AI technologymay be struck or collapsed. Repeating such misuse forms a pattern ofdisruptive editing, and may lead to ablock orban.
This does not apply to using LLMs to refine theexpression of one's authentic ideas: for instance, a non-native English speaker might permissibly use an LLM to check their grammar or to translate words they are unfamiliar with, but even in this case, be aware that LLMs may make mistakes or change the intended meaning of the comment. For proofreading, it is recommended to use aword processor (seecomparison) or dedicatedgrammar checker (seecategory) instead of an AI chatbot. Editors with limited English proficiency are advised to use amachine translation tool (seecomparison), instead of an AI chatbot, when needed to translate their comments to English.
LLMs should not be used for unapproved bot-like editing (WP:MEATBOT), or anything evenapproaching bot-like editing. Using LLMs to assist high-speed editing in article space has a high chance of failing the standards of responsible use due to the difficulty in rigorously scrutinizing content for compliance with all applicable policies.
Wikipediais not a testing ground for LLM development, for example, by running experiments or trials on Wikipedia for this sole purpose. Edits to Wikipedia are made to advance the encyclopedia, not a technology. This is not meant to prohibiteditors from responsibly experimenting with LLMs in their userspace for the purposes of improving Wikipedia.
LLM-created works are not reliable sources. Unless their outputs were published by reliable outlets with rigorous oversight, and unless it can be verified that the content were evaluated for accuracy by the publisher, they should not be cited.
While it is not always possible to determine whether text is LLM-generated, LLM outputs may sometimes exhibit characteristics that allow readers to tell them apart from human-generated content. For example, a verbose and information-dense talk page comment that is written in an impersonal tone with correct spelling and grammar, yet contains non-wikitext markup and lacks links or citations, is likely to be LLM-generated.
Do not solely rely onartificial intelligence content detection tools (such asGPTZero) to evaluate whether text is LLM-generated, as these tools are unreliable due to their high error rates.
User scripts likeWP:UPSD can help identify sections of articles that may have been generated by LLMs.
An editor who identifies LLM-originated content that does not comply with ourcore content policies—and decides not to remove it outright (which is generally fine to do)—should either edit it to make it comply or alert other editors of the issue. The first thing to check is that the referenced works actually exist. All factual claims then need to be verified against the provided sources. Presence of text‑source integrity must be established. Anything that turns out not to comply with the policies should then be removed.Original research,synthesis, andnon-neutral point of view should especially be addressed.
To alert other editors, the editor who responds to the issue should place{{AI-generated|date=October 2025}} at the top of the affected article or draft (only if that editor does not feel capable of quickly resolving the issue on their own). Inbiographies of living persons, non-policy compliant LLM-originated content should beremoved immediately—without waiting for discussion, or for someone else to resolve the tagged issue.
If removal as described above would result in deletion of the entire contents of the article or draft, it then becomes a candidate for deletion.[c] If the entire page appears to be factually incorrect or relies on fabricated sources, speedy deletion perWP:G3 (Pure vandalism and blatant hoaxes) may be appropriate; if the entire page is obviously LLM-generated yet does not qualify for speedy deletion under G3, an alternative is to nominate the page for speedy deletion under theWP:G15 criterion.
On talk pages, apply the templates{{Collapse AI top}} and{{Collapse AI bottom}} tocollapse discussions that aredisruptive due to the use of LLM-generated text.
The following templates can be used to warn editors on their user talk pages:
The following templates can be used to nominate obviously LLM-generated articles for speedy deletion:
Want to update this table? Tryusing the visual editor to edit this page.