Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Wikipedia:Bots/Noticeboard

From Wikipedia, the free encyclopedia
<Wikipedia:Bots
Noticeboard for bot-related issues
icon
Wikipedia's centralizeddiscussion, request, and help venues. For a listing of ongoing discussions and current requests, see thedashboard. For a related set of forums which do not function as noticeboards seeformal review processes.
General
Articles,
content
Page handling
User conduct
Other
    Bots noticeboard

    Here we coordinate and discuss Wikipedia issues related tobots and other programs interacting with the MediaWiki software. Bot operators are the main users of this noticeboard, but even if you are not one, your comments will be welcome. Just make sure you are aware about ourbot policy and know where to post your issue.

    Do not post here if you came to

    Bot-related archives
    Bots (talk)
    1,2,3,4,5,6,7,8,9,10
    11,12,13,14,15,16,17,18,19,20
    21,22
    Newer discussions atWP:BOTN since April 2021
    19,20,21,22,23,24,25,26,27,28
    29,30
    Pre-2007 archived underBots (talk)
    Bot requests (talk)
    1,2
    Newer discussions atWP:BOTN since April 2021
    BRFA (talk)
    1,2,3,4,5,6,7,8,9,10
    11,12,13,14,15
    Newer discussions atWP:BOTN since April 2021



    Discussion atWikipedia:Administrators' noticeboard § SineBot, benign helper or closet vandal?

    [edit]

     You are invited to join the discussion atWikipedia:Administrators' noticeboard § SineBot, benign helper or closet vandal?, which may be of interest to this noticeboard.Tenshi! (Talk page)16:25, 10 October 2025 (UTC)[reply]

    The discussion got archived without a closure statement.~2025-34098-99 (talk)18:25, 16 November 2025 (UTC)[reply]

    AI-driven article review bot

    [edit]

    I am in the process of developing an AI-based article improvement suggestions bot. Before I go any further, I'd like to raise possible policy issues up front, before I go to the effort of actually implementing it and making a proposal.

    Background: I've been experimenting with using LLMs and web searching to do multi-step systematic review to find errors in articles, with very promising results. Here's my methodology:

    1. Select an article using 'Random article'
    2. Get Claude to perform a review of that article, giving it the article's wikitext as an input (Claude, and I imagine other LLM agents, has been blocked from accessing Wikipedia directly.)
    3. Based on that, tell it to perform a set of web searches to find sources to confirm or deny any factual errors it thinks it may have found. (It incorrectly 'believes' that it cannot access the web unless actually told to.) It is forbidden to use Wikipedia as a source. I will soon add more stringent criteria on sources.
    4. Based on the output of those searches, perform a review of the claims based on the evidence it has found. It is instructed to generate a detailed rationale for each claim, together with source URLs to back up its assertions.
    5. Finally, based onthat, select the single correction out of the remaining errors that it is most confident about.

    Note that this is a multi-stage process, with each round of checking being isolated from the previous rounds.

    So far, the results have been stunning. Not a single bogus error has passed through this multi-step review process, with the LLM revising its opinion at the systematic review phase, and some of the errors detected have been based on foreign-language sources, with the script finding and correctly interpreting them without even being directed to do so.

    If I can make this accurate and reliable enough, I am then contemplating using it to drive a bot that would then notify other editors by putting a comment on the article's talk page, wrapped in a template that could be used to style the text and also put the talk page into an appropriate tracking categories. The template wouldclearly label the report as AI-generated, and warn editors that they are responsible for fact-checking the reports themselves against the sources and should not blindly incorporate its suggestions into articles, or use its own words verbatim.

    My aim would be to make false positive rates < 1%. If the bot can succesfully review 100 articles without a single error, I would regard that as evidence of that criterion likely being met, and would then take the bot to this forum as a bot proposal.

    Just as with my geocoding bot, I can then record that the article has been visited, and not go back to the same article again for some time, if ever, to avoid annoying other editors with repeated reports. There are, after all, millions of articles to review, so this isn't really a limitation. To avoid overwhelming the editing community with noise, I could limit the bot to perhaps 100 edits a day - the aim is to be a helper to editors, not a taskmaster. Not to mention that LLM access costs money, and this sort of multi-stage review consumes a lot of tokens per article reviewed.

    In the interests of transparency, I will of course make the code open source.

    Before I go any further in working on this, there is an obvious policy issue, which is the legitimacy of making automated edits containing text generated by LLMs, even on talk pages and clearly marked as such.

    I would like to hear your opinions on this. —The Anome (talk)11:35, 6 November 2025 (UTC)[reply]

    Wasting the Earth's resources to find some minor errors likethis is ethically rather hard to defend.Fram (talk)11:36, 6 November 2025 (UTC)[reply]
    Hello again, Fram! That cost perhaps $0.001 to $0.01 of electricity to find. I can provide exact costings for further queries, if you'd like. Perhaps you might findthis a bit more impressive? —The Anome (talk)11:45, 6 November 2025 (UTC)[reply]
    Not really, no. It's an additional detail, not an error. Obviously you only know after the fact whether you will find something worthwhile or have just wasted a LLM run. And I don't really care about the cost of a single run anyway, it's the general "let's use LLMs for everything, from the important to the trivial" which in toto leads to the creation of these huge resource-guzzling centers, with datasets created by misusing the often copyrighted work of others. If you want to improve articles, read articles, use your brain and skills, and improve them based on your own work.Fram (talk)12:11, 6 November 2025 (UTC)[reply]
    Your rejection of this on these grounds is ludicrous. We sit in our warm houses, heated and lit by resource-guzzling heating, using computers made by resource-guzzling manufacturing, communicating via the (surprisingly) resource-guzzling internet, eating food made by resource-guzzling agriculture... there are far more serious problems than GPU barns to worry about. Can we please have a serious discussion aboutimproving the encylopedia instead? —The Anome (talk)12:23, 6 November 2025 (UTC)[reply]
    @The Anome We best find some common ground before wasting precious resources on debating. Dismissing concerns is not as effective as finding common ground and exploring our options from there (even tho debating online can be a lot of fun).Polygnotus (talk)12:25, 6 November 2025 (UTC)[reply]
    Bye.Fram (talk)12:30, 6 November 2025 (UTC)[reply]
    @Fram You may be interested in joining themeta:Sustainability Initiative.
    It is true that in the grand scheme of things the fact that we try to write and improve an encyclopedia is rather hard to defend; people are literally dying of hunger as we speak.
    It is also true that when a new tech hype shows up idiots try to cram it into anything, and we don't reallyneed vibrators to be on the blockchain.
    On the other hand, it is wise to explore to make sure we understand it, and to not throw out the baby with the bathwater. LLMs are both a giant threat and an opportunity for Wikipedia.Polygnotus (talk)12:15, 6 November 2025 (UTC)[reply]
    Thanks, but no, I'm not interested in editing meta in any way or shape.Fram (talk)12:32, 6 November 2025 (UTC)[reply]
    @Fram And you don't appear to have many userboxen. Switching to (relatively) green energy is a no-brainer, and many data centers support it.Polygnotus (talk)12:35, 6 November 2025 (UTC)[reply]
    At the risk of derailing the conversation further, full electrification of the economy using sustainable power and energy storage is indeed the way to go. I'm an enthusiastic booster of green energy. —The Anome (talk)12:49, 6 November 2025 (UTC)[reply]
    Resource guzzling centers? I thought DeepSeek proved that a mid-range, off-the-shelf PC was all you needed for AI?~2025-31723-49 (talk)16:27, 6 November 2025 (UTC)[reply]
    DeepSeek#Training_frameworkAs of 2022, Fire-Flyer 2 had 5,000 PCIe A100 GPUs in 625 nodes, each containing 8 GPUsPolygnotus (talk)09:19, 9 November 2025 (UTC)[reply]
    As far asWP:Bot policy goes, I don't see a problem other than needing community consensus. As far as that, your discussion atWP:VPI#AI citation-checking bot is a good start but we'd probably want to see approval in aWP:VPR discussion to let it go ahead, considering general community sentiment against using LLMs to directly produce content or discussion comments.Anomie12:20, 6 November 2025 (UTC)[reply]
    Just to be clear, I'memphatically against using LLMs to generate Wikipedia content; this would mark the beginning of the end for Wikipedia and its descent into a Grokipedia clone. This isn't that - it's flagging things for human attention. Following the earlier discussion, I realised that in-page annotation was a bad idea, so I've already changed my proposal based on that. —The Anome (talk)12:30, 6 November 2025 (UTC)[reply]
    That doesn't change anything I said. There seems to be enough sentiment against LLMs in general that we'd want to see a well-attendedWP:VPR discussion showing consensus for your idea.Anomie12:34, 6 November 2025 (UTC)[reply]
    @Anomie: I'm in total agreement with you; perhaps I didn't make myself clear enough above, I'm more thanemphatically againt abuse of LLMs to generate 'fact', I'm violently opposed to it. Grokipedia stands as an awful warning of what happens if you try. Any use of LLMs on Wikipedia, in any way, needs justification, and that why I'm talking about this in public, in a number of appropriate forums, trying to get feedback about all the different aspects of this. (Oh - to give another example of what I would consider a legitimate use of LLMs, adding WikiProject tags to talk pages of articles without them comes to mind. Proposal to follow there, too.) —The Anome (talk)12:40, 6 November 2025 (UTC)[reply]
    Grokipedia is not an attempt to generate facts, it is an attempt to createalternative facts to not offend tiny-brained far right idiots.Polygnotus (talk)12:42, 6 November 2025 (UTC)[reply]
    Indeed. To quote Frank Herbert: "Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them." —The Anome (talk)12:43, 6 November 2025 (UTC)[reply]
    I find the argument that "using electricity on bots is wasteful" to be unpersuasive. Electricity isn't something we need to be rationing at the user level. If it's truly that expensive, limits will be introduced via raising prices, or Toolforge will introduce restrictions.
    However the inaccuracy of LLMs often wastes an incredible amount of editor time that could be spent doing other productive things onwiki. So my presumption for anything LLM is that it's bad until proven otherwise.
    Anyway, got any diffs of what kinds of reports this bot would make, so that we can evaluate the LLM's accuracy? –Novem Linguae(talk)13:19, 6 November 2025 (UTC)[reply]
    I will code up an implementation of my hand-driven experiments so far, and post some. You can see a little bit of the early results atUser:The Anome/Claude experiment. —The Anome (talk)13:22, 6 November 2025 (UTC)[reply]
    My objection was not about "using electricity on bots", but about the costs of creating and running LLMs. Wikipedia should not support or rely upon such a dreadful, wasteful, thieving, mind-numbing technology.Fram (talk)13:31, 6 November 2025 (UTC)[reply]
    The WMF essentially caused us to miss out on a generation of new editors by ignoring that the common way to interact with the internet was via mobile devices. Now the community is is going to the same thing by ignoring that LLMs have changed the way that many people research and interact with the internet. That will be two generations of new editors shut out.
    We should be seriously considering ways that LLMs can be used constructively on Wikipedia so that we don't wither away. The number of English speakers on the internet has ballooned while active editor numbers have stagnated, active admin numbers are dropping, and our page views are starting to decline. Maybe we shouldn't accelerate the decline?ScottishFinnishRadish (talk)13:32, 6 November 2025 (UTC)[reply]
    At the risk of seeming to play both sides, I think the real value is Wikipedia is that it represents thehuman distillation of the consensus reality ofhuman knowledge. The NPOV principle and the social infrastructure built around Wikipedia has made it the nearest thing to a single source of truth in the modern world - even if that truth is "sources differ" or "there is no consensus", Wikipedia reports the controversy in as neutral way as we possibly can. And there is huge value in that. It's also the case that LLMs are to a substantial extent driven by Wikipedia as one of their central sources of knowledge, as no other single source binds all human knowledge together to the same degree. Nothing we should do should imperil that. But at the same time, to ignore the potential utility of LLMs to aid human curation of human knowledge would also be crazy. We can either use the tool to help human endeavour - by checking facts, finding citations, and so on - or have it used against us to take it away. Feeding LLM-generated text directly into Wikipedia will help bring aboutmodel collapse, and that would be a disaster. (Ironically, another valid use of LLMs would be to detect LLM editing of Wikipedia so it can be rooted out - but that's another discussion for another day.) —The Anome (talk)14:07, 6 November 2025 (UTC)[reply]
    I broadly agree. Human hands need to be involved, and I don't think it's very likely we'll reach a point where we can just say "hey, make me an article onshit flow diagrams" and accept whatever it shits out. But coming up with constructive use cases and what editors need to know and do to use those tools to contribute is necessary for the continued relevancy of Wikipedia.ScottishFinnishRadish (talk)14:39, 6 November 2025 (UTC)[reply]
    Exactly. Man discovered how to use fire in prehistoric times, we discovered deep learning just recently. But man learned the lesson that there is a big difference betweenusing contained, controlled fire to cook and power things, and settingyourself on fire. And that's the lesson we need to learn now. —The Anome (talk)15:20, 6 November 2025 (UTC)[reply]
    To use the chainsaw analogy from after my RFA, I used a chainsaw to cut down the tree I used to build my bed, but I didn't use it to cut lap joints or mortices. Tools used in the right way are effective. Used incorrectly they end in a ruined bed and a missing finger.ScottishFinnishRadish (talk)15:57, 6 November 2025 (UTC)[reply]
    I am dubious that this particular proposal would encourage LLM-friendly editors to get involved. It seems to me (and I don't claim to be an expert) that LLM fans generally want AI to do their work for them, while this would be the other way 'round: the AI is making the choices then asking for human hands to do the work for it. --Nat Gertler (talk)15:35, 6 November 2025 (UTC)[reply]
    I supportThe Anome's efforts thus far. Running various reports to create lists for human editors (and sometimes bots operated by human editors) to review and fix is something that we have been doing on Wikipedia for a long time. This proposed report looks like the latest version of that. Whether it'sanalyzing a database dump for possible typos, or generating areport of possibly invalid ISBNs, we have helpful tools that create hundreds of lists for humans to analyze. One more, if it is of high quality, is welcome. And as for bringing in LLM-friendly editors, who knows? The goal is to help editors find errors in Wikipedia articles and fix them. We don't know what kinds of editors will show up to do that. –Jonesey95 (talk)16:06, 6 November 2025 (UTC)[reply]
    Onthe sample page, the output looks pretty verbose. If the bot could be configured to just print reports with bulleted lists, and each bullet contains a page wikilink and then one suggestion per bullet, that could be a pretty efficient format. Wouldn't even need a BRFA for that since the reports could be printed to the bot's userspace.
    I think asking to let the bot put these suggestions on article talk pages would be much more controversial, so perhaps might want to avoid that route. –Novem Linguae(talk)16:13, 6 November 2025 (UTC)[reply]
    Just generally speaking, yes page views are declining, don't know if thecorrect answer to all of this is to embrace LLMs. With respect to this specific bot idea, while I'm not against what is being done here, but I do feel like I would feel a bit out of my depth if I were to describe "why" the LLM is able to say "yes this claim is supported" (which imo is not a good thing for a "fact checking" bot). I would personally favor a more low-tech approach of feedingRoBERTa chunks of data/sentences from a URL to create a vectored DB indexing each work and then usingcosine similarity or some kind oftextual entailment model to verify if a statement is supported by the text. I'm not sure how it would compare with a LLM in terms of emissions or electricity usage, but it would allow us to say "the model saw X claim and Y text and gave the wrong prediction here" rather than "god knows why it thought X is fine".Sohom (talk)16:46, 6 November 2025 (UTC)[reply]
    There's a difference between broadly embracing LLMs and figuring out constructive use cases for them.ScottishFinnishRadish (talk)17:19, 6 November 2025 (UTC)[reply]
    @ScottishFinnishRadish, we should consider each technical approach on it's technical merits and not shoehorn technical solutions due to a sense of urgency fueled by external factors as you lay out above. I don't have problems with using LLMs in constructive use cases, but it needs to be the square key in the square hole, I feel like there is a flavor of "lets just smash the hexagonal key into the square hole" in this thread.Sohom (talk)19:19, 6 November 2025 (UTC)[reply]
    I guess there is no real policy precedent yet for an LLM "AI" bot. It all still falls underWP:CONTEXTBOT and normal need for consensus. I guess the closest we have is Cluebot machine learning stuff, where some false positives are deemed acceptable. But of course that's not creating content. Then again, in this case it's manually reviewed, so it's "just" assisted editing. So it's underWP:LLM (but even that hasn't moved beyond an essay). I suppose my main concern would be "who is going to verify that the editor isn't just mass-adding content without verifying?" In fact, how is the operator themselves verifying it? That has always been one of the main reasons for the bot policy to exist. No one can review all the millions of edits made by bots. And it's much more difficult when every edit is unique. When LLMs hallucinate, they're confidently wrong and it can be really hard to tell if they're wrong. But I can see how with some effort one can assemble an agent LLM that has the tools for feedback to validate its own work, i.e. searching the web or looking at its own addition "critically", and reducing errors greatly. Anyway, I'm just thinking out loud. Because this is very novel, I don't think we'll find out the broader community reaction until something like it goes live. Though I can definitely predict there will be a lot of both reasonable and reactionary division on this, so we might want to up the policy/guidelines sooner than later. I just don't know with what exactly... —  HELLKNOWZ  TALK16:43, 6 November 2025 (UTC)[reply]
    I'm working my way through all this as I try to pin the proposal down to something that is supportable by community consensus. But I have thought of a way to deal with the problem of people just banging the bot's corrections in verbatim, and that's for the bot itself to re-visit article to check for over-literal non-trivial use of the bot's output in edits. Which can then in turn be reported for review. As I said earlier, writing the bot itself is the easy bit, it's engagement with the Wikipedia ecosystem of editors, culture, processes and rules that's the hard bit. —The Anome (talk)17:10, 6 November 2025 (UTC)[reply]
    To copy over somewhat my comment at the pump, an automatic post-to-talkpage bot has its drawbacks, and may go the way of being mostly ignored like the EL bot. Such a tool would work best as an on-request tool, especially if it's only picking out 1 issue each run.CMD (talk)08:40, 7 November 2025 (UTC)[reply]
    An on-request review-bot would seem like a good way to go. —The Anome (talk)17:38, 7 November 2025 (UTC)[reply]
    By the way, I've searched for "EL bot", and can't find out what this refers to. Can you tell me? It sounds like something I should know. —The Anome (talk)10:59, 9 November 2025 (UTC)[reply]
    The Anome, it's the bot that checked whether the external links in an article were dead, updated them to archive links, and then posted a talkpage notice about fixing them. E.g.Talk:Links (series)#External links modified 2. — Qwerfjkltalk14:07, 9 November 2025 (UTC)[reply]

    Discussion atWikipedia:Administrators' noticeboard/Incidents § citation bot malfunctioning

    [edit]

     You are invited to join the discussion atWikipedia:Administrators' noticeboard/Incidents § citation bot malfunctioning.45dogs (they/them)(talk page)00:10, 25 November 2025 (UTC)[reply]

    BOTINACTIVE isn't followed?

    [edit]

    According toWP:BOTINACTIVE, bots which haven't edited in 2 years get de-botted. But we haveCategory:Inactive Wikipedia bots, with some bots which haven't edited in often more than ten years but still have the bot user group. Examples of bots in that category which still are members of the bot usergroup areUser:Acebot (last edit 2019),User:AndreasJSbot (2013),User:Arbitrarily0Bot (2012),User:ArmbrustBot (2020),User:ArticlesForCreationBot (2013),User:AttributionBot (no edits, granted 2014),User:AudeBot (2012 apart from one edit in 2018?)... Perhaps time to check them all and remove the bot user group where warranted?Fram (talk)16:30, 26 November 2025 (UTC)[reply]

    The rule isn't "bot > 2 years since edit", it's "bot > 2 years since contribution and operator > 2 years since contribution". The latter part of this rule keeps a lot of bots on the list.Izno (talk)16:36, 26 November 2025 (UTC)[reply]
    Oh, right. Then the next question probably is, wouldn't it be better (for security mainly) if bots which haven't edited for X years (2, 5, 10) get de-bot-grouped anyway regardless of the status of the operator? It's not as if all these operators are around anyway, e.g. the operator of AceBot has made two edits in the last 2 years...Fram (talk)16:49, 26 November 2025 (UTC)[reply]
    I started a discussion about nudging the relevant requirements north atWT:Bot policy/Archive 29#Bot and operator inactivity - blocks which hadWikipedia:Bots/Noticeboard/Archive 18#Inactive bots as its immediate background. Some other editors tried to make it about some sort of timeline but either way I took the reception as lukewarm. Eyeballing both archives since I don't see any other discussion about changing inactivity requirements.Izno (talk)17:17, 26 November 2025 (UTC)[reply]
    Though there has been at least one discussion that circles that drain regarding whether bots are open source and freely published, because one usual objection to tightening the belt has been "we don't have access to the source so we can't make X bot not a bot anymore". This doesn't really apply to the ones you've pointed out but it might to ones you didn't.Izno (talk)17:20, 26 November 2025 (UTC)[reply]
    Thanks for those links. I don't understand that last point though, removing the "bot" user group does nothing to make the code more or less accessible? It's just that if such a bot would suddenly restart but without the bot tag attached, it would appear in recent changes and so on so would get more (or faster) scrutiny than if it could operate "stealthily", in the background as a bot.Fram (talk)17:39, 26 November 2025 (UTC)[reply]

    New maintainers needed for Citation bot

    [edit]

    See the topic a couple sections up.User:Citation bot's maintainers @Smith609, @Kaldari, and @AManWithNoPlan are all inactive and while the bot still works, it has been blocked due to a new bug (which is pretty serious, though it only impacts a few pages). Considering thatthe only person who has been maintaining the bot recently has said they are no longer able to maintain it, it seems to be time for someone new to jump on board. I'm posting this here in hopes that someone who sees this will be interested in stepping up.Jay8g[VTE]03:29, 28 November 2025 (UTC)[reply]

    smith has the full power on the repository. There are thing he can so that I cannot.AManWithNoPlan (talk)21:31, 28 November 2025 (UTC)[reply]
    Retrieved from "https://en.wikipedia.org/w/index.php?title=Wikipedia:Bots/Noticeboard&oldid=1324645611"
    Categories:
    Hidden category:

    [8]ページ先頭

    ©2009-2025 Movatter.jp