On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page'sarchive index. The current archive is located at2025/11.
SpBotarchives all sections tagged with{{Section resolved|1=~~~~}} after 1 day and sections whose oldest comment is older than 7 days.
Latest comment:4 days ago11 comments5 people in discussion
What policy do we have about historical official websites? If we can have them as history (as usual with normal vs. preferred ranks), what if they are redirects to some spam or even malicious scripts now?Infovarius (talk)22:02, 6 November 2025 (UTC)Reply
Dead links are often domain-squatted. They come and go and change hands. None of them have your best interest at heart. We're not in the business of classifying exactly how malicious a domain squatter is. Once we mark the link as deprecated, no-one should be treating it as the live link. Further classification of where the link currently goes is ephemeral and uninteresting.Bovlb (talk)20:13, 24 November 2025 (UTC)Reply
>None of them have your best interest at heart Links may go down temporarily and get marked as dead only to come back later on though, perhaps due to a DNS issue, page move without proper redirection, temporary server issues, etc. >Further classification of where the link currently goes is ephemeral and uninteresting. As I mentioned in the above reply, >Additional info may be useful for when the replacement is the cause of notable events as well. If the redirection is malicious and results in notable security events, lack of service, or other notable incidents it might be useful to have this relation in the DB (cause of event, factor of, related to, etc.) along with the status of the history of the link. This would allow one to check how many times and total duration that a major social network or similar has had significant downtime for instance.AHIOH (talk)20:38, 24 November 2025 (UTC)Reply
I see what you are getting at though and my suggested implementation may not necessarily be the right "domain." For instance, I am not sure the "wikidata item reference" would be the best place for the info to go. It's not really a cause of any event. These kind of relations would likely be better represented on the events themselves.AHIOH (talk)20:52, 24 November 2025 (UTC)Reply
I did notice a bit of an error though where since I preferentially look at English and then Japanese, if there is a mul label but no english label it shows the ja label instead of the mul label. But that is a very technical issue.Immanuelle (talk)23:26, 7 November 2025 (UTC)Reply
Those problems with an article at Wikipedia shows up as "no label" for Wikidata info displayed at the very top of page of the Wikipedia in nn and no, as an user claims to remove all other language labels except mul saying that mul will cover all other languages. Also for queries where one for instance are asking for persons having articles in nn or no Wikipedias a local language is needed. also pingingLucas WerkmeisterPmt (talk)22:38, 16 November 2025 (UTC)Reply
Of course, it won't get implemented. Migrating all data consumers to even this most simple solution with just one mul code is a multi-year effort. Having multiple mul codes will make it a messMidleading (talk)08:02, 13 November 2025 (UTC)Reply
I am not sure how they would be represented in Cyrillic and Arabic scripts since I do not know the conventions of transliterating Chinese and Japanese to those scriptsImmanuelle (talk)22:48, 16 November 2025 (UTC)Reply
Just leave the mul label empty for these items, because there is no majority of languages that use the same label. For regional mul labels, please contact the relevant Wikimedia community before making a proposal on your own. Chinese don’t really need a mul-Hani label, zh label already serves that purpose effectively.Midleading (talk)02:57, 18 November 2025 (UTC)Reply
zh is the fallback language of lots of Chinese languages (ref.File:MediaWiki_fallback_chains.svg). This is also fully supported by Wikidata right now. Japanese doesn't fall back to zh for good reason. You can ask a Japanese friend. Probably they don't understand Chinese, or they don't want to use Chinese characters anymore in Japanese text. Anyway, you need to contact communities that use this native language before making a proposal on their behalf.Midleading (talk)07:47, 19 November 2025 (UTC)Reply
If Chinese/Japanese is hard to understand, I have a better example. I just thought Ukrainian should fall back to Russian because lots of Ukrainians understand Russian better than English. But no, that fall back does not exist. Actually, Ukrainian fallback to Russian used to exist ([1]), but it was removed!Midleading (talk)07:55, 19 November 2025 (UTC)Reply
Good point. Currently mul is used only for Latin scripts. In the long run, we should rename the property to mul-Latn. In the exampleBoris Godunov(Q170172) (fromHelp:Default values for labels and aliases) the same Cyrillic label is used for at least a dozen different languages. A mul-Cyrl label would be very useful for items like that. (But we should probably wait until the current issues with mul have been resolved. Let's iron out the current can of worms(ouch) before we open another.) —Chrisahn (talk)10:30, 17 November 2025 (UTC)Reply
I didn't realize that it was only supposed to be used for Latin scripts. I had thought that disambiguation pages often got other scripts at least.Immanuelle (talk)23:29, 18 November 2025 (UTC)Reply
Disambiguation pages are an example of items that wouldnot use Mul. Remember, Mul is not intended to be the great be-all end-all. It's intended to reduce duplication where possible.—Huntster(t@c)23:41, 18 November 2025 (UTC)Reply
@Huntster in Wikidata a disambiguation page that links Wikipedia pages with lists for with different concepts for the same string.
If we would have the German word "Baum" and the English translation "tree", we would have "Baum(disambguation)" and "tree(disambiguation)" as two separate Wikidata items.ChristianKl ❪✉❫20:48, 23 November 2025 (UTC)Reply
Maybe a new property for Wikidebates could solve this but then the Wikiversity pages wouldn't show up anymore underneath the Wikiversity links where one may expect them. On the other hand, this is already the case for Commons categories where for example no Commons link is shown inhobby(Q47728) under section "Multilingual sites" at the bottom which usually contains Commons links but only inCommons category(P373).Prototyperspective (talk)13:24, 10 November 2025 (UTC)Reply
Had this idea about how it can be specified since after I asked: maybe it could be set like the badges andintentional sitelink to redirect(Q70894304) etc are for Wikipedia articles – would this be possible and if so how? I think it would be better to separate it like this so that Wikiversity links still are in and only in the Wikiversity links section instead of being set on a new property.
Also note thatv:Wikidebate links don't have anything standardized in the URL except a question mark at the end.
If the better solution would be to have Wikidebates have something like Wikidebate: in the title, I'd still need to build sparql that returns only pages with that in the title assuming that's possible but I don't know whether that would be a better solution as it would be useful to distinguish between various subtypes of project pages (maybe not just wikiversity but also e.g. wikibooks).Prototyperspective (talk)14:10, 21 November 2025 (UTC)Reply
I added instance ofWikidebate toIs morality objective?(Q56420123) along with the main subject. This should allow distinguishing it as a Wiki Debate type article and finding WikiDebates associated with a topic. I was going to mergeIs morality objective?(Q56420123) to it, but figured it would be good to run it past you first in case you can think of any issues. It looks like the wikidebate item could use some enhancement to indicate it is a type of WikiVersity article. It also appears there is only one other wikidebate currently listed as such. Perhaps a batch edit could add the instance statements to items from the appropriate categoryAHIOH (talk)21:22, 25 November 2025 (UTC)Reply
Thanks for adding the main subject but I reverted the addition of instance of Wikidebate. There are Wikidebates about all kinds of things (such as existence of God) and those things aren't instances of Wikidebates but e.g. instances of questions or political issues or philosophical problems etc. This is not a "wikidebate item". It's thesubject of a wikidebate (and/or possibly 'described by' a wikidebate).Prototyperspective (talk)21:37, 25 November 2025 (UTC)Reply
np, Isn't each item associated with aWikidebate(Q28043977) article that has a particular topic though? I believe the wikidebates are the form of the debate, a type of wikisource article, with instances representing each debate. Each has an instance of subject which is discussed. These are part ofWiki Debates(Q136806310). I don't believe the Wikidebates are "instances" or types of questions or issues. For if they are not wikidebates then they are just concepts.AHIOH (talk)22:17, 25 November 2025 (UTC)Reply
I don't fully understand your first sentence – as described items can have a wikidebate but then also another wikiversity link or a wikiversity link that isn't a wikidebate so currently there's no way to know. Wikidebates themselves are not instances of questions/problems/issues – that's not what I said. These are set on the items about the subject (the question/problem/issue) that the Wikidebate is about and that's how it should be. It's hard to make sense of what you're saying honestly. AndWiki Debates(Q136806310) seems to be something nearly completely different.Prototyperspective (talk)23:47, 25 November 2025 (UTC)Reply
My apologies. Doing my best. >These are set on the items about the subject (the question/problem/issue) that the Wikidebate is about and that's how it should be. Key point: The sitelink points to the article, not the items about the subject
The topics (morality in this instance) should only have a wikisource articlesitelink on thetopic itself (morality),not every article/debate that it is asubject of. (this is where you are being prevented from adding multiple articles)
If you create a (wikidebate) article it should have an associated (wikidebate) item in wikidata with it (the particular debate article) as a sitelink.
Multiple topics can be associated with this wikidebate article item by adding main subject property statements if needed.
Articles (Wikidebates) related to a topic can be determined by examining the derived statements of topic itemsor querying forinstance of(P31) wikidebate items with the topic/s as the subject.
This allows querying all aspects of the debate and allows multiple debates on the same issue to be raised.
Removinginstance of(P31) removes the ability to easily identify WikiSource articles that are wikidebates.
>And Wiki Debates (Q136806310) seems to be something nearly completely different. You are correct. It appears it is an affiliate program. A separate item should be created for the wikidebate project itself if it is indeed different.AHIOH (talk)01:26, 26 November 2025 (UTC)Reply
Just my humble opinions btw. Not trying to say what is or isn't. I appreciate your bringing up and discussing the topic and hope my comments are productive and don't seem confrontational.AHIOH (talk)02:37, 26 November 2025 (UTC)Reply
Your first point again makes little sense as the item has a Wikipedia article.
So you'd suggest a Wikidebate property and then setting the Wikiversity page there? If not, that would be another way and I referred to that atMaybe a new property for Wikidebates could solve this.Prototyperspective (talk)15:41, 26 November 2025 (UTC)Reply
I think "linking" information using property statements and "sitelinking" are getting confused here. I'm not implying we need a new property, we've got a wikidebate item which can have any properties necessary and only requires assignment (instance of) It works perfectly fine and does everything needed as long as it isn't changed it to an instance of something it isn't. The wikidebate isn't the philosophical problem. You're literally making it one by assigning it as an instance of problem :) Is there a reason you feel the need to change it from a type:wikidebate to type:problem? This is the part that is confusing me.AHIOH (talk)19:58, 26 November 2025 (UTC)Reply
>Wikidebates themselves are not instances of questions/problems/issues – that's not what I said. This was implied by setting theIs morality objective?(Q56420123) wikidebate item as aninstance of: ·philosophical problem I think we probably agree on more things than not BTW and are just having difficulty due to various reasons. >but then also another wikiversity link or a wikiversity link that isn't a wikidebate so currently there's no way to know This is what instances of wikidebate indicate Main wikiversity articles site linked to topics are often a resource or overview article (although it could be any) Unless the wikisource article is set as an instance of a "subclass" of a wikisource article type such as wikidebate, essay, etc. there is no way to know what it is without going to the page. Another option mentioned would be to use indicators in the sitelinks, but this doesn't allow multiple resource links which are often an aspect of wikisource articles.AHIOH (talk)20:58, 26 November 2025 (UTC)Reply
The wikidata book item is typically an instance of version, edition, or translation and has a sitelink to the wikipedia article that is about it as well as main subject properties linking it to the subject concepts with their own wikimedia articles.AHIOH (talk)21:15, 26 November 2025 (UTC)Reply
Do you propose that we create items for every philosophical question (including variations)? Wouldn't it be better to just add the general topics to the article/debate items? The articles/debates are are often titled with the actual thing being considered so not sure that would be necessary or efficient.AHIOH (talk)01:56, 26 November 2025 (UTC)Reply
For every philosophical question with sufficient sourcing, that would be great and make WD much more useful if they have also other data set and it was queryable in useful ways like 'What are philosophical questions relating to biology that were first asked before 1700?' or 'What are philosophical questions relating to physics for which there are not yet any Wikipedia articles?' or 'What are nonethical philosophical problems with identified hypotheses in science?' or whatever. See for exampleList of unsolved problems in biology.Prototyperspective (talk)15:37, 26 November 2025 (UTC)Reply
I agree, if it is notable, it would be worthwhile to have and could save setting multiple subjects on the items that are related in these cases. The wikidebate items could have the philosophical question item (or political issues, questions,etc.) entered as a main subject.AHIOH (talk)20:03, 26 November 2025 (UTC)Reply
The wikidebate isn't the philosophical problem. You're literally making it one by assigning it as an instance of problem :) ,The wikidebates aren't the topic of the philosophical question though Is a Wikipedia article linked to an item about a book an instance of a book? Absurd.Prototyperspective (talk)20:41, 26 November 2025 (UTC)Reply
The book has a wikidata item and the subject is set to the concept (using property statements, not sitelinks) which is represented by a wikipedia article. The article is an instance of wikipedia article. If the book is notable then an article is created for it as well and site linked to the book item.AHIOH (talk)20:46, 26 November 2025 (UTC)Reply
Wikipedia articles are created independently of wikidata. They can and should be linked to wikidata items. A site link is sufficient. As they site links can be any variety of wikimedia content, it would not make sense to assign the item as an instance of wikipedia article. This is not the case when a wikimedia list article, disambiguation, or other type is used.AHIOH (talk)21:20, 26 November 2025 (UTC)Reply
Site links just represent Wikimedia content that best represents the sense of a concept identified by a wikidata item in various languages. It's a priority list essentially and doesn't really say any more than that. The site link may be moved to a different article if it is a better match. The information relating the items to the subjects they are about should be in the statements of the item.AHIOH (talk)21:36, 26 November 2025 (UTC)Reply
Same goes for philosophical questions. Also please try to reply within one comment, not 3 separate ones. Wikiversity pages are created independently of wikidata. They can and should be linked to wikidata items. Commons needs a site link and the Commons category prop to be set.Site links just represent Wikimedia content that best represents the sense of a concept identified by a wikidata item in various languages same here.Prototyperspective (talk)21:45, 26 November 2025 (UTC)Reply
I apologize for branching off, however I am trying to keep the answers concise and focused to each point. As you have probably noticed, it is easy for me to start making an essay addressing multiple points and introducing more which can be difficult to comprehend.AHIOH (talk)22:00, 26 November 2025 (UTC)Reply
I see you were probably referring to multiple comments in my responses rather than the multiple comments overall. Trying to balance conciseness with flooding you with notifications :) This has been an interesting conversation but perhaps other points of view would help get past what seems to be a bit of an impasse. I think we agree on much but think the implementations to resolve the issue of a single Wikiversity site link differs. I'm gonna move on to something else for a bit, but enjoyed the discussion thus far. Hope you have an enjoyable day and will try to check back after a while. Take care!AHIOH (talk)22:25, 26 November 2025 (UTC)Reply
Right now the English Wikipedia is having a proposal to ban articles created by LLM's (Wikipedia:Large language models(Q118877760)). Although Wikidata is structured data far less amenable to LLM generation, I believe the amount of AI-generated slop on the Internet and its demonstrated potential to hinder communication here means that we should similarly disallow LLM usage to create item/property/lexeme descriptions or any non-database content (like comments here). If there's an appetite for this, I can work on drafting a policy and setting up an RfC.Jasper Deng (talk)03:19, 16 November 2025 (UTC)Reply
AI systems, when mature, will give huge benefits to WD, in data gathering, in applying consistent policies, in avoiding database bloat when producing automatic descriptions. I'm very conscious as I make manual efforts of the metaphor of termites toiling away producing there huge mound which a machine could replicate in hours. We are not creatives fearful of imitation, we are striving to gather facts, and need machines to automate the huge and often terminally dull tasks that involves.
What is needed is not a ban, but guidelines on supervision, whether AI generated sources like Grokipedia are acceptable, and the issues of our data training LLMs. If we reject AI there is a danger that an AI could replicate our efforts and then surge past us. Of course we can point fingers at AI mistakes, or worry about their directed use to distort facts, but need to accept they will revolutionise knowledge curation very soon. So a RFC is desireable, but I think the debate needs to be framed differently
Regarding non-database content, using a LLM if you are a fluent user seems direspectful of other users, but, like translation tools, has its place if the user is struggling to express themselves. But i'd always wonder about nuance, which often matters here.Vicarage (talk)05:50, 16 November 2025 (UTC)Reply
No, because there is no way to definitively automatically filter for it, and people are not always transparent about whether one was used or not.Jasper Deng (talk)11:42, 16 November 2025 (UTC)Reply
I think I'll have to agree with Vicarage here, what we need are a kind of acceptable use policy for LLMs. It's true LLMs can be abused to waste people's time on generated BS or hoaxes, that's some of the things we would want to avoid. LLMs can on the other hand be an invaluable tool especially on Wikipedia where it could help with creating a first draft of an article saving people a lot of time, or you can use it to do initial research into a topic. The areas where LLMs aren't so good are well understood at this point. It would be a bad idea to ban LLMs categorically. Although the only area where I could picture LLMs being useful for Wikidata editing would be a tool that uses a private LLM to assist with editing tasks by generating suggestions the user will then have to manually ok. If someone uses LLMs or other tools to generate plausible looking identifiers that's clear abuse so it's covered by the blocking policy. Another concern is that it can be hard to identify what is LLM generated, and we don't want a policy where being accused of using LLM with poor evidence can be used to sanction someone.Infrastruktur (talk)09:21, 16 November 2025 (UTC)Reply
If we used our own, in-house LLM that was ethically trained and rigorously checked for bias by the community, and refined accordingly, then I can see a use case. But we should not allow the use of ChatGPT any more than we allow third-party ads.
Remember, Wikidata is a source of training data for LLM's. We thus cannot ethically use such LLM's due to the potential for circular reasoning/sourcing, and reinforcement of existing biases (such as how femme people disproportionately have much less representation than masculine people).
Grokipedia has already been shown to be horribly unreliable so it in particular is completely out as a source, to the point that I would want it spam-blocklisted on all items unrelated to itself.
"If we reject AI there is a danger that an AI could replicate our efforts and then surge past us. "–this is not an argument. At the current time, the flaws of existing LLM's are far too much of a liability. When they have matured enough, then maybe we can reconsider, as consensus can change.Jasper Deng (talk)11:38, 16 November 2025 (UTC)Reply
"Grokipedia has already been shown to be horribly unreliable so it in particular is completely out as a source, to the point that I would want it spam-blocklisted on all items unrelated to itself." As long as websites like Metapedia and Conservapedia are allowed on items unrelated to themselves i cannot see any reason to justify banning GrokipediaTrade (talk)19:16, 16 November 2025 (UTC)Reply
@Jasper Deng that's not an article about anyone using Grok on Wikidata in a way that's problematic. The point of policy is to solve problems we have on Wikidata. If people use citations to Grok on Wikidata in a way that produces a problem on Wikidata, we should look at the actual problem.
As far as the article goes, it doesn't provide any evidence of hallucinations. It discusses the case of Tylenol where Wikipedia says things that are false and Grok says things that are correct. Three are observational studies that link Tylenol to the problems. Observational studies are low-quality scientific evidence. Saying that observational studies are no scientific evidence is wrong.
I also find the Wikipedia's position that meta-reviews published in reputable scientific journals are "unreliable sources" when they contradict establishment medical views to be bad and would not fault Grok for considering meta-reviews in reputable journals as valid sources. Valuing meta-reviews in reputable journals has nothing to do with hallucinating.ChristianKl ❪✉❫16:38, 22 November 2025 (UTC)Reply
I agree with you and Vicarage. I'd emphasis that whether an AI or human is producing the content, the same rules apply, it should be relevant, useful, and accurate. Use of optional tags for AI content should be encouraged to allow easy identification and analysis. Users of the tag could have content reviewed and assigned a "reliable LLM content orchestrator skill level" based on accepted content in conjunction with the tag. Limitations could potentially be applied to the number of API/MCP calls based on a similar method.AHIOH (talk)08:48, 23 November 2025 (UTC)Reply
Over time, I've developed a complete aversion to the habit of having something generated and then passing it off as one's own. For example, appealing against a block[4] or item deletion (November so far:[5][6][7][8]). So I was all for banning that before you posted this.
When talking about banningLLM-generated content, some believe this means banning LLMs at all. But that's not what I am for. As Vicarage says, there is a great potential for LLM-assissted knowledge curation, especially where natural language processing arises. Let's leave the door for experiments open. But also preserve humans as the ultimate arbiter of (not) making a change. --Matěj Suchánek (talk)11:07, 16 November 2025 (UTC)Reply
If LLM generated content could bereliably identified it would be a good idea to ban it from talk pages, but if we start banning all posts that look like it might be LLM generated, I fear there will be a lot of false positives. And what's to stop anyone from flooding our appeal system with baseless appeals that's not made with LLM? Or appeal twice having an deletion appeal rejected because it looked like LLM? Now we're at a total of four potential appeals, so all we got from the suggested policy was a doubling of our existing workload. Yet I've heard no one else say they think there is a problem with this.Infrastruktur (talk)12:00, 16 November 2025 (UTC)Reply
Itcan be reliably detected through the use of detectors. The ultimate way is to make sure they understand the content themself; if they don't, then they didn't write it.Jasper Deng (talk)12:11, 16 November 2025 (UTC)Reply
I recognise the discomfort of LLMs to write appeals against blocks and deletions. On the other hand I can imagine that people use an LLM as they feel they are not capable of writing English well enough. It is also clear that LLMs are a lot more patient and stubborn in repeating rejected arguments than any human can be. Pfff. I think there are several crystal clear cases of LLM use in appeals and I would think it is a good idea to reject such appeals by policy. Some shorter LLM generated translations of a self written appeal in another language may slip through, and that may be okay. I must say (and appreciate) that Wikidata is relaxed in less well formulated English or use of other languages. --Lymantria (talk)13:02, 16 November 2025 (UTC)Reply
On this - I think we should have a clear policy that English isnot required for communication with administrators or other users on Wikidata. Translation pages (like this one) are language-specific, but in general I think we should have a policy that expressing your view in your native language is preferred to LLM-generated English text.ArthurPSmith (talk)13:49, 16 November 2025 (UTC)Reply
How about we just exempt Google Translate from the LLM prohibition completely. I would much rather deal with that than AI slopTrade (talk)19:17, 16 November 2025 (UTC)Reply
I think no one considers Google Translate an LLM. (Also, some people preferDeepL.)
Machine translation is not prohibited from any Wikimedia project. In fact, it is even integrated in Wikimedia-developed tools.
I would much prefer that people use Google Translate for a more direct translation of their appeal/undeletion request if they must translate it. AI is completely rewriting their requests, structuring them all in the same way, and making errors. It makes it difficult to know whether it's a legitimate appeal from a non-English speaker or a spammer using AI to pump out spam requests.Ternera (talk)14:53, 16 November 2025 (UTC)Reply
Use of LLMs cannot be reliably detected and building a policy on the assumption that they can would be a mistake. Misuse of LLMs is a different story, but I'd rather educate people on the risks and mitigations and make it clear they are responsible for what they post.Bovlb (talk)17:33, 16 November 2025 (UTC)Reply
Well, no. My essay is about poor communication, which I believe we can detect, at least manually. And it seeks to educate about risks and inform about mitigations, not impose a policy.Bovlb (talk)19:29, 16 November 2025 (UTC)Reply
Would a policy againstundiclosuredLLM usage in discussions be too harsh in your opinion? I feel like if the user in question indeed is acting in good faith they shouldn't have any issues about being upfront and honest about itTrade (talk)23:04, 17 November 2025 (UTC)Reply
Do the people who use LLM to make unblock requests on their own behalf fall under that category? If not, why should we not be allowed to ban it?Trade (talk)22:22, 24 November 2025 (UTC)Reply
I think it's sufficient if administrators take the signs of LLM generation into account when reviewing unblock requests. If they post walls of text that are not wiki-formatted, repeat the same information in multiple responses, and are not responsive to points raised, then we can safely conclude that there is no evidence that the editor understands our policies and is going to abide by them. It won't help anyone for us to have an official policy against using LLMs just because some editors make doomed unblock requests.Bovlb (talk)23:09, 24 November 2025 (UTC)Reply
If I look at property descriptions many proposals are quite awful and would be improved by smart use of LLMs. Asking an LLM to summarize the prior art that exists for defining a term and then argue for a version that might make a good Wikidata description would likely result in a better description than what most people who are making a property proposal and don't invest the energy in engaging with prior art come up with.
When it comes to solving a conflation of multiple concepts into a single item, asking an LLM for the concepts that are intermingled and the potential ways you can define the concepts is very useful. It's again an area where an LLM that can check prior work of how the concept is defined elsewhere makes it easy to engage with prior art when presently Wikidata users are often to lazy to engage with prior art for the definition of the concept.
Given that all the models that people use these days are reasoning models, LLMs aren't just doing circular recitation of their training data. They are very useful tool to do background research about how terms are defined elsewhere.
As far as requests gohttps://www.wikidata.org/wiki/User_talk:HammadHP123#Unblock_Request_-_User:HammadHP123 does not look LLM generated to me even as it gets tagged by detector websites. There's a missing "," after "Thank you". That's a mistake that humans make easily, while LLMs generally don't make that mistake. It has a lot of sentences that start with "I", which is again a sign of bad writing that's not typical of LLMs but more typical for humans who don't write well. The message looks to me like it's written by someone who read some SEO guide about how to get an item on Wikidata undeleted. While I think the term slop might acccurately describe that request, I don't think it's AI slop.ChristianKl ❪✉❫21:28, 16 November 2025 (UTC)Reply
I'm going to respond more fully later, but a comma is not necessary after "Thank you" when it is followed by "for", any more than it is needed after "merci" in "merci pour...".Jasper Deng (talk)08:12, 17 November 2025 (UTC)Reply
I think I was wrong here. I actually tested how ChatGPT writes an unblock request when you ask it to do so, and it does end up looking like that.
I think a general rule that would autoreject requests for admins to undelete and unblock that get 100% on a GPTdetector would be fine.
When it comes to property requests, a property request that is just created via an one-shot prompt is likely bad, but especially for users who haven't done property requests before, a multiple prompt approach with ChatGPT might lead to something better than what happens in many property requests at the moment. Our property creation process does include complex bureaucratic rules and I think it's okay when ChatGPT helps new users with that.ChristianKl ❪✉❫15:38, 17 November 2025 (UTC)Reply
So if someone does manage to get unblocked despite getting 100% on the GPTdetector you would declare the unblock invalid and reinstate the block. Is that correctly understood?Trade (talk)23:00, 17 November 2025 (UTC)Reply
@Trade that's not something I wrote or a position I hold. When it comes to making judgments about blocking/unblocking I mostly support decisions that are made by admins unless it's clear that the decisions produce problems.
>When it comes to solving a conflation of multiple concepts into a single item, asking an LLM for the concepts that are intermingled and the potential ways you can define the concepts is very useful. I agree with the caveat that at least with many current LLMs, although they may or may not be prompted with the current date this is only a context clue that can easily be lost within a short conversation depending on context length and reiteration methods. The LLMs "think" in the moment of when they were trained with a rolling context window. They have a tendency to stick to their "beliefs" unless specifically set to widen their parameters (with the tendency to lose accuracy and coherence). I guess what I am trying to say is that you can't really reason with an LLM. It reasons you. If that makes sense. True novelty, with reason, will be the true test and not one that should be taken lightly. Combining this innate ability we have with the LLMs ability to process and generate massive amounts of data is what can make the difference between slop and innovation.AHIOH (talk)09:14, 23 November 2025 (UTC)Reply
"You do not have permission to see details of this entry."
It was a message on a talk page that offered a "useful source" with a link. I don't know if it was LLM generated, but it bore no relevance to the context of the discussion.Bovlb (talk)19:46, 17 November 2025 (UTC)Reply
It's a link where someone wanted to promote a website by talking about the website being a useful source and an abuse filter prevented the message from going through. If I put the diff from the message into gptzero it does judge it to be 100% AI generated.
Jasper Deng did create the filter that caught the message. That filter is currently:
page_namespace == 1
& user_editcount < 10
& (contains_any(added_links, 'arrest', 'case')
| contains_any(added_lines, 'tool', 'tools'))
Policy-wise, one question would be whether we could run a tool like GPTZero automatically in an abuse filter and filter out messages where the diff is classified as 100% AI. I wouldn't want that for autoconfirmed users, but I would be okay with running that for "user_editcount < 10" or maybe even all users that aren't autoconfirmed.ChristianKl ❪✉❫21:06, 17 November 2025 (UTC)Reply
How about just every account less than a month old and with less than 100 edits in the item name space? Plenty of spammers can easily get autoconfirmedTrade (talk)22:54, 17 November 2025 (UTC)Reply
Have open-source locally run LLMs been considered? It doesn't seem the LLM would need to be very large. A crew of agents checking submissions and generating customized help guides for new users might be pretty nifty. Could probably add a report at the end detailing recommended improvements based on the common issues. Some clever person here could probably mash together some short python scripts fairly quick I'd bet.AHIOH (talk)09:37, 23 November 2025 (UTC)Reply
Could you explain to me why simple having a policy against undisclosured use of LLM would be insufficient? I would much prefer if we just start with something more lenient. We can always change it later if the community wants @ChristianKl:--Trade (talk)23:03, 17 November 2025 (UTC)Reply
@Trade When it comes to making rules to increase the bureaucratic burden, I think a good heuristic should be to start with skepticism because it's quite easy grow the bureaucratic burden over time and make everything harder.
To make good policy, we need to understand the goal of the policy.
Wikidata has a certain amount of bandwidth from admins and other user who patrol potentially spam and vandalism. If there's a lot of such spam that's AI generated, a policy against undisclosured use of LLM doesn't prevent that spam and vandalism to land on Wikidata. An abuse filter that filter out posts that hit 100% AI generated at GPTZero for new users on the other hand would filter out the spam in a way where it does not take up admin and user bandwidth. It's worth noting here that high-quality AI generated text usually does not get classified as 100% AI generated by GPTZero. To the extend that the goal is about freeing up admin bandwidth, an abuse filter does that while a policy against undisclosured use doesn't.
Another practical problem is that there's no obvious place to make a disclosure for an LLM generated item description. Many users are likely also not going to be aware of the policy and violate it which creates additional issues that take up admin bandwidth.ChristianKl ❪✉❫00:08, 18 November 2025 (UTC)Reply
What I said is a proposal for how configure abuse filters. An abuse filter is something that gets evaluated automatically when a user tries to make an edit.ChristianKl ❪✉❫13:25, 19 November 2025 (UTC)Reply
Latest comment:3 days ago6 comments5 people in discussion
The propertynumber of pages(P1104) has limitations that I do not know how to work around. For example, inthis edit, itemOedipus the King(Q106099149) was labelled as having 114 pages. While the source makes that claim, that is only half true. There are 114 pages numbered 1 to 114, but there are also 14 additional pages of text at the front of the book numbered with Roman numerals. Library databases would indicate this as xiv + 114 or as xiv, 114p. However, neither format is permitted by this property, so I cannot enter the data as found in library databases. Does anyone have a suggestion? --EncycloPetey (talk)21:24, 17 November 2025 (UTC)Reply
It would make sense to have one 114 value and one 128 value along with some qualifiers that explain what each number refers to. I don't know which qualifier is the best for it. It might also make sense to set one of the two to preferred rank.ChristianKl ❪✉❫21:56, 17 November 2025 (UTC)Reply
Doing either one would mean that we can't simply import data from library databases, because there would need to be a conversion. The result would differ from the original database, which means we could not cite the library database as a reference, since we've altered the data. This applies also to published articles where the number of text pages and number of plates are counted separately, which is a common occurrence. --EncycloPetey (talk)01:24, 18 November 2025 (UTC)Reply
There is no "unnumbered page" item, but one could easily be created for the purpose of distinguishing from plates, regular numbered pages and roman-numbered pages. As for qualifiers, the one to be used is very obviouslyapplies to part(P518).Circeus (talk)21:21, 24 November 2025 (UTC)Reply
To answer this question, for a massive bot import seeWD:Bot request. But if you have community concensus to do that, you don't actually need a bot and a simpleHelp:QuickStatements session is OK. For example to add "dates" statements to calendar days items, just prepare and upload a CSV file to quickstatement, it will probably be fine.author TomT0m /talkpage17:10, 24 November 2025 (UTC)Reply
The equivalent classes are easily retrieved by a query, so … just use a query, we could create a gadget for this, similar toClassification.js.
I noticecalendar day of a given year(Q47150325) which seems useless as an item because … there are not instances of day without a specific year. It just should be merged with it's parent classcalendar date(Q205892), it's essentially the same class.
A query that does that, find calendar days, all calendar, with the samepoint in time(P585) value as one :here
It does not work because the other calendar days do not seem to have apoint in time(P585) statement with a normalized calendar. @Immanuelle: instead of creating items en masse, just add the "date" statements with a normed calendar, this will do the trick. (My query will need to be refined, just a firt round)author TomT0m /talkpage21:13, 20 November 2025 (UTC)Reply
A more refined query where you can select an item to find its equivalents date :https://w.wiki/GCTW more satisfying (although it seem we have a problem with the gregorian date item who is both an indirect part of both julian and gregorian calendar, weirdly.)
Yes, the "part of" path of November 17, 1901 (Gregorian) is: November 17, 1901 (Gregorian) -> November 1901 -> 1901 -> 1900s -> 20th century -> 2nd millennium -> Common Era -> Julian calendar. Maybe there should be November 1901 (Julian) and November 1901 (Gregorian) etc.?Difool (talk)02:19, 21 November 2025 (UTC)Reply
The most dubious thing here is that last "Common Era -> Julian/Gregorian Calendar" part, I think. (put by @Verdy p: and @Schekinov Alexey Victorovich:.
Conceptually there is a question of if a calendar date is "pointing to" a time period, or if it "is" the time period or not. The fact that there are several calendars, so different "adresses" for the same day, tends to prove imho that we are more in an analogous things to the "unit of measure <-> things it measures" relationship than a "dates are themselves time interval". Then calendar dates are not parts of time period, and the "part of" hierarchy is to be reserved for relationship between the calendar objects themselves, not the time period.
Maybe we should have a property dedicated for this relation, like "time period covered by the calendar".
It's even more complicated, I think, because a monday in the USA is not exactly the same time period as a monday in Italy … they do not begin/end at the same time. I think it makes clear that there is no exact correspondance between calendar dates and time (they are relatives, time is also relative to observer, to make the confusion total :) )author TomT0m /talkpage11:02, 21 November 2025 (UTC)Reply
Good points. I'd suggest dates are properties and a calendar is an index of objects (days) mapped to ranges; Weeks, months, years, etc. The days have the dates as properties and the value maps to indexes in particular calendars. This allow us to map objects to time, period, and location using occurrences (events at a particular place and time.) A day in the USA is an event. The start of the event is designed to coincide with the time the westernmost time zone starts a new day (00:00). So we have objects with days (a type of event) which are indexed to multiple calendars. The day in California starts earlier than New York but the date of the day in the USA remains the same for each. The dates correspond but the day steps across time zones as the Earth spins. Breaking this down we have objects occurring at locations experiencing events which are a type of occurrence with a duration. Simplifying further we haveobjects with events(types of occurrences)relating to occurrences(types of events) at locations on particulardates which representdays on a particularcalendar. So we can see this allows separation of the objects, events, locations, days, dates, and calendars while allowing relationships between all. Or at least I think it does, please let me know if missed something.AHIOH (talk)14:31, 23 November 2025 (UTC)Reply
Yes, I was thinking about that. I agree with you. I could do it with my bot once approved, but first I would like to be sure it has consensus.Paucabot (talk)06:01, 22 November 2025 (UTC)Reply
I looked at your contributions, and you are usingdescribed at URL(P973) a lot. It would be better to usedescribed by source(P1343) with theURL(P2699) qualifier. That ensures the links are to notable sources, and makes it much easier to change them if the format changes or we want to promote the source to have its own property ID.described at URL(P973) should be used sparingly. References need to be made to notable sources, so need to refer an item here, even if the source material is not online, and you are quoting a page number.Vicarage (talk)08:23, 22 November 2025 (UTC)Reply
Latest comment:5 days ago7 comments2 people in discussion
Is there some straightforward way to get images in English Wikipedia and set them on the associated Wikidata items?
Take a look at this listWikidata:List of skin diseases with a Commons category but no image set – it contains many items (206 now) and having at least one image would be very useful and adequate for these but probably nobody wants to (and nobody did so far) go through them all to add images manually.
In contrast, probablymost of the linked English Wikipedia articles do have an article set. Couldn't that be used to set the images?
I guessHarvest Templates could be used for that but the last time I tried it, it aborted the process early on due tothis bug and nobody here seems to care about developments to that tool (per it being rarely discussed, not having a help page here albeit I may create on soon despite of little skills with it, it still being broken, and no support onthis Community Wishlist wish). So I don't know whether it would work and more importantlyhow to get just skin diseases with the tool and it canonly retrieve the images from templates but some are used outside of templates. Templates used includeen:Template:Infobox medical condition (new).
When it added the categories, one could view the changes via some listeria list for example that is set up before the import so one can see the diff of the import and check the images. However, in this case here it's probably not needed if it imports the image from either inside the template or that is first in the article.Prototyperspective (talk)13:30, 22 November 2025 (UTC)Reply
Thanks! That's interesting. But I wonder why is this on Wikipedia and not Wikidata. Even worse, a user subpage on German Wikipedia and mostly just unexplained links. I think this needs to be in a Wikidata help/meta page to be just slightly discoverable (and discussable + improvable). The actual needed infoas far as I understood it I think is atmw:Manual:Pywikibot/illustrate wikidata.py which has basically the same issue and also I don't know how to use it for the task described above as e.g. examples are missing but it seems like it could be done via the -cat parameter and limiting it to Wikipedia categories for skin diseases along with the -subcatsr parameter to make it also scan the subcategories.Prototyperspective (talk)17:48, 23 November 2025 (UTC)Reply
Hello, regarding commonscats I did not find a solution yet:
Above I was talking aboutWikipedia categories for skin diseases. I don't know how and why one would use a commons category to set the images on Wikidata items. Specifically: how would one then select a very fitting image? The files there aren't sorted by how well they represent the subject, quality, and whether it depicts the subjects which often is needed.Prototyperspective (talk)18:30, 23 November 2025 (UTC)Reply
Because of the page titleList of skin diseases with aCommons category but no image set I assumed it is about commons categories.
Often, there is only one image per commonscategory, e.g. for cultural heritage monuments. Even if there are more pictures the first image could be selected and checked afterwards if it is the best fit and change afterwards, if there is a better choice.
Thanks for explaining. I would not set any image files based on Commons categories except when one is checking each suggested edit before it's added OR all the changes after they have been written (which can be done via creating a listeria table and then checking the diff – if there was a way to sparql the date when a statement was added then that could also be used and would be better). Even if there is only one file, the file could not be useful / sufficiently representative of the item and if there are more selection would be arbitrary. But categories with just one file are rather rare.
Background for those interested (can be skipped): there is a lot of public attention on the subject of Web feeds right now which are in some documentaries for example are portrayed as some kind of unethical invention that's causing mental health problems like increased depression and anxiety, misinformation proliferation, Trump getting elected, addictiveness, etc etc. For example, in the still-new social media siteBluesky(Q78194383) the default feed and more or less all usable feeds just sort things chronologically which some hail as the solution. Using that site, I quickly found it a waste of time because the posts I see there are mostly boring and not e.g. those I'm likely interested in mixed with posts that got a lot of traction/feedback. I don't think just sorting all posts from 'followed' accounts chronologically is a good idea or something that will be widely adopted or that has more positives than negatives. Another observation I made is that nearly no or no feeds factor in things like accuracy and more broadly rationality. Such could be done for example, by downranking posts fromusers with high misinformation scores and posts that make claims without sources and posts where inaccuracy/falsehoods have been pointed out by users tagging them as well as clearly labelling –probably partly using some bot – posts that have been refuted/contain false info.There has been at least one study that tested such a feed. Web feed algorithms are not evil monoliths, they can and do differ as much as websites which also is not all-under-one-blanket thing that can be measured by the one scientific measure ofscreen time(Q28130149) – it's a big difference if you read and edit Wikipedia and read studies vs mindlessly swipe a TikTok feed full of Trump posts or whatever. There is little awareness however – and also little research, documentation, news coverage, testing, and innovation – about Web feeds in terms oftheir shape and how they can be designed and differ. This brings me to the topic of the thread:
Web feed algorithms can be assessed and evaluated via various dimensions. One example is transparency: whether the algorithm can be publicly viewed and studied. Another quality would be whether the code is free and open source (if the algorithm was merely explained using pseudocode and not freely-licensed then it e.g. can't be adopted in modified form by other platforms). Could this be set on the item somehow? I set some things asmeasurement scale(P1880) but I don't know if this is right. If there is currently no way to properly set this are there some ideas on how to enable it?
I think qualities by which things are being evaluated would also be useful to set on other items. For example product tests and so on sometimes test a subset of such qualities (note: it's rare that any one study or product test does test all major qualities). For example,food(Q2095)has dimensions that include healthiness andtastyness(Q123663924). Such qualities are of course each subject of studies if not of entire research fields and multifactoral. Other examples for qualities includecustomizability(Q136936204) (new) anddurability(Q10269116). I think such are quite useful or could become quite useful, and that setting qualities characteristic to items would also be insightful and useful and eventually enable useful queries and data. I don't know how all of this could be done, do you have input on these things?Prototyperspective (talk)19:28, 23 November 2025 (UTC)Reply
It's possible to do original art when it comes to evaluating "web feed algorithms" but the first approach should be to research which metrics people are already using to evaluate them. It's generally better to see what metrics sources are already using outside of Wikidata and then using the same metrics and citing the outside sources for the values.ChristianKl ❪✉❫19:44, 23 November 2025 (UTC)Reply
Yes of course and I could add references for the qualities that I've added. I'm mostly asking abouthow to set these, i.e. setting them onmeasurement scale(P1880) doesn't seem like the right way or at the very least I haven't seen qualities set on that property on items so far.Prototyperspective (talk)20:36, 23 November 2025 (UTC)Reply
Quick way to add videos from Commons category to respective items about places?
The place is in the file title (it may be difficult to extract it / reconcile this to items)
Most of the files in that category do not havedepicts structured data or other SD about the location so this probably can't be used
The best thing to use may be the place-specific Commons category that is also set on the item such asc:Category:Bristol city centre orc:Category:Videos from Bristol (one would need a list of which files did not having such a category because some of the files in there aren't yet well-categorized)
Note that if a video is set on an item, it could be replaced by a better one or could get a complementary or as good one added next to it. Also note that theWikiShootMe is so far only used to set images for location items, but not for setting videos, which for items about places of larger regional size, such as a small city, are usually more illustrative / useful / informative.Prototyperspective (talk)20:57, 23 November 2025 (UTC)Reply
Is there a trick to stop large Wikidata:Database reports from freezing my browser?
Then we haveWikimedia disambiguation page(Q4167410) for the Wikimedia Dismbiguation page type and finallyDisambiguation(Q1151870) which appears to be the Wikimedia Dismbiguation page for "disambiguation" I was thinking a "disambiguates" property, or perhaps an "includes" qualifer might work. However it might be good to stick withdifferent from(P1889) and add it as a qualifier usingpart of(P361). I believe this would link the criterions to the relation which might be useful. Then one could examine the disambiguated terms along with the criterions in wikidata which may help to write the descriptions. It might be helpful to include the language as well since the concepts may not be ambiguous in other languages.AHIOH (talk)03:36, 24 November 2025 (UTC)Reply
Latest comment:8 hours ago7 comments3 people in discussion
To the Wikidata Editors,we request the creation of a new item for NetFM Inc., the organization recognized asthe world's oldest internet-only radio station.The information is fully verifiable with high-authority, independent sources (NFSA) • This entity is crucial for accuracy in the Google Knowledge Graph and Al services.Label (en): NetFM Inc.Description (en): World's oldest internet-only radio station.Aliases: NetFM, NetFM Classic Rock Radio Core Statements to be Added:Property: instance of (P31) - Value: organization (243229)Property: instance of (P31) - Value: radio station(01455248)Property: country (P17) - Value: Australia (2408)Property: official website (P856) - Value:https://netfm.net/Property: inception (P571) - Value: 13November 1998Source for Inception Date:Reference URL (P854): https://www.nfsa.gov.au/collection/curated/asset/ 101722-net-fm-broadcasting-first-internet-radio-broadcastRetrieved Date (P81?'November 24, 2025↓Thank you for yourstance.Netfmradio (talk)05:47, 24 November 2025 (UTC)Reply
You will have tocreate the item yourself. It will have to satisfy ournotability policy. It ismentioned on Wikipedia, and the source on it being the first one in Australia at least to me enough to be considered notable per #2, in the spirit of #1 and for its historical relevance. Toadd sources to the claims, type "reference url" in the first box and click on the first entry, then in the second box type the reference url. HTHInfrastruktur (talk)12:34, 24 November 2025 (UTC)Reply
The Wikipedia mention was added by the above account, and was considered promotional by Wikipedia. It may very well fall below the threshold of notability now.Infrastruktur (talk)17:37, 26 November 2025 (UTC)Reply
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment.Bovlb (talk)21:31, 28 November 2025 (UTC)Reply
Wikipedia category links to empty item instead of item about subject
The empty item was created by a bot; I agree it's useless without multiple sitelinks, statements linking between it other items, or languages as all it does is say it's a category (which is obvious from the namespace) but the task was approved atWikidata:Requests for permissions/Bot/Pi bot 19. I added links between the category and topic.Peter James (talk)13:39, 24 November 2025 (UTC)Reply
Thanks for elaborating. So it seems the answer is 1) because Wikipedia categories can't (or shouldn't?) link to the WD item about the subject and 2) because your bot hasn't yet run on many items for unknown reasons so the link to the item was missing.
I don't run any bots and I don't know if there is a bot that links the category and topic, and searching links to P301 with the prefix "Wikidata:Requests for permissions/Bot/" I could only find requests that mention adding links based on other statements already in Wikidata. If there is, it's likely this one was missed as there was no main article specified for the category in Wikipedia until now. If by "category pages link to the item" you mean sitelinks appearing on category pages in Wikipedia I don't know if that has been discussed or if it's possible.Peter James (talk)15:24, 24 November 2025 (UTC)Reply
This seems to conflate two different concepts, an instance oftechnology(Q11016) and a subclass ofbroadcaster(Q15265344). To address this, I have createdInternet television station(Q136954577) as the concept that I believe should be split off. The current usage ofInternet television(Q841645) (<200 items) is messy and inconsistent, and most usages actually refer to specific channels, stations, or service providers rather than the technology itself:
Latest comment:4 days ago1 comment1 person in discussion
phab:T142082 "Add another 'Add statement' button on top to ensure consistent position" trundles on towards its tenth anniversary, with no resolution and no apparent consensus.
phab:T137681 "Add table of contents for items/properties on Wikidata" is not far behind.
Upcoming LD4 Wikidata Affinity Group session on 25 November 2025: Join the community-driven effort led by WikiProject Personal Pronouns to improve pronoun data modeling and ethics in Wikidata. This is part of a three-session series (Oct 14, Nov 25, Dec 9) focused on implementing new best practices. No prior Wikidata experience required. Join at17:00 UTC.Event page.
LDF endpoint retirement considered: The unstable and low-traffic LDF endpoint may be retired to reduce maintenance effort and unnecessary load on WDQS. If your workflow depends on it and you would hate to see it go, please let the Wikidata development team know atWikidata talk:Data access.
WDQS Legacy endpoint deprecation: The legacy endpoint (query-legacy-full.wikidata.org) will be fully decommissioned on 7 January 2026. Please migrate tools and workflows to the supported endpoints:query.wikidata.org (Main) orquery-scholarly.wikidata.org (Scholarly). Assistance is available on theData Access andRequest a Query pages.
Insights from Data Governance Research Process 2025: Results of the research to better understand how the communities currently think about which data and communities are best served by Wikidata, Wikibase Cloud or Wikibase Suite respectively.
Abstract Wikipedia naming contest: Help pick a name for the new Wikimedia wiki project which is provisionally known as Abstract Wikipedia. The second phase of voting is now open until December 1.
Bookbindings - share metadata and images from the currently offline Database of Bookbindings at the British Library
Irish Traditional Music:Projects - aims to systematically create and enhance Wikidata items for people profiled in the 2024 edition of Fintan Vallely's The Companion to Irish Traditional Music.
Showcase Lexemes:Bogucin (L1405411) - Polish proper noun (bɔˈɡu.t͡ɕin) meaning "village in Greater Poland", "village in Kuyavian-Pomeranian Voivodeship", or "village in Lublin Voivodeship"
Weekly Tasks
Add labels, in your own language(s), for the new properties listedabove.
While this article mentions Wikipedia's notability criteria, it somehow manages to avoid mentioning that Wikidata also has them.Bovlb (talk)18:05, 24 November 2025 (UTC)Reply
Because this is basically a guide how to exploit Wikipedia and Wikidata for promotion. Not to mention the overall low quality, e.g. a study is mentioned but neither named nor linked. --Dorades (talk)22:30, 25 November 2025 (UTC)Reply
Good idea to get rid of the LDF endpoint. It was sold as a more reliable and less computationally expensive wasn't it? The problem is that you almost never need just a set of triples matching a single triple pattern, and so you end up requesting data you eventually throw away wasting network bandwidth. And SPARQL doesn't have to be expensive either if certain features were disallowed, it is likely more efficient if you can eliminate disk reads. Probably a lot of possible use-cases might just as well have done a construct query to grab a subset of the graph, stored the triples in a local triplestore and query it from there and the query could be as expensive as you like since you're running it on your own server, also eliminating the need to write custom scripts to do the equivalent operations. Blazegraph actually already have an API call that basically lets you request a triple pattern, so this is a drop-in replacement for LDF (for now).Infrastruktur (talk)21:43, 24 November 2025 (UTC)Reply
The bot runs the scriptproperty_uses.py from Toolforge. It sequentially runs 3 queries (mainsnak,qualifier snak,reference snak) per property per endpoint (query andquery-scholarly) and fetches the total number of triples for the requested predicate in the results set from the first response in the JSON-LD representation. This takes around 2 days per run, and runs currently 80k queries (2 servers times 3 snaktypes times 13k properties). Execution is once per week. You can probably see the user agent in your log files—in case you have access to them—as the bot identifies as DeltaBot.
All data is then aggregated in the script, and ultimately written to these templates:
The data on the template pages is not visible when accessing the Template page as it is fairly excessive (currently 138k page size). You need to open the templates in editor view to see the data. The data from all four templates is currently mainly used byModule:Property documentation, which by itself is transcluded on every property talk page. SeeProperty talk:P31 for example: the table under “current uses” displays results from the templates.
I am currently not aware of any other method to fetch the total number of uses for properties efficiently, particularly heavily used ones. You can for instance runcurl --get -H 'Accept: application/ld+json' --data-urlencode 'predicate=http://www.wikidata.org/prop/P31' 'https://query.wikidata.org/bigdata/ldf' on the console to find that there are currently 78.109.759 mainsnak uses for P31 on the main query server (i.e. withoutquery-scholarly), and the response is pretty much instantaneous.
The first method relies on a finicky Blazegraph specific optimization that can be a PITA to get right and is inherently not portable. On the flip side it is blow-your-socks-off fast:https://w.wiki/GMAZ . A downside to this method is due to how the optimization works it is not possible to count anything other than a single triple pattern, so if you want to count distinct items or exclude deprecated claims you're out of luck. Obviously this will fail spectacularly once they sunset Blazegraph, but I think it has been added to Qlever too at this point, see commit f9c9ff4ff538484c92690143e0e400d4f34f0e70 (2024-02-09).
The other approach is brute force. I have an old script that churns through the dump file to produce a huge amount of general statistics. When it's run on the gzip dump file IIRC it only takes around 10-12 hours or so, I think that was before the graph split. The code is already written, I just never bothered publishing it, thinking it would just end up being infrastructure looking for users, pun intended. I put up an example reporthere if this is something people would want to have regular reports of on Toolforge.Infrastruktur (talk)23:17, 25 November 2025 (UTC)Reply
Your WDQS-specific query is interesting, but clearly above my paygrade after neglecting SPARQL and Blazegraph for years now. I need to compare whether it is sufficiently similar as the current LDF approach which does not scale well anyways. If so, I would not mind that it is Blazegraph-specific and just use it as it seems to yield much better performance than what I have right now.
That said, I also would not bother if someone else took over the task to update the templates entirely. I still have much more responsibilities here than I could reasonably oversee, and it does not seem that my time budget for Wikidata will improve in the future. —MisterSynergy (talk)23:35, 25 November 2025 (UTC)Reply
I think I might take you up on that. I want to see if I can replace the LDF calls with API calls to see if I can make it faster without having to change much of the script.curl-Ghttps://query.wikidata.org/bigdata/namespace/wdq/sparql--data-urlencodeESTCARD--data-urlencode'p=<http://www.wikidata.org/prop/P31>'.Infrastruktur (talk)07:19, 26 November 2025 (UTC)Reply
Ugh! Discovered the reason for it being slow, it is nothing other than being rate-limited by the endpoint. So even if we can get the count in like 1 millisecond, it doesn't matter, gotta wait 2 seconds per request, 44 hours in total. *facepalm* Guess that means it's time for a rewrite, unless the WMDE can exempt this script from the ratelimit.Infrastruktur (talk)23:30, 26 November 2025 (UTC)Reply
Thanks for the response, super helpful. We're going to look into this internally and see whether there are better alternative methods available for your use case. For now, you don't have to do anything/put any work into migration. Thanks again!Ifrahkhanyaree WMDE (talk)16:19, 26 November 2025 (UTC)Reply
Latest comment:4 days ago1 comment1 person in discussion
When P1050 is used with the qualifiers of P580/P585 does that indicate the time the condition started or the time of the medical diagnosis?Trade (talk)01:22, 25 November 2025 (UTC)Reply
How can newcomers suggest edits/fixes on protected pages?
Latest comment:2 days ago10 comments4 people in discussion
For example, theBalıkesir province's population statistics are almost 5 years old, existing reference is a dead link. New official population statistics from 2024 exist.
But as a new user, there is no way for me tosuggest a change to be over viewed by moderation. And when I read about theAutoconfirmed user guideline, instructions to obtain this privilege is unclear unlike english wikipedia's instructions.
I've literally read all the guidelines already and my initial question is still the same.
I still don't understand, how can a new user that doesn't have the "Autoconfirmed user" add new information to pages wit semi-protected status.
We can't propose addition, we can't contact with anybody that is responsible for that specific category, like there is no way.
For the sake of the argument please answer;
Q47117 for example, a random province from Turkey, the most recent population information is from 2021, but the official data from 2024 exist. How can a new comer like me propose the newer date to be added considering the fact that the page is protected?
New users are able to create new talk pages and discussions on semi-protected pages so they can utilize the template, correct?AHIOH (talk)22:01, 25 November 2025 (UTC)Reply
thanks for the addition, i believe if new comers navigated to this guide correctly, wikidata can receive more contributions. Also another quick question, there is no information on how one can be given "autoconfirmed" status on this wiki, not that i'm interested in becoming one as i'm not planning to be a regular contrubitor but would just request additions on some pages rarely, but i really wonder the process. Would be great if that's also documented.Yoruk1337 (talk)17:45, 26 November 2025 (UTC)Reply
Latest comment:2 days ago2 comments2 people in discussion
If a website provides interfaces in different languages via parameters rather than domain names, is it necessary to list them all? (likeSteam(Q337535)) They are not part of the domain name; any Steam link can specify the language in this way. Some user has encounteredproblem.——Rinna (Talk)17:42, 26 November 2025 (UTC)Reply
Latest comment:13 hours ago2 comments2 people in discussion
I am interested in potentially making a bot that would take linking data on wikipedias and add a property to items that shows the fact that a relationship is predicted between the two items, and possible predictions of the relationship. The property would be essentially meaningless on its own, but it could be used to suggest tasks for other users to look at the relationships. If we did this we could potentially have multiple relationship predicting bots use the same property with qualifiers saying which bot did it, and if a relationship is added between the two, then a bot would remove this property.
Is there any qualifier like that? As far as I understand, it's up to individual wikis to create the infobox templates, but is there anything to say whether or not it should be inverted in dark mode?
I believe each is a representation based on criteria. One is only the sense when used as a given name, the other the surname, and then the concept of being a "name" class. The middle name king could also be considered its own concept with characterizations. This allows finding instances of given names, middle names, surnames, or of any of the three. TheJoseph K. Mansfield (Q3820408given name you mentioned is set as an instance of the combined termKing(Q71508911) (name: given name and family name) and is causing the issue of "Values of given name statements should be instances or subclasses of one of the following classes (or of one of their subclasses), but King currently isn't:given name" It's a good consideration to point out. Perhaps a hint or constraint might be possible to prevent this from occurring at the time of form entry.AHIOH (talk)01:07, 28 November 2025 (UTC)Reply
King(Q71508911) isn't doing anything and doesn't need to exist. Those items are only required when there is a Wikipedia article (nearly always English Wiki) that covers both given names and family names. There is no such article in this case. —Xezbeth (talk)05:45, 28 November 2025 (UTC)Reply
I'd leave it in. I feel it is beneficial in regards to structure. It is the base name object and the others are just characterized by usage. They are already represented using ordinals, so one could just qualify them. The name itself could have an article covering both given and surname usages and they could redirect to it. One can query instances of it if modeled correctly and get all individuals with any variation of it.AHIOH (talk)07:40, 28 November 2025 (UTC)Reply
Latest comment:7 hours ago3 comments3 people in discussion
I saw a lot of links being added to it. Is Toki pona wikipedia something that was just approved and imported from elsewhere? Most articles on it do not appear to have wikidata links yet.Immanuelle (talk)04:42, 28 November 2025 (UTC)Reply
Note that because toki pona is designed to use simple (pona) concepts used together to build up more complex concepts, many pages may not have a direct match on other language editions. Which is fine, seeWD:sitelinks to redirects for discussion about how this is resolved in the general case.Arlo Barnes (talk)22:59, 28 November 2025 (UTC)Reply
Disambiguation pages are structurally needed to ensure we cover every WP page, but otherwise have no merit. They should contain theinstance of(P31) and WP links, but nothing else, they should not be extended. The information they contain should be coded indifferent from(P1889) for other items.Vicarage (talk)06:43, 28 November 2025 (UTC)Reply
Kraków Old Town (Q652634) de. Altstadt von Krakau, en. Kraków Old Town - it's good
Old Town (Q11695631) de:Dzielnica I Stare Miasto, en:Stare Miasto, Kraków (district) - It's good, but it's better to change the name in Wikidata to „Old Town (district)”
Kraków Old Town (Q136785685) de:Historisches Zentrum von Krakau, en. Kraków Historic Centre - the name in Wikidata is wrong, it should be "Historic Centre of Kraków" (like the name UNESCO)
Thank you very much for your responses and answers. There are many errors in the articles about Krakow. I'm trying to correct them. I've only been on Wikipedia for a short time and haven't mastered the technical aspects.
I also have a request to remove the redirection of theStradom, Kraków page (enWiki).
A careless update in January 2025 made the effort bya wikivoyage/ja admin a trash. Mismatching Qids for the second time: how can clear the mess, and prevent it repeat the 3rd time?
The geographic name "Tokyo" mixed up include:
pre 1943 prefecture (Wikipedia page);
current Tokyo prefecture ( both Wikipedia and Wikivoyage); and
Inner city Tokyo as a place to go/see/stay (Wikivoyage).
the original mismatch has been reported by an IP user, and has been corrected once in 2021; and
the mismatch in early 2025 affects two Wikivoyage cities on three interwiki pages. I list them for Japanese language as:
Latest comment:6 hours ago2 comments2 people in discussion
I am appealing the deletion of Item Q136943232 (NetFM), which was deleted today for "Does not meet the notability policy."The item meets the Wikidata criteria for inclusion as it is clearly identifiable and can be described using a serious and publicly available reference from an authoritative national archive.The notability is established by this independent, reliable source:• Source: Australian National Film and Sound Archive (NFSA)• Link:https://www.nfsa.gov.au/collection/curated/asset/101722-net-fm-broadcasting-first-internet-radio-broadcastThis source verifies NetFM's status as a pioneer in media history, specifically noting it as the first internet-only radio broadcast to meet industry criteria. Recognition by a national archive (NFSA) establishes the necessary historical and cultural notability for inclusion.The item contained statements for instance of (Q43229, Q1455248) and inception (P571) which were fully sourced to the above link.Could you please review the NFSA source and consider restoring Item Q136943232?Thank you for your time.Netfmradio (talk)22:52, 28 November 2025 (UTC)Reply