Movatterモバイル変換

[0]ホーム

Jump to content

Wiktionary:Grease pit/2025/December

Add links

Add topic

From Wiktionary, the free dictionary

<Wiktionary:Grease pit

Latest comment:1 hour ago by Benwing2 in topicmammoth page issues

Time allocated expired/memory overflow for long pages?

[edit]

Latest comment:14 days ago4 comments4 people in discussion

I don't know how often this happens for anyone else, but I occasionally get Wiktionary pages breaking from being too long? If I try to look up a page likea for instance and want to look at the last few entries, it all becomes illegible because I get the repeated message "the time allocated for running scripts has expired". I'm attaching an imgur screenshothere of what I mean.

Let me know if anyone has a solution or planning on a fix for this? Atm I'm just working around it by opening the edit page for the individual section I want to view. Thanks! -Znex (talk)02:05, 1 December 2025 (UTC)Reply

@Znex: That's the only page right now:Category:Pages with module errors, where more information is found. It’s kind of unstable and unpredictable how the code is executed, so occasionally a few other pages are affected. Half a decade ago the memory limits we got were stricter and there were more, like certain Chinese characters, but our local coders worked assiduously to bring the number down and we also got more leeway from the backend. Technological progress will fix this one too.Fay Freak (talk)02:28, 1 December 2025 (UTC)Reply

And adding "features" will unfix it. A features vs. resources treadmill.DCDuring (talk)17:04, 1 December 2025 (UTC)Reply

@Znex see the previous discussion on this:Wiktionary:Grease pit/2025/October#Long entry failure: "The time allocated for running scripts has expired.".This, that and the other (talk)02:58, 1 December 2025 (UTC)Reply

Tech News: 2025-49

[edit]

Latest comment:14 days ago1 comment1 person in discussion

Latesttech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you.Translations are available.

Updates for editors

The Wikipedia Year in Review 2025 will be available on December 2 for users of iOS and Android Wikipedia apps, featuring new personalized insights, updated reading highlights, and refreshed designs. Learn more on the review'sproject page.
The Growth team is working on improving the text and presentation of the Verification Email sent to new users to make them more welcoming, useful and informative. Some new text have been drafted for A/B testing and you can help by translating them. SeePhabricator.
Add a link will now be deployed at Japanese, Urdu and Chinese Wikipedias on December 2. Add a link is based on a prediction model that suggests links to be added to articles. While this feature has already been available on most Wikipedias, the prediction model could not support certain languages. A new model has now been developed to handle these languages, and it will be gradually rolled out to other Wikipedias over time. If you would like to know more, please contactTrizek (WMF).
View all 34 community-submitted tasks that wereresolved last week. For example, the issue where search boxes on some Commons pages showed no results due to switch from SpecialSearch to MediaSearch, has now been fixed.[1]
Two new wikis have been created:
- a Wikipedia inToki Pona (w:tok:)[2]
- a Wikiquote inNigerian Pidgin (q:pcm:)[3]

Updates for technical contributors

Detailed code updates later this week:MediaWiki

In depth

The Wikimedia Foundation is in the early stages of exploring approaches toArticle guidance. The initiative aims to identify interventions that could help new editors easily understand and apply existing Wikipedia practices and policies when creating an article. The project is in the exploration and early experimental design phase. All community members are encouraged tolearn more about the project, and share their thoughts onthe talk page.

Tech news prepared byTech News writers and posted bybot •Contribute •Translate •Get help •Give feedback •Subscribe or unsubscribe.

MediaWiki message delivery18:58, 1 December 2025 (UTC)Reply

Add /æɪ/ as an English diphthong

[edit]

Latest comment:10 days ago5 comments3 people in discussion

Can someone add /æɪ/ as an English diphthong toModule:IPA/data? Right now,day is categorized as a two syllable word due to the Australian pronunciation.Tc14Hd (aka Marc) (talk)19:09, 1 December 2025 (UTC)Reply

And why are the editors virtually never using ◌̯? This sign is not restricted to phonetic detail, for which it ismentioned in Wikipedia, in the sense of being opposed to phonemic notation—see also mymy recent comment on the false dichotomy in lexicography. Two consequent vowel signs are by default to be parsed as two syllables.Fay Freak (talk)19:55, 1 December 2025 (UTC)Reply

It's the classic "because we've always done it this way", at least for most English transcriptions. I would be in favor of requiring ◌̯ to be used, but right now it is how it is.Tc14Hd (aka Marc) (talk)20:13, 1 December 2025 (UTC)Reply

@Tc14Hd

Done.Benwing2 (talk)02:45, 5 December 2025 (UTC)Reply

@Benwing2 Thanks!Tc14Hd (aka Marc) (talk)08:29, 5 December 2025 (UTC)Reply

Request for category deletion

[edit]

Latest comment:12 days ago2 comments2 people in discussion

As part of the recent R: template cleanup, I generated a bunch of now-unneeded categories to check for invalid calls to each template. If some admin has a script for mass-deleting pages, here's thelist of categories that can be deleted. I can re-format the list as needed if it helps with cleanup. Thanks!JeffDoozan (talk)17:50, 3 December 2025 (UTC)Reply

Done.Benwing2 (talk)02:58, 4 December 2025 (UTC)Reply

<templatedata> doesn't support basic wikicode?

[edit]

Latest comment:7 days ago18 comments6 people in discussion

I found that the documentation for{{ja-vp}} was a bit wonky (inconsistent and confusing wording), so I set to doing some basic copy-editing. TheTemplate:ja-vp/documentation page makes use of the<templatedata> element, which I've never had to deal with before. Apparently the contents are not allowed to contain any wikicode or HTML? These wind up just rendered on-screen as straight text, so things like[[this]] or''this'' or<i>this</i> get shown on the rendered page as-is, without the expected linking or formatting and with the markup displayed.

This strikes me as pretty awful usability.

It looks like this content is only used to generate a table.

Is there any good reason why we should use<templatedata>, instead of just creating a wikicode table? Wikicode tables can be a royal PITA in their own right, but at least the wikicode is more straightforward to deal with than the odd constraints imposed by<templatedata>. ‑‑ Eiríkr Útlendi │^{Tala við mig}23:28, 5 December 2025 (UTC)Reply

@Eirikr: TemplateData isn't there for the documentation page, it's there so editors using apps like themw:VisualEditor can view information on the parameters while editing elsewhere on Wiktionary. Seemw:Help:TemplateData.Chuck Entz (talk)01:24, 6 December 2025 (UTC)Reply

Thank you, @Chuck. Unfortunately,mw:Help:TemplateData doesn't seem to document anywhere that thedescription string is not parsed, and is instead output as a bare string. That would seem to be a failure of the documentation, but then I don't know how this is supposed to work -- maybe thedescription string is supposed to be parsed, and something has broken in this feature?

I did notice this somewhat amusing bit, over atw:VisualEditor#Original_rationale:

The decline in new contributor growth was viewed as the single most serious challenge facing the Wikimedia movement. VisualEditor was built with the goal of removing avoidable technical impediments associated with Wikimedia's editing interface, as a necessary pre-condition for increasing the number of Wikimedia contributors. Subsequent research foundno measurable gains over wikitext for new contributors.

As far as Wiktionary goes, is this a solution in search of a problem? Do we have any data on the use of the VisualEditor here at Wiktionary? Is this TemplateData rigamarole a MacGuffin of sorts, or is it actually useful? ‑‑ Eiríkr Útlendi │^{Tala við mig}02:06, 6 December 2025 (UTC)Reply

@Chuck Entz @Eirikr The problems with templatedata are worse than just what you note. It's impossible to generate templatedata using a template or module, and as a result it tends to be woefully out of date; furthermore, there's no way to hide its ugliness from the documentation page and still have it work with VisualEditor. I strongly believe we should nuke all uses of templatedata.Benwing2 (talk)04:06, 6 December 2025 (UTC)Reply

Thank you @Benwing2. I've never before encountered<templatedata>, and I can't say as my first brush has left a favorable impression. Barring anyone presenting compelling evidence of its utility, I would support its removal. ‑‑ Eiríkr Útlendi │^{Tala við mig}05:15, 7 December 2025 (UTC)Reply

I'd not support removal of TemplateData (I believe it is valuable for those using the so-called2017 wikitext editor as well as VE), hut it would be very nice to find a way to visually hide it. It might be good to have some JS that encloses it in a collapsible box.This, that and the other (talk)21:42, 7 December 2025 (UTC)Reply

How many people actually make use of the TemplateData? Do we have evidence for this? It's usually wrong or out-of-date so I still question its utility.Benwing2 (talk)21:50, 7 December 2025 (UTC)Reply

@Benwing2: Something I thought was quite cool is the ability to use the TemplateData API outside of Wiktionary for making external edits using some other text editor or program; in theory, that would be a good use for it, e.g. providing a Wiktionary language server inside of a text editor that is aware of the exact parameters available to the user for each template. Unfortunately, in reality, I doubt this is actually done in practice. I'm not aware of any implementation like this, although someone did do something similar for Wikipedia (though I can't remember where I saw that). If it's not possible to maintain this easily, then it might be best to either find a way to generate the TemplateData from the params object given toModule:parameters, and/or from the template parameters used in in raw templates; or, just not to use TemplateData anymore, like you suggest. The first one would take a lot of effort, so maybe we could just not use TemplateData, as sad as that is.Kiril kovachev (talk・contribs)23:52, 7 December 2025 (UTC)Reply

@Kiril kovachev So I tried to find a way to auto-generate TemplateData from a module when I wroteModule:form of doc, which auto-generates the documentation for form-of templates, but at the time I wrote this, it was apparently not possible. This means the only way to auto-generate it is using a bot, which is far from ideal because it's somewhat of a maintenance headache. The least painful way I can think of is forModule:form of doc or similar to provide a bot interface that generates the TemplateData junk and then use a bot to add it to each doc page. We'd need a JS script to hide it, though, because it's big and butt-ugly. Note also that I have a partly-written project to provide a general mechanism to auto-generate doc pages, which could make it much easier to create things likeModule:form of doc and would allow one-off templates to have standardized wording for parameters that recur in many templates; this sort of module could also potentially generate TemplateData data.

OTOH maybe it is in fact possible to auto-generate TemplateData, because there are things likew:Template:Format TemplateData andw:Module:Format TemplateData on Wikipedia that are designed to make TemplateData less awful (the doc page to the latter has a long screed about how bad TemplateData is by default) and might be auto-generating it. I haven't looked enough into this to figure out exactly what it does or how it works.Benwing2 (talk)00:09, 8 December 2025 (UTC)Reply

@Benwing2 Do you mean that there's a technical reason why generating TemplateData using a module is impossible, such as, I imagine, the TemplateData needs to be a literal HTML-like tag, not transcluded via a template? If so, it might be possible to use that project you mentioned working on to subst: the output of a module, which would then result in the desired TemplateData being output directly, whilst also being easily re-generatable by bot and by hand.

For now, any user can easily block the TemplateData for themselves using uBlock Origin, by doing a CSS selector for.mw-templatedata-doc-wrap. I suppose writing a gadget for this would also be really easy, but I don't know if it doesn't vary by skin, so I don't know about that... but at any rate, at least that part of the problem should be easy to solve.

What do you think about the viability of that subst idea?Kiril kovachev (talk・contribs)00:34, 8 December 2025 (UTC)Reply

Or perhaps have a subpage with the lua-generated version, and a script that copies the text from it in the expected format to the documentation page on demand (like the one that generates the css from the language modules), or some process that does it automatically from time to time?Chuck Entz (talk)00:43, 8 December 2025 (UTC)Reply

I didn't think of the subst: idea and it would work, but it suffers the same problem in that there are > 100 form-of pages, and any time any change is made to the module, we'd need to update all 100+ pages, so it may as well be done by bot (or possibly by a JS script similar to the one that currently updates language name caches when changes are made to language data).Benwing2 (talk)02:02, 8 December 2025 (UTC)Reply

But yes, it seems that templatedata is parsed before expanding templates or modules, so if a template or module generates the<templatedata> tag, it doesn't properly display as a TemplateData structure, but displays raw. Maybe this doesn't actually matter for the purposes of Visual Editor, but I suspect it does.Benwing2 (talk)02:04, 8 December 2025 (UTC)Reply

The workaround would be to provide some sort of function in Lua to insert TemplateData, but it doesn't exist AFAIK.Benwing2 (talk)02:05, 8 December 2025 (UTC)Reply

@Benwing2 These edits were made using the 2017 wikitext editor; it would be interesting to know whether these users find TemplateData useful.

If TemplateData is kept, I'd support only providing it for the most common templates, and even then only the most commonly used parameters (like 1=, 2=, 3= and 4= of{{m}}), unless/until some improvements are made on the MediaWiki side. That avoids issues of keeping the TemplateData up to date, as these very common parameters are unlikely to ever change.This, that and the other (talk)07:14, 8 December 2025 (UTC)Reply

OK, that seems a reasonable compromise.Benwing2 (talk)08:10, 8 December 2025 (UTC)Reply

The idea itself is good IMO, but perhaps the implementation not so much. It would only work if it were the only place where parameters are documented, then we could have human + machine-readable information about the templates. The machine bit could help with input/after-the-fact validation for example. Regarding usage stats, I'm also curious, we'd have to look at revisions and filter for the tags added by the visual editor?Jberkel 23:10, 7 December 2025 (UTC)Reply

The biggest issue is that it's impossible (AFAICT) to auto-generate the templatedata using a template or module. This makes it a major pain in the ass to keep the main documentation and templatedata in sync. I ran into this issue withModule:form of doc, trying to make it auto-generate the corresponding templatedata. It's typical of the horrendous designs that have come out of incompetent programmers at MediaWiki.Benwing2 (talk)23:15, 7 December 2025 (UTC)Reply

Page up for deletion, debate ended five to one year ago.

[edit]

Latest comment:7 days ago3 comments1 person in discussion

Category:Reference templates should be deleted, per this discussion:https://en.wiktionary.org/wiki/Wiktionary:Category_and_label_treatment_requests#h-Category:Reference_templates-2017-2017-02-23T15:24:00.000Z —LeastConcern 13:14, 8 December 2025 (UTC)Reply

I should add thatWiktionary:Entry layout#References led me to the above Cat. so I guess that should be edited as well. —LeastConcern 13:15, 8 December 2025 (UTC)Reply

maybe it should linkCategory:Citation templates instead. —LeastConcern 13:43, 8 December 2025 (UTC)Reply

Tech News: 2025-50

[edit]

Latest comment:7 days ago1 comment1 person in discussion

Latesttech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you.Translations are available.

Weekly highlight

Anybody who wishes to secure their user account can now usetwo-factor authentication (2FA). This is available to all registered users of all Wikimedia projects. This is part of theAccount Security initiative. Later, 2FA will be required for all users who can take security- or privacy-sensitive actions.

Updates for editors

Following last week's deployments, theAdd a link feature, which allows editors to add suggested links during editing, will be available to an additional33 Wikipedias starting on 9 December. This expansion is possible thanks to the new prediction model that now supports all languages, including those that were previously not covered. While the feature has been available on most Wikipedias for some time, this rollout brings us closer to using the improved model everywhere. If you have any questions or would like more details please contactTrizek (WMF).
Last week, theSearch Platform team addedtransliterated as-you-type search suggestions to Georgian wikis. If there are only a few regular search suggestions, then queries in Latin or Cyrillic scriptare now rewritten into Georgian script to look for more matches. For example, searching for eitherbedniereba orбедниереба will now suggest the existing article aboutბედნიერება ("happiness"). You can recommend other languages where transliterated suggestions would be usefulon Phabricator for future development.
Later this week, a controlled experiment will begin for editors on the 100 largest Wikipedias who are editing a section in the mobile web visual editor. 50% of these editors will notice a new "Edit full page" button that will enable them to expand their editing session to the whole page. This feature is intended to make it easier for people on mobile web to edit any article section, regardless of which section-edit icon they tapped to begin. The experiment will last ~4 weeks. You can findmore details about the project.
Later this week, theReader Growth team will launch amobile web experiment to expand all article sections by default (currently they are collapsed by default) and pin the section header the user is currently reading to the top of the page. The experiment will affect 10% of users on Arabic, Chinese, French, Indonesian, and Vietnamese Wikipedias.[4]
TheWikipedia Year in Review 2025, a feature in the Wikipedia mobile apps (iOS and Android) that provides users with a personalised summary of their engagement with Wikipedia over the year, is now available on the iOS and Android apps. This edition includes expanded personalised insights, improved reading highlights, new donor messaging, and updated designs. Open the app to view your Year in Review and explore your reading journey from 2025.
A recent software bug caused edits made with VisualEditor to make unintended changes to wikitext, including removing whitespace and replacing spaces with underscores in wikilinks inside citations. This was partially fixed last week, and further fixes are in progress. Editors who used VisualEditor between November 28 and December 2 should review their edits for unexpected modifications.[5]
View all 23 community-submitted tasks that wereresolved last week. For example, the incorrect handling of URLs copied from the address bar of Microsoft Edge users, has been resolved.[6]

Updates for technical contributors

Starting this week, users of the "Improved Syntax Highlighting"beta feature will haveCodeMirror as the editor for Lua, JavaScript, CSS, JSON and Vue content models, instead ofCodeEditor. With this, thelinters will be upgraded. This is part of a larger effort to eventually replace CodeEditor and provide a consistent code editing experience.[7]
Developers are encouraged to take the2025 Developer Satisfaction Survey, which remains open until 5 January 2026. If you build software for the Wikimedia ecosystem and would like to share your experiences or feedback, your participation is greatly appreciated.[8]
There is no new MediaWiki version this week.

Tech news prepared byTech News writers and posted bybot •Contribute •Translate •Get help •Give feedback •Subscribe or unsubscribe.

MediaWiki message delivery17:45, 8 December 2025 (UTC)Reply

Category:Japanese Han characters

[edit]

Latest comment:4 days ago2 comments2 people in discussion

Should we deleteCategory:Japanese Han characters? It is obsolete toCategory:Japanese kanji, and there is nothing besides user pages that links to it.Kiril kovachev (talk・contribs)03:10, 11 December 2025 (UTC)Reply

Yes, I just deleted it.Benwing2 (talk)19:56, 11 December 2025 (UTC)Reply

Template:attention andTemplate:tea room

[edit]

Latest comment:4 days ago1 comment1 person in discussion

The display box generated by{{attention}} is one of the ugliest things I have ever seen at Wiktionary. Why does it have to be so ugly, with the color clash and the traffic-warning style triangle? (I know how to turn it off.)

Its ugliness is compounded when{{tea room}} seemingly generates it automagically even though{{tea room}} has its own large display box. (Seejoint#Etymology 2 for an example if you have the "catch my attention" gadget turned on.) Why the duplication of attention-drawing displays?DCDuring (talk)23:18, 11 December 2025 (UTC)Reply

uncaught invalid IPA

[edit]

Latest comment:2 days ago5 comments3 people in discussion

Why are edits likethis andthis notcategorized as using invalid IPA? Is the check not language-specific? Can we make it language-specific? (At least some of the categories are language-specific, e.g.Category:English IPA pronunciations with invalid separators.) Or are there varieties of English which do have the mid floating tone? If so, could the check be made to take into account the declared accent, too? GenAm does not have the mid floating tone, so the macron should register as not being a valid lang=en, a=GA, /phonemic/ input, no? It is clearly an enPR-like transcription that user has incorrectly listed as IPA.(PS if anyone would like to alsomake the enPR template flag invalid input that'd be great.)- -sche (discuss)07:16, 13 December 2025 (UTC)Reply

@-sche Is the only thing needed for the enPR flagging to check whether the input uses non-enPR characters and just categorize if so?Kiril kovachev (talk・contribs)00:34, 14 December 2025 (UTC)Reply

AFAICT, yes. (If anyone has extra energy, making it display a Lua-error-like warning to the user only when previewing the entry, like{{IPA}} does, would be nice.)- -sche (discuss)06:24, 14 December 2025 (UTC)Reply

@-sche: it may be obviously wrong to you or me, but computers are inherently ignorant and stupid. Making them smart takes work and resources- especially with something as complex as English pronunciation. Factor in all the regional variation, and it becomes a challenge of epic proportions, with isoglosses criss-crossing all over the place.

Plus, many of the vowels in your examples can be found in American English:day,sane uh,odd,un /duh,fee,bruh,law,un. Some consonant clusters are obviously wrong for English, like "tr" and "sh", but many of them are context-dependent. Context checking is anything but simple.

It looks to me like the best we can manage is to catch a few obvious "gotchas" like "tr" and "sh" for English. The rest will have to wait until the groundwork is done for an English IPA module, and will have to be disabled for the resource-hogs like many of the 1-3 character pages.Chuck Entz (talk)01:37, 14 December 2025 (UTC)Reply

There are words where "a" and "e" (without macrons) occur in diphthongs, but is there any English pronunciation (and if so, is there any GenAm pronunciation) which is correctly notated, in IPA, using "ā", or "ē" (with macrons)? AFAICT, the mere presence of "ā" or "ē" should be treated as an error, when lang=en, or at least when a=GA.- -sche (discuss)06:24, 14 December 2025 (UTC)Reply

se should be converted to amammoth page

[edit]

Latest comment:1 hour ago14 comments5 people in discussion

Some single-letter entries other thana (such aso andu) are also pretty uncomfortably long. Their (includingse's) loadtime is too high; they load so sluggishly. Even though the wikiparser's limits are not exceeded there as easily and often, I still think such optimization will do them good. By the way, I have edited the template's docs to that end.Bytekast (talk)18:51, 15 December 2025 (UTC)Reply

Maybe this can be fully automated by imposing a limit of 100 L2s per page? Looking atCategory:Pages with entries there are only a handful of pages that large.Ioaxxere (talk)19:25, 15 December 2025 (UTC)Reply

I like that idea! Somebot ought to be implemented and run over the pages in the non-empty subcategories there afterCategory:Pages with 100 entries (not including it) to mammothize them (se has 109!). We could use an existing bot, but I think a specialized one for that would be better. (Name ideas for the bot:mammobot,MannyEntries [a wordplay betweenManny and the phrase "Many entries"],Woollybot... pick your favorite.)

Moreover, I think that someday the Mediawiki software should be tweaked and adapted, at first for Wiktionary and Wikipedia, with some optimizations for huge pages (which Wikipedia also has, in a different way, with more textbulk).

Perhaps withlazy loading of entries (or more generally, sections), whereby they should be parsed and loaded only after being scrolled down to and viewed by the user; or something similar to what we are doing here (with subpages), but implemented in the background in such a way that it is not visible either to the user or the editor, like a serverside abstraction that uses automatic anonymous subpages while making it all seem like one page.

This part of the discussion should be brought over toPhabricator.

Bytekast (talk)20:19, 15 December 2025 (UTC)Reply

I'm documenting the mammothisation process here for information. It can only be carried out bytemplate editors or admins.

Suppose you want to mammothise pageX.

Copy the content that would go to the "X/languages A to L" subpage to the creation form atmammoth page test/languages A to L and press Preview.
- Look for instances where the entry name is being detected/displayed as "mammoth page test/languages A to L" and repair the applicable templates or modules to use the so-calledlogical pagename (by substituting{{safe page name}} in place of{{PAGENAME}}, for instance).
- Where the entry name is detected as "mammoth page test" with no subpage, that is the correct behaviour and no change is needed.
- There's no need to publish themammoth page test/languages A to L page - preview mode should be sufficient.
AddX to themammoth_pages list atModule:links/data.This requires template editor rights.
Create the subpages ofX by moving the relevant L2s. Only the Translingual and English L2s remain onX itself.
Add{{mammoth page}} at the top and{{mammoth page footer}} at the bottom ofX and all subpages.
Copy watchers fromX to the new subpages. Do this by moving Talk:X (only) to Talk:X/languages A to L, then move back to Talk:X, then move to Talk:X/languages M to Z, then move back to Talk:X.Template editors (and other non-admins) need to complete the moves in exactly this sequence! (Admins can move Talk:X/languages A to L directly to Talk:X/languages M to Z.)

Step 1 is not very easily bottable, but eventually it won't be required, as @Benwing2 is working on eliminating the problem it solves.This, that and the other (talk)23:10, 15 December 2025 (UTC)Reply

@Bytekast @Ioaxxere @Benwing2 if no objections, I'm going to mammothiseo next, as it is currently inCAT:E. @Chuck Entz may have insight on other priority candidates for this treatment.This, that and the other (talk)23:12, 15 December 2025 (UTC)Reply

I would hold off a bit; I'm currently in the process of writing a post about some more subtle problems I found in the mammoth code, which IMO we should address before we "mammothize" more pages.Benwing2 (talk)23:15, 15 December 2025 (UTC)Reply

Sure, no worries.This, that and the other (talk)23:29, 15 December 2025 (UTC)Reply

@This, that and the other: I think we should get rid of the table of contents box at the bottom, because it bloats the already-long page. Also, on mobile it's weird because the box ends up getting tucked inside the last L2, so someone just scrolling down wouldn't see it anyway. It would be interesting to consider a JS-based solution for these mammoth pages, where the L2 of your choice would be dynamically fetched and displayed.Ioaxxere (talk)01:15, 16 December 2025 (UTC)Reply

@This, that and the other I completely agree. I didn't even realize the table of contents is duplicated at the bottom, and I think it's quite unnecessary as well as (currently) doubling the number of expensive calls being done. (This will be fixed but the page will still end up being parsed twice unless we do something clever with loadData().)Benwing2 (talk)01:56, 16 December 2025 (UTC)Reply

@Benwing2 @Ioaxxere I put a rationale for this at the{{mammoth page}} documentation. Yes, I agree the second box needs to have its appearance fixed somehow on mobile, and in desktop it needs a <hr> above it. But I still think it's important to help users navigate their way around a page that departs from Wiktionary's norms in a potentially confusing manner. You could argue that the page title "a/languages A to L" or whatever is enough of a clue, but if you follow a # link, you won't necessarily see that.This, that and the other (talk)05:20, 16 December 2025 (UTC)Reply

In that case if you really think users will be confused by the split pages and wonder where the remaining languages are, I would recommend either

Put a simple boldfaced line at the bottom of A-L saying, "For languages beginning with M through Z, click here" which takes you to the M through Z page table of contents at the top; or
have the TOC at the bottom of A-L contain only the M-Z languages, and no TOC at the bottom of M-Z.

The full TOC in both cases seems overkill.Benwing2 (talk)05:36, 16 December 2025 (UTC)Reply

I like idea 1. Good thinking. I will do it.This, that and the other (talk)05:57, 16 December 2025 (UTC)Reply

@This, that and the other: mostly it'si, withe,o andA also fairly frequent (u sometimes). I haven't seen any Han character pages since a workaround was added for the big ones. I notice thatan is inCategory:Pages with too many expensive parser function calls. In general the "Pages where[…]" categories have some interesting things that need fixing- though some are side issues (Template:cop-conj's documentation says: "This template handles all Coptic verb conjugations from all available dialects". Coptic is an agglutinative language, and the tables are huge even before they're repeated for each of the 5 dialects implemented- but I digress...).Chuck Entz (talk)05:10, 16 December 2025 (UTC)Reply

@Chuck Entz I am going to look into the expensive parser function calls inan next.Benwing2 (talk)05:16, 16 December 2025 (UTC)Reply

Tech News: 2025-51

[edit]

Latest comment:12 hours ago1 comment1 person in discussion

Latesttech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you.Translations are available.

Updates for editors

View all 18 community-submitted tasks that wereresolved last week. For example, one of the fixes addressed an issue for temporary accounts adding an external URL, which triggered an hCaptcha request in more cases than intended, and did not display the required popup on the first attempt to publish the edit.[9]

Updates for technical contributors

To improve database and site performance, external links to Wikimedia projects will no longer be stored in the database. This means they will not be searchable inSpecial:LinkSearch, will not be checked by the Spam Blacklist or AbuseFilter as new links, and will not be in theexternallinks table on database replicas. In the future this may be extended to other highly-linked trusted websites on a per-wiki basis, such as Creative Commons links on Wikimedia Commons.[10]
Detailed code updates later this week:MediaWiki

Tech news prepared byTech News writers and posted bybot •Contribute •Translate •Get help •Give feedback •Subscribe or unsubscribe.

MediaWiki message delivery19:03, 15 December 2025 (UTC)Reply

mammoth page issues

[edit]

Latest comment:1 hour ago8 comments3 people in discussion

I'm starting a new GP thread to track continuing issues related to the "mammoth page" split. Currently the discussion is happening inWiktionary:Grease_pit/2025/October#Long_entry_failure:_"The_time_allocated_for_running_scripts_has_expired." which is from two months ago and thus no longer transcluded intoWT:GP. @Polomo reported some weirdness in the character info box ona, in that for example the Unicode name shows up as<reserved-0061> instead of its correct name, and the next character shows up as U+3400 instead of as U+0062 "b". I went down a rabbit hole looking into this and when I came out I found out some interesting things. The following is structured similarly to the "five whys" component of an Amazon-style COE ("Correction of Errors"); seehttps://aws.amazon.com/blogs/mt/why-you-should-develop-a-correction-of-error-coe/.

Why do we see<reserved-0061>? At its root, it's because as of July, instead of having our own Unicode data modules, @Surjection changed things to call out to Commons to fetch the data, including things such as the Unicode name. This counts as an "expensive" call, and the expensive call limit has been hit ona, so this call fails, which (I'm not yet exactly sure why) causes lookup_name() inModule:Unicode data to return<reserved-0061>.
Why does the next character show as U+3400? The code inModule:character info to retrieve the next and previous characters handles the failure in (1) badly (IMO), and gets in a loop looking up and down the codepoints, one by one, for a non-reserved until it either hits 0 (when going down) or hits U+3400, which for some reason returns a non-reserved name (maybe because we still have our down data on CJK characters).
Why do we hit the expensive call limit? It's because of the redirectTarget check inModule:title/redirectTarget, which happens during template parsing. This is something that is due to @Theknightwho, and TKW insists this redirectTarget check is necessary even though it seems to cause lots of issues.
Why isModule:title/redirectTarget getting invoked repeatedly? I put some logging in, which shows that all the templates ona/languages A to L anda/languages M to Z are being run throughModule:title/redirectTarget. This is happening becauseModule:mammoth page line 33 calls process_page() inModule:headword/page on each subpage in order to get the list of L2's on that page. This parses the entire page, unnecessarily running every template throughModule:title/redirectTarget. Somewhere in the O's (I think), the expensive call limit happens.

My conclusions:

We can fix the immediate issue by having a version of process_page() that only returns the L2's and doesn't parse the whole page. I've already previously hacked process_page() with a `no_fetch_content` flag for use on documentation pages that have many examples with the|pagename= param set, which causes each such page to be parsed and makes the documentation pages take a long time to process (the actual slowdown comes mostly from line 815frame:callParserFunction("DEFAULTSORT",data.pagename_defaultsort), which is disabled when no_fetch_content is set). We would need another flag here.
More generally, we need to rethink the promiscuous use ofModule:title/redirectTarget every time a page is parsed. For many uses of the template parser, checking redirects isn't needed, and having it enabled by default will almost certainly lead to further issues. I would suggest that the template parser code have a flag to enable redirect target processing, which is not set by default; or at least, if that is deemed unacceptable, a flag to turn off redirect processing.
We also need to see if there's another way to mitigate the "expensive" redirectTarget checking. For example, if there's some way to, say, use the slower but non-"expensive" check after processing a certain number of redirect checks, that would be ideal. Maybe MediaWiki prevents us from doing this, but we could at least have the template parser itself stop redirectTarget checking the "expensive" way after 100 templates processed or so.
We also need to clean upModule:character info andModule:Unicode data to more gracefully handle failures in retrieving Commons data.

Pinging the "usual suspects": @Surjection @Theknightwho @This, that and the other @Ioaxxere @Fenakhay @Erutuon.Benwing2 (talk)23:31, 15 December 2025 (UTC)Reply

It looks likeprocess_page doesn't just fetch the L2 list but has dozens of other fields, including one that does checks for the presence of certain templates for the sake of some kind of categorization (hence the need for redirect checking). I think some separation of concerns is necessary, otherwise it's like buying an entire sandwich and then only eating the lettuce.Ioaxxere (talk)01:08, 16 December 2025 (UTC)Reply

Indeed, I am in the process of doing that now.Benwing2 (talk)01:19, 16 December 2025 (UTC)Reply

@This, that and the other Hi. I am rewritingModule:mammoth page and notice some coding issues you should beware of:

Snake case should be used throughout for local variables and function names rather than camel case. This is in keeping with Wiktionary coding style. There are a few modules that don't follow this style in their function names for historical reasons, but we should not create new code or modules using camel case.
Strings should use double quotes, not single quotes, except when the text of the string contains a single quote.
Module references should begin withm_.
Some modules (e.g.Module:links/data) should be loaded withmw.loadData() rather thanrequire(), which ensures they get loaded only once per page instead of once per template invocation.
You should generally avoid things likestring.find(foo, bar) in favor offoo:find(bar).
Be very careful using the Unicode functions inmw.ustring.*, and even more careful if you mix byte-level functions likestring.find with character-level functions likemw.ustring.*. In this case you were retrieving the byte position of a slash usingstring.find and then passing it tomw.ustring.sub, which expects character positions. This will fail if the portion before the slash has any non-ASCII characters in it.
More generally, consider usingfoo:match(bar) to pull out portions of a string instead of finding the index of a delimiter and then using substring retrieval.
pairs() does not return its items in any particular order. You have a loop usingpairs() that builds up a list; without sorting the resulting list, there is no guarantee that the items in the list will follow the order of the data as found in the source file; in fact, with more than 2-3 items, it almost certainly will not. In this case, it's just blind luck that the order of the two items in the table is coming out right. The same issue happens more seriously on line 1565 ofModule:languages, which has a similar loop; if the items happen to be iterated in reverse order, everything will end up on the L-Z page. Instead of a table you should probably use a list of two-item tuples.
process_page() does a lot of unnecessary things for the purposes of finding L2 sections, and is generally intended to be run only for the current page (e.g. it calls a parser function to set the default sort key, which affects sorting on the current page). Instead, we need to be calling the template parser directly and telling it to iterate only over L2 sections (I am rewriting the code to do this).
The current implementation of mammoth pages assumes all mammoth pages are split the same way. Since some pages have very different characteristics from others (e.g. one-character Han pages may hit limits with only 4 languages on them, and some pages will end up having many more languages than others), I think we need to redo it to allow for different splits for different pages. I would suggest a level of indirection where you have types of mammoth pages and splits accordingly; so you have e.g. "CJKV pages" that may put Chinese on a page by itself, "two-way pages" that split A-L and M-Z, "three-way pages" that split e.g. A-H, I-P, Q-Z, etc.
The current implementation inModule:languages to determine which subpage to place a given language on needs some work for things like the'Are'are language, which sorts under A, as if the apostrophe isn't there. Also your Lua pattern that looks like"^[A-LÀÁÄ]" seems fragile in that it depends on there not being any languages with other accents on it, which could change at any time. I think the correct thing is rather to canonicalize the language name before lookup. This is done in a complex fashion on line 840 ofModule:headword/page:weight=toNFC(ugsub(ugsub(toNFD(L2),"["..comb_chars_all.."'\"ʻʼ]+",""),"[%s%-]+"," ")), which essentially decomposes the language name; removes all combining characters (i.e. diacritics) as well as apostrophes, double quotes and Unicode quotes; converts whitespace (including Unicode whitespace) and hyphens to a generic space character; and re-composes the result. This line should be exported as its own function so it can be called elsewhere, such as by the mammoth splitting code that's currently part ofmakeEntryName().

Benwing2 (talk)03:16, 16 December 2025 (UTC)Reply

@Benwing2 thanks for apprising me of these points, especially point 8. Should items like point 1, 2 and 4 not be written down atWiktionary:Coding conventions#Lua?

As for point 10, you have probably seen from my code comments that one splitting scheme for all mammoth pages, rather than different splits for each mammoth page, was intentional (to make detection of mammoth pages more straightforward and to make life easier for bot and script users). My assumption, admittedly without much effort put into verifying it, was that the Chinese section alone would push CJK entries over the line, with other sections being relatively small by comparison, so per-language splitting for CJK entries would be futile. If I'm wrong about this, then I agree some change is needed, as C, J and K all come before L. I'd see one split for CJK pages and a second split for non-CJK pages as a logical progression from the status quo, but if you want to go the whole hog, fine by me.

As for point 6, I was aware of this; I used the non-Unicode function in the critical path for speed, as this code runs for every single link template invocation (this was also the rationale for point 7). Since I anticipated mammoth pages to be Latin only (as above), I felt this would be a non-issue, and if it was to become an issue, the breakage would be obvious.

As for point 11, I went for the simplest option, but yes, it is fragile. I'm glad to defer to your expertise here and elsewhere, and I apologise for leaving a trail of dodgy Lua in my wake!This, that and the other (talk)05:16, 16 December 2025 (UTC)Reply

@This, that and the other I inserted item #2 later on, so your point numbers may be off by one; could I request you to update them? BTW as for the one about Unicode vs. non-Unicode functions, you could have used the non-Unicode function everywhere; that would have been correct in all circumstances. In general, there's no need to use Unicode functions when dealing with Unicode text except under certain circumstances (e.g. if you are matching an individual Unicode character or a character range containing Unicode characters). In this case, the slash is ASCII so if you're just fetching the stuff on either side, it doesn't matter if there are Unicode (properly UTF-8) chars among that stuff. As for the splitting scheme, we can leave it as-is for now but I suspect we will need to revisit it later.Benwing2 (talk)05:31, 16 December 2025 (UTC)Reply

@Benwing2 numbering updated. Yes, now that you mention it, I wonder why I used the Unicode version at all. I'm sure there was some reason, even if it was a wrong one!This, that and the other (talk)05:36, 16 December 2025 (UTC)Reply

@This, that and the other I'll add 1, 2 and 4 toWT:Coding conventions#Lua. @Ioaxxere I just pushed my rewrittenModule:mammoth page and changes toModule:languages andModule:links/data to use a list with ipairs() instead of a table with pairs(). The expensive function call count is now 145 ona and 405 ona/languages M to Z. I think I can reduce it a lot by not doing redirect checks at all during the process_page() run over the page itself; the only need for this is to check for redirects pointing to the single template{{reconstructed}}, and we can just hardcode the one redirect{{reconstruction}} (which has not changed in a long time).Benwing2 (talk)06:08, 16 December 2025 (UTC)Reply

Deprecation of language-specific templates

[edit]

Latest comment:5 hours ago1 comment1 person in discussion

there are many old language-specific templates for linking to entries using other languages. as the Wiktionary codebase has grown more complex over the years, these have their functionality incorporated into the main{{l}} (link) and{{m}} (mention) templates:

{{zh-l}} and{{zh-m}} (Chinese to Pinyin;>60k transclusions and>1k transclusions respectively)
{{ko-l}} (Korean to RR;>8k transclusions)

two other templates ({{jra-l}} and{{ryu-l}}) that were completely unused, I simply nominated for speedy deletion. besides these, whose functionality is already implemented, there are others that require little modification.

{{ja-l}} (Japanese kana to Hepburn romanisation;>1k transclusions)
- compare{{ja-r}} discussedin this thread. a modification may be to include a|ruby= parameter to control when ruby text is or is not displayed (the consensus inferred from the thread above is probably display by default)
{{ltc-l}} (Middle Chinese to Baxter romanisation;>4k transclusions)
{{th-l}} (Thai respelling to Paiboon romanisation;>2k transclusions)
- compare the proposed solution for Japanese
{{vi-l}} (Vietnamese with Chu nom;>4k transclusions)
- {{vi-link}} and{{vi-m}} (duplicate templates;101 transclusions and unused respectively)
- loosely compare Korean links

the remaining template{{fa-l}} (for Persian) is a bit weirder for me to understand. leaving its consideration up to other editors.

for these templates, my proposal is that transclusions be updated to use the singular{{l}} and{{m}} templates (or other link templatse depending on context) and then be deprecated. the details of which will be tracked here.Juwan (talk)01:55, 16 December 2025 (UTC)Reply

Booglinee - a real entry

[edit]

Discussion moved toWiktionary:Requests for deletion/English#Booglinee.

request to stop translation lists from expanding automatically

[edit]

Latest comment:1 hour ago3 comments2 people in discussion

like when loading or refreshing a "translations" sectionBrawlio (talk)04:38, 16 December 2025 (UTC)Reply

@Brawlio: there’s a link on the menu at the right side on the screen to hide translations by default. You’ll need to have cookies for the website turned on, I think. —Sgconlaw (talk)04:49, 16 December 2025 (UTC)Reply

@Sgconlaw: oh interesting, i have it set to hide translations。the problem is that the website temporarily sets itself to show translations when you load or refresh the webpage when "translations" is in the browser's address bar。directly loading a translations section of a webpage happens when you click on the in-line "see also" hyperlinks on a translations list gloss bar。refreshing a webpage with "translations" in the URL happens after clicking on a translations section under the table of contents, or after editing a translations section and then hitting refreshBrawlio (talk)05:30, 16 December 2025 (UTC)Reply

"alt + z" shortcut conflict

[edit]

Latest comment:2 hours ago2 comments2 people in discussion

there appears to be a shortcut conflict wherein this shortcut is meant to undo the last change when using the translations template, but instead takes you to the website's homepageBrawlio (talk)04:45, 16 December 2025 (UTC)Reply

@Brawlio: this sounds like something set on your own computer (maybe the browser settings?). I don’t experience this issue. —Sgconlaw (talk)04:48, 16 December 2025 (UTC)Reply

Retrieved from "https://en.wiktionary.org/w/index.php?title=Wiktionary:Grease_pit/2025/December&oldid=88609370"

Category:

Wiktionary-namespace discussion pages

[8]ページ先頭