| This Lua module is used onapproximately 18,300,000 pages, or roughly 28% of all pages. To avoid major disruption and server load, any changes should be tested in the module's/sandbox or/testcases subpages, or in your ownmodule sandbox. The tested changes can be added to this page in a single edit. Consider discussing changes on thetalk page before implementing them. |
| This module can only be edited byadministrators because it istranscluded onto one or morecascade-protected pages. |
This module provides some functions to help with the complex edge cases involved in modules likeModule:Template parameter value which intend to process the raw wikitext of a page while respecting nowiki tags or similar content reliably. This module is designed to be called by other modules, and does not support invoking.
| Thismodule is rated asready for general use. It has reached a mature state, is considered relatively stable and bug-free, and may be used wherever appropriate. It can be mentioned onhelp pages and other Wikipedia resources as an option for new users. To minimise server load and avoid disruptive output, improvements should be developed throughsandbox testing rather than repeated trial-and-error editing. |
| This module is currentlyprotected from editing. See theprotection policy andprotection log for more details. Please discuss any changes on thetalk page; you maysubmit an edit request to ask anadministrator to make an edit if it isuncontroversial or supported byconsensus. You may alsorequest that this page be unprotected. |
PrepareText(text, keepComments) will run any content within certain tags that normally disable processing (<nowiki>,<pre>,<syntaxhighlight>,<source>,<math>) through mw.text.nowiki and remove HTML comments. This allows for tricky syntax to be parsed through more basic means such as%b{} by other modules without worrying about edge cases.
If the second parameter,keepComments, is set to true, the content of HTML comments will be passed through mw.text.nowiki instead of being removed entirely.
Any code using this function directly should consider using mw.text.decode to correct the output at the end if part of the processed text is returned, though this will also decode any input that was encoded but not inside a no-processing tag, which likely isn't a significant issue but still something worth noting.
| Thismodule is rated asalpha. It is ready for limited use and third-party feedback. It may be used on a small number of pages, but should be monitored closely. Suggestions for new features or adjustments to input and output are welcome. |
| This module is currentlyprotected from editing. See theprotection policy andprotection log for more details. Please discuss any changes on thetalk page; you maysubmit an edit request to ask anadministrator to make an edit if it isuncontroversial or supported byconsensus. You may alsorequest that this page be unprotected. |
ParseTemplates(InputText, dontEscape) will attempt to parse all{{Templates}} on a page, handling multiple factors such as[[Wikilinks]] and{{{Variables}}} among other complex syntax. Due to the complexity of the function, it is considerably slow, and should be used carefully. The function returns a list of template objects in order of appearance, which have the following properties:
If the second parameter,dontEscape, is set to true, the inputted text won't be ran through thePrepareText function.
require("strict")--Helper functionslocalfunctionstartswith(text,subtext)returnstring.sub(text,1,#subtext)==subtextendlocalfunctionendswith(text,subtext)returnstring.sub(text,-#subtext,-1)==subtextendlocalfunctionallcases(s)returns:gsub("%a",function(c)return"["..c:upper()..c:lower().."]"end)endlocaltrimcache={}localwhitespace={[" "]=1,["\n"]=1,["\t"]=1,["\r"]=1}localfunctioncheaptrim(str)--mw.text.trim is surprisingly expensive, so here's an alternative approachlocalquick=trimcache[str]ifquickthenreturnquickelse-- local out = string.gsub(str, "^%s*(.-)%s*$", "%1")locallowEndlocalstrlen=#strfori=1,strlendoifnotwhitespace[string.sub(str,i,i)]thenlowEnd=ibreakendendifnotlowEndthentrimcache[str]=""return""endfori=strlen,1,-1doifnotwhitespace[string.sub(str,i,i)]thenlocalout=string.sub(str,lowEnd,i)trimcache[str]=outreturnoutendendendend--[=[ Implementation notes---- NORMAL HTML TAGS ----Tags are very strict on how they want to start, but loose on how they end.The start must strictly follow <[tAgNaMe](%s|>) with no room for whitespace inthe tag's name, but may then flow as they want afterwards, making<div\nclass\n=\n"\nerror\n"\n> validThere's no sense of escaping < or >E.g. <div class="error\>"> will end at \> despite it being inside a quote <div class="<span class="error">error</span>"> will not process the larger divIf a tag has no end, it will consume all text instead of not processing---- NOPROCESSING TAGS (nowiki, pre, syntaxhighlight, source, etc.) ----(In most comments, <source> will not be mentioned. This is because it is thedeprecated version of <syntaxhighlight>)No-Processing tags have some interesting differences to the above rules.For example, their syntax is a lot stricter. While an opening tag appears tofollow the same set of rules, A closing tag can't have any sort of extraformatting period. While </div a/a> is valid, </nowiki a/a> isn't - onlynewlines and spaces/tabs are allowed in closing tags.Note that, even though <pre> tags cause a visual change when the ending tag hasextra formatting, it won't cause the no-processing effects. For some reason, theformat must be strict for that to apply.Both the content inside the tag pair and the content inside each side of thepair is not processed. E.g. <nowiki |}}>|}}</nowiki> would have both of the |}}escaped in practice.When something in the code is referenced to as a "Nowiki Tag", it means a tagwhich causes wiki text to not be processed, which includes <nowiki>, <pre>,and <syntaxhighlight>Since we only care about these tags, we can ignore the idea of an interceptingtag preventing processing, and just go straight for the first ending we can findIf there is no ending to find, the tag will NOT consume the rest of the text interms of processing behaviour (though <pre> will appear to have an effect).Even if there is no end of the tag, the content inside the opening half willstill be unprocessed, meaning {{X20|<nowiki }}>}} wouldn't end at the first }}despite there being no ending to the tag.Note that there are some tags, like <math>, which also function like <nowiki>which are included in this aswell. Some other tags, like <ref>, have far toounpredictable behaviour to be handled currently (they'd have to be split andprocessed as something seperate - its complicated, but maybe not impossible.)I suspect that every tag listed in [[Special:Version]] may behave somewhat likethis, but that's far too many cases worth checking for rarely used tags that maynot even have a good reason to contain {{ or }} anyways, so we leave them alone.---- HTML COMMENTS AND INCLUDEONLY ----HTML Comments are about as basic as it could get for thisStart at <!--, end at -->, no extra conditions. Simple enoughIf a comment has no end, it will eat all text instead of not being processedincludeonly tags function mostly like a regular nowiki tag, with the exceptionthat the tag will actually consume all future text if not given an ending asopposed to simply giving up and not changing anything. Due to complications andthe fact that this is far less likely to be present on a page, aswell as beingsomething that may not want to be escaped, includeonly tags are ignored duringour processing--]=]localvalidtags={nowiki=1,pre=1,syntaxhighlight=1,source=1,math=1}--This function expects the string to start with the taglocalfunctionTestForNowikiTag(text,scanPosition)localtagName=(string.match(text,"^<([^\n />]+)",scanPosition)or""):lower()ifnotvalidtags[tagName]thenreturnnilendlocalnextOpener=string.find(text,"<",scanPosition+1)or-1localnextCloser=string.find(text,">",scanPosition+1)or-1ifnextCloser>-1and(nextOpener==-1ornextCloser<nextOpener)thenlocalstartingTag=string.sub(text,scanPosition,nextCloser)--We have our starting tag (E.g. '<pre style="color:red">')--Now find our ending...ifendswith(startingTag,"/>")then--self-closing tag (we are our own ending)return{Tag=tagName,Start=startingTag,Content="",End="",Length=#startingTag}elselocalendingTagStart,endingTagEnd=string.find(text,"</"..allcases(tagName).."[\t\n]*>",scanPosition)ifendingTagStartthen--Regular tag formationlocalendingTag=string.sub(text,endingTagStart,endingTagEnd)localtagContent=string.sub(text,nextCloser+1,endingTagStart-1)return{Tag=tagName,Start=startingTag,Content=tagContent,End=endingTag,Length=#startingTag+#tagContent+#endingTag}else--Content inside still needs escaping (also linter error!)return{Tag=tagName,Start=startingTag,Content="",End="",Length=#startingTag}endendendreturnnilendlocalfunctionTestForComment(text,scanPosition)--Like TestForNowikiTag but for <!-- -->ifstring.match(text,"^<!%-%-",scanPosition)thenlocalcommentEnd=string.find(text,"-->",scanPosition+4,true)ifcommentEndthenreturn{Start="<!--",End="-->",Content=string.sub(text,scanPosition+4,commentEnd-1),Length=commentEnd-scanPosition+3}else--Consumes all text if not given an endingreturn{Start="<!--",End="",Content=string.sub(text,scanPosition+4),Length=#text-scanPosition+1}endendreturnnilend--[[ Implementation notesThe goal of this function is to escape all text that wouldn't be parsed if itwas preprocessed (see above implementation notes).Using keepComments will keep all HTML comments instead of removing them. Theywill still be escaped regardless to avoid processing errors--]]localfunctionPrepareText(text,keepComments)localnewtext={}localscanPosition=1whiletruedolocalNextCheck=string.find(text,"<[NnSsPpMm!]",scanPosition)--Advance to the next potential tag we care aboutifnotNextCheckthen--Donenewtext[#newtext+1]=string.sub(text,scanPosition)breakendnewtext[#newtext+1]=string.sub(text,scanPosition,NextCheck-1)scanPosition=NextChecklocalComment=TestForComment(text,scanPosition)ifCommentthenifkeepCommentsthennewtext[#newtext+1]=Comment.Start..mw.text.nowiki(Comment.Content)..Comment.EndendscanPosition=scanPosition+Comment.LengthelselocalTag=TestForNowikiTag(text,scanPosition)ifTagthenlocalnewTagStart="<"..mw.text.nowiki(string.sub(Tag.Start,2,-2))..">"localnewTagEnd=Tag.End==""and""or--Respect no tag ending"</"..mw.text.nowiki(string.sub(Tag.End,3,-2))..">"localnewContent=mw.text.nowiki(Tag.Content)newtext[#newtext+1]=newTagStart..newContent..newTagEndscanPosition=scanPosition+Tag.Lengthelse--Nothing special, move on...newtext[#newtext+1]=string.sub(text,scanPosition,scanPosition)scanPosition=scanPosition+1endendendreturntable.concat(newtext,"")end--[=[ Implementation notesThis function is an alternative to Transcluder's getParameters which considersthe potential for a singular { or } or other odd syntax that %b doesn't like tobe in a parameter's value.When handling the difference between {{ and {{{, mediawiki will attempt to matchas many sequences of {{{ as possible before matching a {{E.g. {{{{A}}}} -> { {{{A}}} } {{{{{{{{Text|A}}}}}}}} -> {{ {{{ {{{Text|A}}} }}} }}If there aren't enough triple braces on both sides, the parser will compromisefor a template interpretation.E.g. {{{{A}} }} -> {{ {{ A }} }}While there are technically concerns about things such as wikilinks breakingtemplate processing (E.g. {{[[}}]]}} doesn't stop at the first }}), it shouldn'tbe our job to process inputs perfectly when the input has garbage ({ / } isn'tlegal in titles anyways, so if something's unmatched in a wikilink, it'sguaranteed GIGO)Setting dontEscape will prevent running the input text through EET. Avoidsetting this to true if you don't have to set it.Returned values:A table of all templates. Template data goes as follows: Text: The raw text of the template Name: The name of the template Args: A list of arguments Children: A list of immediate template children--]=]--Helper functionslocalfunctionboundlen(pair)returnpair.End-pair.Start+1end--Main functionlocalfunctionParseTemplates(InputText,dontEscape)--SetupifnotdontEscapethenInputText=PrepareText(InputText)endlocalfunctionfinalise(text)ifnotdontEscapethenreturnmw.text.decode(text)elsereturntextendendlocalfunctionCreateContainerObj(Container)Container.Text={}Container.Args={}Container.ArgOrder={}Container.Children={}-- Container.Name = nil-- Container.Value = nil-- Container.Key = nilContainer.BeyondStart=falseContainer.LastIndex=1Container.finalise=finalisefunctionContainer:HandleArgInput(character,internalcall)ifnotinternalcallthenself.Text[#self.Text+1]=characterendifcharacter=="="thenifself.Keythenself.Value[#self.Value+1]=characterelseself.Key=cheaptrim(self.Valueandtable.concat(self.Value,"")or"")self.Value={}endelse--"|" or "}"ifnotself.Namethenself.Name=cheaptrim(self.Valueandtable.concat(self.Value,"")or"")self.Value=nilelseself.Value=self.finalise(self.Valueandtable.concat(self.Value,"")or"")ifself.Keythenself.Key=self.finalise(self.Key)self.Args[self.Key]=cheaptrim(self.Value)self.ArgOrder[#self.ArgOrder+1]=self.KeyelselocalKey=tostring(self.LastIndex)self.Args[Key]=self.Valueself.ArgOrder[#self.ArgOrder+1]=Keyself.LastIndex=self.LastIndex+1endself.Key=nilself.Value=nilendendendfunctionContainer:AppendText(text,ftext)self.Text[#self.Text+1]=(ftextortext)ifnotself.Valuethenself.Value={}endself.BeyondStart=self.BeyondStartor(#table.concat(self.Text,"")>2)ifself.BeyondStartthenself.Value[#self.Value+1]=textendendfunctionContainer:Clean(IsTemplate)self.Text=table.concat(self.Text,"")ifself.ValueandIsTemplatethenself.Value={string.sub(table.concat(self.Value,""),1,-3)}--Trim ending }}self:HandleArgInput("|",true)--Simulate endingendself.Value=nilself.Key=nilself.BeyondStart=nilself.LastIndex=nilself.finalise=nilself.HandleArgInput=nilself.AppendText=nilself.Clean=nilendreturnContainerend--Step 1: Find and escape the content of all wikilinks on the page, which are stronger than templates (see implementation notes)localscannerPosition=1localwikilinks={}localopenWikilinks={}whiletruedolocalPosition,_,Character=string.find(InputText,"([%[%]])%1",scannerPosition)ifnotPositionthen--DonebreakendscannerPosition=Position+2--+2 to pass the [[ / ]]ifCharacter=="["then--Add a [[ to the pending wikilink queueopenWikilinks[#openWikilinks+1]=Positionelse--Pair up the ]] to any available [[if#openWikilinks>=1thenlocalstart=table.remove(openWikilinks)--Pop the latest [[wikilinks[start]={Start=start,End=Position+1,Type="Wikilink"}--Note the pairendendend--Step 2: Find the bounds of every valid template and variable ({{ and {{{)localscannerPosition=1localtemplates={}localvariables={}localopenBrackets={}whiletruedolocalStart,_,Character=string.find(InputText,"([{}])%1",scannerPosition)ifnotStartthen--Done (both 9e9)breakendlocal_,End=string.find(InputText,"^"..Character.."+",Start)scannerPosition=Start--Get to the {{ / }} setifCharacter=="{"then--Add the {{+ set to the queueopenBrackets[#openBrackets+1]={Start=Start,End=End}else--Pair up the }} to any available {{, accounting for {{{ / }}}localBracketCount=End-Start+1whileBracketCount>=2and#openBrackets>=1dolocalOpenSet=table.remove(openBrackets)ifboundlen(OpenSet)>=3andBracketCount>=3then--We have a {{{variable}}} (both sides have 3 spare)variables[OpenSet.End-2]={Start=OpenSet.End-2,End=scannerPosition+2,Type="Variable"}--Done like this to ensure chronological orderBracketCount=BracketCount-3OpenSet.End=OpenSet.End-3scannerPosition=scannerPosition+3else--We have a {{template}} (both sides have 2 spare, but at least one side doesn't have 3 spare)templates[OpenSet.End-1]={Start=OpenSet.End-1,End=scannerPosition+1,Type="Template"}--Done like this to ensure chronological orderBracketCount=BracketCount-2OpenSet.End=OpenSet.End-2scannerPosition=scannerPosition+2endifboundlen(OpenSet)>=2then--Still has enough data left, leave it inopenBrackets[#openBrackets+1]=OpenSetendendendscannerPosition=End--Now move past the bracket setend--Step 3: Re-trace every object using their known bounds, collecting our parameters with (slight) easelocalscannerPosition=1localactiveObjects={}localfinalObjects={}whiletruedolocalLatestObject=activeObjects[#activeObjects]--Commonly needed objectlocalNNC,_,Character--NNC = NextNotableCharacterifLatestObjectthenNNC,_,Character=string.find(InputText,"([{}%[%]|=])",scannerPosition)elseNNC,_,Character=string.find(InputText,"([{}])",scannerPosition)--We are only after templates right nowendifnotNNCthenbreakendifNNC>scannerPositionandLatestObjectthenlocalscannedContent=string.sub(InputText,scannerPosition,NNC-1)LatestObject:AppendText(scannedContent,finalise(scannedContent))endscannerPosition=NNC+1ifCharacter=="{"orCharacter=="["thenlocalContainer=templates[NNC]orvariables[NNC]orwikilinks[NNC]ifContainerthenCreateContainerObj(Container)ifContainer.Type=="Template"thenContainer:AppendText("{{")scannerPosition=NNC+2elseifContainer.Type=="Variable"thenContainer:AppendText("{{{")scannerPosition=NNC+3else--WikilinkContainer:AppendText("[[")scannerPosition=NNC+2endifLatestObjectandContainer.Type=="Template"then--Only templates count as childrenLatestObject.Children[#LatestObject.Children+1]=ContainerendactiveObjects[#activeObjects+1]=ContainerelseifLatestObjectthenLatestObject:AppendText(Character)endelseifCharacter=="}"orCharacter=="]"thenifLatestObjectthenLatestObject:AppendText(Character)ifLatestObject.End==NNCthenifLatestObject.Type=="Template"thenLatestObject:Clean(true)finalObjects[#finalObjects+1]=LatestObjectelseLatestObject:Clean(false)endactiveObjects[#activeObjects]=nillocalNewLatest=activeObjects[#activeObjects]ifNewLatestthenNewLatest:AppendText(LatestObject.Text)--Append to new latestendendendelse--| or =ifLatestObjectthenLatestObject:HandleArgInput(Character)endendend--Step 4: Fix the orderlocalFixedOrder={}localSortableReference={}for_,Objectinnext,finalObjectsdoSortableReference[#SortableReference+1]=Object.Startendtable.sort(SortableReference)fori=1,#SortableReferencedolocalstart=SortableReference[i]forn,Objectinnext,finalObjectsdoifObject.Start==startthenfinalObjects[n]=nilObject.Start=nil--Final cleanupObject.End=nilObject.Type=nilFixedOrder[#FixedOrder+1]=Objectbreakendendend--Finished, returnreturnFixedOrderendlocalp={}--Main entry pointsp.PrepareText=PrepareTextp.ParseTemplates=ParseTemplates--Extra entry points, not really requiredp.TestForNowikiTag=TestForNowikiTagp.TestForComment=TestForCommentreturnp--[==[ console testslocal s = [=[Hey!{{Text|<nowiki | ||>Hey! }}A</nowiki>|<!--AAAAA|AAA-->Should see|Shouldn't see}}]=]local out = p.PrepareText(s)mw.logObject(out)local s = [=[B<!--Hey!-->A]=]local out = p.TestForComment(s, 2)mw.logObject(out); mw.log(string.sub(s, 2, out.Length))local a = p.ParseTemplates([=[{{User:Aidan9382/templates/dummy|A|B|C {{{A|B}}} { } } {|<nowiki>D</nowiki>|<pre>E|F</pre>|G|=|a=|A = [[{{PAGENAME}}|A=B]]{{Text|1==<nowiki>}}</nowiki>}}|A B=Success}}]=])mw.logObject(a)]==]