- Notifications
You must be signed in to change notification settings - Fork0
m3api extension package to use the MediaWiki “query” API
License
lucaswerkmeister/m3api-query
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
m3api-query is an extension package form3api,simplifying some common operations when working with thequery action.
The module exports several functions,which typically take asession parameter that would be constructed separately –see the m3api README for details on that.The more important functions are documented below;some others can be found in the source code.
Get the full data for a single page with the given title,according to theprops (and possibly other parameters).
importSession,{set}from'm3api/node.js';import{queryFullPageByTitle}from'm3api-query/index.js';constsession=newSession('en.wikipedia.org',{formatversion:2,},{userAgent:'m3api-query-README-example',});consttitle='List of common misconceptions';constpage=awaitqueryFullPageByTitle(session,title,{prop:set('categories','contributors','coordinates','description','pageimages','pageprops',),clprop:set('sortkey'),cllimit:'max',colimit:'max',pclimit:'max',pilimit:'max',});console.log(`${page.title} (${page.description}) `+`is in${page.categories.length} categories.`);for(constcontributorofpage.contributors){console.log(`Thank you,${contributor.name}, `+`for contributing to${page.title}!`);}// ...
If the API doesn’t return the full page information in a single response,the function automatically follows continuationand merges the responses back into a single object.
There is also aqueryFullPageByPageId function that does what you’d expect,and a similarqueryFullRevisionByRevisionId function as well.
Get the full data for a collection of pages,typically produced by a generator.
letn=0;forawait(constpageofqueryFullPages(session,{generator:'allpages',gapnamespace:10,// NS_TEMPLATEgaplimit:100,prop:set('revisions'),rvprop:set('content'),rvslots:set('main'),formatversion:2,})){constcontent=page.revisions[0].slots.main.content;if(content.includes('style=')){console.log(`${page.title} seems to contain inline styles`);if(++n>=10){break;}}}
In this example, we ask theallpages generator for 100 pages at a time,but therevisions prop will actually only return 50 revisions per request,so we need to follow continuation once for the revisions of the second half of pages,before continuing with the next batch of 100 pages.The function handles all of this for you.
It’s worth noting that the function only starts yielding pages at the end of a complete batch,i.e. in this example only after the first 100 pages all have their revisions.If we usedgaplimit: 'max', the generator would produce 500 pages at once,and the function would make 10 requests internally before yielding any pages;since this example quickly breaks from the loop anyways,a shortergaplimit makes more sense here.
Also, when you use a generator,the order of pages in the actual API result will usually be unrelatedto the order in which the generator produced them.You can restore the meaningful order using them3api-query/comparePages option;for example, thesearch generator adds anindex property to each page,so by comparing by this property we can get the pages in the search order again:
letn=0;forawait(constpageofqueryFullPages(session,{generator:'search',gsrsearch:'example',gsrlimit:'max',},{'m3api-query/comparePages':({index:i1},{index:i2})=>i1-i2,})){console.log(page.title);if(++n>=1000){break;}}
Similar toqueryFullPages, this provides a stream of revision objects.It can be used to get all the revisions of a page, following continuation as needed:
forawait(constrevisionofqueryFullRevisions(session,{titles:'MediaWiki',rvprop:set('timestamp','user','comment'),rvlimit:'max',})){const{ timestamp, user, comment}=revision;console.log(`${timestamp} ([[User:${user}]]):${comment}`);}
Or to get the current revision of a set of pages produced by a generator:
forawait(constrevisionofqueryFullRevisions(session,{generator:'categorymembers',gcmtitle:'Category:Member states of the United Nations',gcmtype:['page'],gcmlimit:'max',rvprop:set('size'),})){constpage=revision[pageOfRevision];console.log(`${page.title}:${revision.size} bytes`);}
The above example also demonstrates how to get the page that a revision belongs to –thepageOfRevision key can be imported from this module just like the other functions.(This also applies to other functions returning revisions, such asqueryFullRevisionByRevisionId.)
You can sort the revisions within each batch using them3api-query/compareRevisions option;the comparison may also involve the page the revision belongs to,e.g. for thesearch generator as seen before (underqueryFullPages):
forawait(constrevisionofqueryFullRevisions(session,{generator:'search',gsrsearch:'example',gsrlimit:'max',rvprop:set('timestamp'),},{'m3api-query/compareRevisions':(revision1,revision2)=>{const{index:i1}=revision1[pageOfRevision],{index:i2}=revision2[pageOfRevision];returni1-i2;},})){const{ timestamp}=revision,{ title}=revision[pageOfRevision];console.log(`${title} (last edited${timestamp})`);}
If you’re usingqueryFullPages() orqueryFullRevisions(),it’s a good idea to also addmaxEmptyResponses() to the options,especially if you’re using various different or dynamic combinations of parameters.This prevents your application from potentially making a neverending stream of API requests.Usage example:
constsession=newSession('en.wikipedia.org',{formatversion:2,},{userAgent:'m3api-query-README-example',...maxEmptyResponses(100),});
The appropriate limit depends on the requests you’re going to make;if you’re not expecting gaps due tomiser mode,you can use a much lower limit, perhaps 5 or 10.
m3api-query follows the same slightly modified version of semantic versioning as m3api;see them3api README for details.In brief, changes to the internal interface,announced as internal breaking changes in the changelog,may take place between different minor versions.(Non-internal breaking changes only occur on major versions, as usual.)
Published under theISC License.By contributing to this software,you agree to publish your contribution under the same license.
About
m3api extension package to use the MediaWiki “query” API
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.