Movatterモバイル変換


[0]ホーム

URL:


Jump to content
MediaWiki
Search

Manual:Job queue/For developers

From mediawiki.org
<Manual:Job queue
Translate this page
Languages:

Jobs are non-urgent tasks.For a general introduction and management of job queues, seeJob queue.

Deferred updates

[edit]

Deferred updates (ordeferrable updates) are a useful way to postpone time-consuming tasks in order to speed up the main MediaWiki response.Refer toDeferredUpdates class API andDatabase transactions for how to use these.

Deferred updates are represented as a callable functions that we queue in an array, and then call at the end of the MediaWiki PHP process.Typically the call will take place after finishing the response to a web request (e.g. echo and flush everything to the browser), but before we actually exit or return to the web server.This is internally powered byfastcgi_finish_request() inMediaWiki::doPostOutputShutdown().

Deferrable updates are executed at the end of thecurrent process.They are only memorised within that same web request (or other process, such as CLI maintenance scripts).

This unlikejobs which are scheduled via a persistent storage backend, to then run some minutes or hours in the future, independent of and after the original request that queued the job.The job queue in MediaWiki is a pluggable service.The default backend is to add jobs to thejob table in the wiki's main database.The defaultjob runner is to execute up to one job at the end of random page views.

More information:

Which one to use?

[edit]

Deferrable updates should be used for tasks that generally take only a few milliseconds to complete as a way to speed up the web response.By nature of being deferred, this means thatfailure is hidden from clients since the response has already been sent.

Normal

[edit]

Examples of critical tasks that we don't run via deferred updates.Failure must be known to users, and more generally people should know how and when their action was completed, to then act further knowing that the change is completed.E.g. make further edits that depend on previous ones, possibly scripted or batched through some automation.

  • Database write that creates a page or saves an edit.
  • Create account, change password.
  • Explicit "send email" feature.

Deferred update

[edit]

Examples of "urgent" tasks that we run via post-response deferred updates after saving an edit.These small transactions are expected to be reflected if the client looks for it afterward, but the result of these is not needed to render the response to the edit itself.

  • Metadata update that adds the article to a certain category listing.
  • Publishing the edit event to theRecent changes feed.
  • Updating the account'suser_editcount field.

Job queue

[edit]

Examples of "non-urgent" tasks that we run via the job queue:

  • After saving an edit to atemplate, iterate through potentially millions of affected pages to re-parse and purge (known as "Refresh links" orLinksUpdate).
  • Periodically prune old rows from theRecentchanges table.
  • After uploading a photo, pre-render common thumbnail sizes.
  • After saving an edit to an article, send emails to the accounts that watch this page with email notifications enabled.

Use jobs if you need to save data in the context of a GET request

[edit]

For scalability and performance reasons, MediaWiki developers should generally not perform database writes during page views or other GET requests.If this becomes difficult to avoid, check theBackend performance guidelines first and consider seeking advice from other developers or the Performance Team for how to approach the problem in a different way.

Note that large wiki farms (such as Wikimedia) may operate from multiple data centers and thus run GET requests (which don't expect database writes) from a secondary data center, which should be able to respond to such requests without relying on communicating to the primary DC.

If you're reasonably certain that your feature will only rarely discover during a GET request the need for a database write, and if the write is not urgent, then one option you do have is to queue a job during a GET request.Job queues can be buffered and synced across datacenters asynchronously and thus do not require immediate cross-DC communication.You can then rely on the job eventually being transmitted to the primary DC where it will then execute at some point in the future.

Deferred updates should not be used to perform database writes after a GET request.Attempting this will log a DBPerformance warning message.

Deferred update fallback

[edit]

Deferrable updates can choose to implement theEnqueueableDataUpdate interface.Such updates can be automatically converted to a job as-needed.For example, if the update fails, MediaWiki will convert it to a job and queue it to try again later.There are also other situations in which we improve reliability or optimise throughput by proactively converting updates to jobs where possible.

Since any MediaWiki code can queue deferred updates, it is also possible for a CLI maintenance script or job to implicitly built up a list of deferred updates.If these batch operations end up queuing a lot of updates, MediaWiki will proactively convert tasks to jobs where possible (handled by the DeferredUpdates class internally).

Registering a job

[edit]

To use thejob queue to do your non-urgent jobs, you need to do these things:

Create a Job subclass

[edit]

You need to create a class that will perform your deferred updates:

<?phpnamespaceMediaWiki\Extension\MyExt\Job\SomeExpensiveOperationJob;useMediaWiki\JobQueue\Job;useWikimedia\Rdbms\IConnectionProvider;classSomeExpensiveOperationJobextendsJob{privateIConnectionProvider$dbProvider;publicfunction__construct(array$params,IConnectionProvider$dbProvider){// Replace SomeExpensiveOperationJob with an identifier for your job.parent::__construct('SomeExpensiveOperationJob',$params);$this->dbProvider=$dbProvider;}/** * @inheritDoc */publicfunctionrun(){$dbw=$this->dbProvider->getPrimaryDatabase();$update=$dbw->newUpdateQueryBuilder()->update('mytable')->set(['foo'=>$this->params['foo'],'bar'=>$this->params['bar'],])->caller(__METHOD__);$update->execute();returntrue;}}

Add your Job class to the global list

[edit]

Add the Job class to the global$wgJobClasses array.In extensions, this is done in theExtension.json file, and in core it's done inDefaultSettings.php.The key must be unique and match the value in the job's constructor, and the value is the class name.So for this example, that would look like this:

{"JobClasses":{"SomeExpensiveOperationJob":{"class":"MediaWiki\\Extension\\MyExt\\Job\\SomeExpensiveOperationJob","services":["ConnectionProvider"]}}}

How to queue a job

[edit]
See also:wikitech:MediaWiki Engineering/Guides/Backend performance practices
<?php/** * 1. Access the JobQueueGroup for the current wiki * * For MW 1.36 and earlier, call JobQueueGroup::singleton() instead. */$services=MediaWikiServices::getInstance();$jobQueueGroup=$services->getJobQueueGroupFactory()->makeJobQueueGroup();/** * 2. Create a Job object * * Construct the subclass, and pass the relevant parameters. * * These will be available as $this->params in your Job class when * it executes later. * * Some older jobs require a $title parameter even if they internally ignore it. * In that case, you may pass TitleFactory->newMainPage(). */$titleFactory=$services->getTitleFactory();$title=$titleFactory->newFromText('User:Example/Foobar');// or $example->getTitle()$job=newMyDataJob($title,['example'=>true,'mydata'=>['x'],]);/** * 3. Push the job into the queue */$jobQueueGroup->lazyPush($job);

Queuing viaJobQueueGroup::lazyPush() allows MediaWiki to send the job in a batch together with any other queued jobs at the end of the web response.If queuing is inseparable from your request's main purpose, and would like any queuing failure to result in (for example) any database writes to be rolled back and an error page presented to the user, then consider callingJobQueueGroup::push() instead.

Other

[edit]

Job queue type

[edit]

A job queue type is the command name you give to theparent::__construct() method of your job class; e.g., using theexample above, that would besynchroniseThreadArticleData.

getQueueSizes()

[edit]

JobQueueGroup->getQueueSizes() will return an array of all job queue types and their sizes.

Array([refreshLinks]=>1[refreshLinks2]=>3[synchroniseThreadArticleData]=>10)

getSize()

[edit]

WhilegetQueueSizes() is handy for analysing the entire job queue, for performance reasons, it’s best to useJobQueueGroup->get(/* <job type> */)->getSize() when analysing a specific job type, which will only return the job queue size of that specific job type.

Array([synchroniseThreadArticleData]=>100)

Internals

[edit]

Pushing jobs

[edit]

The primary function isJobQueueGroup::push().It selects the job queue corresponding to the job type and, depending on the job queue implementation (database or Redis), it will be pushed either through a Redis connection (Redis case) either as a deferrable update (database case).

Thelazy push function (JobQueueGroup::lazyPush()) keeps in memory the jobs.At the end of the current execution (end of MediaWiki request or end of the current job execution) the jobs kept in memory are pushed, as thelast deferrable update (of typeAutoCommitUpdate).As a deferrable update, the jobs are pushed at the end of the current execution, and as anAutoCommitUpdate the jobs are pushed as a single database transaction.SeeJobQueueGroup::lazyPush() andJobQueueGroup::pushLazyJobs() for details.

In CLI, note that deferrable updates (either fromJobQueueGroup::push() (JobQueueDB implementation) orJobQueueGroup::lazyPush()) are directly executed if the database transaction flag (LBFactory::hasTransactionRound()) is free.SeeDeferredUpdates::addUpdates() andDeferredUpdates::tryOpportunisticExecute() for details.

When some jobs are pushed throughJobQueueGroup::lazyPush() but never really pushed (and hence lost), usually because an unhandled exception is thrown, the destructor of JobQueueGroup shows a warning in the debug log:

PHP Notice: JobQueueGroup::__destruct: 1 buffered job(s) never inserted

SeeT100085 for an example of such a warning; this was before MediaWiki 1.29 release for Web-executed jobs, because when a job internally lazy-push a job and the former job is executed in the shutdown part of a MediaWiki request, the later job is not pushed (becauseJobQueueGroup::pushLazyJobs() was already called); the fix for this specific bug was to callJobQueueGroup::lazyPush() inJobRunner::executeJob() to always push lazily-pushed jobs after execution of each job.

Execution of jobs

[edit]

Jobs are ordinarily executed at the end of a web request, at the rate of$wgJobRunRate per request.If$wgJobRunRate==0, no jobs are run at the end of a web request.The default value of$wgJobRunRate is 1.

All enqueued jobs can be executed at any time by runningmaintenance/runJobs.php.This is particularly important when$wgJobRunRate==0.

The jobs are run by theJobRunner class.Each job is given its own database transaction.

At the end of the job execution, deferrable updates are executed.Since MediaWiki 1.28.3/1.29 lazily-pushed jobs are pushed through a deferrable update in order to use a dedicated database transaction (withAutoCommitUpdate).

Setup
Implementation
Extension points
General
Pages
Content
Wiki markup
Moderation
Authentication
Best practices
Tools
Retrieved from "https://www.mediawiki.org/w/index.php?title=Manual:Job_queue/For_developers&oldid=7688062"

[8]ページ先頭

©2009-2025 Movatter.jp