Movatterモバイル変換

[0]ホーム

Jump to content

Help:Export

Deutsch

Edit links

From Wikipedia, the free encyclopedia

This help page is ahow-to guide.

It explains concepts or processes used by the Wikipedia community. It is not one ofWikipedia's policies or guidelines, and may reflect varying levels ofconsensus.

Shortcut

H:EXH:EX

Linking and page manipulation
Linking and diffs URLs Links Orphans Interlanguage links Interwiki linking Shortcuts External links External link icons Plainlinks Link color Colon trick Pipe trick Self links What links here Linksearch Manual of Style on linking Navigation templates Hatnotes Template index for links Diffs Simplest diff guide Simple diff and link guide Complete diff and link guide
Categorization Category Categorization guideline Classification Container category FAQ for categorization FAQ for categories Categories, lists, and navigation templates Categorizing articles about people By year Redirect categories User categories Overcategorization User categories Template index for categories
Moving and redirecting How to move a page for beginners Redirects Moving a page Requested moves How to fix cut-and-paste moves Moving files to Commons Userfication Non-admin and admin-only moves Template index for moving Template index for redirects
Merging Merging Proposed article mergers Requests for history merge Merge and delete? Merge what? Delete or merge? Template index for merging WikiProject Merge
Splitting Splitting Template index for splitting
Importing and copying How to import articles Requests for page importation Import Export Copying within Wikipedia
Protecting Protection Protection policy High-risk templates Requests for page protection Rough guide to semi-protection
Additional How to create a page Your first article Editing Deletion process
v t e

Wiki pages can be exported in a specialXML format toimport into another MediaWiki installation or use it elsewise for instance for analysing the content. See alsom:Syndication feeds for exporting all other information except pages, and seeHelp:Import on importing pages.

How to export

[edit]

There are at least six ways to export pages:

Paste the name of the articles in the box inSpecial:Export or usehttps://en.wikipedia.org/wiki/Special:Export/FULLPAGENAME.
Useaction=raw. (This fetches just the page's wikitext and not the XML format described below.) For example:https://en.wikipedia.org/w/index.php?title=Wikipedia&action=raw .. it's important to use/w/index.php?title=PAGENAME&action=raw and not/wiki/PAGENAME?action=raw (seePhab T126183)
Use the API to fetch data in XML or JSON packaging
The backup scriptdumpBackup.php dumps all the wiki pages into an XML file.dumpBackup.php only works on MediaWiki 1.5 or newer. You need to have direct access to the server to run this script. Dumps of mediawiki projects are (more or less) regularly made available athttp://download.wikipedia.org. More help is athttp://www.mediawiki.org/wiki/Manual:DumpBackup.php
There is anOAI-PMH-interface to regularly fetch pages that have been modified since a specific time. For Wikimedia projects this interface is not publicly available. OAI-PMH contains a wrapper format around the actual exported articles.
Use thePython Wikipedia Robot Framework. This won't be explained here.

By default only the current version of a page is included. Optionally you can get all versions with date, time, user name and edit summary.

Additionally you can copy the SQL database. This is how dumps of the database were made available before MediaWiki 1.5 and it won't be explained here further.

Using 'Special:Export'

[edit]

To exportall pages of a namespace, for example.

1. Get the names of pages to export

[edit]

Go toSpecial:Allpages and choose the desired namespace.
Copy the list of page names to a text editor
Put all page names on separate lines
Prefix the namespace to the page names (e.g. 'Help:Contents'), unless the selected namespace is the main namespace.

2. Perform the export

[edit]

Go toSpecial:Export and paste all your page names into the textbox, making sure there are no empty lines.
Click 'Submit query'
Save the resulting XML to a file using your browser's save facility.

and finally...

Open the XML file in a text editor. Scroll to the bottom tocheck for error messages.

Now you can use this XML file toperform an import.

Exporting the full history

[edit]

A checkbox in theSpecial:Export interface selects whether to export the full history (all versions of an article) or the most recent version of articles. A maximum of 1000 revisions are returned; other revisions can be requested as detailed inMW:Parameters to Special:Export.

Export format

[edit]

The format of the XML file you receive is the same in all ways. This format is codified inXML Schema athttp://www.mediawiki.org/xml/export-0.6.xsd. This format is not intended for viewing in a web browser, though some browsers show you pretty-printed XML with "+" and "-" links to view or hide selected parts. Alternatively the XML-source can be viewed using the "view source" feature of the browser, or after saving the XML file locally, with a program of choice. If you directly read the XML source it won't be difficult to find the actual wikitext. If you don't use a special XML editor "<" and ">" appear as < and >, to avoid a conflict with XML tags; to avoid ambiguity, "&" is coded as "&".

In the current version the export format does not contain an XML replacement of wiki markup (seeWikipedia DTD for an older proposal, orWiki Markup Language). You only get the wikitext as you get when editing the article. (After export you can usealternative parsers to convert wikitext to other format)

Example

[edit]

<mediawikixml:lang="en"><page><title>Pagetitle</title><!-- page namespace code --><ns>0</ns><id>2</id><!-- If page is a redirection, element "redirect" contains title of the page redirect to --><redirecttitle="Redirect page title"/><restrictions>edit=sysop:move=sysop</restrictions><revision><timestamp>2001-01-15T13:15:00Z</timestamp><contributor><username>Foobar</username><id>65536</id></contributor><comment>Ihavejustonethingtosay!</comment><text>Abunchof[[text]]here.</text><minor/></revision><revision><timestamp>2001-01-15T13:10:27Z</timestamp><contributor><ip>10.0.0.2</ip></contributor><comment>new!</comment><text>Anearlier[[revision]].</text></revision><revision><!-- deleted revision example --><id>4557485</id><parentid>1243372</parentid><timestamp>2010-06-24T02:40:22Z</timestamp><contributordeleted="deleted"/><model>wikitext</model><format>text/x-wiki</format><textdeleted="deleted"/><sha1/></revision></page><page><title>Talk:Pagetitle</title><revision><timestamp>2001-01-15T14:03:00Z</timestamp><contributor><ip>10.0.0.2</ip></contributor><comment>hey</comment><text>WHYDYOULOCKPAGE??!!!iwaseditingthatjerk</text></revision></page></mediawiki>

DTD

[edit]

Here is an unofficial, shortDocument Type Definition version of the format. If you don't know what a DTD is just ignore it.

<!ELEMENTmediawiki(siteinfo?,page*)><!-- version contains the version number of the format (currently 0.3) --><!ATTLISTmediawikiversionCDATA#REQUIREDxmlnsCDATA#FIXED"http://www.mediawiki.org/xml/export-0.3/"xmlns:xsiCDATA#FIXED"http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocationCDATA#FIXED"http://www.mediawiki.org/xml/export-0.3/ http://www.mediawiki.org/xml/export-0.3.xsd"><!ELEMENTsiteinfo(sitename,base,generator,case,namespaces)><!ELEMENTsitename(#PCDATA)><!-- name of the wiki --><!ELEMENTbase(#PCDATA)><!-- url of the main page --><!ELEMENTgenerator(#PCDATA)><!-- MediaWiki version string --><!ELEMENTcase(#PCDATA)><!-- how cases in page names are handled --><!-- possible values: 'first-letter' | 'case-sensitive'                         'case-insensitive' option is reserved for future --><!ELEMENTnamespaces(namespace+)><!-- list of namespaces and prefixes --><!ELEMENTnamespace(#PCDATA)><!-- contains namespace prefix --><!ATTLISTnamespacekeyCDATA#REQUIRED><!-- internal namespace number --><!ELEMENTpage(title,id?,restrictions?,(revision|upload)*)><!ELEMENTtitle(#PCDATA)><!-- Title with namespace prefix --><!ELEMENTid(#PCDATA)><!ELEMENTrestrictions(#PCDATA)><!-- optional page restrictions --><!ELEMENTrevision(id?,timestamp,contributor,minor?,comment,text)><!ELEMENTtimestamp(#PCDATA)><!-- according to ISO8601 --><!ELEMENTminorEMPTY><!-- minor flag --><!ELEMENTcomment(#PCDATA)><!ELEMENTtext(#PCDATA)><!-- Wikisyntax --><!ATTLISTtextxml:spaceCDATA#FIXED"preserve"><!ELEMENTcontributor((username,id)|ip)><!ELEMENTusername(#PCDATA)><!ELEMENTip(#PCDATA)><!ELEMENTupload(timestamp,contributor,comment?,filename,src,size)><!ELEMENTfilename(#PCDATA)><!ELEMENTsrc(#PCDATA)><!ELEMENTsize(#PCDATA)>

Processing XML export

[edit]

Many tools can process the exported XML. If you process a large number of pages (for instance a whole dump) you probably won't be able to get the document in main memory so you will need a parser based onSAX or other event-driven methods.

You can also use regular expressions to directly process parts of the XML code. These run fast but are difficult to maintain.

Please list methods and tools for processing XML export here:

Parse::MediaWikiDump is a perl module for processing the XML dump file.
m:Processing MediaWiki XML with STX - Stream based XML transformation

Details and practical advice

[edit]

To determine the namespace of a page you have to match its title to the prefixed defined in

/mediawiki/siteinfo/namespaces/namespace

Possible restrictions are
- sysop (protected pages)

Wikipedia-specific help

[edit]

Wikipedia:WikiProject Transwiki/exporting - instructions on how to export the entire history of a Wikipedia article.

v t e Wikipedia help pages
Visit theTeahouse or theHelp desk for an interactiveQ & A forum. FAQs (?) Reference desks (?) Noticeboards (?) Cheatsheet (?) Directories (?) Village pumps (?)
About Wikipedia (?)	Administration Purpose Principles Policies and guidelines What Wikipedia is not Disclaimer (parental advice) Making requests Who writes Wikipedia?
Help for readers (?)	FAQ Books Copyright Glossary Mobile access Navigation Other languages Searching Students Viewing media
Contributing to Wikipedia (?)	Advice for young editors Avoiding common mistakes Etiquette Simplified Manual of Style Simplified rule-set "Ignore all rules" "The rules are principles" Style-tips Tip of the day Your first article (article wizard)
Getting started (?)	Why create an account? Introductions by topic Graphics tutorials Picture tutorial IRC(live chat) tutorial VisualEditor user guide
Dos and don'ts (?)	Accessibility Biographies Biographies (living) Categorization Consensus Discussions Disambiguation Images Leads Links Lists References Tables Titles (of articles)
How-to pages and information pages (?)	Appealing blocks Article deletion Categories Citations/references Referencing for beginners Citation Style 1 Cite errors References and page numbers Convert Diff Editing Minor edit toolbar edit conflict Find sources Files Footnotes Image deletion Infoboxes Linking (link color) Logging in Merging New page review Page name Renaming pages Redirect Passwords Email confirmation Reverting Simple vandalism cleanup Talk pages (archiving simple archiving) User contributions WP search protocol
Coding (?)	Wiki markup Barcharts Calculations Characters Columns Elevation Hidden text HTML Lists Magic words Music symbols Sections Sounds Tables Templates Transclusion URL Visual files
Directories (?)	Abbreviations Contents(Encyclopedia proper) Departments Editor's index Essays FAQs Glossary Guidelines Manual of Style Policies Tasks Tips Tools
Missing Manual Ask for help on your talk page (?)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Help:Export&oldid=1299605012"

Category:

Wikipedia how-to

[8]ページ先頭