Hi everyone,I recently set up a MediaWiki (http://server.bluewatersys.com/w90n740/) and I need to extra the content from it and convert it into LaTeX syntax for printed documentation. I have googled for a suitable OSS solution but nothing was apparent.I would prefer a script written in Python, but any recommendations would be very welcome.Do you know of anything suitable?Kind Regards,Hugo Vincent,Bluewater Systems.
I've been tinkering with an extension to provide for a captcha to reduceautomated linkspamming while still staying out of the way for common use.My preliminary code is running now ontest.leuksman.com; the actual"captcha" part is a really primitive plain text hack which would takeall of a few minutes for a dedicated attacker to crack, but don't worryabout that -- I'm not testing the protection yet, just the framework itplugs into.By default the captcha prompt will only kick in if an edit adds new URLsto the text. Most regular editing shouldn't trip this -- wiki links,plain text, or just preserving existing links. But if you add new HTTPlinks that weren't there before, it'll then make you pass the captchabefore it saves.The captcha step can also be bypassed based on user group (eg registeredbots, sysop accounts, optionally all registered users), and can also beset to skip for any user who has gone through confirmation of theiraccount e-mail address.I haven't coded it yet, but it should also be possible to add a URLwhitelist, for instance for the site's own local URLs.As for a 'real' captcha generator to put into this system; I'm not toosure what code is already out there that's not awful. There's a Drupalplugin which would be easy to rip GPL'd PHP code from, but it doesn'tseem very robust.There's a set of samples of various captcha output and their weaknesseshere:http://sam.zoy.org/pwntcha/Obviously it would be good to either find something on the 'hardcaptchas' list rather than 'defeated captchas', or roll our own thatdoesn't suck too bad.There's also the question of whether we can feasibly provide an audioalternative or whathaveyou.-- brion vibber (brion @pobox.com)
This is a minimal change to add a new magic token that preventsnumbering of the TOC and to do the same when the user option "Auto-numberheadings" under "Misc" in "Preferences" is set. This option is alreadyretrieved through $this->mOptions->getNumberHeadings(); but currently notused effectively.The token directive is useful because some sections in a wiki may alreadyinclude numbering as part of the heading and suppressing auto-numbering isuseful in those cases.Index: includes/Parser.php===================================================================RCS file: /cvsroot/wikipedia/phase3/includes/Parser.php,vretrieving revision 1.509diff -u -r1.509 Parser.php--- includes/Parser.php23 Sep 2005 12:10:39 -00001.509+++ includes/Parser.php24 Sep 2005 12:57:48 -0000@@ -2461,8 +2461,15 @@ */ function formatHeadings( $text, $isMain=true ) { global $wgMaxTocLevel, $wgContLang, $wgLinkHolders, $wgInterwikiLinkHolders;--$doNumberHeadings = $this->mOptions->getNumberHeadings();++# if the string __NOTOCNUM__ (not case-sensitive) occurs in the HTML,+# or if the user prefers not to, do not add TOC Numbering+$mw =& MagicWord::get( MAG_NOTOCNUM );+if( $mw->matchAndRemove( $text ) ) {+$doNumberHeadings = false;+} else {+$doNumberHeadings = $this->mOptions->getNumberHeadings();+} $doShowToc = true; $forceTocHere = false; if( !$this->mTitle->userCanEdit() ) {@@ -2647,7 +2654,7 @@ $anchor .= '_' . $refcount[$headlineCount]; } if( $doShowToc && ( !isset($wgMaxTocLevel) || $toclevel<$wgMaxTocLevel ) ) {-$toc .= $sk->tocLine($anchor, $tocline, $numbering, $toclevel);+$toc .= $sk->tocLine($anchor, $tocline, $doNumberHeadings?$numbering:'', $toclevel); } if( $showEditLink && ( !$istemplate || $templatetitle !== "" ) ) { if ( empty( $head[$headlineCount] ) ) {Index: languages/Language.php===================================================================RCS file: /cvsroot/wikipedia/phase3/languages/Language.php,vretrieving revision 1.684diff -u -r1.684 Language.php--- languages/Language.php23 Sep 2005 12:10:39 -00001.684+++ languages/Language.php24 Sep 2005 12:58:06 -0000@@ -190,6 +190,7 @@ # ID CASE SYNONYMS MAG_REDIRECT => array( 0, '#redirect' ), MAG_NOTOC => array( 0, '__NOTOC__' ),+MAG_NOTOCNUM => array( 0, '__NOTOCNUM__' ), MAG_FORCETOC => array( 0, '__FORCETOC__' ), MAG_TOC => array( 0, '__TOC__' ), MAG_NOEDITSECTION => array( 0, '__NOEDITSECTION__' ),--http://members.dodo.com.au/~netocrat
Hi,We requested a new 'portal' namespace about one year ago but this came to nothing. Some days ago, we discovered the English and German Wikipedia are now using this namespace (sic!) so we requesting to be able to do same [1]. The French word for portal is 'portail'.Regards,Aoineko[1]http://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro/30_ao%C3%BBt_2005#Un_…
Hi,I have just joined, I am from mumbai, india. I would like to get thearticles translated in marathi, my mother tongue. Looking at the effortand no of volunteers, this will not be usable in any reasonable amountof time.That has made me think of alternatives - machine translation. A statefunded institute has a software available but I don't have access to ityet. Pl. comment about this approach. Has this been tried for any otherlanguage earlier.Thanks & regards,Prasad Gadgil________________________________________________________________________Yahoo! India Matrimony: Find your life partner onlineGo to:http://yahoo.shaadi.com/india-matrimony
On 28/09/05, Phil Boswell wrote:> "Mark Ryan" wrote:> > Multilinugal error messages have now been implemented on the Wikimedia> > squids. I would like to thank everyone who helped to make this a> > reality over the past couple of weeks. I was keeping a running list of> > everyone who had helped, but I lost track of everyone :)>> Kudos to you and your helpers!>> Can you remind us of where we can see these messages *without* requiring a> WP failure?Well, as I just discovered looking for something else, you can always"use the source, look":http://cvs.sourceforge.net/viewcvs.py/wikipedia/tools/downtime/language-sup…--Rowan Collins BSc[IMSoP]
Thank you so much Ashar and Tim,for the setting up of the new subdomains!With regards,Jay B.2005/9/25, wikitech-l-request(a)wikimedia.org <wikitech-l-request(a)wikimedia.org>:> Message: 5> Date: Sun, 25 Sep 2005 15:29:50 +0200> From: Ashar Voultoiz <hashar(a)altern.org>> Subject: [Wikitech-l] nap, war, lad wikipedias created>> Hello,>> With the technical assistance of Tim Starling, I created three new> wikipedia projects which were pending somewhere on meta: .>> The projects are:>> Ladino :http://lad.wikipedia.org/> Waray-Waray :http://nap.wikipedia.org/> Neapolitan :http://war.wikipedia.org/>> For any trouble with those newly created projects, please reply to> wikitech-l mailing list only (followup-to set).>--ilooy.gaon(a)gmail.com
I am interested in making a map of Wikipedia in order to streamline the content,provide an overview of different areas, and connect Wikipedia to digital archivesmaintained by museums and laboratories all around the world. For more informationplease seehttp://meta.wikimedia.org/wiki/CDT_proposal . If you would like to collaborate, or if you already have similar efforts underway,please contact me. Thank you, Deborah MacPherson *************************************************Deborah MacPherson, Projects Director Accuracy&Aesthetics, A Nonprofit Organization for the Advancement of Education, Cultural Heritage, and Sciencewww.accuracyandaesthetics.comwww.deborahmacpherson.com mailing address: PO Box 52, Vienna VA 22183 USAphones: 703 585 8924 and 703 242 9411mailto:debmacp@gmail.comThe content of this email may contain private and confidential information. Do not forward, copy, share, or otherwise distribute without explicit written permission from all correspondents.**************************************************
Pakaran suggested on IRC the use of 7zip's LZMA compression for datadumps, claiming really big improvements in compression over gzip. I didsome test runs with the September 17 dump ofes.wikipedia.org and canconfirm it does make a big difference: 10,995,508,118 pages_full.xml 1.00x uncompressed XML 2,320,992,228 pages_full.xml.gz 4.74x gzipped output from mwdumper 775,765,248 pages_full.xml.bz2 14.17x "bzip2" 155,983,464 pages_full.xml.7z 70.49x "7za a -si"(gzip -9 makes a neglible difference versus the default compressionlevel; bzip2 -9 seems to make no difference.)The 7za program is a fair bit slower than gzip, but at 10-15 timesbetter compression I suspect many people would find the download savingsworth a little extra trouble.While it's not any official or de-facto standard that we know of, thecode is open source (LGPL, CPL) and a basic command-line archiver isavailable for most Unix-like platforms as well as Windows so it shouldbe free to use (in the absence of surprise patents):http://www.7-zip.org/sdk.htmlI'm probably going to try to work LZMA compression into the dump processto supplement the gzipped files; and/or we could switch from gzip backto bzip2, which provides a still respectable improvement in compressionand is a bit more standard.(We'd switched from bzip2 to gzip at some point in the SQL dump saga; Ithink this was when we had started using gzip internally on 'old' textentries and the extra time spent on bzip2 was wasted trying torecompress the raw gzip data in the dumps.)-- brion vibber (brion @pobox.com)
Hi Sorry if this is not the right place for this. I'm a sort of a black sheep here running an html static clone of Wikipedia. Right now I'm tying to improve the program to include {{msg}}. Half a year ago this was the URL to get the content,http://no.wikipedia.org/w/wiki.phtml?title=Template:Akershus&action=raw&cty…. Is there a new way or url to get the content? Regards Stefan Vesterlund