Wikitech-lMay 2003

wikitech-l@lists.wikimedia.org

59 participants
147 discussions

Countering the vandals who attack users (not just pages)
by David A. Wheeler 18 Jun '03

18 Jun '03

Sadly, it appears that there are some hurtful vandals out therewho are attacking the people trying to counter them. For example,User:Zoe has just posted that she's abandoning her efforts to countervandals; see:http://www.wikipedia.org/wiki/User%3AZoewhich begins: "I'm tired of fighting, I'm tired of arguing,I'm tired of being called names."The last straw seems to have been an edit by No-Fx to Zoe'suser page, in which No-Fx made it appear that Zoe was "into oral sex".I don't know enough about this situation to know for sure if thisis an example, but I am concerned about the long-term dangersif this starts a trend.Attacks on users and sysops - particularly highly dedicated ones -are much more dangerous to the Wikipedia thansimple attacks on a few pages. If these kinds of attacks causepeople to stop weeding out bad pages or vandals forfear of retribution, the project is doomed.Is there any way the software could be modified to make it harderfor vandals to counter-attack the people who are trying toremove vandalism?At the least, why not let the User:NAME pages be ONLY editableby NAME? The "User_talk:" spaces need to be editable in some way,but I don't see a need for others to "fix" the User: space of someone;it's not critical that that content be fixed, and there's advantangesto having some areas that are "precious" to each user.Here's a more controversial idea: perhaps someinformation relating to deletion of pages and banning of usersshould be hidden from non-sysops. For example,since "delete" can only be done by sysops, why not just tellnon-sysops that a deletion occurred, but not WHICH sysop did it?By the same token, perhaps some discussion areas should beonly readable/writeable by sysops, in particular a discussionarea to discuss banning someone. Perhaps there could be a waywhere anyone (non-sysop) could suggest that someone be banned,without having their name revealed to non-sysops.Since real deletes and banning can only be done by sysops anyway,and sysops are trusted, there's no reason this informationMUST be public.A related idea might be to modify the "talk" system so that it'smore like a bulletin board, with threaded messages anda clear identification of who made it (click on "reply" to replyto that item, maybe in a threaded way). That way, any message isclearlyidentified with its REAL author. A side-effect would be thatthe attribution would happen automatically (no moreforgetting ~~~~). That way, when people discuss things, theycan't make it appear that someone else made an outrageous/nastystatement.The goal here would be to prevent people from attacking each other,or at least limit its effectiveness.Thoughts?

13 34

Static html
by Alfio Puglisi 12 Jun '03

12 Jun '03

After some delays and bug-hunting my script for the HTML static versionsis in acceptable shape.Here you can see an example, built from a SQL file of some weeks ago:(Don't try the Search box!!! I explain below)http://www.arcetri.astro.it/~puglisi/wiki/dump/ma/main_page.htmlPlease don't DOS the connection, it's not a very fast line.Interested parties can find the script here:http://www.arcetri.astro.it/~puglisi/wiki/wiki2static.txt(renamed to .txt due to some server misconfig)use a wide terminal for this one. Everything (html code included) is inone single file. The whitespace may appear weird because I use 4-spacetabs. There's no need to tell me you don't like the coding style, Ialread know :-)))Some issues:- the topbar links do not work (known bug :-). The Edit link goes to theonline wikipedia site.- interlanguage links are ignored- some wiki markup is not recognized yet.- no images are present (of course!)- filenames should be OK for most filesystems not "8.3" limited (max 63 chars, only a-z, 0-9 and underscore)- despite the two-letter subdirectories, some of them have over 4,000files in them!- Time: the script takes more than 2 hours on my 1.3 Ghz Athlon...- Size: this dump is about 800MB. (tar.gz is just 110MB). I thinkthat I can bring it down to 600-650MB with a bit of trimming andeliminating unnecessary redirects. BUT, without some form of compression,the English wikipedia will soon overflow a single CD. Maybe we shouldtarget DVDs? :-)- Images: no images are present here. AFAIK, each of them has a SQL record(that my script skips), but the actual image data is not included. Howmany megabytes of images we have? I think it will be impossible to storethe full images on a CD. Certainly it's possible on a DVD. Maybe a low-resversion could be included in a CD.- Search: I tried a javascript search that worked well for small sizeddatabases: it's basically a big array of strings (article titles andfilenames) with some lines code that do a regexp match against them.For full-sized databases like this one, the search page becomes an 8megabytes monster that takes forever to process (IE grabs 100 MB of memoryand stops there, Opera is even worse). I'll see if I can find a differentsolution.Enough for now. While I carry on development, any input is welcome.Ciao,Alfio

15 109

request for a redirect
by giskart＠gmx.net 02 Jun '03

02 Jun '03

Can somebody setup a redirect fromhttp://nds.wikipedia.org andhttp://www.nds.wikipedia.org tohttp://za.wikipedia.comThe Plattdüütsch Wikipedia now is using a old usemod wikipedia whit thewrong language code. Whit the redirect the can spread the good url even when thedo not have a fase III wikiGiskart-- +++ GMX - Mail, Messaging & morehttp://www.gmx.net +++Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!

6 6

Re: Fair use
by Daniel Mayer 02 Jun '03

02 Jun '03

Marco wrote:> No, you lose much more. You can not easily combine the content of two> "free" encyclopedias and get something that is "free". You can not copy> images from the English Wikipedia to the German Wikipedia anymore because> the "fair use" right works not this way in Germany.What? Since when has the German Wikipedia moved to a German-based server? Well, I know for a fact that it hasn't so German law has no bearing on the legality of having "fair user" (per US law) images on the German Wikipedia. However, those people who are subject to German law may be legally barred from uploading such images. But there are plenty of German-speaking Wikipedians living outside of Germany to do this.-- Daniel Mayer (aka mav)

8 9

3 new InterWiki prefixes
by erik_moeller＠gmx.de 01 Jun '03

01 Jun '03

I have added*PageHistory*UserContributions*BackLinksInterWiki prefixes, because we currently do not support parameters in the [[Special:]] namespace, and this was the lazy way to provide a much needed quickfix. Among other things, this allows us to put#REDIRECT [[UserContributions:Username]]on the user page of a known vandal, making it easier to fix his edits from RC.As you might expect, these InterWiki links point to en:. I pondered picking a name like EnHistory, EnContris etc., but I wanted something intuitive. If other languages want the same functionality, prefixes like "SeitenHistorie","BenutzerBeitraege" and "LinksAuf" can be easily added (i.e. local equivalents).Note that if we change the functionality of [[Special:]], things like [[Special:MovePage->nul|Click here]] also become possible unless we specifically forbid them.Regards,Erik

2 3

Bug in Search
by Thomas Corell 31 May '03

31 May '03

If you enter something like this (test in german wikipedia):"5 AND Dezember" you get:1064: You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near 'AND (MATCH (si_title) AGAINST ('dezember')) ) AND cur_namespace":SELECT cur_id,cur_namespace,cur_title,cur_text FROM cur,searchindex WHERE cur_id=si_page AND ( AND (MATCH (si_title) AGAINST ('dezember')) ) AND cur_namespace IN (0) LIMIT 0, 20I see two AND's one after the other which means that "this->mTextcond" is emtpy (in the source code). It works with every single character as search term, not only numbers.Someone good at SearchEngine.php should take a look.-- Smurfsmurf(a)AdamAnt.mud.de------------------------- Anthill inside! ---------------------------

1 0

InnoDB monitor -- how to turn off?
by Brion Vibber 31 May '03

31 May '03

-----BEGIN PGP SIGNED MESSAGE-----Hash: SHA1Is somebody deliberately turning on the InnoDB monitor, or is some setting turn it on automatically?It dumps data to the MySQL error log file every 15 seconds listing a bunch of status and every transaction that's been done since the last one, and that comes to several hundred megs of log file after a few days, which can only be freed from the disk by deleting the log file and restarting MySQL.So if anyone knows a way to have it not start up, that would be nice. :)- -- brion vibber (brion @pobox.com)-----BEGIN PGP SIGNATURE-----Version: GnuPG v1.2.2 (GNU/Linux)iD8DBQE+2Dt3xVlOmwh1xjgRAn5zAJ4pOiIZB7QCMZkcCBl2pQAJJq83eQCeN6RCD9tybprG144oSWsj5oIQsvc==0Ebu-----END PGP SIGNATURE-----

1 0

RE: Static html
by Erik Zachte 30 May '03

30 May '03

Hi Alfio,I looked at your code. Nice job.Superficially it may seem we did almost the same job.But overlap is minimal. My perl script addresses a lot of issues thatonly are relevant in a Palm/Pocket PC/TomeRaider environment.Your version has quite some code which is specific for a static htmlversion. Still there are some areas where we can be of help to each other.You mentioned unicode support as an open issue. Conincidentally I waslooking into this issue the past few days, while preparing a TomeRaiderversion of the Esperanto Wikipedia, which would be unreadable withoutit. You will also find the UTF-8 coding scheme on which this is based below.Here is some Perl code to translate unicode multicharacter bytesequences into html tags of type &#nnn; # unicode -> html character codes &#nnnn;$entry =~ s/([\x80-\xFF]+)/&UnicodeToHtml($1)/ge ;sub UnicodeToHtml{ my $text = shift ; my $html = "" ; my $c, $byte, $ord, $unicode, $bytes, $html ; for ($c = 0 ; $c < length ($text) ; $c++) { $byte = substr ($text,$c,1) ; # optimize with regexp ? $ord = ord ($byte) ; if ($ord < 128) # plain ascii character { $html .= $byte ; } # (will not occur in this script) else { if ($ord < 224) { $bytes = 2 ; } elsif ($ord < 240) { $bytes = 3 ; } elsif ($ord < 248) { $bytes = 4 ; } elsif ($ord < 252) { $bytes = 5 ; } else { $bytes = 6 ; } $unicode = substr ($text,$c,$bytes) ; $html .= &UnicodeToHtmlTag ($unicode) ; $c += $bytes - 1 ; } } return ($html) ;}sub UnicodeToHtmlTag{ my $unicode = shift ; my $char = substr ($unicode,0,1) ; my $ord = ord ($char) ; my $c, $ord, $value ; if ($ord < 128) # plain ascii character { return ($unicode) ; } # (will not occur in this script) else { if ($ord >= 252) { $value = $ord - 252 ; } elsif ($ord >= 248) { $value = $ord - 248 ; } elsif ($ord >= 240) { $value = $ord - 240 ; } elsif ($ord >= 224) { $value = $ord - 222 ; } else { $value = $ord - 192 ; } for ($c = 1 ; $c < length ($unicode) ; $c++) { $value = $value * 64 + ord (substr ($unicode, $c,1)) - 128 ; } return ("\&\#" . $value . ";") ; }}Found this somewhere on the web:#UTF-8 works as follows:#ENCODING# The following byte sequences are used to represent a char-# acter. The sequence to be used depends on the UCS code# number of the character:# 0x00000000 - 0x0000007F:# 0xxxxxxx## 0x00000080 - 0x000007FF:# 110xxxxx 10xxxxxx## 0x00000800 - 0x0000FFFF:# 1110xxxx 10xxxxxx 10xxxxxx## 0x00010000 - 0x001FFFFF:# 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx## 0x00200000 - 0x03FFFFFF:# 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx## 0x04000000 - 0x7FFFFFFF:# 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx## The xxx bit positions are filled with the bits of the# character code number in binary representation. Only the# shortest possible multibyte sequence which can represent# the code number of the character can be used.By the way I enjoyed your contribution about Ant Power.If you have any questions or suggestions you can reach me atxxx(a)chello.nl!spam: read xxx as epzachteCheers, Erik Zachte

1 0

RE: Unicode and polices...Have you tried this?
by Erik Zachte 29 May '03

29 May '03

>> If you run IE6 and right click on any web page, you will get a dropdown menu with "encoding" as an entry. Follow the arrow to a long listof encodings. In my case, I chose Japanese and it was installed ondemand, in under a minute. Then I left "Encoding" set to "Autoselect."<<I tried this but it did not work for me. I remember that when Iinstalled XP and then ran the 'Windows Update' wizard I clicked 'Remove'for all foreign language packages (a little short sighted, lookingback). Maybe this explains why. Could not find how to undo this.There is also a "Enable Install On Demand (Explorer)" checkbox inExplorer -> Options -> Advanced. (unchecked by default, or because of myactions above). Enabling this did not help me either.Finally I found a link in the Wikipedia to "Alan Wood's UnicodeResources": "http://www.alanwood.net/unicode/ Lots of info and usefullinks there.He tells that Microsoft has some very complete TrueType fonts. They areonly shipped with MS Office. I copied the Arial unicode font(Arialuni.ttf, 24 Mb) from another machine running Office and all waswell.Erik Zachte

1 0

Re: [Wikitech-l] Unicode and polices...Have you tried this?
by rose.parks＠att.net 29 May '03

29 May '03

Hi, I just got a new computer with Windows XP. I, also, was wondering where the old "Input Methods" for foreign languages were. If you run IE6 and right click on any web page, you will get a drop down menu with "encoding" as an entry. Follow the arrow to a long list of encodings.In my case, I chose Japanese and it was installed on demand, in under a minute. Then I left "Encoding" set to "Autoselect." If you are aware of this already, apologies... As Ever, Ruth Ifcher --> On Tue, 27 May 2003 12:32:19 +0900, Guillaume Blanchard > <gblanchard(a)arcsy.co.jp> gave utterance to the following:> > <older attribution for the >> was snipped by Guillaume>> >> So...perhaps I understood nothing, but do you think> >> Opera 5 is not accepting unicode because of missing> >> polices or does it just not tolerate it at all ?> > >> > I think there are both problem. Even if your browser can handle unicode, > > you> > can't see caracters not defined in your font. I'm using MS Arial Unicode> > with IE6.0 and I still not be able to see 100% of unicode characters. In > > my> > case I think it's only a font problem. You can go to this page and look > > at> > what percentage of caracters you can see :> >http://www.columbia.edu/kermit/utf8.html (it's a UTF8 sample page).> >> Opera 5 has no unicode support - Opera 6 was the unicode rewrite.> Both Opera (6+) and Mozilla support unicode natively - the only thing you > have to do to get it working is to install an appropriate font.> However, even if you have the font, IE doesn't display some writing systems > until you "install support" by downloading a large patch to your operating > system. (A fully multilingual installation of IE6 weighs in at around 85MB)> > -- > Richard Grevers> I hate Victor Hugo said Les miserably> > > > _______________________________________________> Wikitech-l mailing list> Wikitech-l(a)wikipedia.org>http://www.wikipedia.org/mailman/listinfo/wikitech-l

1 0

Movatterモバイル変換

Keyboard Shortcuts

Thread View

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-lMay 2003