Table of Contents
Blacklisting
The internet isn't the place it used to be anymore. Everything good gets corrupted and so it is with Wikis. WikiSpam, like Spam in blogs and email, is on the rise. If you useDokuWiki in your Intranet this is no problem for you. But if you intend to use it on the open Internet you may want to blacklist some known Spam words.
For using a blacklist in DokuWiki:
- enable theusewordblock option in the config manager (by default on)
- edit the
conf/wordblock.local.conf
file. You can have a look inside the fileconf/wordblock.conf
for a list of existing word blocks.
The file containsRegular Expressions (Perl compatible) if any of these match saving is disallowed. To understand why a certain text was banned for spam, you can use thewhyspam plugin to analyze the text.
IP based blocking can be done using Apache'sdeny from directives or theipban plugin.
Adding/removing block words from list
You can add your own block words by creating aconf/wordblock.local.conf
file and placing your abbreviations.
- conf/wordblock.local.conf
(long|loud) shouting
Disable default block words by prefixing with (!
):
- conf/wordblock.local.conf
!woww gold
Blacklist Sources
Updating the blacklist from a public source through a daily cronjob is recommended, here is a list of sources you can use to do so.
Wikipedia
The nice people at Wikipedia maintain asimilar blacklist. You can use the following command for updating your blacklist from this source:
$> curl -sk https://meta.wikimedia.org/wiki/Spam_blacklist?action=raw | egrep -v '<?pre>' > conf/wordblock.local.conf
Don't forget to createconf/wordblock.local.conf
file.
Logging of blocked Attacks
This small change in original DokuWiki fileinc/common.php makes it possible to log blocked attacks in/data/meta/wordblock.log
and can also be used for block lists debugging.
Search Line:
function checkwordblock($text=''){[...]if(count($re)&&preg_match('#('.join('|',$re).')#si',$text,$matches)){// prepare event data$data['matches']=$matches;$data['userinfo']['ip']=$_SERVER['REMOTE_ADDR'];[...]
Change it to:
function checkwordblock($text=''){[...]if(count($re)&&preg_match('#('.join('|',$re).')#si',$text,$matches)){// prepare event data io_saveFile($conf['metadir'].'/wordblock.log',strftime($conf['dformat'])."\t".$matches[0]."\t".$ID.$_SERVER['REMOTE_USER']."\t".$_SERVER['REMOTE_ADDR'].":".$_SERVER['SERVER_PORT']."\t".gethostbyaddr($_SERVER['REMOTE_ADDR'])."\t".$_SERVER['HTTP_USER_AGENT']."\n",true); $data['matches']=$matches;$data['userinfo']['ip']=$_SERVER['REMOTE_ADDR'];[...]
See also
- You can install and use the
Configuration File Manager for editing via admin interface of the wiki