February 2, 2012

MediaWiki spam is such a headache

MediaWiki, like any other popular software product (that accepts user input), has become a massive target for spam. Now I’m not talking about vandals running around on Wikipedia trying to convince you that African Elephant population has tripled in 6 months. I’m talking about plain old spammers. Just like email spammers and comment spammers on blogs. These automated bots run around and slam small wikis with all sorts of dumb advertisements… and they drive me nuts.

Why is Wiki spam so much worse to me than other types of spam? Well, first off I run a lot of wikis (

Snowulf, Au RCC 2010, RCC 2010, RecentChangesCamp, WestCoastWikiCon 2011, IPv6Wiki to name a few). Every wiki that I run is an open door for spammers who could potentially get my server marked as a spam zone, thereby hurting other sites (like this one). Mostly I hate MediaWiki spam because it is such a slow process to deal with.

When I get comment spam here at the blog (and I do get a fair bit) that makes it through the spam filter, I just click the trash button and it’s all gone. With Wikis there is a lot more clean up work. More importantly, WordPress has the most amazing plugin/service called Akismet. It catches probably 99% of the spam comments to the blog. It’s like SpamAssassin for blog commentary (actually it’s better, since SpamAssassin is actually fairly limited).

Combatting spam, the hard way
I spend way more of my Wiki administration time fighting spam, than I do actually getting fun things done. As with any other game of cat-and-mouse, so much time is spent trying to outwit the spammers. At one point in time simply requiring a user to verify their email address before editing was enough to stop all automated spammers. Now there are pages devoted to the topic of combating spam on MediaWiki. Wikipedia has a fairly good system but most of them are user powered (like AbuseFilter) and require constant (manual) adjustment/update.

What I’d love, is for someone to make an Akismet for Wikis. The bots that hit all my wikis are the same bots that run all around the web. They post the same type of comment and the same shitty pictures everywhere they go. The plugin needs to allow administrators to mark an edit as spam, report that back to a central server, and deal with the edit. Additionally each edit would be checked against some sort of central database (much like how Akismet works). You could even check images in a similar method, but using a SHA1/256/512 hash of the image contents. I assure you that I’ve been hit hard by the image spammers and at least 75% of their images (uploaded to one wiki) are identical.

Wikimedia, Wikia, and the big farms have people who are devoted to the spam problem, but us little guys don’t. We need something that is WordPress simple for dealing with the problem. It can be an optional extension (which those who are privacy concerned would opt out of), but we need something easier than constantly updating AbuseFilters and AntiBot extensions.