Today I noticed something strange while backing up the database from another website I manage, ComWiki. The database backup took much longer than usual, and I was surprised when the size of the backup started to go over a gigabyte. Either someone had been adding lots of pages to the wiki, or something weird was going on.
So I went to the “All pages” page on the wiki and noticed a bunch of pages with strange titles that appeared to be spam-like URLs. Visiting those pages gave me a big shock, as the content appeared to be some kind of binary code. It sort of reminded me of the old Usenet rar files, line after line of gobbledegook that looked suspiciously like a slice of Warez.
Then I got an even bigger surprise, when I got a message on my screen that Google had detected malware on the site. Oh boy.
After doing some more digging, I found a note from something called the Graffiti Networks Project. Apparently, this was a project started by a couple of students at Brown University to exploit a weakness in MediaWiki, the open source software that runs Wikipedia (and the software I use on ComWiki). Essentially, the project demonstrated how one could use this weakness to establish a peer to peer file sharing network.
Here’s the more technical description from their website:
In response to the lack of user anonymity and long-term data persistence in existing P2P systems, we developed the Graffiti Network distributed file sharing protocol that uses multiple third-party storage sites as a data replication and transfer medium between clients. Our approach is to use publically available web sites to store multiple copies of shared content. We use the term graffiti for our work since we are storing data in a way that non-network participants may regard as unsightly or unwanted vandalism.
Employing the same concept of a central tracker as in the BitTorrent protocol, a Graffiti client will connect to a tracker and receive well-defined instructions on where and how to retrieve segments of shared files from a remote storage site. Upon successfully downloading and decrypting some portion of the shared data, the client will receive further instructions to replicate that same data at different storage site. If the client succeeds in replicating the data, it notifies the tracker of the new replica location to receive the next data segment it needs and then repeats the process. Our approach has several key benefits over other P2P systems where clients transmit data directly with each other:
A newly arriving peer can still download files even if all other peers have long disconnected
A peer does not need to know about the existence of other peers
A tracker does not need multiple peers in order to enforce tit-for-tat policies.
Wow. You would think they would have at least asked me first before they started hacking at ComWiki. But then I guess that would spoil the fun.
Anyway, I’ve taken ComWiki down for now and put up a “parking page” until I can sort out this mess. When I do get ComWiki back up, I’ll probably have to put up a bunch of security measures, like CAPTCHA-style “type the letters you see in the box” routines, in order to keep out spammers…and Brown University students.
I can understand the theory behind this “experiment.” But I don’t appreciate the ethics, or lack of them, in its execution. I get the impression that these students felt that they were simply testing a “proof of concept,” and that no harm was done by storing their “encrypted data payloads” on wiki pages. But just because something CAN be done doesn’t mean it SHOULD be done. In the social sciences, I doubt if this kind of “experiment” would ever be approved by a human subjects research board.
Sure, running an open wiki means one has to expect some vandalism. I’ve come to expect some “edit wars” when running an open wiki, as people try to use the wiki to advance a particular agenda. Yet I’m still a real believer in the value of open wikis. I like the fact that on an open wiki, one can quickly correct a typo or add an important point to an article. No need to register, no need to squint at a CAPTCHA. Just hit edit and do it. Free and open. Anyone can edit. Yes, that means you have to expect edit wars, but that’s part of the wiki culture. And sometimes you can learn a lot from edit wars. If nothing else, you learn something about those who feel so compelled about their views that they take the time to engage in an edit war.
But to set up a P2P system that exploits this openness takes the “edit wars nuisance” to whole new level, one that just seems wrong to me. I don’t really care if people want to use the internet to share music or movies or warez. Indeed, that’s become part of the culture of the internet, and there’s not much I can do about it. Nor does it seem there is much the RIAA and MPAA can do about it. But to exploit a weakness in MediaWiki (and in particular, a default open installation of MediaWiki) just seems to spit in the face of the Wikimedia Foundation, one of the biggest defenders of openness on the internet.
In my opinion, the real shame in all of this is that when I finally do get comwiki.org back up, it will have to be a more closed wiki, which defeats one of the major advantages of a wiki: the fact that “anyone can edit it.” In fact, at one point I did have comwiki.org more closed, so that only registered users could edit articles. But when I did so, I noticed a significant decrease in edits from users. So I opened it back up, thinking that this might encourage a more open, freely-editable wiki experience. It was just such a freely open wiki environment that these students sought to exploit with their P2P experiment. And now it looks like I’ll have to lock it back up. What a pity.
By the way, even though the Graffiti Networks Projects’s web site claims they used their “removal tool” to delete their “encrypted data payloads” as of April 11, three weeks later I am still getting tons of hits to the wiki from bots. In the time it took me to manually delete a bogus wiki page and its edits, another page or two would pop up. So far this month, the traffic on this site is over 12 gigs. And even after completely removing MediaWiki and putting up a temporary parking page, the domain name is still getting hundreds of hits every day.