Testing Wikipedia on the iPod (WikiPod)

Update: Colour me amazed! Just a few hours after I sent my changes to Matt, noted at the bottom of this page, he updated the script to take them into account — and so this piece of criticism simply doesn’t apply anymore. Wikipod now provides better feedback to the user, informing them of how many articles have already been downloaded as it progresses, and I’m overjoyed to see how responsive Matt was to this piece of criticism. I’ve left the review intact, but bear in mind that the problem of feedback has now been corrected in the current version of Wikipod.
Testing Wikipedia on the iPod
My newly acquired iPod 30GB is sitting before me, quietly holding a bunch of Wikipedia articles downloaded just now using Matt Swann’s Wikipod script. The script allows the user to download entire Wikipedia articles — sans images, infoboxes and other hard-to-scale information — to view on their iPod using the built-in notes functionality.
Here are some impressions from my time with Wikipod.
Installation
Wikipod is a script, rather than an easily-executable application like one would expect. This has both an upside and a downside: on a positive note, this inherently limits the number of people that will use this utility, and so saves the Wikimedia Foundation’s bandwidth and server resources in no small way. On the downside, it can be a little daunting to use if you’re inexperienced with a UNIX-style command-line. This wasn’t a problem at all for me, but it does raise the question of building a nicer way to do the same thing.
A GUI frontend couldn’t be that difficult to construct, although making it cross-platform and reliable would be a bit of a challenge. Fink is a collection of simple script that can be manipulated through the FinkCommander frontend, so it’s certainly reasonable to think that Wikipod would eventually be packaged with a GUI as well. I hope.
Downloading articles
On top of that, it’s damn slow, and with a very good reason. Wikipedia demands that scripts like this build-in a period of sleep between each article they download to prevent the website slowing to a crawl. Wikipod complies with this, to the point that it took me around five minutes to download 1MB of articles on a 512kbps ADSL connection. Not fast, but again, I’m glad that it doesn’t destroy the WMF’s server resources.
On the other hand, though, the script provides no feedback at all to the user until it has finished running. The first time I attempted to run it with just a 1MB size limit (Matt uses 10 in his example), it appeared to hang for the entire time until it finally finished downloaded the articles. Ideally, it would update the output to let you know it’s doing something, even if it’s technically impossible to provide an ETA or tell you how close it is to finishing. Telling the user how many megabytes have been downloaded should be trivial (since presumably the script keeps track of this as it runs), or else how many articles have been copied. Either solution would be great, but is missing from Wikipod in its current incarnation.
It also doesn’t handle errors well at all. Almost any error produces the same, equally useless message:
Done transferring 0 pages!
Great! So, what’s the problem? 🙂 I have found that the above error message is most commonly triggered when the iPod’s Notes folder is already full, or already contains the article that you ask the script to start from. I don’t have a fix for this — I suck at perl — but checking that folder will generally solve the problem.
Indeed, Wikipod generally gives the user as little information as possible. Below is a screenshot of the running the script, demonstrating just how little it tells you about anything. Only the line “Done transferring 65 pages!” is output from the script.
By the way, it’s damn slow, did I mention that? Unfortunately, the second problem is not something the developer can prevent. See, the links between articles are included when they’re downloaded, so that when the iPod loads the listing for the first time it must validate each and every link to ensure that an article actually exists for it. This takes time and battery power (it’s very disk-intensive) — in fact, it took just as long to validate one megabyte of articles as it did to download them in the first place, on my connection. To put that in some context for you, 1,000 notes (or between about 200 and 1,000 articles, depending on their length) took well over 20 minutes to validate on the iPod. This isn’t great, but at least it’s a one time thing.
One very major benefit is that the user is allowed to specify a starting article (the suggested default is “IPod“), which it downloads first. Then, it runs through the links in that article and downloads all the articles that the first one linked to… and so on, and so forth. It’s nicely done, and allows users to really “explore the web”, and combined with the ability to choose how much space you want to let Wikipod take up, it’s quite an interesting way to do things.
For instance, asking for “15MB of articles” using “IPod” as a starting point (which in reality downloaded just 4MB due to the iPod’s restriction on the number of notes that can be saved) allowed Wikipedia to download articles as far away from the iPod as AIDS, Hogwarts and, strangely enough, New York City Police Department. That’s only about two degrees of separation between articles, but it’s still quite impressive.
Usage
As I said before, the script retains the links created between each article. The user scrolls through each article with the iPod’s wheel, using the select button to click links and move between article. The spirit of Wikipedia is thus retained, and users are free to move between articles as they see fit — with the benefit that there are no redlinks. Thank god.
On the other hand, the length of each article is limited. They’re saved as notes when moved to the iPod, meaning that each one can only be fairly short. To compensate, the end of each note contains a “next page” link that takes you to the second part of the article, allowing the user to read through the article without much difficulty at all.
Unfortunately, these additional notes not only reduce the total number of articles that can be downloaded — the iPod only supports a maximum of 1,000 notes, and longer articles take up more than one — so the value of Wikipod is inherently limited by Apple’s own iPod. Don’t expect to download the whole of Wikipedia any time soon.
In addition, the need to split articles between multiple notes can make browsing harder than need be. Each additional file is labelled with the “Z” prefix, dropping it back to the bottom of the list, but unfortunately they’re still visible to the user. It allows a certain degree of freedom, in that users can skip to certain sections of an article if they so wish, but it also introduces quite a lot of clutter.
Finally, wikilanguage sucks hard, as Kelly Martin explained in a recent episode of Wikipedia Weekly. It’s extremely difficult to build a decent parser for the language, and it shows in Wikipod: it doesn’t handle tables nicely at all. The simple table on iPodLinux doesn’t display properly at all due in part to the lack of images in Wikipod articles, but mostly because it simply doesn’t understand the language well enough. This would be an enormous challenge to get around, especially in the confines of the tiny screen of an iPod, so I don’t pass any blame on the Matt for this failing to work — it’s just a pity to see.
Conclusion
Wikipod is a very impressive script, seeking to achieve what other projects have been working towards for a long time: offline Wikipedia content. We spoke with Martin Walker about Wikipedia 0.5, which seeks to produce a CD version of Wikipedia, but Matt took the different approach of allowing users to determine the files the script downloads. In this respect, it differs from alternatives like TomeRaider and Webaroo, which allow users to download pre-packaged content from any of a number of websites. It’s a custom version of those, if you like, and the first for the iPod.
Obviously, the software has its limitations. It’s slow on both the computer and the iPod, it’s sometimes difficult to use, and its utility is confined by the maximum possible number of notes imposed by Apple. But despite these faults, it’s an effective and well-implemented means of carrying up to 1,000 articles in your pocket, which I’ll be grateful for in the future. If Apple ever lifts the 1,000 article limit, this really would be formidable.
Update
I have modified Wikipod to address the problem of not giving the user feedback while articles are being downloaded. This raises the script to version 1.6 — I’ve emailed Matt about the change, too. My own version can be found here