Posted on 03/01/2009. By Pete Otaqui.
If you edit Wikipedia and have any kind of tumblog or lifestream, you might have found that the RSS feed that the MediaWiki software gives you is not especially useful, listing as it does the entire edited page, making each item potentially huge.
To fix this, I created a Yahoo Pipe showing Wikipedia Contributions – comments only which strips out the page content, leaving only the comment that was left. I’ve also given it a URL parameter (“target”) so you can use it for your own (or any other) Wikipedia username.
The idea is relatively simple, but I’ll list the process if you are interested in learning about the power of Pipes:
- Add a “Text Input” and give it a name of “target” (which is the same as the URL parameter required for the Wikipedia Contributions feed for a given user)
- Add a “URL Builder” with the basic URL for contributions and query parameters for name (which is “Special:Contributions”), feed (which is “rss”) and then one for target.
- Connect the output of the Text Input to the target parameter in the URL Builder – now you can add &target=MyUserName to pipe URL to change user
- Add a “Fetch Feed” and connect it’s URL to the output of the URL Builder
- Wikipedia pages can be long, too long in fact for Pipes’ regular expression parser, so add a “Loop” and then drag a “Subtring” into it. I trimmed the comments to 1024 in length, which I think is a reasonable amount in that it should be more than the length of the comment (actually limited to 200 characters) and any leading “junk” that we don’t want. the Loop should work on, and assign results, to “item.description”
- Now to strip out everything except the comment, add a Regex and connect the Loop’s output to it. We need two rules running on item.description, the first to strip newlines (find: “\n”, replace with “”, “g” option) and then find the comment (find: “[^:]+:(.*)<\/p>.*”, replace with “$1”, no options).
- Connect the Regex output to the Feed output … and you’re done!