Sunday, July 31, 2011

Overviewing the Government Web Presence

The Government released a list of about 440 current websites listed by department, which was fairly easy to turn into some PHP code. Hoping to release the code soon (once I've got Github working), but so far you can see what's below. Both projects try to get an overview of the Government's total web impression - it'd be interesting to repeat this, say, yearly to see what changes.

Click here for a page showing screenshots for all 400+ homepages

I also scraped the listed pages and did a bit of processing to turn the content into Wordles. OK, not terribly exciting. But I kind of like the idea of turning open data into "art".

Here's the front page content with some common and site-related words removed (click through for original):

Wordle: UK Government websites [unfiltered]

Here's the same data but with more common words removed by

Wordle: UK Government websites [filtered]

As I say, hope to post code and data soon, even though it's not much work to re-create... In the meantime, any suggestions welcome.

