Monday, February 16, 2009

A Step-by-step Guide to Visualising Tweets

In response to psychemedia's request, I've put up a commented version of the code used to generate Wordles and Google Timelines.

The very rough and ready process was along the lines of:


  1. Run the Perl script to output Tweets into CSV format. These will be ordered in forwards chronological order. The script also generates a count for the number of tweets in each 10 minute time slot, to give some idea of activity over the course of time.

  2. Import the CSV into a fresh Google docs spreadsheet.

  3. I found it handy to duplicate this sheet, to play with subsets of the data e.g. just tweets for certain days.

  4. For the Wordles, I simply selected the relevant column (i.e. usernames, or tweet texts), and pasted into a decent text editor. I removed half the #ukgc09 tags for tweet texts, to stop it overpowering the rest. Then I just headed over to http://www.wordle.net/create, pasted the text, and played with formatting until I got something I like the look of.

  5. For the Google Timeline, I created a version of the sheet with 4 columns: tweet date/time (to become the X axis), number of tweets for this 10-minute time band (for Y axis = activity), tweet author and tweet content (for the notations). Then it was just a case of opening the "Insert gadget" menu, choosing the "Interactive Time Series Chart" gadget, and setting the Range to include all the data in these columns.

    I found it easiest to limit tweets to 3 days (otherwise the amount of notations causes the browser to get very slow), and to move the gadget to a separate sheet.

    Then I published the whole document as a web page (see the "Share" menu in the top right of the spreadsheet - you need to publish the data for the chart to work).


And that's it. As yet, I haven't found a way to publish just the gadget - Google have code to embed it in a page, but this doesn't seem to work.

I'm pretty sure there's a whole lot more you could do - I was just intrigued as to how activity varied through the weekend. (Perhaps it's good that the wifi on the day was down - I'm not sure how well that Time Series gadget scales...) The Wordles also seem quite a nice way to remember the day. Perhaps it might be possible to generate a similar, animated version to view word/author proliferation throughout a day as well?

Anyway, if you have any questions, leave a comment or get in touch via Twitter.

2 comments:

Mike Chelen said...

Can you allow viewer access to http://spreadsheets.google.com/ccc?key=p7PIlhDM1IlNUYT2Vbca6CQ ? The timeline chart is really nice, you could include a word cloud from Google Docs as well, I'll give it a shot as well.

Scribe said...

Done - let me know if you can't see the sheet. I've not played a huge amount with sharing Google Docs before, so picking this up as I go along - let me know if you have any problems seeing the data.

Thinking I should also have made the original raw CSV output available as well, will do so this evening.

Wondering if it's easy to a) bypass Google Docs to pipe data straight into Google Gadgets/Visualisations, and b) make this dynamic. Then you could have some interesting "live dashboard" things going on...