LazyWeb Request: I Want to Build a Data Cloud

I’m both pleased and dismayed that I received 1203 responses to my Why Do You Blog survey. Pleased because that’s a great data set, but dismayed because that’s a lot of data to wade through and present. What can I tell you? I asked for it.

There are a couple of things I’d like to do with the responses:

  1. Take all of the anecdotal (essay-style, as opposed to multiple choice) text and run it through a magic word counter, which will spit out a spreadsheet that displays the frequency of usage of each word.
  2. Once I’ve got that spreadsheet, I want to convert it into a kind of data cloud–the bigger the text, the more frequently the word’s been used. IBM’s Many Eyes bubble chart is pretty close, but I’d like it to look like a tag cloud (see, for example, my Flickr tags).

Any suggestions on the tools or services I could use to make these happen?


  1. Sounds like a lot of work. We’re doing something similar with a survey at UBC but are using the software atlas.ti to do a qualitative analysis on commentaries from the survey.

  2. The magic word counter sounds quite easy – I could likely do up a perl script in 10 minutes that does the trick. That’s Programming 101 stuff right there.

    Tag clouds aren’t too hard either. I don’t know of any software that does this (because I don’t pay attention, not because it doesn’t exist), but anyone with a little bit of HTML/PHP-foo should be able to whip up something pretty quickly for you. Google will help as well (“php tag cloud” comes up with a bunch of results).

Comments are closed.

%d bloggers like this: