Following on from my previous post about the methods we’re starting to use to make sense of the Australian blogosphere data we’re receiving from our colleagues at Sociomantic Labs, here’s a first look at what happens when we begin to visualise those data in the open source network visualisation software Gephi. Let me begin by making one thing very clear, though: this is based on as yet incomplete data, and should not be seen to say anything comprehensive about the shape of the Australian blogosphere. What we’re currently working with is:
- a highly incomplete list of Australian blogs that is biased towards those genres of blogging that we already know quite a bit about, and
- hyperlink data that hasn’t yet been cleaned up to contain only those links present in the blog posts themselves, rather than links elsewhere on the page.
So, as we’ve explained in our previous work, we can expect plenty of false positives (e.g. sites like WordPress.org which appear to be central to the blog network, but are so only because many blogs run on and link to WordPress – not because their posts actually talk about WordPress-related topics), and a network structure which overrepresents those sectors of the overall Australian blogosphere where we already know and track a majority of existing blogs (e.g. Australian politics, which we’ve studied in detail over the past few years).
With those caveats in mind, though, in this post I’ll work through the data as they are at the moment, largely to test our methods as we’ve established them and to see what insights can emerge from this process. I’m drawing here on a slice of hyperlink data from the nearly 8,300 blogs that we follow (also including a number of mainstream news sites which have RSS feeds – these will be sorted into a separate category at a later stage), collected between 17 July and 27 August 2010 – i.e. roughly coinciding with the Australian federal election campaign between 17 and 21 August. (Given this heightened activity, we should expect an overrepresentation of political blogs, therefore, even beyond the skew towards politics in our overall list of blogs.)
Read the rest of this entry →