Posts Tagged ‘visualisation’

Visualising Twitter Dynamics in Gephi, Part 2

OK, so this is the second part of my post on turning Twitter data from Twapperkeeper into a dynamic network visualisation in Gephi. Last night’s post did the groundwork, generating a GEXF file from our #spill hashtag dataset (covering Twitter discussion of an Australian Labor Party leadership spill between 7 p.m. and midnight (AEST) on 23 June 2010). In this post, we’ll work with this data file to generate a number of dynamic visualisations of the @reply activity (including old-style ‘RT @username’ retweets) during this time.

Essentially, here’s the overall network of the most active participants which we ended up with last night, now with each node’s degree value (number of @replies sent + number of @replies received, from within this most active group) next to its name. (If positions of nodes have shifted slightly from what they were, that’s because I had to recalculate the map again.) As noted at the end of part one, this overall map somewhat underestimates the weight of connections within the network, due to a limitation in how Gephi currently calculates its edge weight averages, but hopefully this will be fixed soon. What I’ve done in this new version of the map, though, is to highlight a number of interesting nodes in the network whom we’ll want to follow further:

Read the rest of this entry →

30

12 2010

Visualising Twitter Dynamics in Gephi, Part 1

In the following posts I’m finally keeping my promise to explore in earnest the use of Gephi’s dynamic timeline feature for visualising Twitter-based discussions as they unfolded in real time. A few months ago, Jean posted a first glimpse of our then still very experimental data on Twitter dynamics, with a string of caveats attached – and I followed up on this a little while later with some background on the Gawk scripts we’re using to generate timeline data in GEXF format from our trusty Twapperkeeper archives (note that I’ve updated one of the scripts in that post, to make the process case-insensitive). Building on those posts, here I’ll outline the entire process and show some practical results (disclaimer: actual dynamic animations will follow in part two, tomorrow – first we’re focussing on laying the groundwork).

First, a quick overview: what we’re after is a process that provides us not only with a static map of all connections (i.e., @replies – including old-style ‘RT @user’ retweets) made between a specific group of users on Twitter during a given period of time, but a dynamic visualisation of how those connections unfolded over the course of that period: how specific users assume more or less central positions in the @reply network as time unfolds; how discussion activity waxes and wanes; how particular tweets stimulate further activity in the network (for example as users reply to them or retweet them).

Read the rest of this entry →

30

12 2010

Fun with Gephi’s new dynamic visualisation feature

This is a quick demo of how the new timeline feature works in Gephi 0.7 beta. We’ve used 5 hours worth of @reply data from the Twapperkeeper archives for the #spill hashtag. This period corresponds to the ‘acute event’ in Australian politics that kicked off the election that sidetracked our research (in all kinds of productive ways, of course) – the day (the evening, and then the next morning) when now-PM Julia Gillard overthrew then-PM Kevin Rudd. Please don’t read too much (or indeed anything) into the actual analysis here, but for the sake of completeness: I’ve indicated betweenness centrality with both colour (red at the high end, yellow at the low end) and size.

The possibilities here are very interesting, particularly if we use better quality data that is properly set up for longitudinal analysis – e.g. so the nodes scale up and down properly through time. I’m pretty sure Axel has one of his epic and highly detailed methods posts up his sleeve in relation to all this, but for now, enjoy the pretty moving pictures – and apologies for the jerky cursor movements – I’m on the road and so without a mouse.

If you’re interested in any of the detail it is probably best viewed at the YouTube website in HD and fullscreen:

06

10 2010

Mapping the Australian Blogosphere Some More

My previous post outlined a few more steps I’ve taken in cleaning up our emerging dataset of links in the Australian blogosphere (current limitations of our data are also listed there). It’s time to take those cleaner data for a spin, then. Beyond mapping the interlinkages between our known blogs during the period of 17 July to 27 August 2010 (roughly coinciding with the Australian federal election campaign), as I did a couple of posts ago, I’ll now work off the cleaned dataset which contains only those links which:

  • originate from those sites in our list which we have confirmed to be (independent or professional) Australian blogs; and
  • point to sites which are more than merely functional (i.e. sites which aren’t on tge destination filter list at the bottom of my previous post).

 
What I’m especially interested in as I work with these network data is:

  1. Which non-blog sites appear prominently in the network, and in what contexts; and
  2. which blog sites appear to serve as connectors between the various components of the overall network.

 
So, feeding the network data (close to 3.4 million links) into Gephi and filtering out any sites which don’t at least receive ten incoming links from anywhere in the network, here’s what we get (PDF here):

Read the rest of this entry →

22

09 2010

Visualising topic-based conversation networks: the #masterchef edition

In future analysis we’ll be interested in doing some form of comparison between the #ausvotes data we’ve been looking at (and that Axel has already blogged about earlier this week), and other topics of shared interest among Australian Twitter users. As an exceptionally high-rating Australian prime-time TV show that was also a trending topic on Twitter, Masterchef is a particularly interesting example of such a topic drawn from popular culture. The patterns of Twitter use around this highly popular, nationally-based show (perhaps even more so than around the pre-election debate) can hopefully help us to understand something about the practices of the networked television audience as a public.

Read the rest of this entry →

30

07 2010

Twitter Concept Mapping with Wordstat and Gephi: First Steps

Continuing my series of posts on methods for doing quantitative research using Twitter data, this will be a fairly tentative post. I’m currently looking into ways to examine the terms and concepts used by tweeters as they discuss specific issues; we’ve done similar work looking at the content of blog-based debates in the past, using the (commercial) concept mapping software Leximancer, but I’ve never been fully satisfied with the information generated by Leximancer, and especially with its data visualisation functionality, so it’s time to look at the alternatives.

Ideally, I’d like to leave the visualisation aspects to the open source software Gephi, which I’ve already used for some useful network visualisations (more on that in another post), so what I’m really after is a software that produces word and concept co-occurrence data for my source texts (in this case, a database of tweets on a specific subject), and pushes this out in a format that Gephi can understand (e.g. UCINet or Pajek, or even Gephi’s own network data format). At the ICA conference in Singapore last month, I came across a (commercial, sadly) quantitative text analysis software called WordStat – part of a larger software package available from Provalis Research that includes various other statistical tools which are less relevant for me here –, so that’s where I’ll start.

Read the rest of this entry →

28

07 2010