Posts Tagged ‘Methods’

Taking Twitter Metrics to a New Level (Part 4)

Update: revision 1.2 of metrify.awk is now available (still at the link below), and introduces some further functionality, which is outlined here.

This is the final instalment of my four-part introduction to the metrify.awk script for generating detailed metrics for specific Twapperkeeper/yourTwapperkeeper hashtag archives. Over the last couple of posts, we’ve mainly dealt with overall stats for the hashtag, as well as for specific, definable percentiles of more or less active users. Finally, now, it’s time to look more closely at patterns within the overall userbase.

Read the rest of this entry →

02

01 2012

Taking Twitter Metrics to a New Level (Part 3)

Update: revision 1.2 of metrify.awk is now available (still at the link below), and introduces some further functionality, which is outlined here.

Over the past couple of posts, I’ve introduced our new metrify.awk Twitter metrics script, and looked at the first of the three metrics tables produced by the script. Let’s move on now to the second table, where I’ll use a snapshot of Australian political discussion on Twitter under the #auspol hashtag between February and August 2011, instead of #qldfloods – the overall metrics for the different user percentiles in the #qldfloods dataset turn out not to be particularly interesting… As before, we’re dividing the total userbase according to the 1/9/90 rule into the 1% of most active users, the next 9% of moderately active users, and the final 90% of least active users. (In the case of #auspol, that first percentile contains 142, the second percentile contains 1291, and the final percentile contains 12700 of a total of 14133 users.)

Read the rest of this entry →

02

01 2012

Taking Twitter Metrics to a New Level (Part 2)

Update: I’ve clarified/corrected some of the details relating to the percentile metrics contained in the first table which metrify.awk generates.

Update 2: revision 1.2 of metrify.awk adds further functionality in addition to what is described below. These changes are detailed here.

In the previous post, I’ve introduced metrify.awk, our new multi-purpose tool for generating Twitter metrics. Over the next instalments in this series of posts, I’ll take you through the results it produces. And seeing as we’re coming up to the anniversary of the January 2011 south-east Queensland floods, and as I needed to generate those metrics anyway, for a report on social media in the floods which we’re publishing soon, I’ll be using an archive of #qldfloods tweets between 10 and 17 January 2011 as an example here.

I’m running metrify.awk as follows for this:

gawk -F , -f metrify.awk divisions=90,99 time=day qldfloods.csv >qldfloods-metrics.csv

In other words, we’re using a 1/9/90 division of users, and we’re tracking activities per day; the skipusers switch is not set, so full stats for all users will be generated.

Read the rest of this entry →

02

01 2012

Taking Twitter Metrics to a New Level (Part 1)

So, 2011 is finally over – and what a year it’s been. While the confluence of natural disasters, political crises, and other major events has also provided us with the basis for a new research programme in crisis communication, let’s hope that 2012 is a little less intense, please…

To start the new year on a positive note, I’m finally getting around to sharing some more information about the new approach to generating Twitter metrics which we’ve developed over the past few months – this actually started during the research workshops we had with Stefan Stieglitz’s group at the University of Münster in August, so it’s taken some time to gestate into its present form. What it’s now turned into is quite a powerful tool for generating detailed information about a specific Twitter dataset – intended mainly for the study of hashtags, but with applications well beyond this as well. Amongst other things, it enables us to distinguish more effectively between different groups of participating users (from highly active lead users to much less active casual participants), and to track different types of participation, in total or by these specific groups, over time.

Read the rest of this entry →

02

01 2012

A Belated Post of Our DIATA11 Keynote, and More…

It’s been a busy few days: last week, Jean, Stephen and I participated in the magnificent Düsseldorf Workshop on Interdisciplinary Approaches to Twitter Analysis (DIATA11), which our colleagues and collaboration partners from the University of Düsseldorf organised – it featured a veritable who’s who of Twitter and social media researchers from Europe and beyond. Stephen has already posted the slides and audio for his own talk here, and belatedly, I’m now following suit with our joint keynote from the event (audio also included). The Düsseldorfers have also set up a Slideshare group for the event, and are currently compiling a collection of all the presentations – keep an eye on it, there’s some excellent work there!

Read the rest of this entry →

20

09 2011

Quick Update from the Road: Twitter Research Methods

Cardiff.
Another week, another presentation: Jean, Stephen, and I have now made it to Cardiff, where we’re participating in the Future of Journalism conference. Today, we presented our paper on Twitter research methods for journalists and journalism researchers, which offers a quick overview of our major ways of studying Twitter (and Twitter hashtags in particular). Our slides and audio from the presentation are below – the full paper is also online. For my liveblogging from the conference, check the Future of Journalism posts on snurb.info – and there’s also the #foj11 hashtag, of course.

09

09 2011

Talking Twitter in Amsterdam

Amsterdam.
After the ECPR conference in Reykjavík, I’ve been lucky enough to spend a week in Amsterdam, where I was invited to present a guest lecture as part of the festive opening of the University of Amsterdam’s ‘new media season’: the official welcoming of the 2011/12 cohort of students in the MA in New Media. My talk presented an overview of our work in Mapping Online Publics so far, with special attention to our work on Twitter. In particular, I spoke about the role of Twitter during the Queensland floods and other crises, as well as our recent breakthroughs in identifying different tweeting activities taking place in the context of different hashtags.

Below are my slides for the talk, with audio (unfortunately I placed my voice recorder in front of the laptop exhaust fan, resulting in a very noisy recording that needed substantial noise reduction, so the audio quality is somewhat below par…). My sincere thanks to Richard Rogers for the invitation to speak to the MA students – looks like a very exciting course.

Read the rest of this entry →

05

09 2011

Extracting images from Twapperkeeper archives

This is just a quick post to share another new script – this one takes a list of tweets with pre-resolved URLs, and filters the list for known image-hosting services. I whipped this up as part of our ongoing efforts to go deeper into the dynamics of communication at various phases of the Queensland Floods disaster – prompted in part by the observations I made on the link data, which showed a very high prevalence of user-uploaded images being posted and retweeted. Besides that, our project aims to investigate not only text-based public communication, but also the role of image- and video-sharing (as well as the communities that have emerged around these practices, particularly on the Flickr and YouTube platforms). I’m partway through drafting a substantial post taking a closer look at the role of image sharing (and communication around images) in both Twitter and Flickr during the floods, but for now here is the script and the instructions.

Please note that this script won’t work unless the urlextract.awk and urlresolve.awk scripts have been run on the archive first.


# extractimages.awk - extract tweets containing links to images
#
# this script takes a preprocessed CSV of tweets based on the Twapperkeeper format, looks at the longurl field, and removes any lines that do not contain a link to a known image hosting service
# the urlextract.awk and urlresolve.awk scripts should be run prior to running this script
# expected data format:
# longurl,url,text,[other columns]
#
# Released under Creative Commons (BY, NC, SA) by Jean Burgess - je.burgess@qut.edu.au and Axel Bruns - a.bruns@qut.edu.au
#Project website http://mappingonlinepublics.net

BEGIN {
	getline
	print $0
}

#add more services below as you find them
$1 ~ /(twitpic\.com|flickr\.com|yfrog\.com|plixi\.com|instagr\.am|photobucket\.com|occip\.it|picasaweb\.google|sphotos\.ak\.fbcdn\.net|facebook\.com\/photo|imgur\.com)/ {

print $0 

}

18

02 2011

Media use in the #qldfloods

As I’m sure you’re aware, last week was pretty rough for Queensland (and then New South Wales and Victoria), as devastating flash floods ripped through Toowoomba and the Lockyer Valley, quickly followed by extreme river flooding in Ipswich and Brisbane that saw thousands of homes inundated. As in any emergency situation or other ‘acute event’, public communication played a vital role during all phases of the flooding – from warning, to emergency, and – eventually – to recovery, relief and rebuilding.

In this and the related Media Ecologies project in the CCI, we’re trying to understand how public communication is constituted through the operation of the broader media ecology, including social media as well as the full range of other communication technologies and practices that individual citizens have at their disposal. So we’re throwing all the research tools we have in our kit (and developing some new ones) at analysing public communication during the floods – initially through the lens of social media, and particularly, Twitter.

Axel has already posted a first look at some overall patterns of Twitter activity during the most acute period of the event, and at the end of the post asked our readers to nominate research questions and ideas for us to investigate – thanks very much to those who’ve contributed ideas so far. There is much more to do of course, and we’re on the case. In this and subsequent posts, I’m focusing on some patterns in the uses made of various media platforms and sources by Twitter users during the flood.

Read the rest of this entry →

22

01 2011

Top 20 election-related YouTube videos (according to Twitter)

Update: this analysis covers a few less days than I originally stated – the results should look quite different once we add in this week’s links (and next week’s!).

Here are the top 20 Australian election-related YouTube videos so far up to last Friday morning, according to the Twitterati. Or to be more precise, here are the 20 videos which have been linked to the most in tweets containing the #ausvotes hashtag posted between 17 July and 6 August, according to the Twapperkeeper archive.

Couple of interesting things to note:

  • the mismatches between the Twitter link rankings of some of these videos with the number of views they have received on YouTube;
  • the low numbers of links generally (could be a glitch with the scripts, but I’m reasonably confident it isn’t)
  • the reasonably solid performance of ‘made-for-web’ comedy videos performed and/or produced by professionals
  • the high retweet value of ‘official’ campaign videos (in which I’d probably count GetUp!) – although it’s important to note that the tweets that go alongside the videos are frequently less-than flattering…
  • and if I may add a personal note, the only mild sharpness or funniness of even the sharpest and funniest of these videos…

Read the rest of this entry →

12

08 2010