A #HASTAC2014 Conference Tweets Archive

HASTAC 2014, Lima, Perú

Like last year, I attempted to archive the tweets tagged with the HASTAC annual conference’s official hashtag (this year #HASTAC2014).

The resulting dataset is a CSV file containing 3748 tweets tagged with #HASTAC2014 (case not sensitive).

The first tweet in the dataset is dated 19/04/2014 23:10:50 Lima, Perú time and the last one is dated 27/04/2014 15:00:54 also Lima, Perú time. The file also contains equivalent times in GMT.

HASTAC is an alliance of humanists, artists, social scientists, scientists and technologists working together to transform the future of learning for the 21st century. Since 2002, HASTAC (“haystack”) has served as a community of connection where 11,500+ members share news, tools, research, insights, and projects to promote engaged learning for a global society.

HASTAC 2014: Hemispheric Pathways: Critical Makers in International Networks, the 6th international conference for the Humanities, Arts, Science, and Technology Alliance and Collaboratory,  was hosted by the Ministerio Cultura of Lima, Perú, from 6pm Wednesday 23 April to 1pm Sunday 27 April 2014 local time. In order to avoid the inclusion of spam tweets the minimum number of followers a person had to have to be included in the archive was two.

I harvested the tweets with (several!) Twitter Archiving Google Spreadsheets (TAGS version 5.1, by Martin Hawksey).

Please note that both research and experience show that the Twitter search API isn’t 100% reliable. Large tweet volumes affect the search collection process as well. The API might “over-represent the more central users”, not offering “an accurate picture of peripheral activity” (González-Bailón, Sandra, et al. 2012). Therefore, it cannot be guaranteed this file contains each and every tweet tagged with #HASTAC2014 during the indicated period.

[It should go without saying but perhaps it must also be noted that some conference tweets might have used other variations of the hashtag. Logically those were not included in this collection. Therefore it cannot be said that even all tweets tagged #HASTAC2014 represent all the Twitter activity around the 2014 conference.]

The file includes raw data and it might require refining including deduplication. The data is shared as is.

The file is openly accessible via figshare:

Priego, Ernesto (2014): #HASTAC2014 Conference Tweets Archive from 19 April to 25 April 2014. figshare.

[I have just published this and the doi might take some time to become active].

The URL for the dataset is


The file is shared with a Creative Commons- Attribution license (CC-BY).

I have been archiving conference tweets and sharing backchannel datasets for some time now. I am keen on promoting the study of academic conference networks on Twitter. By openly sharing the resulting datasets and by blogging about it throughout time, I have also been openly documenting my own learning curve trying to archive tweets and how to do it better.  If you use or refer to this data in any way please cite and link back using the citation information above.

I will hopefully have time to finish and publish another post with more detail about the HASTAC conference backchannels soon.

Thank you for reading and sharing. If you attended the conference, I hope you had a nice time. As usual, I am sorry I could not attend in person.


Visualising #digitrans

Screenshot of a fragment of a #digitrans TAGSExplorer visualisation
Screenshot of a fragment of a #digitrans TAGSExplorer visualisation, 20/11/2012 1:07 PM GMT

Yesterday I attended the Digital Transformations Moot organised by the Arts and Humanities Research Council in London. My colleague Sarah Quinnell and I participated in the ‘Yack Space’ with a ten-minute flash presentation on our Networked Researcher project. You can view our slides here.

This morning I used Martin Hawksey‘s TAGSExplorer to create a visualisation of a Google spreadsheet archive of the #digitrans tweets. You can view it here.

By tweaking the visualisation’s URL you can also see the nodes connected by @ mentions and @ replies, here.

And if you want to push your browser to the limit and see web entanglement in full effect, the archive can also visualise RTs (here).

Note that the visualisation is in fact an interactive, searchable arhive. You can click on nodes to find out more and also search by keyword.

The Google spreadsheet archive was created once the event had finished (this morning around 9:00am GMT) and it updates itself every fifteen minutes. Nevertheless since the RL event officially concluded last night we can argue most of the event’s backchannel tweets have been collected. At the time of writing this post the archive had collected 1517 unique tweets:

#digitrans archive summary with top 20 tweeters
 #digitrans archive summary with top 20 tweeters. Screen shot taken 20/11/2012 12:48 PM GMT.

As expected most of the tweets were posted during the day of the event (19 November 2012), with some activity some days before and the day after:

#digitrans tweet volume over time graph
 #digitrans tweet volume over time. Screenshot taken 20/11/2012 12:54  PM GMT.

The top tweeters were divided between the organisers, speakers and attendants:

#digitrans top tweeters percentages pie chart
#digitrans top tweeters percentages. Screenshot taken 12:53 PM GMT.

I have found Martin Hawksey’s tool very useful to collect, archive, visualise and analyse Twitter activity, particularly academic conference backchannels. It offers a way of revealing the intrinsically networked and social (as in, involving human interaction) nature of a Twitter’s stream data.

As a form of data mining and distant reading, visualising archives of Twitter backchannels (and therefore networks) can be a useful way of demonstrating an event’s public impact and of discovering key participants, topics, sentiment and links.