The Digital Humanities 2014 conference took place Monday 7 July 2014 – Saturday 12 July 2014 in Lausanne, Switzerland. (I was lucky to attend and present a poster there). Though there were other hashtags used to tweet about the conference, the main hashtag was #dh2014.
I started collecting #dh2014 Tweets on the 7th of September 2013. Having attempted to collect and reconstruct #dh2012 and #dh2013 archives (and having seen the growth in conference live-tweeting in DH since 2009) I knew the volume would exceed expectations. I broke several Google spreadsheets along the way. In the end I resigned myself to trying to reconstruct an archive for the duration of the conference proceedings, 7-12 July 2014.
I have now shared on figshare an .XLS file containing a dataset of Tweets tagged with #dh2014 (case not sensitive).
Priego, Ernesto (2014): An Incomplete #dh2014 Twitter Archive (Conference Days Only). figshare.
The complete archive contains 16,154 Tweets published publicly and tagged with #dh2014 between Monday 07/07/2014 00:03:00 (CEST) and Saturday 12/07/2014 23:48:00 (CEST).
The tweets contained in this file were collected using Martin Hawksey’s TAGS 5.1. Due to the volume of Tweets nine Google Spreadsheets were created during the period of the event, which were subsequently refined to four. The data was subsequently organised manually into various sheets, which have been included here.
Sheet 0. A ‘Cite Me’ sheet, including procedence of this file, citation information, information about its contents, the methods employed and some context.
Sheet 1. Monday 7 July 2014 (1,052 Tweets; (1,052 Tweets; gap between 07/07/2014 10:19 and 07/07/2014 11:20)
Sheet 2. Tuesday 8 July 2014 (3,605 Tweets)
Sheet 3. Wednesday 9 July 2014 (4,372 Tweets)
Sheet 4. Thursday 10 July 2014 (2,879 Tweets; significant gap between 10/07/2014 01:51 and 10/07/2014 10:10)
Sheet 5. Friday 11 July 2014 (3,843 Tweets)
Sheet 6. Saturday 12 July 2014 (403 Tweets)
Collected under local Lausanne, Switzerland times. Times in GMT also included.
Only users with at least 2 followers were included in the archive. Retweets have been included. Data might require reduplication.
Unfortunately the metadata in the sheets for Monday – Thursday is incomplete (the lack of ISO language metadata in these sheets is particularly disappointing, as it would have provided interesting insights); Friday and Saturday do contain the standard metadata available from TAGS.
Some work was done to ensure the chronology was complete; I have highlighted a gap in the Tweets on Monday 7 July 2014 between 07/07/2014 10:19 and 07/07/2014 11:20 and on Thursday 9 July 2014 between 10/07/2014 01:51 and 10/07/2014 10:10.
I was not able to recover these Tweets. Yannick Rochat and Martin Grandjean’s archive has what seems the complete set (available at http://goo.gl/6W3dol; last accessed Tuesday 15 July 2014 11:55 BST). Please cfr:
- Rochat, Yannick, “The DH 2014 Conference in Lausanne – A feedback”, 2014/07/14, http://yro.ch/?p=417, accessed 15 July 2014; and
- Grandjean, Martin, “[DataViz] The digital humanities network on Twitter (#DH2014)”, 2014/07/14, http://www.martingrandjean.ch/dataviz-digital-humanities-twitter-dh2014/, accessed 15 July 2014.
Please note Rochat and Grandjean’s dataset has 16,903 Tweets, whereas my collection only harvested 16,154 Tweets (749 Tweets less).
Please note that both research and experience show that the Twitter search API isn’t 100% reliable. Large tweet volumes affect the search collection process. The API might “over-represent the more central users”, not offering “an accurate picture of peripheral activity” (González-Bailón, Sandra, et al. 2012).
The Tweet volume was higher than what the available collecting methods allowed so data is likely to be incomplete. It is not guaranteed this file contains each and every Tweet tagged with #dh2014 during the indicated period, and is shared for comparative and indicative educational and research purposes only.
Please note the data in this file is likely to require further refining and even deduplication. The data is shared as is. This dataset is shared to encourage open research into scholarly activity on Twitter. If you use or refer to this data in any way please cite and link back using the citation information above.