#rfringe17: Top 230 Terms in Tweetage

 

 

fringelogo-2017-justlogo

 

tl; dr

Repository Fringe is a gathering for repository managers and others interested in research data repositories and publication repositories.

I collected an archive of #rfringe17, containing 1118 Tweet IDs. I then analysed the text in the tweets with Voyant Tools to identify most frequent terms and manually refined the results to 230 terms.

I collected an archive of #rfringe17 tweets using TAGS. The key stats from the archive:

Number of Tweets in Archive 1,118
Number of usernames in Archive 215
First Tweet Collected 26/07/2017 14:58:12
Last Tweet Collected 05/08/2017 08:00:06

From http://www.repositoryfringe.org/:

Repository Fringe is a gathering for repository managers and others interested in research data repositories and publication repositories. Participation is a key element – the event is designed to encourage all attendees to share their repository experiences and expertise.

2017 marks the 10th Repo Fringe where we will be celebrating progress we have made over the last 10 years to share content beyond borders and debating future trends and challenges.

It took place in Edinburgh,  3 – 4 August 2017.

If you are not new to this blog you will then guess that I could not resist running the text of the tweets collected through Voyant Tools to obtain the term counts in the corpus with their Terms tool. As usual I applied the English stop words filter which I customised to include Twitter-specific terms (such as https, t.co, etc.) and the list of usernames.

I then manually refined the resulting data to remove smileys and any remaining usernames (some might have survived as it’s hard to disambiguate sometimes normal terms from usernames). I limited the results to 230 top terms.

Do take the counts with a pinch of salt as I did not clean the export from TAGS so Tweet duplicates and perhaps even some spam (who knows) might have remained.

Term Count
research 109
open 106
data 104
wikidata 75
oa 72
openscience 66
repository 63
repofringe 56
repositories 53
libraries 51
openresleeds 49
copyright 46
just 43
science 42
good 41
impact 41
thanks 41
day 39
access 38
poster 36
work 35
openaccess 34
talk 34
edinburgh 30
today 30
great 29
ucl 29
sherpa 28
read 27
want 27
event 26
project 26
really 26
time 26
cool 25
fringe 25
policy 24
metadata 23
publishers 23
publishing 23
says 23
colleague 22
policies 22
wikipedia 22
workflow 22
guide 21
millar 21
useful 21
comprehensive 20
content 20
fascinating 20
interesting 20
liveblogs 20
rdm 20
institutional 19
issue 19
it’s 19
liveblog 19
look 19
new 19
think 19
workshop 19
check 18
citizen 18
events 18
group 18
ip 18
management 18
need 18
outputs 18
presentation 18
rescue 18
session 18
trump 18
casrai 17
cycle 17
excellent 17
journal 17
lots 17
promotion 17
query 17
resource 17
uk 17
best 16
future 16
press 16
stuff 16
gallery 15
i’m 15
key 15
ref 15
showing 15
successful 15
support 15
thank 15
working 15
art 14
come 14
core 14
fun 14
miss 14
nice 14
process 14
provide 14
reminding 14
university 14
using 14
way 14
add 13
beautiful 13
demo 13
deposit 13
eprints 13
forward 13
funders 13
importance 13
keynote 13
looking 13
paper 13
phd 13
researchers 13
vote 13
e.g 12
era 12
especially 12
feedback 12
generation 12
got 12
let 12
needed 12
observation 12
recent 12
report 12
review 12
showcase 12
site2cite 12
star 12
theses 12
try 12
we’re 12
weirdness 12
advises 11
attendees 11
boat 11
broken 11
coar 11
control 11
criteria 11
exposure 11
global 11
institutions 11
like 11
model 11
prof 11
scholarly 11
survey 11
trek 11
use 11
years 11
articles 10
award 10
case 10
excited 10
exposing 10
figshare 10
gifts 10
hear 10
highlighted 10
important 10
initiative 10
integrating 10
introducing 10
live 10
opening 10
platform 10
ref2021 10
spend 10
vision 10
week 10
won 10
workshops 10
altmetric 9
colleagues 9
current 9
discussion 9
evidence 9
field 9
getting 9
i’ll 9
infrastructure 9
inspiring 9
library 9
link 9
list 9
local 9
long 9
make 9
meeting 9
peer 9
post 9
practice 9
preservation 9
problem 9
role 9
service 9
shoutout 9
shows 9
slides 9
sure 9
team 9
thought 9
touch 9
tweets 9
works 9
added 8
based 8
believe 8
better 8
change 8
conference 8
contributing 8
days 8
european 8
example 8
far 8
favourite 8
fully 8
here’s 8
image 8
included 8

Logically sharing this data as an HTML table is not the best way of doing it but hey. I have the source data if anyone is interested; Twitter developer guidelines allow the sharing of tweet IDs. In this case the source data is composed by the dataset of 1118 tweet ID strings (id_str).

Maybe I missed it but in the list above I could not find ‘bepress’ or ‘elsevier‘, by the way…

#HASTAC2013 Interactive Archive

http://hastac2013.org/
http://hastac2013.org/

Version 2.0 The figshare and HASTAC versions of this post have been updated accordingly.

UPDATE Thursday 2 May 2013, 08:48am BST. 

Unfortunately I did not have time to do a new collection increasing the number of tweets to collect. The initial collection used the default 1500, and even though I did it on the Monday morning (BST time) after the conference the archive did not go back enough (it can only go back 7 days). In retrospect I should have aimed to collect more tweets than the default 1500 the first time around, but I was concerned the script would time out.

I only found some time this morning to try again (script having timed out when I tried 18,000 tweets, which is the maximum output), and using 17,500 at worked this time, taking me as back as 26/04/13, 08:2243, which is more than 24 hours before my previous collection.The Conference information says activities started on 25/04/13 (Thursday) but as the programme and now both #hastac2013 archives confirm the day with the most activity was 27/04/13 (Saturday). Therefore though this new collection does not go as back as the 25th, at least it covers the day before activity peaked. Where the previous archive had 1500 tweets, this new one gathered 3,898.

Here two screeshots of the second archive’s summary charts right after I ran the collection:

3898 tweets collected, archive started 26/04/13 8:22:43. Archive set up by Ernesto Priego using TAGS.
3898 tweets collected, archive started 26/04/13 8:22:43. Archive set up by Ernesto Priego using TAGS.
#HASTAC2013 Tweet Volume Over Time, second collection, with peak on 28/04/13, reaching ∼1200 tweets. Archive set up by Ernesto Priego
#HASTAC2013 Tweet Volume Over Time, second collection, with peak on 28/04/13, reaching ∼1200 tweets. Archive set up by Ernesto Priego

You can see a published interactive archive of this new archive here.

I link to the published spreadsheets from the PDF version of this post that can be downloaded from figshare.

HASTAC2013 Interactive Archive. Ernesto Priego. figshare.
http://dx.doi.org/10.6084/m9.figshare.693045

[Version 1.0 below]

As they describe it themselves, the Humanities, Arts, Science, Technology Advanced Collaboratory (HASTAC – “haystack” hastac.org), is “an organisation at the international forefront of knowledge mobilization for our digital present and innovation in the academy.” I have had the honour to be a HASTAC Scholar blogging at their site since 2010.

2013 marks the 10th anniversary of HASTAC’s founding, and on 25th-28th April they celebrated their decennial conference, titled “The Storm of Progress: New Horizons, New Narratives, New Codes”, in Toronto, Canada.

I was able to participate in the conference via pre-recorded video thanks to Fiona Barnett’s kind invitation on Saturday the 27th. While I as presenting in real life at the Forms of Innovation workshop at Durham, UK, my video was being shown in Toronto! This also means that while colleagues were live-tweeting about my session at #formsinn, they were also live-tweeting from #hastac2013…

Anyway many of us were able to follow the proceedings of the HASTAC conference through a lively Twitter backchannel. I believe the backchannel is a useful research resource on its own, and of course it allows us to perform some ‘meta’ analysis of the network itself. I set up a Google spreadsheet to collect #HASTAC2013 tweets and created an interactive archive that visualises the interactions in real time. (This will make demands from your browser…)

[I have done an intial archive covering only the latest (at the time of publishing) 1500 tweets, as high values may no work due to script timeouts, but I am currently experimenting trying to get the majority of the #hastac2013 output. Will update accordingly. Times from my archiving are GMT].

Screen Shot of a moment in #HASTAC2013 interactive archive, 2013-04-29 at 09.03.13     Screen Shot of a moment in #HASTAC2013 interactive archive, 2013-04-29 at 09.03.13
Screen Shot of a moment in #HASTAC2013 interactive archive, 2013-04-29 at 09.03.13
1500 tweets, 599 RTs, 421 links. Archive started by Ernesto Priego 27/04/2013 18:42:19.
1500 tweets, 599 RTs, 421 links. Archive started by Ernesto Priego 27/04/2013 18:42:19 GMT.
#HASTAC2013 Tweet Volume Over Time, with peak on 28/04/13, reaching ∼500 tweets. Archive set up by Ernesto Priego
#HASTAC2013 Tweet Volume Over Time, with peak on 28/04/13, reaching ∼500 tweets. Archive set up by Ernesto Priego

I have archived and shared a version of this blog post as a PDF on Figshare, so it gets a digital object identifier. Citation is:

HASTAC2013 Interactive Archive. Ernesto Priego. figshare.
http://dx.doi.org/10.6084/m9.figshare.693045

Retrieved 09:12, Apr 29, 2013 (GMT)

As usual, with many thanks to Martin Hawksey.