The 2018 Altmetric Top 100 Outputs with ‘Comics’ as Keyword

As it’s that time of the year and Altmetric has released its 2018 Top 100, in this post I share the 2018 Top 100 research outputs with ‘comics’ as a keyword according to Altmetric.

I queried the data from the Altmetric Explorer, looking for all outputs with this keyword between 13/12/2017 and 13/12/2018. I then refined the data to concentrate only on the Top 100 outputs about comics.

To see the complete Top 100, you can download the dataset I shared on figshare at

Below you can quickly take a look at the top 20 outputs with keyword “comics” ordered by their Altmetric Attention score :

Altmetric Attention Score Title Journal/Collection Title Publication Date
524 Ten simple rules for drawing scientific comics PLoS Computational Biology 04/01/2018
286 Comixify: Transform video into a comics 09/12/2018
154 Teaching Confidentiality through Comics at One Spanish Medical School AMA Journal of Ethics 01/02/2018
99 Bruised and Battered: Reinforcing Intimate Partner Violence in Comic Books Feminist Criminology 17/05/2018
84 Of Microscopes and Metaphors: Visual Analogy as a Scientific Tool The Comics Grid: Journal of Comics Scholarship 10/10/2018
79 The potential of comics in science communication JCOM – Journal of Science Communication 23/01/2018
65 Alter egos: an exploration of the perspectives and identities of science comic creators JCOM – Journal of Science Communication 16/01/2018
61 Using comics to change lives The Lancet 01/01/2018
50 The Question Concerning Comics as Technology: Gestell and Grid The Comics Grid: Journal of Comics Scholarship 24/09/2018
47 A survey of comics research in computer science 16/04/2018
41 Is There a Comic Book Industry? Media Industries 05/06/2018
38 The Utility of Multiplex Molecular Tests for Enteric Pathogens: a Micro-Comic Strip Journal of Clinical Microbiology 24/01/2018
38 Farting Jellyfish and Synergistic Opportunities: The Story and Evaluation
of Newcastle Science Comic
The Comics Grid: Journal of Comics Scholarship 20/03/2018
35 Pitfalls in Performing Research in the Clinical Microbiology Laboratory: a Micro-Comic Strip Journal of Clinical Microbiology 25/09/2018
34 Neural Comic Style Transfer: Case Study 05/09/2018
31 Comics and the Ethics of Representation in Health Care … AMA Journal of Ethics AMA Journal of Ethics 01/02/2018
29 Undemocratic Layout: Eight Methods of Accenting Images The Comics Grid: Journal of Comics Scholarship 25/05/2018
29 Communicating Science through Comics: A Method Publications 30/08/2018
26 Of Cornopleezeepi and Party Poopers: A Brief History of Physicians in Comics … AMA Journal of Ethics AMA Journal of Ethics 01/02/2018
26 On the Significance of the Graphic Novel to Contemporary Literary Studies: A Review of The Cambridge Companion to the Graphic Novel The Comics Grid: Journal of Comics Scholarship 19/09/2018
DOI Altmetric Details Page URL

To see the complete Top 100, you can download the dataset I shared on figshare at

I am obviously very pleased to see The Comics Grid included in the Top 100.

It is interesting to note the diversity of countries associated to the profiles (where the metadata was available) giving attention to the outputs. According to Altmetric, there were 4,588 tweets about research outputs with ‘comics’ as keyword between 13/12/17 and 13/12/18 by 2,866 unique tweeters in 98 different countries. The map looks like this:

Countries and Number of Profiles that Gave Attention to Research Outputs with 'Comics' Keyword between 13/12/17 and 13/12/18 according to Altmetric. Chart by Altmetric Explorer.
Countries and Number of Profiles that Gave Attention to Research Outputs with ‘Comics’ Keyword between 13/12/17 and 13/12/18 according to Altmetric. Chart by Altmetric Explorer.


I shared the countries data on figshare at

For more information and context on Altmetric and using the Altmetric Explorer, see my 2016 post here. Many other posts about alternative metrics and the Altmetric Explorer can be found throghout my blog.


Priego, Ernesto (2018): Altmetric Top 100 Outputs with ‘Comics’ Keyword between 13/12/17 and 13/12/18. figshare. Dataset.

Priego, Ernesto (2018): Countries and Number of Profiles that Gave Attention to Research Outputs with ‘Comics’ Keyword between 13/12/17 and 13/12/18 according to Altmetric. figshare. Dataset.

Metricating #respbib18 and #ResponsibleMetrics: A Comparison

I’m sharing summaries of Twitter numerical data from collecting the following bibliometrics event hashtags:

  • #respbib18 (Responsible use of Bibliometrics in Practice, London, 30 January 2018) and
  • #ResponsibleMetrics (The turning tide: A new culture of responsible metrics for research, London, 8 February 2018).


#respbib18 Summary

Event title Responsible use of Bibliometrics in Practice
Date 30-Jan-18
Times 9:00 am – 4:30 pm  GMT
Sheet ID RB
Hashtag #respbib18
Number of links 128
Number of RTs 100
Number of Tweets 360
Unique tweets 343
First Tweet in Archive 23/01/2018 11:44 GMT
Last Tweet in Archive 01/02/2018 16:17 GMT
In Reply Ids 15
In Reply @s 49
Unique usernames 54
Unique users who used tag only once 26 <–for context of engagement

Twitter Activity

#respbib18 twitter activity last three days
CC-BY. Originally published as


#ResponsibleMetrics Summary

Event title The turning tide: A new culture of responsible metrics for research
Date 08-Feb-18
Times 09:30 – 16:00 GMT
Sheet ID RM
Hashtag #ResponsibleMetrics
Number of links 210
Number of RTs 318
Number of Tweets 796
Unique tweets 795
First Tweet in Archive 05/02/2018 09:31 GMT
Last Tweet in Archive 08/02/2018 16:25 GMT
In Reply Ids 43
In Reply @s 76
Unique usernames 163
Unique usernames who used tag only once 109 <–for context of engagement

Twitter Activity

#responsiblemetrics Twitter activity last three days
CC-BY. Originally published as

#respbib18: 30 Most Frequent Terms


Term RawFrequency
metrics 141
responsible 89
bibliometrics 32
event 32
data 29
snowball 25
need 24
use 21
policy 18
today 18
looking 17
people 16
rankings 16
research 16
providers 15
forum 14
forward 14
just 14
practice 14
used 14
community 13
different 12
metric 12
point 12
using 12
available 11
know 11
says 11
talks 11
bibliometric 10

#ResponsibleMetrics: 30 Most Frequent Terms

Term RawFrequency
metrics 51
need 36
research 29
indicators 25
panel 16
responsible 15
best 13
different 13
good 13
use 13
index 12
lots 12
people 12
value 12
like 11
practice 11
context 10
linear 10
rankings 10
saying 10
used 10
way 10
bonkers 9
just 9
open 9
today 9
universities 9
coins 8
currency 8
data 8


Twitter data mined with Tweepy. For robustness and quick charts a parallel collection was done with TAGS. Data was checked and deduplicated with OpenRefine. Text analysis performed with Voyant Tools. Text was anonymised through stoplists; two stoplists were applied (one to each dataset), including usernames and Twitter-specific terms (such as RT,, HTTPS, etc.), including terms in hashtags. Event title keywords were not included in stoplists.

No sensitive, personal nor personally-identifiable data is contained in this data. Any usernames and names of individuals were removed at data refining stage and again from text analysis results if any remained.

Please note that both datasets span different number of days of activity, as indicated in the summary tables. Source data was refined but duplications might have remained, which would logically affect the resulting term raw frequencies, therefore numbers should be interpreted as indicative only and not as exact measurements.  RTs count as Tweets and raw frequencies reflect the repetition of terms implicit in retweeting.


As usual I share this hoping others might find interesting and draw their own conclusions.

A very general insight for me is that we need a wider group engaging with this discussions. At most we are talking about a group of approximately 50 individuals that actively engaged on Twitter on both events.

From the Activity charts it is noticeable that tweeting recedes at breakout times, possibly indicating that most tweeting activity is coming from within the room– when hashtags create wide engagement, activity is more constant and does not exactly reflect the timings of actual real-time activity in the room.

It seems to me that the production, requirement, use and interpretation of metrics for research assessment directly affects everyone in higher education, regardless of their position or role. The topic should not be obscure or limited to bibliometricians and RDM, Research and Enterprise or REF panel people.

Needless to say I do not think everyone ‘engaged’ with these events or topics is or should be actively using the hashtag on Twitter (i.e. we don’t know how many people followed on Twitter). An assumption here is that we cannot detect nor measure anything if there is not a signal– more folks elsewhere might be interested in these events but if they did not use the hashtag they were logically not detected here. That there is no signal measurable with the selected tools does not mean there is not a signal elsewhere, and I’d like this to be a comment on metrics for assessment as well.

In terms of frequent terms it remains apparent (as in other text analyses I have performed on academic Twitter hashtag archives) that frequently tweeted terms remain ‘neutral’ nouns, or adjectives if they are a keyword in the event’s title, subtitle or panel sessions (e.g. ‘responsible’). When a term like ‘snowball’ or ‘bonkers’ appears, it stands out. Due to the lack of more frequent modifiers, it remains hard to distant-read sentiment or critical stances, or even positions. Most frequent terms do come from RTs, not because of consensus in ‘original’ Tweets.

It seems that if we wanted to demonstrate the value added by live-tweeting or using an event’s hashtag remotely, quantifying (metricating?) the active users, tweets over time, days of activity and frequent words would not be the way to go for all events, particularly not for events with relatively low Twitter activity.

As we have seen, automated text analysis is more likely to reveal mostly-neutral keywords, rather than any divergence of opinion on or additions to the official discourse. We would have to look at those words less repeated, and perhaps to replies that did not use the hashtag, but this is not recommended as it would complicate things ethically: though it is generally accepted that RTs do not imply endorsement, less frequent terms in Tweets with the hashtag could single-out individuals, and if a hashtag was not included on a Tweet it should be interpreted the Tweet is not meant to be part of that public discussion/corpus.





Questions of Access in the Digital Humanities: Data from JDSH

[On 8 August 2017, this post was selected as Editor’s Choice in Digital Humanities Now at]

[N.B. As usual, typos might still be present when you read this; this blog post is likely to be revised post-publication… thanks for understanding. This blog is a sandbox of sorts].

Para Domenico, siempre en deuda

tl;dr, scroll down to the charts

I used The Altmetric Explorer to locate any  articles from the Journal of Digital Scholarlship in the Humanities that had had any ‘mentions’ online anytime. An original dataset of 82 bibliographic entries was obtained. With the help of Joe McArthur the Open Access Button API was then employed to detect if any of the journal articles in the dataset had open access surrogates (for example, self-archived versions in institutional repositories) and if so, which content they actually provided access to. The API located 24 URLs of the 82 DOIs corresponding to each article in the dataset.

I then edited and refined the original dataset to include only the top 60 results. Each result was manually refined and cross-checked to verify the resulting links matched the correct outputs and to what kind of content they provided access to, as well as to identify the type of license and type of access of each article’s version of record.

A breakdown of the findings below:

Visualisation of numeralia from the JDSH 60 Articles Altmetric-OA Button Dataset

(Note numbers re OA Button results will not add up as there are overlaps and some results belong to categories not listed).

It must be highlighted that only one of the links located via the Open Access Button API provided access to an article’s full version.

This disciplinarily-circumscribed example from a leading journal in the field of the digital humanities provides evidence for further investigations into the effects of publishers’ embargos on the ability of institutional open access repositories to fufill their mission effectively.

The dataset was openly shared on figshare as

Priego, Ernesto (2017): A Dataset Listing the Top 60 Articles Published in the Journal of Digital Scholarship in the Humanities According to the Altmetric Explorer (search from 11 April 2017), Annotated with Corresponding License and Access Type and Results, when Available, from the Open Access Button API (search from 15 May 2017). figshare.


The Wordy Thing

Back in 2014, we suggested that “altmetrics services like the Altmetric Explorer can be an efficient method to obtain bibliographic datasets and track scholarly outputs being mentioned online in the sources curated by these services” (Priego et al 2014).  That time we used the Explorer to analyse a report obtained by searching for the term ‘digital humanities’ in the titles of outputs mentioned anytime at the time of our query.

It’s been three years since I personally presented that poster at DH2014 in Lausanne, but the topic of publishing pracitices within the digital humanities keeps being of great interest to me. It could be thought of as extreme academic navel-gazing, this business of deciding to look into bibliometric indicators and metadata of scholarly publications. For the digital humanities, however, questions of scholarly communications are questions of methodology, as the technologies and practices required for conducting research and teaching are closely related to the technologies and practices required to make the ‘results’ of teaching and research available. For DH insiders, this is closely connected to the good ol’ less-yacking-more-hacking, or rather, no yacking without hacking. Today, scholarly publishing is all about technological infrastructure, or at least about an ever-growing awareness of the challenges and opportunities of ‘hacking’ the modes of scholarly production.

Moreover, the digital humanities have also been for long preoccupied with the challenges in getting digital scholarship recoginsed and rewarded, and, also importantly, about the difficulties to ensure the human, technical and financial preconditions of sustainability. Scholarly publishing, or more precisely ‘scholarly communications’ as we prefer to say today, are also very much focused on those same concerns. If form and content are unavoidably interlinked and codependent in digital humanities practice, surely issues regarding the so-called ‘dissemination’ of said practice through publications remain vital to its development.

Anyway, I have now finally been able to share a dataset based on a report from the Altmetric Explorer looking into the articles published at the Journal of Digital Scholarship in the Humanities (from now on JDSH), one of the (if not the) leading journal in the field of digital humanities (it was previously titled Literary and Linguistic Computing). I first started looking into which JDSH articles were being tracked by Altmetric as mentioned online for the event organised by Domenico Fiormonte  at the University Roma Tre in April this year (the slides from my participation are here).

My motivation was no only to identify which JDSH outputs (and therefore authors, affiliations, topics, methodologies) were receiving online attention according to Altmetric. I wanted, as we had done previously in 2014, to use an initial report to look into what kind of licensing said articles had, whether they were ‘free to read’, paywalled or labeled with the orange open lock that identifies Open Access outputs.

Back in 2014 we did not have the Open Access Button nor its plugin and API. With it I had the possibility to try to check if any of the articles in my dataset had any openly/freely available versions through the Button. I contacted Joe McArthur from the Button to enquire whether it would be possible to run a list of DOIs through their API in bulk. It was, and we obtained some results.

Here’s a couple of very quick charts visualising some insights from the data.

It should also be highlighted that of the 6 links to institutional repository deposits found via the Open Access Button API, only one gave open access to the full version of the article. The rest were either metatada-only deposits or the full versions were embargoed.

As indicated above, the 60 ‘total articles’ refers to the number of entries in the dataset we are sharing. There are many more articles published in JDSH. The numbers presented represent only the data in question which is in turn the result of particular methods of collection and analysis.

In 2014 we detected that “the 3 most-mentioned outputs in the dataset were available without a paywall”, and we thought that could indicate “the potential of Open Access for greater public impact.” In this dataset, the three articles with the most mentions are also available without a paywall. The most mentioned article is the only one in the set that is licensed with a CC-BY license. The two that follow are ‘free’ articles that require permission for reuse.

The data presented is the result of the specific methods employed to obtain the data. In this sense this data represents as much a testing of the technologies employed as of the actual articles’ licensing and open availability. This means that data in columns L-P reflect the data available through the Open Access Button API at the moment of collection. It is perfectly possible that ‘open surrogates’ of the articles listed are available elsewhere through other methods. Likewise, it is perfectly possible that a different corpus of JDSH articles collected through other methods (for example, of articles without any mentions as tracked by Altmetric) have a different proportion of license and access types etc.

As indicated above the licensing and access type of each article were identified and added manually and individually. Article DOI’s were accessed one by one with a computer browser outside/without access to university library networks, as the intention was to verify if any of the articles were available to the general public without university library network/subscription credentials.

This blog post and the deposit of the data is part of a work in progress and is shared openly to document ongoing work and to encourage further discussion and analyses. It is hoped that quantitative data on the limited level of adoption of Creative Commons licenses and Institutional Repositories within a clearly-circumscribed corpora can motivate reflection and debate.


I am indebted to Joe McArthur for his kind and essential help cross-checking the original dataset with the OA Button API, and to Euan Adie and all the Altmetric team for enabling me to use the Altmetric Explorer to conduct research at no cost.

Previous Work Mentioned

Priego, Ernesto; Havemann, Leo; Atenas, Javiera (2014): Online Attention to Digital Humanities Publications (#DH2014 poster). figshare. Retrieved: 18:46, Aug 04, 2017 (GMT).

Priego, Ernesto; Havemann, Leo; Atenas, Javiera (2014): Source Dataset for Online Attention to Digital Humanities Publications (#DH2014 poster). figshare. Retrieved: 17:52, Aug 04, 2017 (GMT)

Priego, Ernesto (2017): Aprire l’Informatica umanistica / Abriendo las humanidades digitales / Opening the Digital Humanities. figshare. Retrieved: 18:00, Aug 04, 2017 (GMT)

The 2016 Altmetric Top 100 Outputs with ‘Comics’ as Keyword


Any frequent readers of this blog will be aware I am interested in article level metrics. I am particularly interested in the work done by Altmetric. Last week they published their annual top 100 list. I wrote this post about it.

 The Altmetric Explorer is a tool for measuring the attention that scholarly articles receive online, and its intuitive user interface works as a live searchable database that allows users to browse the journals and repositories Altmetric tracks and obtain detailed reports.

On a weekly basis Altmetric captures hundreds of thousands of tweets, blog posts, news stories, Facebook walls and other content that mentions scholarly articles on the Web. The Explorer can browse, search and filter this data. The data can be exported by the user as ‘reports’ as simple text or spreadsheets, which can be then analysed in different forms. For example, The Explorer provides demographic data of the Twitter users found mentioning specific outputs, and thus works as a mechanism for the study of academic users of social media.

In the past few years I have often suggested, online, in talks, workshops and lectures, that the Altmetric Explorer can be useful to researchers as well. Librarians with access to the tool can help students and researchers get new views of recent articles that are receiving attention online. People often focus on ‘altmetrics’ as indicators of online activity around published outputs, but I often insist the Altmetric Explorer is useful as well as a tool for searching, discovering, collecting, creating, archiving, sharing and analysing bibliographic reference collections as datasets including not just bibliographic data including identifiers and/or URLs but also historical data of any metrics the service has tracked and quantified at the time of the data query/collection.

Inspired by Altmetric’s annual Top 100 list I used the Altmetric Explorer to search for the top articles with keyword ‘comics’ mentioned in the past 1 year. I did this particular search on the morning of Tuesday 20 December 2016. Dating the collection (and indicating the specific query) is always important as social media metrics are hopefully dynamic and not static (i.e. we expect an output’s altmetrics to change over time).

After my query I saved as usual my search as  a ‘workspace’ on the app and then exported the dataset as a CSV file. I then manually cleaned and refined the data to obtain a file listing the top 100 references specifically on comics including their altmetrics. Data refining was needed to ensure the list included articles about comics, eliminating any non-relevant outputs (i.e. they were not about comics) and to correct text rendering errors, add missing data (like output titles when missing from the initial export) and limit the set to only 100 items by deleting the extra outputs.*

I have deposited and shared the dataset as

Priego, Ernesto (2016): The 2016 Altmetric Top 100 Outputs with ‘Comics’ as Keyword Mentioned in the Past 1 Year. figshare. Retrieved: 17 06, Dec 21, 2016 (GMT)

Hopefully it will be of interest to some of you out there. For comparison here’s these other datasets I have deposited on figshare in previous years:

Priego, Ernesto (2015): Almetrics of articles from the comics journals mentioned at least once in the past 1 year as tracked by Altmetric (20 August 2015). figshare. Retrieved: 17 21, Dec 21, 2016 (GMT)


Priego, Ernesto (2014): Comics Journals Articles Tracked by Altmetric in the last year (Dec 2013-Dec 2014). figshare. Retrieved: 17 23, Dec 21, 2016 (GMT)


Though the two datasets above are outputs from different search queries (focusing on specific comics journals tracked by Altmetric rather than in any articles with keyword ‘comics’) we should we able to continue collecting data for future transversal studies.

Having yearly datasets obtained from the same queries, over a series of years, would provide evidence of comics scholarship’s presence online, and of the field’s (and Altmetric’s)  evolving practices.

*It is possible the degree of relevance varies. Some outputs do not have ‘comics’ in their title but do discuss comics, for example ‘A randomized study of multimedia informational aids for research on medical practices: Implications for informed consent’ (Kraft et al 2016). It is possible however that a non-comics article or two remained, if you spot one do please let me know or leave a comment on the figshare output and I will correct and create a new version. It might also be noted that various outputs included are from The Conversation, which is not an academic journal, but it is tracked by Altmetric as it focuses on academic research news written by academics. For information and context about how Altmetric sources the data please read this.

Insights from the Altmetric Top 100 2016

Altmetric Top 100 2016 Affiliations. via Altmetric

The Altmetric Top 100 2016 was published yesterday. If you click on the green ‘read more about this list’ button, you’ll see useful analysis of the data.

[I also wrote about the Altmetric Top 100 2014, here and here.]

It’s very welcome that this year Altmetric has deposited the article and affiliation data as two datasets as a collection on figshare:

Engineering, Altmetric (2016): Altmetric Top 100 2016. figshare. Retrieved: 11 24, Dec 14, 2016 (GMT)

This time the source data provides greater insights, particularly the article’s access type  (Open Access, ‘Free’ or paywalled), type of content (article, letter, etc.) and subject.

Altmetric has already provided an analysis of this data (percentage of OA outputs in the list; countries of affiliations, institutions etc.) but having access to the source data means their analysis, visualisations and findings are actually reproducible (reproducibility was identified as a topic gaining interest; see Cat Williams’ post here). By providing access to the source data openly, other types of analysis are not only possible but encouraged (for example text and content analysis of the top 100 output titles).

One insight for me is that this list again demonstrates the dominance of the usual countries of affiliation, and up to a certain extent of the same journals (considering that Altmetric tracks a selection of publications, not all publications that exist).

I was interested in finding out whether the Top 100 would include any articles authored or coauthored by researchers with a Mexican institution as affiliation. There are two:

  1. A genome-wide association scan in admixed Latin Americans identifies loci influencing facial and scalp hair features. Nature Communications 7, Article number: 10815 (2016) doi:10.1038/ncomms10815 (Published online:01 March 2016)
  2. Beverage purchases from stores in Mexico under the excise tax on sugar sweetened beverages: observational study. BMJ 2016;352:h6704 doi: (Published 06 January 2016)

It is notable that both articles are the result of international coauthorship; the Nature Communications article including authors from other Latin American countries (Argenitna, Chile, Colombia); the BMJ one from Mexico and the United States. Importantly, both articles are open access.

I was also interested in seeing whether any Information Science or Computer Science research had made it into the list. There is only one article whose subject was categorised as “Information and Computer Sciences”:

Mastering the game of Go with deep neural networks and tree search. Nature 529,484–489 (28 January 2016) doi:10.1038/nature16961

This is a paywalled article authored by a team of 21 authors with Google DeepMind (London, UK) as affiliation.

I believe access to this data is useful to understand the evolving landscape of scholarly communications. It can also help us authors to gain insights into what kind of research is receiving attention online.

For example, the data seems to contribute to a body of encdotal and bibliometric evidence indicating that, for researchers with affiliations in ‘developing’ nations,  open access and international collaboration remains key to greater visibility.

This year’s data also shows, again, that some countries (in the case of Africa, a whole continent), fields, and journals, remain under-represented or not present at all. It should also be noted that the only Computer Science article in the list is not by researchers affiliated to universities but to Google.

Yesterday I tweeted some quick thoughts after checking out the datasets, and compiled them using the new-ish ‘Moments’ feature on Twitter, which, for what it’s worth, I have embedded below.

[I also wrote about the Altmetric Top 100 2014, here and here.]


Engineering, Altmetric (2016): Altmetric Top 100 2016. figshare. Retrieved: 11 24, Dec 14, 2016 (GMT)

Inequality, Paywalled: Update from a Literature Review

I have done revisions to this post since publication.

[I don’t have time. What is this about?

My view is that altmetrics are not merely tools for the measurement of  online attention but tools that can help us discover the literature that is being tracked as mentioned. I used the Altmetric Explorer as a tool to discover articles about inequality. I cleaned the data into three tables to reflect only the articles that interested me from three journals and then checked them for access and license type. Most are paywalled and if free access the licensing is not clear. Scroll down to see the tables, or download the dataset here.

It’s better if you read the post, though. ;-) ]



Keep off the grass, photo CC-BY Attribution Some rights reserved by Kyknoord
Photo CC-by Kyknoord. Some Rights Reserved.


Using the Altmetric Explorer to Discover Literature

I‘ve been doing some research on the concept of ‘inequality’ from an economic and sociological perspective to add background to ongoing research on academic publishing and ‘monopolies of knowledge‘. I am interested in finding out more about the potential relationships between inequality of access to information (particularly access to peer-reviewed research publications) and other forms of inequality affecting social and economic development.

As you may (or not) know I am also interested in the potential for altmetrics as tools to help us in the discovery of research outputs. Some may not like it but needless to say people do search for and discover all sorts of information online. To give an example, these days many of us rarely get invited to a party with a paper invitation sent on the post (unless it’s a wedding, and even that is culture and country-dependent now); it’s likely, however, that there will be a Facebook invite, an Instagram account, or an email. OK, you may hate weddings or have never been invited to one. You must like music. If you are reading this you are likely to know people who discover new (and old!) music by looking into what other people listen to on apps like Spotify or Soundcloud, etc. (Yes, this sounds so old and so obvious!). We trust other people to recommend us stuff. (Think of how many of us travel today: TripAdvisor is a good example too).

More to the point, libraries and library web sites are no longer the only gateways to academic information (why should they be?). You don’t have to be a declared open education advocate to share, search for and discover interesting materials on Slideshare or YouTube. The distinction between ‘social networking’ or ‘social media’ sites and the rest of the Web is at best artificial: most platforms today imply inter-linking and therefore social interaction. Surely, I think, web platforms tracking social media activity like Altmetric can be used to discover what research people are mentioning online. One does not need a personal or institutional Altmetric account to discover other outputs from the articles themselves when they have Altmetric widgets. In other words, my view is that altmetrics are not merely tools for the measurement of  online attention but tools that can help us discover the literature that is being tracked as mentioned.

The bibliography collection is an important part of a literature review. We may collect bibliography we are interested in reading before we properly review or collect as we read/review (hopefully once one is reading one follows leads in an article, checks the references and notes, clicks on links, gets elsewhere). To discover published research I have used the Altmetric Explorer  many times before (see, as an example, “Ebola: Access and Licenses of 497 Papers Crowdsourced in 7 Days”, 14/08/2014).

Three Sets of Articles on Inequality

Recently I have been using it to search for articles on the topic of ‘inequality’. I am interested in which articles on this topic are being tracked by Altmetric as mentioned online, but I am also interested in the access and license types of the outputs tracked.

As I do normally in my research workflow I have been exporting the results of my searches and then cleaning the data. I do this by manually applying spreadsheet filters and adding and deleting columns, and using OpenRefine to deduplicate and standarise the data. I then check each output (i.e. I click on each link) and make a note whether I can access the full version without academic library credentials or not.

In this case I am sharing with you three sets of articles, each corresponding to a different journal that has published articles on inequality that have been tracked as mentioned online by Altmetric within the last year. In the tables below I have left the Altmetric score in timeframe (one year) in the first column and have organised the outputs in that order (from the highest score to the lowest). Having checked each article one by one manually not using any institutional credentials or IP, I have indicated in the last column the access type of each article. As Altmetric scores can change over time often quite quickly I have also left the most recent mention online according to Altmetric. This is of course not live data so it merely reflects the score and the most recent mention at the time of my data collection.

Information, Communication & Society

Altmetric Score in timeframe Title URL Most recent mention online according to Altimetric Access Type


Racial formation, inequality and the political economy of web traffic Tue, 16 Aug 2016 20:46:58 +0000 Free access. License not clear. 


The Trend of Class, Race, and Ethnicity in Social Media Inequality Fri, 26 Apr 2013 19:03:50 +0000 Paywalled


Social networking sites and low-income teenagers: between opportunity and inequality Mon, 07 Mar 2016 11:57:49 +0000 Paywalled


The contemporary US digital divide: from initial access to technology maintenance Fri, 19 Jun 2015 16:17:31 +0000 Paywalled


The Digital Production Gap in Great Britain Wed, 31 Jul 2013 14:46:59 +0000 Paywalled


Reconceptualizing Digital Social Inequality Tue, 02 Feb 2016 20:04:21 +0000 Paywalled


The disability divide in internet access and use Tue, 06 Dec 2011 17:05:28 +0000 Paywalled


Mapping the two levels of digital divide: Internet access and social network site adoption among older adults in the USA Tue, 24 Nov 2015 18:43:40 +0000 Paywalled

British Journal of Sociology

Altmeric Score in timeframe Title URL Most recent mention Access Type


After Piketty? Wed, 30 Sep 2015 08:47:42 +0000 Paywalled


Capital in the twenty‐first century: a critique Thu, 07 May 2015 09:55:36 +0000 Paywalled


Gendering inequality: a note on Piketty’s Capital in the Twenty-First Century. Thu, 07 May 2015 09:55:58 +0000 Paywalled


The politics of Piketty: what political science can learn from, and contribute to, the debate on Capital in the Twenty-First Century Thu, 21 May 2015 13:11:02 +0000 Paywalled


Income inequality, poverty and crime across nations Mon, 15 Dec 2014 18:00:58 +0000 Paywalled


Why ‘class’ is too soft a category to capture the explosiveness of social inequality at the beginning of the twenty‐first century Tue, 13 Jan 2015 00:56:42 +0000 Free access. Permissions required via RightsLink.


Where’s the capital? A geographical essay. Thu, 07 May 2015 09:56:17 +0000 Paywalled


Capital and time: uncertainty and qualitative measures of inequality. Thu, 07 May 2015 09:55:00 +0000 Paywalled


Class and comparison: subjective social location and lay experiences of constraint and mobility Sun, 31 Jul 2016 08:34:47 +0000 Paywalled


Alleviating poverty or reinforcing inequality? Interpreting micro-finance in practice, with illustrations from rural China. Sat, 17 Oct 2015 07:50:32 +0000 Paywalled


Configurations of gender inequality: the consequences of ideology and public policy1 Mon, 02 Mar 2015 00:00:00 +0000 Paywalled


Who do you think they were? How family historians make sense of social position and inequality in the past Mon, 07 Jan 2013 09:25:05 +0000 Paywalled


Tom Clark and Anthony Heath 2015 [2014] Hard Times: Inequality, Recession, Aftermath, Aftermath, New Haven and London: Yale University Press Mon, 28 Sep 2015 10:59:31 +0000 Paywalled


Cultural capital or relative risk aversion? Two mechanisms for educational inequality compared1 Thu, 03 Mar 2016 09:20:51 +0000 Paywalled


Piketty’s capital and social policy. Wed, 24 Dec 2014 10:30:35 +0000 Paywalled


Declining inequality? The changing impact of socio-economic background and ability on education in Australia Tue, 18 Sep 2012 02:04:48 +0000 Paywalled

Journal of Economic Perspectives

Altmeric Score in timeframe Title URL Most recent mention Access Type


Why Hasn’t Democracy Slowed Rising Inequality? Mon, 15 Aug 2016 12:20:10 +0000 Free Access (“Complimentary”)


Income Inequality, Equality of Opportunity, and Intergenerational Mobility Wed, 25 May 2016 10:20:30 +0000 Free Access (“Complimentary”)


The Top 1 Percent in International and Historical Perspective Tue, 02 Aug 2016 14:35:17 +0000 Free Access (“Complimentary”)


Consumption Inequality Sun, 28 Aug 2016 09:59:44 +0000 Free Access (“Complimentary”)


The Rise and Decline of General Laws of Capitalism Wed, 03 Aug 2016 00:00:00 +0000 Free Access. License not clear


The Inheritance of Inequality Wed, 27 Jul 2016 22:41:32 +0000 Free Access (“Complimentary”)


Crime, the Criminal Justice System, and Socioeconomic Inequality Fri, 08 Jul 2016 22:54:56 +0000 Free Access (“Complimentary”)


Pareto and Piketty: The Macroeconomics of Top Income and Wealth Inequality Tue, 10 Nov 2015 08:02:28 +0000 Free Access. License not clear.

I am not sure if this humble blog would be tracked by Altmetric so (ironically) I may or may not be contributing to the Altmetric score of the outputs above as I am linking to them. (It is insightful that altmetrics can be tracked when people have reached merely abstracts but not full texts). In this instance I am not listing them above because I necessarily recommend them but as a small sample of articles on inequality from recognised journals, noting their access type.

I do not know if the authors of these articles have deposited open access versions of these papers in their respective institutional repositories or elsewhere (if you are so inclined, you can check the three journals’ archiving policies here), and I am not publishing this post because I cannot personally access the articles above (so thank you very much indeed but please do not contact me, dear reader, to offer me the PDFs via email or Twitter). I am not saying the articles above are all there is on the subject; I am just sharing those results and detailing their access type (which you can’t easily get unless you click on them and try to access them, and even if you can access them -this means full versions- you may find it difficult to tell why you happen to have access to them).

In this post I have wanted to make a very simple point: following the links to the publishers’ versions of record of these articles discovered via the Altmetric Explorer, the access conditions were the ones detailed above.


It could be argued that as an academic I have used the wrong tool to access these resources. It can be said that in my case, as an academic based in London, UK, it is my fault to expect to access these resources from outside my library (you say you can’t access them, dear reader? Your fault!) What I am trying to do here is try to see and share what happens when someone who normally has access to this kind of research steps out from their traditional/standard discovery tools and/or position of privilege. If you don’t have the right credentials, how much can you access? [I must also note that the Altmetric Explorer requires registration and normally membership too; however, all the links listed above can be reached via regular search engines and Google Scholar].

Things are changing slowly but academics’ distrust and complaints about the low quality and lack of trustworthiness of information found on the Web are common, but at the same time we have allowed paywalled online academic journals to remain (to me weirdly) disconnected from the rest of the Web, with links leading to abstracts that promise you a full version if you pay or have the right library credentials. This breaks the flow of information that has made the Web the amazing invention it is, and contributes to the separation between the outputs of higher education and the ‘general’ public.

In my opinion it is a serious problem that if you don’t have the right credentials then so much detective work is required to access some important research (or to elucidate articles’ licensing conditions, even if they are ‘free’ or ‘complimentary’). Others, as we know, can’t be bothered at all and merely jump all the hoops, against all policies. The more barriers you impose, the more people will want to circumvent them. Ideally.

In reality, it is more likely that paywalled outputs remain inaccessible/invisible to the larger public, and perhaps even more to those affected by the very conditions studied in them. Even as an academic or student in an elite institution it is often hard (read: not straight-forward, not friction-free) to access them! A non-academic searching for this research online is likely to have already transcended many of the structural barriers created by inequality. Once you finally get to an interesting article, how great it must be then to be greeted by a huge ‘pay or keep off’?

Some might say my hypothetical non-academic individual seeking access does not really exist. Some have suggested to me that there is no evidence there is interest from the public, and that those who have access are the only ones interested. That the non-academic public wouldn’t understand the research anyway. That those interested could try harder to find surrogates. That in case they exist they are likely to know people who can ‘share’ the research with them anyway. The list of justifications of the current system can be long.

Having lived, studied and worked in a developing country I know intelligent, curious, well-informed bilingual individuals who have no access to versions of record do exist. This is people who face the inequalities of access to scientific information. They may be relatively privileged, because they have transcended the most pressing needs to enable them to seek out research. This, however, does not mean they do not exist and that their needs are not important.

I know interested individuals that are not academics exist here in the UK too. I also know for a fact that there are academics worldwide who do not have access to a lot of paywalled research. I am often one of them myself. I know there are others because I know them personally and because we know that not all libraries can afford to subscribe to the same ‘bundles’ (for the latter there is a growing body of evidence).  My personal experience does not count as scientific evidence, but it matters to me and I know it matters to others. I question why we assume that if there is supposedly no current public demand for research then it is acceptable to paywall it and not encourage further public interest and demand.

I am aware it is getting boring because I have been repeating this for several years know, but legal ‘frictionless sharing‘ wouldn’t go amiss, especially for this type of research. We call it “open access”.


Priego, Ernesto (2016): Inequality: Three sets of Journal Article Titles and URLs/DOIs from Three Different Journals, with Altmetric Score in Timeframe (1year), Last Mention at the Time of Collection and Access Type Noted. figshare.  [CC-0].

A #HEFCEmetrics Twitter Archive (Friday 16 January 2015, Warwick)

HEFCE logo

The HEFCE metrics workshop: metrics and the assessment of research quality and impact in the arts and humanities took place on Friday 16 January 2015, 1030 to 1630 GMT at the Scarman Conference Centre, University of Warwick, UK.

I have uploaded a dataset of 821 Tweets tagged with #HEFCEmetrics (case not sensitive):

Priego, Ernesto (2015): A #HEFCEmetrics Twitter Archive (Friday 16 January 2015, Warwick). figshare.

TheTweets in the dataset were publicly published and tagged with #HEFCEmetrics between 16/01/2015 00:35:08 GMT and 16/01/2015 23:19:33 GMT. The collection period corresponds to the day the workshop took place in real time.

The Tweets contained in the file were collected using Martin Hawksey’s TAGS 6.0. The file contains 2 sheets.

Only users with at least 2 followers were included in the archive. Retweets have been included. An initial automatic deduplication was performed but data might require further deduplication.

Please note the data in this file is likely to require further refining and even deduplication. The data is shared as is. The contents of each Tweet are responsibility of the original authors. This dataset is shared to encourage open research into scholarly activity on Twitter. If you use or refer to this data in any way please cite and link back using the citation information above.

For the #HEFCEmetrics Twitter archive corresponding to the one-day workshop hosted by the University of Sussex on Tuesday 7 October 2014, please go to

Priego, Ernesto (2014): A #HEFCEmetrics Twitter Archive. figshare.

You might also be interested in

Priego, Ernesto (2014): The Twelve Days of REF- A #REF2014 Archive. figshare.

#HEFCEMetrics: More on Metrics for the Arts and Humanities

Today I’ll participate via Skype at the HEFCE Metrics and the assessment of research quality and impact in the Arts and Humanities workshop, commissioned by the independent review panel. I share below some notes. For previous thoughts on metrics for research assessment, see my 23 June 2014 post.

What metrics?

Traditionally, two main form of metrics have been used to measure the “impact” of academic outputs: usage statistics and citations.

“Usage statistics” usually refers to mainly two things: downloads and page views (they are often much more than that though). These statistics are often sourced from individual platforms through their web logs and Google Analytics. Some of the data platform administrators have collected from web logs and Google Analytics apart from downloads and page views  include indicators of what type of operating systems and devices are being used to access content and landing pages for most popular content. This data is often presented in custom-made reports that collate the different data, and the methods of collection and collation vary from platform to platform and user to user. The methods of collection are not transparent and often not reproducible.

Citations on the other hand can be obtained from proprietary databases like Scopus and Web of Knowledge, or from platforms like PubMed (in the sciences) Google Scholar, and CrossRef. These platforms have traditionally favoured content from the sciences (not the arts and humanities). Part of the reason is that citations are more easily tracked when the content is published with a Digital Object Identifier, a term that remains largely obscure and esoteric to many in the arts and humanities. Citations traditionally take longer to take place, and therefore take longer to collect. Again, the methods for their collection are not always transparent, and the source data is more often than not closed rather than open. Citations privilege more ‘authoritative’ content from publishers that provide and count with the necessary infrastructure, and that has been available for a longer time.


Altmetrics is “the creation and study of new metrics based on the Social Web for analyzing, and informing scholarship.” (Priem et al 2010). Altmetrics normally employ APIs and algorithms to track and create metrics from the activity on the web (normally social media platforms such as Twitter and Facebook, but also from online reference managers like Mendeley and tracked news sources) around the ‘mentioning’ (i.e. linking) of scholarly content. Scholarly content is recognised by their having an identifier such as a DOI, PubMed ID, ArXiv ID, or Handle.  This means that outputs without these identifiers cannot be tracked and/or measured.  Altmetrics are so far obtained through third-party commercial services such as, and ImpactStory.

Unlike citations, altmetrics (also known as “alternative metrics” or “article-level metrics” when usage statistics are included too) can be obtained almost immediately, and since in some cases online activity can be hectic the numbers can grow quite quickly. Altmetrics providers do not claim to measure “research quality”, but “attention”; they agree that the metrics alone are not sufficient indicators and that therefore context is always required.  Services like Altmetric, ImpactStory and PlumX have interfaces that collect the tracked activity in one single platform (that can also be linked to with widgets embeddable on other web pages). This means that these platforms also function like search and discovery tools where users can explore the “conversarions” happening around an output on line.

The rise of altmetrics and a focus on their role as a form or even branch of bibliometrics, infometrics, webometrics or scientometrics (Cronin, 2014) has taken place in the historical and techno-sociocultural context of larger transformations in scholarly communications. The San Francisco Declaration on Research Assessment (DORA, 2012) [PDF], for example, was prompted by the participation of altmetrics tools developers, researchers and open access publishers, making the general recommendation of not using journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual researcher’s contributions, or in hiring, promotion, or funding decisions.

The technical and cultural premise of altmetrics services is that if academics are using social media (various web services such as Twitter and Facebook only made possible by APIs) to link to (“mention”) online academic outputs, then a service “tapping” into those APIs would allow users such as authors, publishers, libraries, researchers and the general public to conduct searches across information sources from a single platform (in the form of a Graphical Unit Interface) and obtain results from all of them. Through an algorithm, it is possible to quantify, summarise and visualise the results of those searches.

The prerequisites for altmetrics compose a complex set of cultural and technological factors. Three infrastructural factors are essential:

  1. Unlike traditional usage statistics, altmetrics can only be obtained if the scholarly outputs have been published online with Digital Object Identifiers or other permanent identifiers.
  2. The online platforms that might link to these outputs need to be known, predicted and located by the service providing the metrics.
  3. Communities of users must exist using the social media platforms tracked by altmetrics services linking to these outputs.

The scholarly, institutional, technological, economic and social variables are multiple and platform and culture-dependent, and will vary from discipline to discipline and country to country.

Standards and Best Practices

Michelle Dalmau, Dave Scherer, and Stacy Konkiel led The Digital Library Federation Forum 2013 working session titled Determining Assessment Strategies for Digital Libraries and Institutional Repositories Using Usage Statistics and Altmetrics, and produced a series of recommendations for “developing best practices for assessment of digital content published by libraries”. Dalmau et al emphasised the importance of making data and methods of collection transparent, as well as of including essential context with the metrics.

As open access mandates and the REF make “impact” case studies more of a priority for researchers, publishers and institutions, it is important to insist that any metrics and their analysis, provided by either authors, publishers, libraries or funding bodies, should be openly available “for reuse under as permissive a license as possible” (Dalmau, Scherer and Konkiel).

Arts and Humanities

If altmetrics are to be used in some way for research assessment, the stakeholders involded in arts and humanities scholarly publishing need to understand the technical and cultural prerequisites for altmetrics to work. There are a series of important limitations that justify scepticism towards altmetrics as an objective “impact” assessment method. A bias towards Anglo-american and European sources, as well as for STEM disciplines, casts a shadow on the growth of altmetrics for non-STEM disciplines (Chimes, 2014). A prevalence of academic journals, particularly in the arts and humanities, have yet to have a significant, sustainable online presence, and many still lack DOIs to enable their automated and transparent tracking.

At their best, altmetrics tools are meant to encourage scholarly activity around published papers on line. It can seem, indeed, like a chicken-and-egg situation: without healthy, collegial, reciprocal cultures of scholarly interaction on the web, mentions of scholarly content will not be significant. Simultaneously, if publications do not provide identifiers like DOIs and authors, publishers and/or institutitons do not perceive any value in sharing their content, altmetrics will yet again be less significant. Altmetrics can work as search and discovery tools for both scholarly communities around academic outputs on the web, but they cannot and should not be thought as unquestionable proxies of either “impact” or “quality”. The value of these metrics lies in providing us with indicators of activity– any value obtained from them can only be the result of asking the right questions, providing context and doing the leg work– assessing outputs on their own right and their own context.

Libraries could do more to create awareness of the potential for altmetrics within the arts of humanities. The role of the library through its Institutional Repository (IR) to encourage online mentioning and the development of impact case studies should be readdressed; particularly if ‘Green’ open access is going to be the mandated form of access. Some open access repositories are already using them (City University London’s open access repository has had Altmetric widgets for its items since January 2013); but the institution-wide capabilities of some of the altmetrics services are fairly recent (Altmetric for Institutions was officially launched in June 2014). There is much work to be done, but the opportunity for cultural change that altmetrics can contribute to seems too good to waste.


HEFCE Metrics: A one-day workshop hosted by the University of Warwick

University of Warwick Faculty of Arts banner

Metrics and the assessment of research quality and impact in the Arts and Humanities

A one-day workshop hosted by the University of Warwick, as part of the Independent Review of the Role of Metrics in Research Assessment.

Date: Friday 16th January 2015 (10:30 to 16:30)

Location: Scarman Conference Centre, University of Warwick

The workshop will have the following objectives:

1. Offering a clear overview of the progress to date in the development of metrics of relevance to arts and humanities to date and persisting challenges.

2. Exploring the potential benefits and drawbacks of metrics use in research assessment and management from the perspective of disciplines within the arts and humanities.

3. Generating evidence, insights and concrete recommendations that can inform the final report of the independent metrics review.

The workshop will be attended by several members of the metrics review steering group, academics and stakeholders drawn from across the wider HE and research community.

Confirmed speakers include:

  • Prof. Jonathan Adams, King’s College London
  • Prof. Geoffrey Crossick, AHRC Cultural Value Project and Crafts Council
  • Prof. Maria Delgado, Queen Mary, University of London
  • Dr Clare Donovan, Brunel University
  • Dr Martin Eve, University of Lincoln and Open Library of Humanities
  • Prof. Mark Llewellyn, Director of Research, AHRC
  • Dr Alis Oancea, University of Oxford
  • Dr Ernesto Priego, City University London
  • Prof. Mike Thelwall, University of Wolverhampton (member of the HEFCE review steering group)
  • Prof. Evelyn Welch, King’s College London

Please register here.

Comparing Altmetric Scores of 2014 Top 25 Paywalled and Top 25 Open Access Outputs

[For context, please see this and this].

If you are new to altmetrics please read:

omparison of scores top 25 by access type

Top 25 Paywalled Outputs in 2014 Altmetric Top 100 List

Altmetric Score in timeframe Title Journal URL
4822.008 Variation in Melanism and Female Preference in Proximate but Ecologically Distinct Environments Ethology
3499.068 Artificial sweeteners induce glucose intolerance by altering the gut microbiota Nature
2985.992 Stimulus-triggered fate conversion of somatic cells into pluripotency Nature
2065.172 Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak Science
1868.194 High-resolution global maps of 21st-century forest cover change Science
1837.974 Association of nut consumption with total and cause-specific mortality New England Journal of Medicine
1767.556 Female Penis, Male Vagina, and Their Correlated Evolution in a Cave Insect Current Biology
1691.35 Does Nursing Assistant Certification Increase Nursing Student’s Confidence Level of Basic Nursing Care When Entering a Nursing Program? Journal of Professional Nursing
1596.304 Observation of Dirac monopoles in a synthetic magnetic field Nature
1570.614 Bidirectional developmental potential in reprogrammed cells with acquired pluripotency Nature
1450.328 Post-study caffeine administration enhances memory consolidation in humans Nature Neuroscience
1426.218 A semi-synthetic organism with an expanded genetic alphabet Nature
1403 The Pen Is Mightier Than the Keyboard: Advantages of Longhand Over Laptop Note Taking Psychological Science
1379.79 Sleep Deprivation and False Memories Psychological Science
1343.46 Global sodium consumption and death from cardiovascular causes New England Journal of Medicine
1333.33 Nurse staffing and education and hospital mortality in nine European countries: a retrospective observational study The Lancet
1280.024 An estimation of the number of cells in the human body Annals of Tropical Medicine & Parasitology
1242.21 Young blood reverses age-related impairments in cognitive function and synaptic plasticity in mice Nature Medicine
1208.974 An Earth-Sized Planet in the Habitable Zone of a Cool Star Science
1180.632 Effects of Low-Carbohydrate and Low-Fat Diets: A Randomized Trial Annals of Internal Medicine
1140.892 Emergence of Zaire Ebola Virus Disease in Guinea – Preliminary Report New England Journal of Medicine
1076.698 Social psychology. Just think: the challenges of the disengaged mind. Science
1026.624 Vaccines are not associated with autism: An evidence-based meta-analysis of case-control and cohort studies Vaccine
1017.118 Ebola Virus Disease in West Africa – The First 9 Months of the Epidemic and Forward Projections New England Journal of Medicine
1017.05 The complete genome sequence of a Neanderthal from the Altai Mountains Nature

Top 25 Open Access Outputs in 2014 Altmetric Top 100 List

Score in timeframe Title Journal URL
5043.756 Experimental evidence of massive-scale emotional contagion through social networks Proceedings of the National Academy of Sciences of the United States of America
2955.068 Dogs are sensitive to small variations of the Earth’s magnetic field Frontiers in Zoology
2733.264 Christmas 2013: Research: The survival time of chocolates on hospital wards: covert observational study British Medical Journal
2391.834 Epidemiological modeling of online social network dynamics arXiv
2245.134 Searching the Internet for evidence of time travelers arXiv
2159.53 Conscious Brain-to-Brain Communication in Humans Using Non-Invasive Technologies PLOS ONE
2145.898 Were James Bond’s drinks shaken because of alcohol induced tremor? British Medical Journal
1693.21 Sliding Rocks on Racetrack Playa, Death Valley National Park: First Observation of Rocks in Motion PLOS ONE
1692.7 Information Preservation and Weather Forecasting for Black Holes arXiv
1478.518 Bodily maps of emotions Proceedings of the National Academy of Sciences of the United States of America
1422.932 Milk intake and risk of mortality and fractures in women and men: cohort studies British Medical Journal
1248.87 Female hurricanes are deadlier than male hurricanes Proceedings of the National Academy of Sciences of the United States of America
1137.826 The hipster effect: When anticonformists all look the same arXiv
1119.952 A Gigantic, Exceptionally Complete Titanosaurian Sauropod Dinosaur from Southern Patagonia, Argentina Scientific Reports
1073.29 Skeletal Muscle PGC-1α1 Modulates Kynurenine Metabolism and Mediates Resilience to Stress-Induced Depression Cell
1067.73 Survey of Academic Field Experiences (SAFE): Trainees Report Harassment and Assault PLOS ONE
1059.886 Volatile disinfection byproducts resulting from chlorination of uric acid: Implications for swimming pools Environmental Science & Technology
956.824 How to make more published research true PLOS Medicine
934.958 Low Protein Intake Is Associated with a Major Reduction in IGF-1, Cancer, and Overall Mortality in the 65 and Younger but Not Older Population Cell Metabolism
898.424 Dendrogramma, New Genus, with Two New Non-Bilaterian Species from the Marine Bathyal of Southeastern Australia (Animalia, Metazoa incertae sedis) with Similarities to Some Medusoids from the Precambrian Ediacara PLOS ONE
880.038 Sex differences in the structural connectome of the human brain Proceedings of the National Academy of Sciences of the United States of America
873.144 Identifiable Images of Bystanders Extracted from Corneal Reflections PLOS ONE
869.706 Nutrition and Health – The Association between Eating Behavior and Various Health Parameters: A Matched Sample Study PLOS ONE
842.594 No Evidence of Dehydration with Moderate Daily Coffee Intake: A Counterbalanced Cross-Over Study in a Free-Living Population PLOS ONE
830.81 Oxytocin enhances brain reward system responses in men viewing the face of their female partner Proceedings of the National Academy of Sciences of the United States of America

Priego, Ernesto (2014): A List of the 37 Open Access Outputs Most-mentioned Onine in 2014 According to Altmetric. figshare.

More on the Altmetric 2014 Top 100

Yesterday I published a post with some initial readings of the Altmetric 2014 Top 100 articles (I’ll be calling them “outputs” as some are not versions of record).

[If you are new to altmetrics please read:

I am interested in being able to investigate whether any trends in publication access preference can be detected, or at least in registering the evidence in order to be able to compare with new data next year for example.

I looked at the countries of affiliation of the authors of the papers  (some papers in the list had only authors from the same country, while others had various authors from different countries). There are 39 unique countries of affiliation in the list:

Czech Republic
Puerto Rico
Republic of Congo
Sierra Leone
The Netherlands

Some outputs in the list had affiliations from only one country. The countries with one-country affiliation outputs in the list are:

Czech Republic
The Netherlands

Apart from a few dominant countries of affiliation in the majority of the outputs in the list (the USA, Canada, France, Germany, the UK) most other countries do have a relatively minimal number of contributions. As Cat Chimes pointed out in her blog post, “68 of the Top 100 had authors from the United States, 19 had authors from the UK, 10 from Canada, 11 from Germany (the most in Europe), 4 from China, and 9 from South or Central America” [I counted 20 outputs with authors with UK affiliations, but I may have made a mistake– I shall check…].

I like alluvial diagrams because even though they require context (for example for knowing the exact numbers behind the volumes/height/density of nodes) they do provide a quick visual insight into the links between different columns/labels in a dataset.

An alluvial diagram of the whole list by access type proved visually chaotic and confusing. Therefore I made some diagrams by grouping them in regions (I did not use the original region categorisation provided by Altmetric). These are the regions I wanted to look at first, not all the regions/countries included in the list.

I started by comparing number of affiliations from the UK and the USA by the access types of the outputs they contributed to:

Author affiliations in the USA and the UK in the Altmetric 2014 Top 100 list
Author affiliations in the USA and the UK in the Altmetric 2014 Top 100 list

Please note that as indicated above some of the outputs visualised in this diagram were jointly written by authors from the UK, the USA and other countries. What the diagram in this case compares is the access types of outputs with authors from either country, so a section of each access type will always overlap, which means this diagram shows the presence of either country of affiliation, but they are not mutually exclusive.

Following this case of the UK and the USA, a comparison can be made with a diagram showing outputs with authors from one single country of affiliation, which might prove more interesting:

Outputs with authors with affiliations from only one country  (USA or UK) by access type in the 2014 Altmetric Top 100 List. Outputs with authors with affiliations from only one country  (USA or UK) by access type in the 2014 Altmetric Top 100 List.
Outputs with authors with affiliations from only one country (USA or UK) by access type in the 2014 Altmetric Top 100 List. Outputs with authors with affiliations from only one country (USA or UK) by access type in the 2014 Altmetric Top 100 List.

Or, for further clarity, as a bar chart:

Outputs by Authors of Only One Country of Affiliation by Access Type in the Altmetric 2014 Top 100 USA and UK
Outputs by Authors of Only One Country of Affiliation by Access Type in the Altmetric 2014 Top 100 USA and UK

It is interesting that the only outputs in the 2014 Altmetric Top 100 list by authors affiliated to UK institutions (and no other countries) were open access and not paywalled. (Once again this might be explained by Altmetric’s sources).

With the intention of offering some context, the following diagram shows all the outputs by authors from a single country of affiliation in the 2014 Altmetric Top 100 list by access type:

Outputs in the Altmetric 2014 Top 100 List by Authors of Only One Country of Affiliation by Access Type
Outputs in the Altmetric 2014 Top 100 List by Authors of Only One Country of Affiliation by Access Type

I points I found interesting from making and looking at the charts:

  • A clear split (almost 50/50) of paywalled and open access outputs by authors of various international affiliations including authors  from UK and USA institutions
  • 37 of the 100 outputs in the list had authors from the USA only (no other country affiliations). The 37 USA-only outputs were divided in 23 paywalled (62%) and 14 open access (38%). This in stark contrast with the 5 UK-only outputs, which were all open access.
  • As in the case of UK-only outputs, when countries had more than one single-country output they did not always split into two access types. If the UK-only outputs were all open access, All the Netherlands, Italy, Israel, Australia and Czech Republic single-affiliation outputs were paywalled.
  • Germany, which had the highest number of outputs from Europe in the list (11), had 9 outputs in collaboration with various countries, all paywalled. The other two outputs were single-affiliation (Germany-only), both open access.

I have made a series of other diagrams but I will share them in a forthcoming post.


As always, please take into account that these are not just 100 outputs mentioned online, but those with the most impressive number of “mentions” (links to permanent identifiers) in the online sources tracked by Altmetric. There is no suggestion that the online popularity of these outputs reflects uncontroversially their “quality”. The fact is that they’ve been mentioned (linked to) on line, and in the current landscape of scholarly communication, I find this fact not without its significance.

The sample of 100 outputs might be considered small, and the bias implicit in Altmetric’s sources and tracking mechanisms impose a series of important caveats to offering any robust conclusions at the moment. However, there is already some data we will be able to compare with future data in the short and long term.

Priego, Ernesto (2014): A List of the 37 Open Access Outputs Most-mentioned Onine in 2014 According to Altmetric. figshare.

Reading the Altmetric 2014 Top 100 Articles from a Distance

The London-based article-level metrics provider Altmetric published a list of their top 100 highest-scoring articles yesterday (access the full Top 100 list). For essential context, please read the accompanying blog post by Cat Chimes here, as well as her “Unwanted attention?” November 14 2014 post here. This recent post on how Altmetric tracks global news will also be relevant to understand the context of the list.

If you are new to altmetrics please read

I am interested in how bibliographic datasets can be subject to ‘distant reading’: can we learn something from looking at what words are used in the titles of articles being mentioned online by larger audiences? Are there any correlations between wording in title,  topic, methodology, discipline, or region and an output’s access type?

I won’t even attempt to answer these questions on a blog post… but in the meanwhile I wanted to share with you a cloud visualisation of the 100 article titles I made with Voyant Tools:

The Altmetric 2014 Top 100 journal article titles as a Cirrus word cloud using Voyant Tools and applying TaporWare English stop words.
The Altmetric 2014 Top 100 journal article titles as a Cirrus word cloud using Voyant Tools and applying TaporWare English stop words.

With Voyant we can also find out which are the 50 most-frequent words in the 100 journal article titles and their counts:

mortality 6
           study 6
         disease 5
           risk 5
         adults 4
           cells 4
    consumption 4
           ebola 4
       evidence 4
         female 4
         global 4
           human 4
         intake 4
         nursing 4
         states 4
         united 4
           virus 4
         alcohol 3
    associated 3
         decline 3
           field 3
    functional 3
         genomic 3
         health 3
         humans 3
           just 3
             men 3
observational 3
         origin 3
    population 3
       reduction 3
         social 3
           west 3
           young 3
       academic 2
         african 2
       alzheimer 2
    association 2
         autism 2
       behavior 2
           brain 2
cardiovascular 2
         causes 2
           cave 2
           cell 2
         change 2
    chelyabinsk 2
       childhood 2
       cognitive 2
         cohort 2

Altmetric’s list usefully labeled the articles in their list by access type (open/paywalled): 37 of the articles in the top 100 were published as open access (63 were originally published under the paywall/subscription model). Again using Voyant, this is the word cloud of the 37 open access articles in the list:


The Altmetric 2014 Top Open Access journal article titles in their Top 100 list as a Cirrus word cloud using Voyant Tools and applying TaporWare English stop words.
The Altmetric 2014 Top Open Access journal article titles in their Top 100 list as a Cirrus word cloud using Voyant Tools and applying TaporWare English stop words.

And the list of the top 50 most-frequent words of the 37 OA journal articles in the list:

       evidence 3
           intake 3
           study 3
         african 2
         alcohol 2
           brain 2
         decline 2
           female 2
           field 2
           health 2
           human 2
       hurricanes 2
    intermediate 2
           men 2
       mortality 2
           new 2
       population 2
       reduction 2
         research 2
             risk 2
           rocks 2
           social 2
             time 2
           young 2
         academic 1
             acid 1
         actions 1
           age 1
       alzheimer 1
         american 1
         angular 1
         animalia 1
anticonformists 1
       argentina 1
         assault 1
       assessing 1
       associated 1
    association 1
       australia 1
           autism 1
           bad 1
         bathyal 1
         behavior 1
           black 1
           bodily 1
           bond’s 1
             boys 1
brain-to-brain 1
         burdens 1
       byproducts 1

By having the data as a .csv file it was easier for me to find out quickly that the open access articles in the Top 100 list were published in the following 14 journals and repositories (from highest to lowest score):

Proceedings of the National Academy of Sciences of the United States of America
Frontiers in Zoology
British Medical Journal
Scientific Reports
Environmental Science & Technology
PLOS Medicine
Cell Metabolism
Translational Neurodegeneration
Journal of Neuroscience

I have deposited on figshare a spreadsheet with the list of the 37 open access articles, including journal/repository names, URLs, published dates and region of authors affiliations.

Priego, Ernesto (2014): A List of the 37 Open Access Outputs Most-mentioned Onine in 2014 According to Altmetric. figshare.

Retrieved 14:48, Dec 10, 2014 (GMT)

For what it’s worth these were the open access outputs that were most-mentioned online according to Altmetric in 2014. Let’s see what happens next year.

As part of my ongoing research I also made this alluvial diagram visualising access type by number of outputs per category, which exemplifies quite directly in my opinion, at least from this specific distance, the dominance of certain disciplines in terms of article-level metrics, but also certain’s disciplines preference within this particular corpus for particular access models.

Alluvial Diagram showing access type by category from the  2014 Altmetric Top 100 Articles; height of node indicates number of outputs by category in the list.  Source data: Altmetric Chart CC-BY @ernestopriego
Alluvial Diagram showing access type by category from the 2014 Altmetric Top 100 Articles; height of node indicates number of outputs by category in the list. Source data: Altmetric