Sobre co-diseño para recursos digitales en el sector cultural – Rostros del tiempo, 2o Coloquio de Vida Cotidiana en México

Hoy martes 3 de septiembre del 2019 participaré en el Segundo Coloquio de Vida Cotidiana en México, “Rostros del tiempo” a las 13:30hrs, en el Museo de Arte de la SHCP, Moneda 4, Centro Histórico, Ciudad de México. Entrada libre.

Mi presentación sintetizará aproximaciones de las ciencias sociales, el diseño interactivo o diseño centrado en el usuario (HCID) y las humanidades digitales explorando qué métodos podemos utilizar para buscar integrar más sustentablemente la vida cotidiana en México y los recursos digitales abiertos en el sector cultural mexicano.

Las principales preguntas que guiarán mi presentación serán:


  • ¿De qué hablamos cuando hablamos de “recursos” digitales en el sector cultural?
  • ¿Qué tipo de contextos, y qué tipo de instituciones y recursos digitales tenemos?
  • ¿Cómo puede contribuir la disciplina del diseño interactivo, o diseño centrado en el usuario (HCI; UX) a ‘conectar’ los recursos digitales con el público en su vida cotidiana?
  • ¿Qué significaría diseñar para conectar con usuarios de un modo sustentable y específico al contexto local?



El programa completo abajo.

Rostros del tiempo. Programa. Talleres.

Rostros del tiempo. Coloquio. Programa

Es un honor estar de vuelta en México para participar en este evento.

#DH2018 and #DH2019 Twitter Archive Counts. A Comparison


My interest in documenting the scholarly activity on Twitter using conference hashtags is not new; for the digital humanities I have been looking into it since 2010. Searching on this blog or googling related keywords may throw some results to those interested in background. I have been archiving conference hashtag archives for a while now, often depositing them as part of the scholarly record, blogging and giving workshops about my objectives and methdologies, etc.

I like sharing results in real time while conferences are taking place or shortly after. Therefore any results shared are always-already provisional, perfectible, and unfinished. I have always believed that a signal is better than no signal or having to wait 3 years for one, therefore I insist in sharing any quick insights that I can get rather than not sharing them at all or having to wait until I miraculously find the time to do it differently (which I am not likely to, so I’d rather take any opportunity I have to share something). Hopefully someone finds it helpful in some way.

Once again I have also been critical of the metrication of scholarly activitiy so the fact that I share quantitative data from the archives collected does not mean I think this metrication is always-already something to aspire to or that it means anything in particular. I see it as an ethnographic means to document the existence of scholarly activity on Twitter around academic conferences in specific fields, and perhaps as an entry point to assess academic and public engagement on Twitter with academic hashtags and the events they represent, and/or possibly any increase or decrease or transformations in this type of activity on Twitter. For example, it is possible to gain insights of Twitter user settings preferences, as in the case of the language users have set up, as I looked into this post on user_lang in #DH2018 tweets.

The Methods

The metrics compared here are the result of a double method of collection as a means to ensure the validity of the collected data. I used a Python script to collect both archives, and then set the parameters as those for archives I collected using TAGS (see Priego 2018). Even if the collected data still needs to be refined, when the counts are the same or very semilar I get a degree of certainty the data collected via TAGS from the Twitter Search API is close to being as reliable as it could be.

For 2018 and 2019 I managed to get the settings and timings right to achieve what looks like a complete set of #DH2018 and #DH2019 tweets. Below I share a comparative table where the main metrics can be compared. As indicated in the table, it must be noted that there are important differences in mainly a) the number of days before and after the conference days included in the archive and b) the number of days each conference was held on according to their respective web pages / programmes (I seem to remember the Mexico City conference had activities at least one day prior to the date indicated on the main web site but I may be misremembering- need to check).

The Basic Counts

Needless to say most interesting or useful insights from looking at these archives would be qualitative data and not necessarily quantitative data as the one presented here. The RTs and @ replies stats can give an indication of the level of interaction in between accounts, and the number of accounts tweeting with each hashtag each year could be seen as an indication of the interest in the conference or hashtag (this indication may be misguiding due to spamming or confusion due to hashtag overlap, and of course one would need to know which accounts are included and not included in each one).

There is a series of analyses that can be run with the full data collected and I hope that now that I have a more solid longitudinal dataset of yearly archives I may be able to do that with more roubstness soon. In the meanwhile then, for what they are worth here are the main archive stats compared for last year and this year.


#DH2018 #DH2019 Notes
First conference day according to programme 26/06/2018 08/07/2019
Last conference day according to programme 29/06/2019 12/07/2019
First Tweet Collected in Archive 24/06/2018 06:19 29/06/2019 02:13 Local conference time zone
Last Tweet Collected in Archive 30/06/2018 06:17 14/07/2019 22:56 Local conference time zone
Days collected 6 days 16 days
Number of collected tweets (includes RTs) 13858 14101 Data might require refining and deduplication
In Reply Ids 564 1091
In Reply @s 747 812
Number of links 4312 9061
Number of RTs 8656 8650 Estimate on occurrence of RTs
Number of unique accounts 2329 2157
Conference location Mexico City, Mexico Utrecht, the Netherlands
Priego, E. (2019): #DH2018 and #DH2019 Twitter Archive Counts. Summary Comparative Data Table. figshare. Dataset.



Even if I collected #DH2019 during a longer period (ten days more than the #DH2018 archive), there were fewer unique user accounts using #DH2019 than #DH2018. And taking into account the #DH2019 archive included more collection days and therefore more opportunity for interactions, the #DH2019 archive showed more replies, mentions and links than the #DH2018 one. The number of tweets and RTs in both archives (again, taking into account the differences in collection days) remained very close. It could be argued the Twitter activity does not indicate an increment nor reduction in engagement (as manifested through tweets or RTs) with the conference hashtag, while showing that this year fewer accounts participated. What follows is refining and deduplicating the data if required, in order to limit the archives to the same data collection timings, revise the initial insights, and then perform qualitative text and account analysis in order to determine amongst other things if any differences in unique accounts using the hashtag were relevant to the field, or were simply bots or other unrelated accounts like spam bots. That qualitative refining could give us greater certainty about any changes in the demographic engaging with the conference hashtags over the years. This needs to be done carefully and following ethical standards.

A Polite Request

If you are interested in this same topic and you read this please do not disregard this output only because it’s not been published in a peer-reviewed journal. If you get any type of inspiration or value or motivation from this post, my tweets about it or any other blog posts about Twitter archiving, please do cite these outputs- not only is it good academic practice but a way for us to know about other responses to the same issues and to continue building knowledge together.


Priego, E. (2018) Archiving Small Twitter Datasets for Text Analysis: A Workshop Tutorial for Beginners. figshare.
Priego, E. (2019): #DH2018 and #DH2019 Twitter Archive Counts. Summary Comparative Data Table. figshare. Dataset.


Oligopolies of Knowledge, {Digital Humanities} and Open Access: Looking at Scopus from the Global South… [form the North]

Oligopolies of Knowledge, {Digital Humanities} and Open Access: Looking at Scopus from the Global South… [form the North]


To download and cite the slides: Priego, Ernesto (2019): Oligopolies of Knowledge, {Digital Humanities} and Open Access: Looking at Scopus from the Global South… [form the North]. figshare. Presentation.

Presentation for P-11: Society, Media, Politics, Engagement Time: Wednesday, 10/Jul/2019: 4:00pm – 5:30pm Session Chair: Amelia Sanz DH2019 Conference, Utrecht, the Netherlands. Location: Pandora Zaal Part of the panel: Twining Digital Humanities and Humanidades Digitales: A set of actual experiences from the South.

All the slides from the panel can be viewed and/or downloaded and cited from:

Fiormonte, Domenico; Numerico, Teresa; Priego, Ernesto; Rodríguez-ortega, Nuria; Sanz, Amelia; Sapiera, Eugenia (2019): Twining Digital Humanities and Humanidades Digitales: A Set of Actual Experiences from the South [Slides]. figshare. Presentation.

MakeWrite: Supporting Writing with Constrained Creativity

MakeWrite screenshot

I am pleased to announce that the INCA project has now launched MakeWrite, an iPad app that was co-designed by and for people with aphasia (a language difficulty following brain injury).

The app offers an accessible way for anyone to create and share texts in English. However, you don’t need to live with aphasia to try it out. Users can use existing text to make their own new piece of creative writing in four simple stages: choose, erase, arrange and share.

It was launched yesterday as part of UNESCO’s World Poetry Day.
This is its first release- it is a worldwide release for all iPad models, but if you are not in the UK and you experience difficulties downloading please do let us know- there should be no problems though.

Needless to say I’d personally love to see a multilingual MakeWrite, and of course one with a wider variety of source texts and an Android version too.

Link to the release on iTunes: 

Find out more about the INCA Project at

El extraño caso de los archivos reaparecidos / The Strange Case of the Reappeared Archives: Carta Abierta/Open Letter: Periódico de Poesía 2007-2018

[Leer y firmar carta aquí / read sign the letter here]


[English version below]

[He compartido aquí esta carta abierta para que quede registro de su existencia. Cuando recibí noticia de esta carta, iniciada por Jorge Fondebrider, el 29 de enero, el archivo de los números 2007-2018 del Periódico de Poesía no estaba disponible de manera clara y visible al visitar  Para mayor contexto sobre la genealogía de este misterioso caso de archivos desaparecidos y reaparecidos, ver el post de Jorge en

Es mi opinión que este es un caso que deja claro que cuestiones de infraestructura académica y humanística, que son casos de arquitectura de la información, son casos políticos. El diseño es político. Lo es porque este es un caso de mal diseño de la interface y del archivo, dejando 10 años (y probablemente más años) de trabajo humanísitico a la intemperie, en riesgo constante de accidente y desaparición. Por eso la carta sigue siendo relevante, pues la reaparición de los archivos desaparecidos no soluciona el problema: es hora de llamar a un experto en ciencias de la información (¡un bibliotecario y archivista!) para que ponga en orden las cosas en el sitio del Periódico de Poesía. Su futuro depende de que eso pase.

I have shared here this open letter for the record. At the time we began collecting the initial signatures, the 2007-2018 issues of Periódico de Poesía were not clearly and visibly available when visiting A day later, once the word had spread, they suddenly reappeared on its home page. For more context on the genealogy of this strange case of disappeared and reappeared archives, please read Jorge’s post at

In my opinion this is a case that proves that issues of academic infrastructure, which are issues of information architecture, are political issues. In other words, information architecture is political. Design is political. It is political because bad interface and archive design are endangering cultural heritage (particularly, but not only, in the Global South). The open letter below is still relevant because the sudden reappearance of the missing archives does not solve the main issue: it is time to call an information professional (a librarian and archivist!) to put things in order at the Periódico de Poesía site. Its future depends on it.]

Los abajo firmantes solicitamos a la UNAM volver a poner a disposición del público el archivo completo del Periódico de Poesía abierta y formalmente en línea, incluyendo todos los números publicados entre 2007 y 2018, los cuales hasta hace poco no aparecían en su archivo en línea, o aparecen/aparecían en locaciones confusas o poco adecuadas del sitio.

The undersigned request UNAM makes the complete archive of Periódico de Poesía (including all the issues published between 2007 and 2018, which until very recently were missing or misplaced) openly available to the public again in an appropriate location within the whole archive.

Para mayor contexto / more context at:

[Firmar carta aquí / sign the letter here]


Periódico de Poesía:

Texto completo de la carta abierta y firmantes iniciales / Full Open Letter in Spanish and initial signataries:


Carta Abierta

Periódico de Poesía 2007-2018: Solicitamos volver a poner a disposición del público el archivo completo del Periódico de Poesía en línea de manera formal, segura, sustentable y permanente.

Los abajo firmantes, colaboradores y lectores del Periódico de Poesía de la UNAM entre 2007 y 2018, solicitamos encarecidamente que se vuelva a poner a disposición del público, en formato PDF así como en HTML (ya que el Periódico también publicaba material interactivo) la totalidad de los números publicados en ese periodo, que actualmente no se encuentran donde corresponde, que es en el “Archivo de épocas anteriores de Periódico de Poesía” (

Habiéndolos publicado ad honorem, los colaboradores entendemos que la única compensación posible por nuestros trabajos es permitir que los lectores, pasados, presentes y futuros, puedan acceder libremente al fruto de nuestros esfuerzos. Así mismo, dada la actual fragilidad del archivo, solicitamos que la UNAM resguarde todos los números del Periódico de Poesía de manera formal en su repositorio institucional, para así asegurar que el contenido esté disponible de manera segura, sustentable y permanente.

Nos sentimos orgullosos de haber colaborado en el Periódico de Poesía y de que nuestra labor sea parte de su patrimonio. Pedimos entonces que la UNAM atienda nuestro reclamo y corrija esta situación.


[Firmar carta aquí / sign the letter here]

Firmantes iniciales:

ALEXIS GÓMEZ ROSA (Rep. Dominicana)
ANNA CROWE (Escocia)
AURELIO MAJOR (España/México/Canadá)
BLANCA STREPPONI (Argentina / Venezuela)
EDUARDO MILÁN (Uruguay/México)
EDWARD HIRSCH (Estados Unidos)
ERNESTO PRIEGO (México/Reino Unido)
HÉLÈNE CARDONA (Estados Unidos/España)
HUGH HAZELTON (Estados Unidos)
INÉS GARLAND (Argentina)
JAN DE JAGER (Argentina)
JORGE AULICINO (Argentina – Premio Nacional de Poesía)
JOSÉ CARLOS CATAÑO (Canarias-Cataluña)
JUAN ARABIA (Argentina)
LUIS BRAVO (Uruguay)
MARK SCHAFER (Estados Unidos)
MARTÍN ESPADA (Estados Unidos)
MATT BROGAN (Estados Unidos)
PEDRO POITEVIN (Estados Unidos)
TOM POW (Escocia)
W. H. HERBERT (Escocia)
VICTOR SOTO FERREL (Tijuana, México)


[Firmar carta aquí / sign the letter here]


Fondebrider, Jorge; Priego, Ernesto; et al. (2019): Carta Abierta/Open Letter: Periódico de Poesía 2007-2018. figshare.


Tweets per user_lang in a #DH2018 archive

I collected an archive of #DH2018 tweets from accounts with at least 10 followers. The main quant summary is in the table below, which I also tweeted earlier:

Twitter Activity for #DH2018, archive by Ernesto Priego

I wanted to take a quick look at number of tweets per user_lang. “user_lang” filters the language that appears in the user twitter profile. (Please note “user_lang” is different from “lang”, which, when present, indicates a BCP 47 language identifier corresponding to the machine-detected language of the tweeted text).

Filtering the #DH2018 tweets archive by user_lang and then counting the number of tweets per user_lang gives us the following table:

tweet count per user_lang

The archive only collected tweets from acounts with at least 10 followers. The table above can be, just for fun, visualised as a simple bar chart, as a means to quickly show the difference in volume:

user_lang #dh2018 archive bar chart

Please note the archive collects unique tweets including RTs,  therefore it can be a unique tweet by a unique user who has been retweeted several times (or none) that contributes to the count or a given user_lang.

In other words, the counts above do not indicate there were x number of users whose Twitter profiles had x language code, but merely the number of tweets in this specific archive organised according to the user_lang code from the tweeter’s Twitter profile.

Therefore what this can possibly provide an indication of is of the over or under-representation of tweets from accounts whose Twitter profiles have specific language codes. It’s not that x number of tweets in the archive were in this or that language, nor that x number of tweeters using the hashtag speak this or that language.

What becomes apparent is that an overwhelming majority of accounts with tweets in the archive have ‘en’ as the language code in their Twitter profiles; it is interesting that, in the archive, only one tweet was collected by an account with ‘es-MX’ as the language code in its Twitter profile.

One must also take into account that often ‘en’ is or might be the default user_lang code in Twitter profiles.

I still need to go back to my archives from previous years, but it does look like that in spite of the usual over-representation of the ‘en’ user_lang code, at least there is a diversity of user_lang in the archived tweets, with ‘es’ in second place.

Once I refine and anonymise the data I will be depositing the source data for this post.

*This blog post was typed quickly, typos and wonky syntax might have remained.


Open Insights Interview

empowoa open insights logo

Thank you to Martin Eve and James Smith from the Open Library of Humanities for interviewing me for their Open Insights series today, part of their EmpowOA programme.

The URL for the interview is:

Make sure to follow the #EmpowOA hashtag for the whole series. Find out more about the Open Library of Humanities’ EmpowOA programme here.

No Wall Blues (78s Mix)


The Great 78 Project is a community project for the preservation, research and discovery of 78rpm records. I think it is one of the most amazing digital humanities projects out there today. As a material culture researcher and music collector I have enjoyed the collections very much. I decided to make a quick mix with some of my favourite tunes.

Copyrights that may exist in these materials have not been transferred to the Internet Archive. I do not own the copyright of the recordings used in this mix/playlist; it has been shared for artistic, preservation and educational use only and no copyright infringement has obviously not been intended.

In my mix I modified the equalization slightly and added some subtle effects. I hope this does not annoy those who with all reason have a lot of love and respect for this music. All my gratitude to the Internet Archive, George Blood, Jessica Thompson, Bob George, Brewster Kahle for sharing these cultural treasures. Shared for educational use.


1 No Wall by Claudia;Nicky;Paul Carson;Barbara Fuller;Tom Collins

2 Black And Evil Blues by Alice Moore with Ike Rodgers

3 Moanin’ (Lamentos) by Mills’ Blue Rhythm Band

4 Deep Moaning Blues by Ma Rainey

5 Vibraphone Blues (Queja del Vibrafono) by Benny Goodmand Quartet;Goodman;Krupa;Hampton

6 I Know The Blues by Israel Crosby Quartette

7 I’ve Been A Bad Boy by Doc Sausage and his Mad Labs

8 The Boll Weevil by Lead Belly

9 Last Call Blues by The Spirits of Rhythm

10 Working Man’s Blues by Lonnie Johnson

11 How Long, How Long Blues by The Varsity Seven

12 The Hipster’s Blues – Opus 6-7/8 by Harry (The Hipster) Gibson

13 Cryin’ And Sighin’ by Manzie Harris and his Band

14 Someone To Watch Over Meby Ira and George Gershwin;Linda Keene;Henry Levine and his Strictly from Dixie Jazz Band

15 Heat Cuttin’ Bluesby Hunter and Jenkins

The Strangest Secret: A Great 78s Mix

The Great 78 Project by the Internet Archive is a community project for the preservation, research and discovery of 78rpm records.

I think it is one of the most amazing digital humanities projects out there today. As a material culture researcher and music collector I have enjoyed the collections very much. I decided to make a quick mix with some of my favourite tunes. This was the music of my grandparents and parents, the music I listened to growing up and an important part of my cultural identity.

As indicated on the project’s web pages, copyrights that may exist in these materials have not been transferred to the Internet Archive. Logically I do not own the copyright of the recordings used in this mix/playlist; it has been shared for artistic, preservation and educational use only and obviously no copyright infringement has not been intended.

In my mix I looped some samples from the recordings and modified the equalization slightly. I hope this does not annoy those who with all reason have a lot of love and respect for this music. All my gratitude to George Blood, Jessica Thompson, Bob George, Brewster Kahle and everyone else involved in this amazing project for sharing these cultural treasures.

“Access/Accès”: #DH2017, Montreal, 8-11 August 2017 Tweetage Volume Charts

Screen Shot 2017-08-08 at 12.03.36

#DH2017 starts today in Montreal.  The theme is “Access/Accès”. Details in the hyperlink. I wish I were there!

I am sure the tweetage will exceed the limits of my poor Google spreadsheet, but as it’s become kind of customary I am attempting to collect as many tweets with the conference hashtag as possible.

Using Martin Hawksey’s TAGS, here’s what the archive looks like as of 6:35:05 AM Montreal time of the first official day (8 August 2017):

Archive for #DH2017, Top Tweeters and 3 day activity, 6:35:05 of day one Montreal time

As of 9 August 2017, 6:11:33 AM Montreal time

Screen Shot 2017-08-09 at 11.19.25

As of 10 August 2017, 6:07:45 AM Montreal time

Screen Shot 2017-08-10 at 11.13.54

As of 11 August 2017, 7:12:46 AM Montreal time

Screen Shot 2017-08-11 at 12.30.08

As of 12 August 2017, 03:11:57 AM Montreal time. (I would have liked to take this screenshot later but I would not be online at that time. Considering the conference had finished by then it will do),

Screen Shot 2017-08-12 at 08.44.15

As of 13 August 2017, 05:50:54 AM Montreal time

Screen Shot 2017-08-13 at 11.16.34

On 9 August do note the hashtag went nuclear being spammed, particularly with  annoying ‘trending topics’ tweets, so data could do with some refining. However it does not look, at a quick glance, that spamming was serious. With more time further on and once I have closed the collection I could take a closer look and give an indication of the extent of the spamming. In any case please note as always the counts I am presenting are merely indicative, numbers are not meant to be taken at face value and no inherent quality or value judgements should be inferred from the volumes reported.

As I often state the data presented is the result of the collection methods employed, different methods are likely to present different results.

Note that this time only tweets from users with at least 10 followers are being collected. For the purpose of the archive, retweets count as tweets (this means not every tweet contains ‘original’ content).

It has been assumed that those scholars or scholarly organisations tweeting publicly from public accounts at very high volumes from an international conference do expect to get noticed by the international community for for their tweetage with the hashtag and therefore are giving implicit consent to get noted by said community for scholarly purposes; if anyone opposes to their username appearing in one of the ‘Top Tweeters’ bar charts above please let me know and I can anonymise their username retrospectively if that helps.

This is the first year I manage to archive a more or less complete set. On the one hand it helps that TAGS has improved, that I was able to be collecting and monitoring the collection in real time, and that I set the limit of a minumum of 10 followers for accounts to be collected. It also helped I did not start collecting to far back in advance as I sometimes have done.

I will be depositing a dataset of Tweet ID’s and timestamps, which is the source data for the charts embedded here, next week.

Speaking of “Access/Accès”, here’s a recent post I wrote about access and license types in a set of articles from the Journal of Digital Scholarship in the Humanities. In case you missed it (you probably did), it might be of interest given this year’s theme.



Questions of Access in the Digital Humanities: Data from JDSH

[On 8 August 2017, this post was selected as Editor’s Choice in Digital Humanities Now at]

[N.B. As usual, typos might still be present when you read this; this blog post is likely to be revised post-publication… thanks for understanding. This blog is a sandbox of sorts].

Para Domenico, siempre en deuda

tl;dr, scroll down to the charts

I used The Altmetric Explorer to locate any  articles from the Journal of Digital Scholarlship in the Humanities that had had any ‘mentions’ online anytime. An original dataset of 82 bibliographic entries was obtained. With the help of Joe McArthur the Open Access Button API was then employed to detect if any of the journal articles in the dataset had open access surrogates (for example, self-archived versions in institutional repositories) and if so, which content they actually provided access to. The API located 24 URLs of the 82 DOIs corresponding to each article in the dataset.

I then edited and refined the original dataset to include only the top 60 results. Each result was manually refined and cross-checked to verify the resulting links matched the correct outputs and to what kind of content they provided access to, as well as to identify the type of license and type of access of each article’s version of record.

A breakdown of the findings below:

Visualisation of numeralia from the JDSH 60 Articles Altmetric-OA Button Dataset

(Note numbers re OA Button results will not add up as there are overlaps and some results belong to categories not listed).

It must be highlighted that only one of the links located via the Open Access Button API provided access to an article’s full version.

This disciplinarily-circumscribed example from a leading journal in the field of the digital humanities provides evidence for further investigations into the effects of publishers’ embargos on the ability of institutional open access repositories to fufill their mission effectively.

The dataset was openly shared on figshare as

Priego, Ernesto (2017): A Dataset Listing the Top 60 Articles Published in the Journal of Digital Scholarship in the Humanities According to the Altmetric Explorer (search from 11 April 2017), Annotated with Corresponding License and Access Type and Results, when Available, from the Open Access Button API (search from 15 May 2017). figshare.


The Wordy Thing

Back in 2014, we suggested that “altmetrics services like the Altmetric Explorer can be an efficient method to obtain bibliographic datasets and track scholarly outputs being mentioned online in the sources curated by these services” (Priego et al 2014).  That time we used the Explorer to analyse a report obtained by searching for the term ‘digital humanities’ in the titles of outputs mentioned anytime at the time of our query.

It’s been three years since I personally presented that poster at DH2014 in Lausanne, but the topic of publishing pracitices within the digital humanities keeps being of great interest to me. It could be thought of as extreme academic navel-gazing, this business of deciding to look into bibliometric indicators and metadata of scholarly publications. For the digital humanities, however, questions of scholarly communications are questions of methodology, as the technologies and practices required for conducting research and teaching are closely related to the technologies and practices required to make the ‘results’ of teaching and research available. For DH insiders, this is closely connected to the good ol’ less-yacking-more-hacking, or rather, no yacking without hacking. Today, scholarly publishing is all about technological infrastructure, or at least about an ever-growing awareness of the challenges and opportunities of ‘hacking’ the modes of scholarly production.

Moreover, the digital humanities have also been for long preoccupied with the challenges in getting digital scholarship recoginsed and rewarded, and, also importantly, about the difficulties to ensure the human, technical and financial preconditions of sustainability. Scholarly publishing, or more precisely ‘scholarly communications’ as we prefer to say today, are also very much focused on those same concerns. If form and content are unavoidably interlinked and codependent in digital humanities practice, surely issues regarding the so-called ‘dissemination’ of said practice through publications remain vital to its development.

Anyway, I have now finally been able to share a dataset based on a report from the Altmetric Explorer looking into the articles published at the Journal of Digital Scholarship in the Humanities (from now on JDSH), one of the (if not the) leading journal in the field of digital humanities (it was previously titled Literary and Linguistic Computing). I first started looking into which JDSH articles were being tracked by Altmetric as mentioned online for the event organised by Domenico Fiormonte  at the University Roma Tre in April this year (the slides from my participation are here).

My motivation was no only to identify which JDSH outputs (and therefore authors, affiliations, topics, methodologies) were receiving online attention according to Altmetric. I wanted, as we had done previously in 2014, to use an initial report to look into what kind of licensing said articles had, whether they were ‘free to read’, paywalled or labeled with the orange open lock that identifies Open Access outputs.

Back in 2014 we did not have the Open Access Button nor its plugin and API. With it I had the possibility to try to check if any of the articles in my dataset had any openly/freely available versions through the Button. I contacted Joe McArthur from the Button to enquire whether it would be possible to run a list of DOIs through their API in bulk. It was, and we obtained some results.

Here’s a couple of very quick charts visualising some insights from the data.

It should also be highlighted that of the 6 links to institutional repository deposits found via the Open Access Button API, only one gave open access to the full version of the article. The rest were either metatada-only deposits or the full versions were embargoed.

As indicated above, the 60 ‘total articles’ refers to the number of entries in the dataset we are sharing. There are many more articles published in JDSH. The numbers presented represent only the data in question which is in turn the result of particular methods of collection and analysis.

In 2014 we detected that “the 3 most-mentioned outputs in the dataset were available without a paywall”, and we thought that could indicate “the potential of Open Access for greater public impact.” In this dataset, the three articles with the most mentions are also available without a paywall. The most mentioned article is the only one in the set that is licensed with a CC-BY license. The two that follow are ‘free’ articles that require permission for reuse.

The data presented is the result of the specific methods employed to obtain the data. In this sense this data represents as much a testing of the technologies employed as of the actual articles’ licensing and open availability. This means that data in columns L-P reflect the data available through the Open Access Button API at the moment of collection. It is perfectly possible that ‘open surrogates’ of the articles listed are available elsewhere through other methods. Likewise, it is perfectly possible that a different corpus of JDSH articles collected through other methods (for example, of articles without any mentions as tracked by Altmetric) have a different proportion of license and access types etc.

As indicated above the licensing and access type of each article were identified and added manually and individually. Article DOI’s were accessed one by one with a computer browser outside/without access to university library networks, as the intention was to verify if any of the articles were available to the general public without university library network/subscription credentials.

This blog post and the deposit of the data is part of a work in progress and is shared openly to document ongoing work and to encourage further discussion and analyses. It is hoped that quantitative data on the limited level of adoption of Creative Commons licenses and Institutional Repositories within a clearly-circumscribed corpora can motivate reflection and debate.


I am indebted to Joe McArthur for his kind and essential help cross-checking the original dataset with the OA Button API, and to Euan Adie and all the Altmetric team for enabling me to use the Altmetric Explorer to conduct research at no cost.

Previous Work Mentioned

Priego, Ernesto; Havemann, Leo; Atenas, Javiera (2014): Online Attention to Digital Humanities Publications (#DH2014 poster). figshare. Retrieved: 18:46, Aug 04, 2017 (GMT).

Priego, Ernesto; Havemann, Leo; Atenas, Javiera (2014): Source Dataset for Online Attention to Digital Humanities Publications (#DH2014 poster). figshare. Retrieved: 17:52, Aug 04, 2017 (GMT)

Priego, Ernesto (2017): Aprire l’Informatica umanistica / Abriendo las humanidades digitales / Opening the Digital Humanities. figshare. Retrieved: 18:00, Aug 04, 2017 (GMT)