Productivity
Description
In general, productivity estimates the amount of output relative to the amount of input. In the context of academia, outputs can be various objects, varying from publications to data, code, or peer reviews. Although productivity is an aspect of interest, it should usually be considered jointly with something like quality. That is, a higher productivity may just stimulate more, but lower quality, outputs. There is some evidence of such a type of effect (Butler 2003), although this evidence is also disputed (Besselaar, Heyman, and Sandström 2017).
Output is usually only measured for a limited set of objects, with scholarly publications being the most typical example. Nonetheless, other relevant outputs should not be ignored, and limitations of productivity based on publications should be considered. Moreover, we should be aware of certain potential differences between productivity at the individual level and the collective level. For instance, consider a research group for which one individual is tasked with data quality assurance and code review. That individual might perhaps have a lower productivity in terms of publication outputs, yet her/his activities are a boon to the other researchers in the group, whose productivity might greatly increase as a result (Tiokhin et al. 2023).
In addition, one aspect of productivity that is usually missing is the overall input (Abramo and D’Angelo 2016). That is, we typically do not know how many people are employed at a certain institution. Even if part of that becomes visible in authorships, not every employee’s contribution will become visible in authorship. Hence, institutions that have for example more research assistants who are not acknowledges as author may seem to have relatively few authors, but in reality there are much more people active at the institution. Moreover, even if we know whether a particular author as affiliated with a certain institution, we do not know the amount of time (s)he spends at that affiliation, which is particularly challenging with multiple affiliations. Going one step further, the input could also be specified in financial terms. Unfortunately, none of this data is typically available (Waltman et al. 2016). Nonetheless, this is an important limitation to taken into account when considering productivity.
Datasources
OpenAlex
OpenAlex covers publications based on previously gathered data from Microsoft Academic Graph, but mostly relies on Crossref to index new publications. OpenAlex offers a user interface that is at the moment still under active development, an open API, and the possibility to download the entire data snapshot. The API is rate-limited, but there are options of having a premium account. Documentation for the API is available at https://docs.openalex.org/.
It is possible to retrieve the number of authors for a particular publication in OpenAlex, for example by using a third-party package for Python called pyalex
.
import pyalex as alx
= "mail@example.com"
alx.config.email = alx.Works()["W3128349626"]
w
= w["author"]
authors = w["institutions"]
institutions = w["countries"] countries
Based on this type of data, the above-mentioned metrics can be calculated. When large amounts of data need to be processed, it is recommended to download the full data snapshot, and work with it directly.
OpenAlex provides disambiguated authors, institutes and countries. The institutions are matched to Research Organization Registry (ROR), the countries might be available, even if no specific institution is available.
Dimensions
Dimensions is a bibliometric database that takes a comprehensive approach to indexing publications. It offers limited free access through its user interface. API access and access through its database via Google BigQuery can be arranged through payments. It also offers the possibility to apply for access to the API and/or Google BigQuery for research purposes. The API is documented at https://docs.dimensions.ai/dsl.
The database is closed access, and we therefore do not provide more details about API usage.
Scopus
Scopus is a bibliometric database with a relatively broad coverage. Its data is closed and is generally available only through a paid subscription. It does offer the possibility to apply for access for research purposes through the ICSR Lab. Some additional documentation of their metrics is available at https://www.elsevier.com/products/scopus/metrics, in particular in the Research Metrics Guidebook, with documentation for the dataset available through ICSR Lab being available separately.
The database is closed access, and we therefore do not provide more details about API usage.
Web of Science
Web of Science is a bibliometric database that takes a more selective approach to indexing publications. Its data is closed and is only through a paid subscription.
The database is closed access, and we therefore do not provide more details about API usage.
References
Reuse
Citation
@online{apartis2024,
author = {Apartis, S. and Catalano, G. and Consiglio, G. and Costas,
R. and Delugas, E. and Dulong de Rosnay, M. and Grypari, I. and
Karasz, I. and Klebel, Thomas and Kormann, E. and Manola, N. and
Papageorgiou, H. and Seminaroti, E. and Stavropoulos, P. and Stoy,
L. and Traag, V.A. and van Leeuwen, T. and Venturini, T. and
Vignetti, S. and Waltman, L. and Willemse, T.},
title = {Open {Science} {Impact} {Indicator} {Handbook}},
date = {2024},
url = {https://handbook.pathos-project.eu/sections/2_academic_impact/productivity.html},
doi = {10.5281/zenodo.14538442},
langid = {en}
}