PathOS - D2.1 - D2.2 - Open Science Indicator Handbook

S. Apartis; G. Catalano; G. Consiglio; R. Costas; E. Delugas; M. Dulong de Rosnay; I. Grypari; I. Karasz; Thomas Klebel; E. Kormann; N. Manola; H. Papageorgiou; E. Seminaroti; P. Stavropoulos; L. Stoy; V.A. Traag; T. van Leeuwen; T. Venturini; S. Vignetti; L. Waltman; T. Willemse

doi:10.5281/zenodo.8305626

Authors

Affiliations

L. Stoy

Technopolis Group

I. Karasz

Technopolis Group

E. Seminaroti

Technopolis Group

E. Delugas

Centre for Industrial Studies

Labour market impact of OS

History

Version	Revision date	Revision	Author
0.5	30-04-2024	Second Draft	Lennart Stoy (TGB), Elisa Seminaroti (TGB), Erica Delugas (CSIL)
0.4	25-04-2024	Peer Review	V.A. Traag
0.3	09-04-2024	Revised draft	Lennart Stoy (TGB), Elisa Seminaroti (TGB), Erica Delugas (CSIL, reviewer)
0.2	15-03-2024	First full draft	Lennart Stoy (TGB), Elisa Seminaroti (TGB)
0.1		Labour efficiency indicator template	Istvan Karasz, Technopolis Consulting Group

Description

Open Science (OS) is increasingly recognised to have direct effects on the skills and competence needs of researchers and research support staff. For example, academic libraries and research infrastructures need skills and competence for research data management (O’Carroll et al. 2017; Manola et al. 2021; “Digital competencies urgently needed! October 2019,” n.d.). The new role of Research Software Engineer is also gaining traction internationally.^[1] The changing requirements are also well-documented in the case of academic libraries (McCaffrey et al. 2020). Open Science is thus integrated into the European career and competence frameworks for researchers, such as ResearchComp (Almerud et al. 2022). While initially emanating from public research organisatons, such skills may spill-over into industry, to make use of Open Science resources, but also through professionals moving into non-academic jobs and transferring their knowledge, skills, and competences.

In addition, the creation and use of Open Science resources may have indirect impacts on the labour market at large, although this remains speculative at this stage. On the one hand, effects such as the reduced access and transaction costs, achieved through Open Access to research results, enhanced findability and interoperability of resources, or the potential reduction of double funding, may influence the demand for specific job profiles in both academic and non-academic sectors. In the long term, this evolution could lead to the displacement of occupations and job profiles that have become obsolete. On the other hand, using or creating Open Science resources could also spur a demand for new skill sets and eventually trigger the emergence of new occupational profiles. It is thus equally possible that the need for profiles equipped with Open Science skills might increase across all sectors, to make better use of the results and opportunities, which may be facilitated by cost savings incurred through Open Science and other effects.

At the moment, it is difficult to establish whether such effects exist, also related to the challenge in isolating the causal effects of Open Science. The 2020 Cost of Non-FAIR study (Cost-benefit analysis for FAIR research data: cost of not having FAIR research data 2018) identified potential economic growth, including job creation, as one potential impact of FAIR data, but fell short of providing a detailed indicator or metric for this effect.

Considering the possibilities to measure such effects currently available, this document describes potential approaches that can be categorised along different dimensions:

The qualitative change of skills and competence related to Open Science identified in job profiles. This metric can be broken down into the respective sectors, e.g., industry, academia, public research etc.
Quantitative change of job profiles with the skills and competence related to Open Science. These may build on the first metric and can be broken down into the respective sectors, e.g., industry, academia, public research, etcetera. In addition, specific metrics may stem from monitoring exercises on Open Science training and skills.

These are not full-fledged instructions that can be implemented immediately. They are suggestions about approaches, with some suggestions for potential methods and data sources. Any use of them needs to be used carefully in the specific context of the respective research question at hand.

Metrics

New Open Science skills and competences

Open Science requires specific skills and competence. These qualitative changes can be identified by investigating job descriptions and the types of skills and competences which they require. Job descriptions can be obtained from platforms such as LinkedIn, IEEE Jobs, indeed.com, or also directly from organisations recruiting specific profiles. This is a relatively well-established practice and has been done in the context of data science and research data management.

Measurement

A tested methodology was employed by the EDISON Project (“Data Science Competence Framework (CF-DS) | Edison Project,” n.d.), which collected job advertisements in the context of data science to develop a body of knowledge and competence framework for data science profiles. A similar approach has been used during the FAIRsFAIR project to update the EDISON framework with specific insights on (FAIR) data skills and competence relevant to the “data steward” job title.^[2] However, the application of this method in practice has been limited to Data Science and (FAIR) Data Management Skills so far. Using these approaches is in principle possible for other Open Science skills as well. This would likely require additional work in mapping specific competences, skills, and knowledge.

Existing datasources:

Job descriptions from employment platforms and company pages

Recruitment platforms such as LinkedIn, IEEE Jobs, indeed.com and others can be used to collect job advertisements. Several platforms offer research access (see for example https://www.linkedin.com/legal/l/research-api-terms). Employment platforms also provide analysis themselves. For example, a recent analysis of LinkedIn profiles found “data steward” to be the 2^nd-fastest growing job in the Netherlands.^[3] Alternatively, job descriptions can also be collected directly from company pages.

There are pros and cons to both approaches. Collecting information from company websites can be time consuming and requires a sampling strategy and a clear view and website access to relevant job descriptions, for example covering a specific industry in a specific country. Using information from employment platforms might offer easier access to large amounts of data, but there can be access and restrictions for the platform and the information itself.

Lightcast data

Another viable option for studying skill trends is to access Lightcast data. Lightcast (formerly known as EMSI Burning Glass) is a company that specialises in labour market data analysis and economic modelling. Lightcast provides comprehensive data and insights that are used by educational institutions, employers, and regional planners to understand and forecast labour market trends, skill needs, and economic conditions. The company has a robust methodology for mapping detailed skill requirements from job postings and resumes to specific occupations. This helps in identifying emerging skill needs and the evolution of roles across different industries. This data has been used in several scientific publication to study, e.g., digitalisation and AI skill jobs (O’Kane et al. 2020; Squicciarini and Nachtigall 2021).

Stakeholder surveys or interviews

Besides the analysis of job descriptions available from employment and company websites, information can also be collected from relevant organisations or professionals directly. This can be done by surveying or interviewing, for example companies, professional organisations, or employees directly. Surveys can ask about the profiles a company is hiring, about the direct experience of changing requirements of professionals, or changing landscape in a given sector.

Surveying and interviewing come with their own advantages and disadvantages. As with collecting data from company websites, these activities can be time consuming and require a sampling strategy, for example covering a specific industry or sector in a specific country. In addition, collecting contact information can be time consuming and responses rates may be low.

Existing methodologies

EDISON and FAIRsFAIR

Developing an approach for Data Science competences

The EDISON project (https://edison-project.eu) aimed to develop the Data Science profession through a new competence framework, the EDISON Data Science framework (EDSF). EDSF sought to help universities shaping the content and design of Data Science programmes, and companies setting competences and skills required for Data Scientist in their specific domain. EDSF developed a Data Science Competence Framework (CF-DS), which defines competence groups required for data science.

To develop the data science competence framework and an underlying body of knowledge, the EDISON project collected job descriptions from job advertisement portals in the context of data science. The competences proposed originate from the initial competences identified through the analysis of the job market, then reviewed based on the feedback received by experts, universities, professional training organisations etc.

Extension by FAIRsFAIR to Research Data Management

The FAIRsFAIR project (https://www.fairsfair.eu) built on EDISON to foster the development of FAIR data competences with specific insights on (FAIR) data skills and competences. The project developed a framework, FAIR Competence Framework for Higher Education (FAIR4HE), which outlines the competences for Data steward and data science to be integrated in university curricula. Similar to what occurred in EDISON, the FAIRsFAIR project also conducted a job market analysis, pinpointing the skills and competences required for Data steward.

The project firstly analysed competences, skills and knowledge topics available in grey literature and other documents. The next step included the collection and analysis of data steward job vacancies on several portals. Then, requirements from the job descriptions are identified and linked to the competence groups outlined by EDISON CF-DS. FAIRsFAIR used a text extraction approach to identify the required competences, skills and knowledge, as well as qualification level and education. These were then mapped against the skills and knowledge groups of the Edison Data Science Framework; new competence and skills groups for Research Data Management were added as necessary.

The competence groups updated by FAIRsFAIR for research data management, but also other frameworks (such as the ELIXIR Data Stewardship Competence Framework (DSP4LS), EOSCpilot FAIR4S Data Stewardship Competence Framework, COAR Librarian’s Competencies for E-Research and Scholarly Communication and others) can be used for these activities.

Number of new job positions

The measurement introduced above shows that it is possible to identify the qualitative change in job descriptions and that it can be applied to Open Science. With this information, it is possible to not just identify the qualitative change of profiles and but also to describe and possibly monitor the change in quantitative terms. While this is not common practice for Open Science skills so far, we describe two different approaches which could extend the metric with a quantitative element. The metric can be applied for different sectors, such as private companies, the public sector, or academia.

In a different nuance, the number of new employees within a company has been proposed as a metric for “economic growth of companies” induced by Open Science. In this case, the metric might be obtained by comparing the growth trajectories of companies using and/or benefiting from Open Science practices with those that do not use/benefit from Open Science. Here, a benchmarking approach might be suitable, similar to other metrics proposed for the indicators on economic growth. However, this metric is so far untested and should be understood as a suggestion for such a metric and not as a fully developed and tested research method for this indicator.

Measurement.

Building on the previous metric, it is feasible to classify job advertisements according to the required Open Science skills, competences, and knowledge. This change can be quantified, e.g., by tracking the occurrence of such requirements in the collected data. As stipulated above, classifications for skills, competences, and knowledge exist in the context of Data Science and Research Data Management. These can be used to classify the requirements, which can be done manually or automatically (e.g., based on key words). The collected data can be compared, for example, over a specific time horizon or different sectors or countries.

Existing datasources:

Job descriptions from employment platforms and company pages

Recruitment platforms such as LinkedIn, IEEE Jobs, indeed.com and others can be used to collect job advertisements and calculate the number of new job positions with Open Science dimension. Several platforms offer research access (see https://www.linkedin.com/legal/l/research-api-terms). Employment platforms also provide analysis themselves; for example, a recent LinkedIn analysis found “data steward” to be the 2^nd-fastest growing job in the Netherlands.^[4] Alternatively, job advertisements can be collected from company pages.

There are pros and cons to both approaches. Collecting information from company websites can be time consuming and requires a sampling strategy, for example covering a specific industry in a specific country. Using information from employment platforms might offer easier access to large amounts of data, but there can be access and restrictions for the platform and the information itself.

Stakeholder surveys or interviews

Besides the analysis of job descriptions available from employment and company websites, information can also be collected from relevant organisations or professionals directly. This can be done by surveying or interviewing, for example companies, professional organisations, or workers directly. Surveys can ask about the number of OS related positions within a company, but also about other aspects such as the profiles a company is hiring, about the direct experience of changing requirements of professionals, or changing landscape in a given sector.

Surveying and interviewing come with their own advantages and disadvantages. As with collecting data from company websites, these activities can be time consuming and require a sampling strategy, for example covering a specific industry or sector in a specific country. In addition, collecting contact information can be time consuming. Responses are not guaranteed.

Existing methodologies

References

Almerud, Mikaela, Maria Ricksten, Gareth O’Neill, Inge van der Weijden, Wolfgang Kaltenbrunner, Lidia Núñez, and An De Coen. 2022. Knowledge ecosystems in the new ERA: using a competence based approach for career development in academia and beyond. Publications Office of the European Union. https://data.europa.eu/doi/10.2777/150763.

Cost-benefit analysis for FAIR research data: cost of not having FAIR research data. 2018. Publications Office of the European Union. https://data.europa.eu/doi/10.2777/02999.

“Data Science Competence Framework (CF-DS) | Edison Project.” n.d. https://edison-project.eu/data-science-competence-framework-cf-ds/.

Demchenko, Yuri, Lennart Stoy, Claudia Engelhardt, and Vinciane Gaillard. 2021. “D7.3 FAIR Competence Framework for Higher Education (Data Stewardship Professional Competence Framework),” February. https://zenodo.org/records/5361917.

“Digital competencies urgently needed! October 2019.” n.d. https://rfii.de/download/digital-competencies-urgently-needed-october-2019/.

Manola, Natalia, Emma Lazzeri, Michelle Barker, Iryna Kuchma, Vinciane Gaillard, and Lennart Stoy. 2021. Digital skills for FAIR and Open Science: report from the EOSC Executive Board Skills and Training Working Group. Publications Office of the European Union. https://data.europa.eu/doi/10.2777/59065.

McCaffrey, Ciara, Thorsten Meyer, Clara Riera Quintero, Cecile Swiatek, Nathalie Marcerou-Ramel, Camilla Gillén, Karin Clavel, et al. 2020. “Open Science Skills Visualisation,” March. https://zenodo.org/records/3702401.

O’Carroll, Conor, Berit Hyllseth, Rinske van den Berg, Ulrike Kohl, Caroline Lynn Kamerlin, Niamh Brennan, and Gareth O’Neill. 2017. Providing researchers with the skills and competencies they need to practise Open Science. Publications Office of the European Union. https://data.europa.eu/doi/10.2777/121253.

O’Kane, Layla, Rohit Narasimhan, Julia Nania, and Bledi Taska. 2020. “Digitalization in the German Labor Market: Analyzing Demand for Digital Skills in Job Vacancies. Bertelsmann Stiftung July 2020.” http://aei.pitt.edu/id/eprint/103213.

Squicciarini, Mariagrazia, and Heike Nachtigall. 2021. “Demand for AI Skills in Jobs: Evidence from Online Job Postings.” https://www.oecd-ilibrary.org/science-and-technology/demand-for-ai-skills-in-jobs_3ed32d94-en.

Reuse

Citation

BibTeX citation:

@online{apartis2023,
  author = {Apartis, S. and Catalano, G. and Consiglio, G. and Costas,
    R. and Delugas, E. and Dulong de Rosnay, M. and Grypari, I. and
    Karasz, I. and Klebel, Thomas and Kormann, E. and Manola, N. and
    Papageorgiou, H. and Seminaroti, E. and Stavropoulos, P. and Stoy,
    L. and Traag, V.A. and van Leeuwen, T. and Venturini, T. and
    Vignetti, S. and Waltman, L. and Willemse, T.},
  title = {PathOS - {D2.1} - {D2.2} - {Open} {Science} {Indicator}
    {Handbook}},
  date = {2023},
  url = {https://handbook.pathos-project.eu/indicator_templates/quarto/4_economic_impact/labour_market_impacts.html},
  doi = {10.5281/zenodo.8305626},
  langid = {en}
}

For attribution, please cite this work as:

Apartis, S., G. Catalano, G. Consiglio, R. Costas, E. Delugas, M. Dulong de Rosnay, I. Grypari, et al. 2023. “PathOS - D2.1 - D2.2 - Open Science Indicator Handbook.” Zenodo. 2023. https://doi.org/10.5281/zenodo.8305626.

Labour market impact of OS

History

Description

Metrics

New Open Science skills and competences

Measurement

Existing datasources:

Job descriptions from employment platforms and company pages

Lightcast data

Stakeholder surveys or interviews

Existing methodologies

EDISON and FAIRsFAIR

Number of new job positions

Measurement.

Existing datasources:

Job descriptions from employment platforms and company pages

Stakeholder surveys or interviews

Existing methodologies

Frequency of Open Science-related job descriptions

References

Reuse

Citation