TOPICS

The RfII focuses on the topics of “Research Data – Sustainability – Internationality” in its first term of office. A series of recommendations and information resources highlight various facets of these topics.

Research Data Management

When the RfII was founded in 2014, several good examples of research data management in Germany existed, but often had the form of parallel, project-based initiatives. Universal access to services for research data management was lacking, and there was a clear need for action in a variety of areas. For this reason, the first work programme of the RfII focused on the topic of research data management. A comprehensive position paper PERFORMANCE THROUGH DIVERSITY (2016) sorted the complex situation in Germany and offered detailed and clear diagnoses. The recommendations ranged from adjustments with a short-term effect (e.g. evaluation of external funding applications) to strategic development tasks with a long-term effect −e.g. long-term financing options for services financed as projects and the establishment of a National Research Data Infrastructure (NFDI). The latter was realized subsequently by the German Federal Government and the Länder. The foundation process for the NFDI-governance was completed in 2020, while the selection process for consortia is still ongoing.

In 2017, the RfII commented on the special requirements of research with personal data in a white paper on Data Security and Research Data. In the interest of science, the RfII advocates facilitating the scientific use of personal data with a sense of proportion in the implementation of the European General Data Protection Regulation (GDPR) and provides some concrete impulses for this, including for the establishment of data trusteeships.

Regarding international developments, the first report AN INTERNATIONAL COMPARISON of the Development of Research Data Infrastructures (2017) depicts parallel formation processes and the paths taken in Canada, Great Britain, the Netherlands, and Australia. Based on these examples, the RfII has also derived a series of suggestions useful for the recommended establishment of the NFDI in Germany (as mentioned above). Regarding the complex pan-European developments, in particular the establishment of the European Open Science Cloud (EOSC), the RfII monitored this development and has also successfully fed several white papers into international discourses. The next comparative report with a slightly different focus on countries is to be expected in 2022.

In 2019, the RfII reacted with the white paper on current developments concerning Open Data and Open Access to the amendment of the European Open Data Directive and took position with regard to the “transition to openness” in the scientific publication system.

Recommendations of the RfII

Performance through Diversity. Recommendations regarding structures, processes, and financing for research data management in Germany, Göttingen, 2016, 90 p.

An international comparison of the development of research data infrastructures. Report and Suggestions, Göttingen, 2017, 53 p.

Statement of the Council for Scientific Information Infrastructures (RfII) on current developments concerning Open Data and Open Access, Göttingen 2019, 8 p.

For more see

DOCUMENTS

The National Research Data Infrastructure (NFDI)

The primary recommendation of the RfII in its position paper PERFORMANCE THROUGH DIVERSITY (2016) was to establish a coordinated research data infrastructure for Germany (NFDI). The RfII suggested to follow a long-term development path with a new kind of federated organizational architecture. The so far poorly coordinated and not coherently funded landscape of data infrastructures in science should thus be steered in a more efficient and more cooperative direction. Systematisation of databases, easily accessible research data, and continuous development of the services should strengthen the position of research in Germany as well as its global competitiveness.

The NFDI is conceived as a collaborative, nationwide network that will expand step by step. It shall provide reliable and sustainable services to cover generic and discipline-specific requirements of research data management in Germany. The NFDI will be established in stages, and its creation will be driven by science. Its services will be available to researchers across disciplines, institutions, and federal states.

The program started on January 1st 2019 and the first consortia officially started work in October 2020. During the negotiations of the Joint Science Conference in 2018 and 2019, the RfII published a series of discussion papers that outline the basic concept of the NFDI to stimulate an early discourse in the scientific community. While the operation of NFDI is now in the hands of the stakeholders, RfII continues to observe the development. With both the NFDI and the EOSC in mind, the RfII published a discussion impulse BUILDING SUSTAINABLE DATA SERVICES in 2020.

You find further information on the NFDI as well as an overview of the actors and their information services at https://www.nfdi.de/.

RfII Discussion Papers

Building Sustainable Data Services. RfII Discussion Paper on the Enhancement of Research Data Infrastructures, Göttingen 2020, 6 p.

Wide Impact for Research: NFDI Consortia as Stakeholders – Third discussion paper on the development of a national research data infrastructure (NFDI) in Germany, Göttingen 2018, 5 p.

Cooperation as an Opportunity – Second Discussion Paper on the Development of a National Research Data Infrastructure (NFDI) in Germany, Göttingen 2018, 4 p.

For more see

DOCUMENTS

DATA QUALITY

Particularly high standards are applied to data from scientific research. Excellent science demands high quality throughout the entire data lifecycle. In this regard, the RfII highlights in its position paper The Data Quality Challenge (2019) the task of scientific quality assurance of research data and presents ideas of how it should be anchored as a core methodological task in research practice. The development of quality-assured data products can contribute to enforce knowledge transfer. The RfII advises universities and non-university research institutions to incorporate the improvement of data quality and issues of data quality assurance in their research strategies. Research funding should also provide appropriate incentives and an adequate time frame to boost the awareness for data quality. Recommendations on funding policies also include support for innovative data products and their impact on researcher reputation. The RfII calls on the Federal government and the Länder not to let up in their efforts to create prospects for the further development of research data infrastructure and to remain attentive to infrastructures and services beyond the NFDI.

The RfII regards “data quality” as a topic, which – more than others – must be actively promoted. For this reason, the council stimulates an overarching discourse on the challenges described in the position paper. In this respect, the RfII (with support from the Volkswagen Foundation) organized an interdisciplinary conference in Schloss Herrenhausen in Hanover on 27./ 28. February 2020, which was very well attended and widely received.

RFII DISCUSSION PAPERS

The Data Quality Challenge. Recommendations for Sustainable Research in the Digital Turn, Göttingen 2020, 120 p.


SKILLS DEVELOPMENT

The increasing digitisation of labour − and especially in scientific labour − has triggered a strong demand for experts in data management and data analysis, as well as for new methodological skills in the respective research domains. At the same time there is a lack of qualification and continuous training opportunities tailored to the epistemological and technical requirements of science. In its paper DIGITAL COMPETENCES – URGENTLY NEEDED! published in 2019, the RfII therefore focused on the scientific labour market. Based on changing tasks, processes and organisational forms, the council considers challenges in this sector. Regarding the increasing “scientification” of research-related support tasks, the RfII recommends, for example, closer interaction of infrastructure and research staff. In view of the competition in the labour market, the RfII suggests that scientific institutions should act as a joint network of employers with common interests in the development of human resources. More emphasis should be placed on collaboratively organised, science-specific solutions to the shortage in expertise. For training and professionalisation within the science sector, it proposes the building of qualification alliances and the promotion of a further qualification campaign that also includes the management level. The RfII reckons that both universities and non-university research institutions, as teaching and learning organisations, have excellent preconditions to offer qualification for a digitally skilled work force. However, they are required to actively fulfil their roles as training facilities and employers and developers of personnel. Besides, it is necessary to make (public) employment in the scientific sector more attractive and competitive. For this purpose, the support of the Federal Government and the Länder is needed.

RFII DISCUSSION PAPERS

Digital competencies – urgently needed! – Recommendations on career and training prospects for the scientific labour market, Göttingen 2019, 56 p.


KEY TERMS

The problems related to the topics covered by the RfII are often complex and subject to numerous conditions. For this reason, the RfII will provide explanations of important terms at regular intervals. The use of consistent and coherent terminology is indispensable for a critical discourse.

[data quality]

This explanation of terms was revised and adopted by the RfII in 2017

The term “data quality” refers to general, typical properties of the data itself, including those required due to the methodology, as well as its suitability for further use after the application of appropriate quality assurance measures. The evaluation of the data quality is based on the requirements to be defined for the data, which in turn depend on the research question, and therefore on how the data will be used to obtain research results. These requirements concern the accuracy of measured values, the reliability of a result obtained empirically, the completeness or currency of the data, and the documentation on how the data was acquired and stored. In addition, sustainability aspects are inextricably intertwined with the evaluation of the actual quality of the data. Such aspects include the properties of the data, the transferability of the data, the life expectancies of data media, etc. They affect in particular the preservation of research data for use in the future by science, business, and society; ideally in many different and possibly even currently unknown ways. In terms of further use (“reuse”), data quality is determined by the ease with which databases and data collections can be researched and data found, as well as by whether or not they contain enough additional information. This additional information should be available, if possible, in the form of standardised technical and scientific metadata regarding quality aspects, provide information on how the data was generated and processed, and state which tools and methods were used. A prerequisite for the traceability and, if possible, reuse of digital research findings is that the corresponding data is fully documented in terms of the data models on which it is based (vocabularies used, formats, etc.) and the methods used to acquire it (e.g. measuring instruments, surveys, algorithms, etc.). Wherever possible, not only the metadata, but also additional and possibly even special documentation should follow recognized and available standards. The availability, accessibility, and citability of research data – including over the long term – are in turn quality aspects of the information infrastructures and services that allow the data to be stored securely, located quickly (retrieval), accessed, and re-used (also in the context of long-term archiving). The clarification of the legal framework conditions under which the data can be used in connection with information infrastructure services is also a component of the data quality.

RfII (2017) – Arbeitsthema Datenqualität (unpublished), p. 11

[information infrastructures (RfII), e-infrastructures (EU)]

Information infrastructures are technically and organisationally networked services and facilities for accessing and maintaining databases, information bases, and knowledge bases. In the context of the RfII’s counselling work, they primarily serve research purposes, are often objects of research, and always function as an enabler.

Information infrastructures must always take into account that knowledge bases in universities, research facilities, archives, libraries, and museums are available in purely analogue or digital form or in a combination of analogue and digital forms. The purpose of the digitisation of analogue knowledge bases is to integrate and merge digitised data and native digital data into uniform, integrated work environments with the goal of achieving dynamic knowledge integration. Like the term ‘e-infrastructures’, the term ‘information infrastructures’ commonly encountered in Germany is also increasingly being used to refer to the digital information and communication technologies employed in research.

The performance of digital information infrastructures depends significantly on the amount invested in digitising the content, user-friendly access methods, technical features, international standards, and effective tools. The level of information literacy of the users and personnel and the associated quality of the custom services provided are equally relevant.

[research data, research data management]

This explanation of terms was revised and adopted by the RfII in 2017

RESEARCH DATA is not only comprised of the (final) results of research. Instead, research data comprises all data generated in the course of scientific activity, including large amounts of data used for documentation purposes in scientific projects generated through measurements and through selecting, preparing, collecting, and storing information. However, data not obtained through direct scientific activity but that is used by science for the purpose of research to form the methodological foundation of the specific research process is also research data. This is the case, for example, when official statistics or other data from public authorities or products from non-scientific service providers are processed scientifically. That research data also includes the research tools used as well as the traces of scientific activity continuously generated – i.e. the process data produced automatically and in large quantities through digital research – is important wherever research processes and research data are documented and archived for quality assurance purposes and wherever this is advisable based on sustainability aspects or for the purpose of historical research.

In actual research, it is possible to differentiate, although not always clearly, between research data and METADATA. Metadata documents the process through which the research data was created and provides it with a context. In the research process, metadata can itself become the object of research, which is significant in terms of the life cycle of research data.

RESEARCH DATA MANAGEMENT includes all measures ‒ even organisational measures extending beyond research activity in the narrow sense ‒ that need to be taken in order to obtain high quality data, to follow good scientific practice within the data life cycle, to make results reproducible, and possibly fulfil existing documentation requirements (e.g. in the health care sector). The availability (possibly across different domains) of data for reuse is an important issue, and data management plans are increasingly being used by scientific institutions. Data management plans, which are developed and written at the beginning of a project or are the result of a research project, are intended to describe the data to be used and generated as well as the documentation, metadata, and standards required, state the potential legal restrictions (e.g. data protection) early on, plan the storage resources necessary, and specify the criteria to determine which data should be made available externally in which form and how it could be stored in the long-term. At the organisational level, research facilities (e.g. universities) must ensure access to the corresponding infrastructure services within the facility (e.g. by creating new capacities or expanding existing capacities) or in cooperation with external partners (through cooperation agreements, etc.). In this context, organisations should also actively work towards the overall goal of enabling the use of data across domains and scientific communities.

Additional terms and previous versions

Begriffsklärungen: Bericht des Redaktionsausschusses Begriffe an den RfII (RfII Berichte No. 1), Göttingen 2016, 31 S. (German only)

The definitions have partially been translated into English, see appendix to the position paper Performance through Diversity (2016), p. 71-82