To content
INF

Information infrastructure project

The information infrastructure project  provides support to all projects in collecting, preprocessing and sharing their research data, and in developing and publishing efficient implementations of their statistical methods in popular open-source software environments. INF ensures that TRR 391 can implement the highest standards with respect to the FAIR data principles, e.g., the reproducibility of research results and re-use of research data, and it provides training on these.

Project Leaders

Dr. Philipp Breidenbach
Research Data Center Ruhr
RWI Leibniz Institute for Economic Research

Prof. Dr. Paul-Christian Bürkner
Department of Statistics - Chair of Computational Statistics
TU Dortmund University

Prof. Dr. Andreas Groll
Department of Statistics - Chair of Statistical Methods for Big Data
TU Dortmund University

Summary

The goal of INF is to ensure a smooth cooperation of all data processes between the participating teams from different universities and disciplines. In addition to the overall research data management, this includes the interoperability of data and software between the participating disciplines and the support and training of researchers in the needs of interdisciplinary workflows and FAIR data. Research data often comes from a variety of sources and formats, making it difficult to standardize processes to ensure consistency, as different disciplines and researchers use different data structures. In addition to incoming data, project products such as coding schemes and newly developed software must also be integrated into the collaboration, which is further complicated by different "understandings" of data and information between the disciplines involved in TRR 391.

The data come from different types of data sources: simulations, experiments, surveys, open access data, and industry collaborations that require special efforts to transfer data and results. The interdisciplinary composition and the broad set of data sources also offer the great possibility that the INF project can develop a language to convey a common understanding of data, data processing and data provision between different disciplines. Such a successful translation would have benefits for future interdisciplinary projects beyond TRR 391, especially for the National Research Data Infrastructure (NFDI), which has the goal of a common research data infrastructure.

Faced with the set of challenges described above, INF aims to manage all data processes efficiently, to provide data according to the FAIR principles and to train scientists in these specific tasks. INF will promote exchanging and merging research data and efficient software implementations of the statistical methods investigated within our TRR, as well as the availability of the research data and output, both inside and outside of TRR 391. The support and infrastructure provided by this project will ensure that TRR 391 can implement the highest standards with respect to the reproducibility of research results and re-use of research data. To address this, the following principles will be followed in all projects:

  • Open data: Data created and re-used in our TRR shall be handled according to the FAIR principles, and the guidelines of research data management at the participating institutions. All data sets will be brought to comparable standards by the researchers. The data will be enhanced with meta-data  according to a uniform standard and a data description.  The data sets will be shared as openly as possible and as closely as necessary.
  • Open source: The various research projects have in common that the results will be based on extensive source code. For quality assurance and replication, it is particularly important that these source codes can also be replicated by other researchers and follow the guidelines invented within the INF project.  Source code of all methods developed in TRR 391 will be published as open source under a suitable license.
  • Reproducible research: Research articles from TRR 391 shall be published jointly with the computational tools necessary to reproduce the results. In particular, this includes the software and a precise description how the results have been obtained. In combination with open data and open source code, we strive for the best possible reproducibility of all results.

Comprehensive training measures guarantee that the researchers can follow these principles. Special emphasis is placed on ensuring that all solutions provide for low-threshold participation of all projects in TRR 391. The INF project will have extensive and direct communication with NFDI sections and consortia. This approach aims to facilitate the direct flow of NFDI developments into TRR 391, and vice versa, to enable the exchange of findings with the NFDI.

 

Software Courses at TUDO in summer term 2025

  • Einführung in LaTeX (3 LP; block course); lecturer: Schmidt (German)
  • Introduction in Python (3 LP); lecturer: Bürkner (English)
  • Introduction in Julia (3 LP); lecturer: Bürkner (English)
  • Advanced SAS (3 LP; block course); lecturer: Wiedner (English)

Relevant NFDI consortia

The National Research Data Infrastructure (NFDI) initiative in Germany brings together various consortia to enhance research data management across disciplines. These consortia focus on establishing FAIR (Findable, Accessible, Interoperable, and Reusable) data practices, developing standards for data exchange, and fostering collaboration within their respective research communities. Below is an overview of key NFDI consortia relevant to energy research, earth sciences, data science, social sciences, and mathematics, including their main goals and contact details.

Focus: Energy research and energy system transformation
Main Goals:

  • Facilitate FAIR data management in energy research

  • Develop standards for data exchange and interoperability

  • Support researchers with tools and infrastructure for energy-related data

https://nfdi4energy.uol.de/

Contact: stephan.ferenzuolde

Focus: Earth system sciences and environmental research
Main Goals:

  • Improve access to geospatial and environmental data

  • Develop tools and standards for sustainable data use

  • Foster collaboration between researchers and institutions

https://www.nfdi4earth.de/

Contact: https://www.nfdi4earth.de/helpdesk

Focus: Data Science and Artificial Intelligence
Main Goals:

  • Promote FAIR data principles in data science

  • Provide infrastructures for machine learning and AI research

  • Ensure transparency and reproducibility in data-driven research

https://www.nfdi4datascience.de/

Contact: christine.hennigfokus.fraunhoferde

Focus: Social, behavioral, and economic sciences
Main Goals:

  • Improve access to sensitive and survey-based research data

  • Develop secure and legal data-sharing infrastructures

  • Enhance data literacy and research data management training

https://www.konsortswd.de/

Contact: bernhard.millergesisorg

Focus: Mathematical research data
Main Goals:

  • Support the structured storage and sharing of mathematical data

  • Develop community standards for mathematical research data

  • Enable open access and reproducibility in mathematical research

https://www.mardi4nfdi.de/about/mission

Contact: bachermardi4nfdide

TUDOdata TRR

In addition, in collaboration with TU Dortmund, we provide a repository for the TRR, integrated into TUDOdata:

https://data.tu-dortmund.de/dataverse/trr391

Needs analysis

We are currently conducting a needs analysis to better understand the specific requirements and challenges related to research data management within the TRR. To gain deeper insights into the kind of support that is needed, we will be speaking with each project leader from the different subprojects.

The interview questions are designed to gather detailed information about the types of data used in each subproject, the challenges researchers face in handling and managing this data, and the requirements for ensuring its reproducibility. Our goal is to identify common needs and develop tailored solutions to improve data accessibility, transparency, and long-term usability across the TRR.