Research Data
What is Research Data?
When we talk about Research Data at VU Amsterdam, it is defined as the following:
“Information that is collected, observed, generated, or reused for the purpose of underpinning academic research. Depending on the discipline, research data may consist of, for example, text, images, audio recordings, video, spreadsheets, databases, statistical data, geographic data, or sensor measurements.”
In the context of the Research Data and Software Management Policy, research data refers to the entirety of the data, including associated metadata, documentation, and contextual information required to understand, interpret, and reuse the data.
This understanding aligns with widely accepted international definitions. For example, the Organisation for Economic Co-operation and Development OECD defines research data as:
“Factual records (numerical scores, textual records, images, and sounds) used as primary sources for scientific research, and that are commonly accepted in the scientific community as necessary to validate research findings.”
Examples of Research Data
Research data can be created in many formats and through a wide range of research methods. Nearly all fields of study and academic disciplines generate research data, including mathmatics, social sciences, computer science, humanities, and law.
Some examples of research data include:
- Text-based files such as documents, spreadsheets, and presentation slides
- Images, photographs, films, and other visual materials
- Survey data, interview transcripts, and codebooks
- Physical samples and genomic or sequence data
- Laboratory and field notebooks
- Audio and video recordings
- Computer code, algorithms, models, and scripts
- Research methodologies, protocols, and workflows
- Bibliographies and reference datasets
Because research data can take many forms, it is not always easy to identify what qualifies as research data. The pyramid below, developed by Andorfer (2015) and provided by the University of Geneva, offers a useful framework for understanding how different types of research data function within the research process, particularly in the social sciences and humanities.

What Is Not Research Data?
Not all data used in a research context qualifies as research data.
Administrative or operational data—such as HR records, routine email correspondence, or generic software logs—are generally not considered research data unless they are explicitly collected or repurposed to answer a research question.
Similarly, publicly available datasets that are reused without modification are not newly created research data. However, their use may still require appropriate documentation, citation, and ethical or legal consideration.
Research Data Across the Research Lifecycle
Research data can be generated at multiple stages of the research lifecycle, including data collection, processing and analysis, and validation of results. At each stage, multiple versions of data may exist, such as raw data, cleaned or processed data, and derived or aggregated datasets.
Appropriate management of these different forms of research data is essential for ensuring transparency, reproducibility, and long-term reuse, in line with FAIR data principles (Wilkinson et al., 2016).
References
- Andorfer, P. (2015). Forschen und Forschungsdaten in den Geisteswissenschaften: Zwischenbericht einer Interviewreihe. Niedersächsische Staats- und Universitätsbibliothek Göttingen.
- Organisation for Economic Co-operation and Development. (2007). OECD principles and guidelines for access to research data from public funding. OECD Publishing. https://doi.org/10.1787/9789264034020-en-fr
- Vrije Universiteit Amsterdam. Research Data and Software Management (RDSM) Policy, version 3.0.
https://rdm.vu.nl/public/policies-regulations/RDSM-policy-VU-EN-v3.0.pdf - University of Geneva. (n.d.). Identify research data. Researchdata. https://www.unige.ch/researchdata/generate-collect/identifier-donnees-de-recherche
- Wilkinson, M. D., et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018.
https://doi.org/10.1038/sdata.2016.18