Data management plans
Data Management Plans (DMPs) are essential documents that help researchers systematically manage data throughout a project, from data collection to storage and sharing. DMPs enhance research quality and impact by clearly defining how data will be stored, protected, and documented, thus promoting data transparency and future accessibility.
With the growing volume of research data, DMPs have become an important tool for ensuring responsible data management. Many funders and projects, such as the Horizon 2020 program, now require DMPs as a mandatory part of project proposals.
Data Management Plans address critical aspects, including data storage, security, metadata, sharing protocols, and roles. Well-developed DMPs foster scientific collaboration and improve research reproducibility.
In this chapter, you can expect to learn about:
- the importance of data management plans: why DMPs are essential for responsible data processing practices, improving research transparency, and supporting data reuse;
- core components of a data management plan: key areas typically covered by DMPs, such as data description, storage, security, metadata, sharing, preservation, roles, and budget considerations;
- the process of creating a data management plan: steps involved in developing an effective DMP, including defining the scope of research activities, consulting guidelines, describing practices, assigning roles, using tools, and reviewing the plan;
- challenges in implementing data management plans: obstacles researchers may face, such as lack of understanding, resource constraints, privacy concerns, infrastructure limitations, and rapidly changing technologies, along with ways to overcome them;
- the future of data management plans: upcoming directions for plan development, including integration with artificial intelligence and automation, real-time data monitoring, improved alignment with open science, international collaboration, and support for managing sensitive data.
By committing to robust data management practices through effective DMPs, the research community can drive scientific progress and contribute to societal advancement in meaningful ways.
Data Management Plans are essential for responsible data handling practices, improving research transparency and supporting data reuse. A well-designed DMP benefits researchers, institutions, and funding agencies by providing a structured approach to data management, which leads to better project outcomes.
Key reasons why DMPs are important:
- enhanced data integrity: proper data management reduces the risk of data loss or corruption, ensuring that data is accurate and reliable;
- increased research efficiency: a DMP helps researchers organize their data processes, saving time and resources;
- facilitates collaboration: when data is well-documented and accessible, it becomes easier for other researchers to understand and collaborate on projects;
- supports open science: DMPs promote data sharing and reuse, aligning with the goals of open science and FAIR data principles (Findable, Accessible, Interoperable, Reusable);
- compliance with funding requirements: many funding agencies require DMPs, and not providing one can result in losing funding opportunities.
In an era where data is a valuable asset, DMPs help protect this resource, ensuring it remains a long-lasting and accessible asset for future research.
A DMP typically covers several key areas, each focusing on a different aspect of data management. While the exact structure of a DMP may vary depending on institutional or funding requirements, most DMPs include the following core components:
- data description: describes the types, sources, and formats of data that will be generated or used during the project. This includes the estimated volume of data, data types (e.g., qualitative, quantitative), and any software or formats required for data use. It also indicates if the data will be generated by the project, collected from existing datasets, or obtained from third parties;
- data storage and security: outlines how the data will be securely stored and protected during the project. It addresses how sensitive or personal data will be handled to ensure confidentiality, as well as backup procedures, storage locations, and data encryption methods to protect against data loss and unauthorized access;
- metadata and documentation: provides information on how data will be documented and described to ensure it is understandable and reusable. It specifies any metadata schemas or standards (e.g., Dublin Core, ISO19115) that will be used to describe the data and includes definitions of variables, codes, and any abbreviations used in the dataset;
- data sharing and access: details how and when data will be shared with others – identifies repositories or platforms where the data will be shared (e.g., institutional repositories, discipline-specific databases). Defines who can access the data and under what conditions (e.g., open access, restricted access), and specifies any licenses (e.g., Creative Commons) governing the use of the data;
- data preservation: explains plans for long-term data storage and preservation, including which repository or archive will be used for data deposition and specifying how long the data will be preserved, following any institutional or funder requirements;
- roles and responsibilities: identifies who is responsible for various data management tasks and assigns specific procedures to ensure that data management is prioritized and consistently applied throughout the project;
- budget considerations: addresses any financial aspects related to data management including expenses for data storage, backup solutions, software, or personnel time. This section also specifies how these costs will be covered, whether through the project's budget or additional funding.
Each component of a DMP plays a crucial role in establishing a comprehensive framework for data stewardship, ensuring data is handled responsibly at every stage of the research process.
Creating an effective Data Management Plan involves several steps designed to address the unique data needs of the research project. A good DMP requires careful planning and collaboration among team members, often with support from institutional resources such as data librarians or IT staff.
Steps to creating a DMP:
- define the research scope: before writing the DMP, researchers need to clearly define the project’s data requirements, including the types of data to be generated, data sources, and any specific handling needs;
- investigate institutional and funder requirements: researchers should review any guidelines provided by their institution or funding agency, as these often specify DMP format, content, and submission processes;
- outline data management practices: for each component of the DMP, researchers must document the practices they will use. This includes selecting appropriate storage methods, identifying metadata standards, and setting up data-sharing protocols;
- assign roles and set timelines: assign specific team members to data management tasks and establish timelines for each stage of data handling, from data collection to long-term preservation;
- use DMP tools and templates: many institutions provide DMP templates and tools like DMPTool and DMPonline, which guide researchers through the DMP creation process, ensuring all essential areas are covered;
- review and revise the plan: a DMP is a living document that may need adjustments throughout the project. Regular reviews allow the team to adapt the DMP to any changes in data needs or project scope.
Creating a DMP is not a one-time task; it requires ongoing assessment and revision. A flexible approach helps researchers respond to unforeseen challenges and refine their data practices over time.
Despite the benefits of Data Management Plans, researchers may face several challenges during the DMP creation and implementation process:
- lack of awareness or training: many researchers are unfamiliar with data management best practices or the specific requirements of DMPs. This lack of knowledge can make it difficult to create a comprehensive and effective plan;
- resource constraints: small research teams or those with limited funding may struggle to afford secure storage options, reliable backup solutions, or long-term preservation services. Limited resources can hinder the implementation of robust data management strategies;
- privacy and ethical considerations: for research involving sensitive or personal data, strict data protection measures are necessary. Balancing the need for data sharing with the obligation to protect participant privacy can complicate DMP requirements, especially in fields like healthcare, social sciences, or any research involving human subjects;
- infrastructure limitations: not all institutions possess the necessary infrastructure to support effective DMPs. This includes secure data storage facilities, high-capacity servers, and access to reliable repositories for data archiving. Lack of institutional infrastructure can impede the proper implementation of a DMP;
- rapidly changing technology: the field of data management is continually evolving, with new technologies and tools emerging regularly. Researchers may find it challenging to keep their DMPs up-to-date and to adopt new practices or technologies, especially if these changes occur during an ongoing project. Upgrading data storage or security measures mid-project can be both technically and financially demanding.
Overcoming these challenges often requires:
- institutional support: universities and research institutions can provide resources, infrastructure, and policies that support effective data management;
- adequate funding: securing sufficient funding allows research teams to invest in necessary data management tools and services, such as high-quality storage solutions;
- continuous training and education: providing researchers and their teams with ongoing training helps build the necessary skills for effective data management. This includes education on best practices, regulatory compliance, and the use of data management tools;
- collaboration with data management experts: engaging with data librarians, IT professionals, and data management specialists can help researchers navigate the complexities of creating and implementing a DMP.
By addressing these challenges proactively, researchers can enhance their data management practices, leading to more reliable research outcomes and greater opportunities for data sharing and reuse.
As data management practices continue to evolve, the role of DMPs will expand to address new challenges and opportunities. Advances in technology, policy changes, and a growing emphasis on open science principles are shaping the future of DMPs.
Future Directions for DMPs:
- integration with artificial intelligence (AI) and automation: AI could play a role in automating data management tasks. AI can assist with metadata generation, data classification, and data quality checks, simplifying DMP implementation and reducing the administrative burden on researchers;
- integration with research data management platforms: future DMPs may be directly integrated with research data management systems, allowing for seamless implementation of data management practices. This integration can automate compliance checks, streamline data handling processes, and provide researchers with tools to manage data more efficiently;
- real-time data monitoring: future DMPs may include real-time data monitoring features. This enables researchers to track data use, detect anomalies, and ensure compliance with data security protocols as data is collected and processed, enhancing data integrity throughout the research lifecycle;
- enhanced open science and FAIR compliance: as more research aligns with open science and FAIR (Findable, Accessible, Interoperable, Reusable) principles, the DMPs will be essential for ensuring data meets these standards. This is likely to lead to greater standardization in DMP templates and practices, facilitating easier data sharing and reuse across disciplines;
- international collaboration and standardization: global collaboration is increasing the need for standardized DMP practices. Harmonizing data management protocols allows researchers from different countries and institutions to follow consistent data management procedures, improving cooperation and ensuring compliance with international regulations;
- support for sensitive data management: as privacy regulations like the GDPR evolve, DMPs will need to include advanced features for handling sensitive data. This involves implementing robust encryption methods, anonymization techniques, and strict access controls to ensure legal and ethical compliance with data protection laws;
- emphasis on data ethics and sustainability: there is a growing focus on ethical considerations and the environmental impact of data management. DMPs will increasingly address issues like ethical data use, minimizing carbon footprints of data storage solutions, and promoting sustainable practices in data handling.
The future of DMPs is promising, as they are poised to become a fundamental part of research data management. By embracing technological advancements, adhering to international standards, and focusing on ethical considerations, DMPs will continue to ensure data stewardship, promote open science, and facilitate responsible data use in research.
Darby R. (n.d.). Writing a data management plan. [online] [accessed 11/24/2024]. Available: https://www.reading.ac.uk/research-services/research-data-management/data-management-planning/writing-a-data-management-plan
Lefebvre A., Bakhtiari B., Spruit M. (2020). Exploring research data management planning challenges in practice. IT - Information Technology, Vol. 62 (1), pp. 29–37. DOI: 10.1515/itit-2019-0029
Michener W.K. (2015). Ten simple rules for creating a good Data Management Plan. PLoS Computational Biology, Vol. 11 (10), pp. 1–9. DOI: 10.1371/journal.pcbi.1004525
Miksa T., Oblasser S., Rauber A. (2022). Automating research data management using machine-actionable Data Management Plans. ACM Transactions on Management Information Systems, Vol. 13 (2), pp. 22. DOI: 10.1145/3490396
Parham S.W., Carlson J., Hswe P., Westra B., Whitmire A. (2016). Using Data Management Plans to explore variability in Research Data Management practices across domains. International Journal of Digital Curation, Vol. 11 (1), pp. 53–67. DOI: 10.2218/ijdc.v11i1.423
Rolando L., Carlson J., Hswe P., Parham S.W., Westra B., Whitmire A.L. (2015). Data Management Plans as a research tool. Bulletin of the Association for Information Science and Technology, Vol. 41 (5), pp. 43–45. DOI: 10.1002/bult.2015.1720410510
Smale N.A., Unsworth K., Denyer G., Magatova E., Barr D. (2020). A review of the history, advocacy and efficacy of Data Management Plans. International Journal of Digital Curation, Vol. 15 (1), pp. 30. DOI: 10.2218/ijdc.v15i1.525
Swedish National Data Service. (2024). Data management plan. [online] [accessed 11/24/2024]. Available: https://snd.se/en/manage-data/plan/data-management-plan
University of Surrey. (n.d.). Data management plans. [online] [accessed 11/24/2024]. Available: https://www.surrey.ac.uk/library/open-research/data-management-plans
University of Toronto. (n.d.). Data management plans. [online] [accessed 11/24/2024]. Available: https://onesearch.library.utoronto.ca/researchdata/data-management-plans
Williams M., Bagwell J., Nahm Zozus M. (2017). Data management plans, the missing perspective. Journal of Biomedical Informatics, Vol. 71, pp. 130–142. DOI: 10.1016/j.jbi.2017.05.004