LibGuides: Scholarly Communications: Guidance for completing the UEL DMP template

Guidance on the UEL DMP template

The following is a guide to the UEL Data Management Plan template. Select each tab to access guidance and examples for each section of the UEL DMP template. Some funders have their own templates which must be completed and submitted at either grant application stage or within a specified period during the research project: contact us for assistance with funder data management requirements.

What data will you collect or create?

What should I include here?

Research data is information collected, generated, observed or created during a research project, to underpin and validate the content of the final research output. It may vary in nature according to the discipline and can be both digital and non-digital in format. Examples include audio/video recordings, spreadsheets, transcripts, software code, prototypes, photographs, models, experimental measurements, physical specimens, and observations.

List and describe each type of data you will collect, create, or use to answer your research questions, whether they are physical or digital, including raw, processed, and published data. If using secondary data, provide a reference for the data including a DOI where possible.
State what formats each data type will be stored in: to enable wider re-use and long-term preservation, consider whether data collected or created in proprietary formats will be saved to an ‘open’ format later, e.g. .xls spreadsheets stored as .csv files. The UK Data Service has a guide to recommended file formats. For each data type estimate the volume of data you expect to create or collect: this could be, for example, in numbers of experiments or files generated, or size in bytes (MB, GB, TB).
Indicate which data are personal (information that relates to an identified or identifiable individual) and/or special category (personal information that’s likely to be more sensitive).

Why is this important?

The type, scope, and volume of data will determine appropriate methods of storage, transfer, and long-term preservation. It can also help identify and mitigate the risk of data loss, disclosure of personal and sensitive information, and identify actions enabling future data use, such as migration of proprietary file formats to open formats.

Examples

Spreadsheets of Patient Health Questionnaire (PHQ-9) responses (50 participants) created in .xlsx and converted to .csv format, contains personal and special category data related to mental health.
Interview recordings in .mp4 format (20 files, approx. 8GB total) and transcripts in .docx

How will the data be collected or created?

What should I include here?

Describe for each data type your data collection methods and any software, devices, and instruments used. Outline when, how, and to where your data will be transferred or exported.
Describe how you will organise your data, including a logical, meaningful folder & file structure and file-naming conventions.
State what data quality or assurance methods you will use (e.g. calibration of instruments, standardised protocols, controlled vocabularies for data entry, etc.)

Why is this important?

Potential risks may be identified, such as data loss, security, or quality issues, with tools, systems, or methodology. Your data collection methods may identify additional resources that might be required, such as specialist software or equipment, to avoid potential delays in starting a project due to their lack.

Examples

Interviews will be conducted and recorded remotely using Microsoft Teams installed on the interviewer’s UEL-managed laptop, with the resulting .mp4 files transferred to OneDrive. Recordings will be stored following the file-naming convention: [ProjectCode]-[InterviewerInitials]-[ParticipantNumber]-[Location]-[Date].Ext . An interview schedule will be developed so that a standard format is followed.

EEG recordings will be captured in .bdf format directly onto PC from electrode headcaps using the BioSemi ActiveTwo system. This data will be imported to MATLAB for analysis. A standardised protocol will be established to ensure data quality.

What documentation and metadata will accompany the data?

What should I include here?

Describe what documentation and metadata you will record, specifying any formal standards used.
You do not have to use a formal standard if it is not appropriate, but otherwise describe your approach to data documentation.
Consider what information you would need to explain your data if returning to it after a long time, or what someone outside of your research group, or your field, would need to understand and re-use your data.

Documentation may be used to describe the research questions of the project as a whole and methods used to answer these, relate specifically to files and how they are organised, or exist at variable level such as a list of variables with definitions and information about permitted values in a spreadsheet or data dictionary.

Metadata can broadly be defined as ‘data about other data’ and make up a subset of data documentation, providing standardised, structured information describing a dataset that can be read by machines; metadata is often required when depositing data in a repository or creating a record in a data catalogue.

There are domain-agnostic standards, such as Dublin Core, which includes basic elements like title; description; creator; publisher; date; and rights. There are also discipline-specific standards, for example Data Documentation Initiative (DDI) in the social sciences. The UK Data Service explains documentation in more detail and FAIRsharing.org maintains a searchable catalogue of disciplinary standards.

Why is this important?

Documentation and metadata provide information and context to enable yourself or others to discover, use, and reproduce your data-- facilitating data to be FAIR (Findable, Accessible, Interoperable, and Reusable).

Examples

We will not use a formal disciplinary metadata standard but will prepare a README file containing descriptions of: the research aims; data collection methods and instruments; quality assurance protocols; folder structure and file-naming conventions. Additionally, a template consent form and participant information sheet will be included.

A readable codebook in PDF format will be created with information at project, file, and variable level. An XML version to DDI Codebook V2.5 specification will be produced to accompany the survey and interview transcript data when deposited in an appropriate repository. Anonymisation procedures for qualitative and quantitative data will be fully documented, also included will be the interview schedule and master copy of survey questions.

Identify any ethical issues and how these will be managed

What should I include here?

This section should describe ethical issues that could affect the management of your data and how these will be addressed, rather than the ethics of conducting the research project as a whole.

Ethical issues may include protecting the anonymity of participants when handling their personal data and confidentiality of commercially sensitive or environmentally sensitive data (for example, the location of endangered species).

Identify relevant ethical issues and describe how you will manage them. Outline what measures you will take to comply with data protection legislation (e.g. DPA 2018 and GDPR), such as issuing privacy notices, minimising the amount of data collected, storing data within the EU, and robust anonymisation. The ICO has a guide to anonymisation techniques, as does the UK Data Service.
How you will obtain informed consent for data collection and storage, as well as for sharing and archiving research data in the future.

Why is this important?

You have ethical and legal obligations to protect the personal and special category data of research subjects: demonstrate that you are aware of the ethical issues pertinent to your research and have considered the measures to address them. Further information and resources on data protection at UEL: UEL Data Protection

Examples

We will be collecting personal and special category data related to mental health, so the protection of participant privacy is an ethical concern. Direct identifiers will not be collected, but indirect identifiers in the survey data will be anonymised through techniques such as aggregation of age and income ranges with suppression of outlier values. In compliance with GDPR principles, we will only use data for the purposes it was obtained, retain only for as long as necessary, store within the EU on UEL OneDrive, and gain written consent from participants for collection, storage, archiving, and sharing of anonymised data.

Confidentiality is an ethical issue pertinent to this project, as we will be interviewing a small, vulnerable population about sensitive subject matter. Anonymising voices and video content will not be feasible, so we will de-identify upon transcription. Interview recordings will need to be handled securely, so access will be restricted to the PI and supervisor, stored on UEL-managed services, and deleted after transcripts have been checked. The transcripts will be stored separately from the pseudonymisation log which could be used to re-identify participants (further information in the ‘Storage and Back-up’ section).

How will the data be stored and backed up during the research?

What should I include here?

Where you store and back up your data will depend on their scope, format, volume, and sensitivity. UEL storage should be used as far as possible—such as Microsoft OneDrive and SharePoint— rather than 3rd party services: all staff and students have access to up to 1TB of storage on UEL OneDrive. Avoid synching files containing personal data from OneDrive to your personal laptop or device. Best practice is that back-up copies of data should be held on a different type of storage media. IT Services can advise on solutions if additional or specialist storage is required.

Specify the storage and back up locations for each data type (physical and electronic formats) during the ‘active’ phase of your research and why these are the most appropriate.
Define back up procedures & schedule, as well as details of data integrity checks.
Include details of data transfer, such as when storing on devices or instruments off-network or in the field: when and how will you securely move data to UEL storage.

Why is this important?

Robust storage and back up procedures are vital to protect the data your research generates. Appropriate storage and back up should help mitigate risk of damage, loss, and/or unauthorised access. Storage and transfer of personal data and/or sensitive data must comply with data protection legislation, for example personal data must be stored within the EU (UEL services comply with this, whereas other cloud providers such as DropBox, Google Drive, and iCloud may not).

Examples

Survey responses will be stored securely in the Qualtrics software, licensed to the UEL School of Psychology, until the end of data collection, upon which they will be exported to UEL OneDrive and backed up on the project’s SharePoint site. Consent forms and coding will be stored on UEL OneDrive in a separate folder from the pseudonymised data.
Paintings and sketches will be stored away from physical hazards in a locked studio at Docklands campus. Digital photographs taken to document the process and of the finished works will be stored on OneDrive, with a back-up copy on the hard drive of my personal laptop. Back-ups of draft sketches will be taken weekly.

How will you manage access and security?

What should I include here?

Describe how you will ensure the security of your data and prevent unauthorised access: this includes both digital and physical data, as well as for instruments and locations such as laboratories. These might be technical measures, such as encryption of files, drives, and devices (e.g. Dictaphones), or physical, such as lockable cabinets or ID card access to rooms. Another safeguard is the anonymisation of personal data.

Describe what measures you will take to keep data secure, in storage and while transferring
State who will have access to the ‘active’ research data and how this access will be controlled

Why is this important?

Ensuring sound methods of security and access will mitigate the risk of data loss or disclosure; researchers have regulatory and ethical obligations to keep the personal and sensitive data of their participants safe and seen only by those that are authorised to.

Examples

Data stored on OneDrive is encrypted, access is limited to me and secured through Multi-Factor Authentication. I will share data with my supervisor upon request using OneDrive secure links. My password-secured laptop will be used to access UEL storage, but no data will be stored locally on the laptop itself and synching of files will be deactivated. Consent forms will be stored in my supervisor’s locked filing cabinet at their Stratford campus office.
The Dictaphone used for interviews will be encrypted and data will be transferred from the device’s memory card to UEL OneDrive immediately after interview, after which it will be deleted from the memory card. A Data Confidentiality Agreement will be signed by the professional transcriber outlining their responsibilities to keep the data secure. The recordings will be transferred to the transcriber via SFTP server. After anonymisation has taken place, the recordings will be deleted.

How will you share the data?

What should I include here?

Many research funders view research data as a ’public good’ and require data to be made open, UEL’s Research Data Management Policy presumes that data will be released for sharing unless there are legal, ethical, or contractual reasons not to do so. The policy recognises that not all data is suitable for sharing and that they should be shared according to the principle of, ‘as open as possible, as closed as necessary’.

Benefits of open data include: validation and reproducibility of results; increased citations; acceleration of research, as duplication of effort is reduced, and other researchers can build upon your work; and can provide opportunities for future collaboration with others in the field.

Research data should be shared via deposit to an appropriate repository: UEL has a Research Repository that should be used if there is not a suitable discipline-specific repository (re3data.org is a searchable database of these). Data should be deposited with appropriate documentation and a metadata record is created, including a DOI, which can be included in the Data Availability Statement in your future publications, cited, and linked to by others.

State who might be interested in the data as well as how and when the data will be shared, identifying an appropriate repository.
Identify actions required to make the data suitable for sharing, e.g., anonymisation, or converting to open file formats
Describe how others are permitted to re-use your data: there are licenses that communicate such terms, for example Creative Commons licenses. CC BY is recommended for widest re-use whilst requiring attribution to the data owner.
If data cannot be shared openly, provide the reasons for this.

Why is this important?

It can help to plan for data sharing at the outset of the project, so you can consider which data might be appropriate to share, or what techniques might be necessary to enable the data to be shared, for example through anonymisation. It is also a good opportunity to consider how data should be licensed for re-use and to ensure you meet any contractual obligations to your funder.

Examples

The transcripts are of potential interest to researchers in the field. They will be anonymised before deposit to the ISO27001 certified secure UEL Research Repository at project end alongside appropriate documentation & metadata, assigned a DOI, and shared under a CC BY 4.0 license.

The SPSS and TIFF files will be deposited to the UK Data Service ReShare repository within 3 months of the end of the grant, in line with the ESRC grant conditions. Documentation to DDI standard will accompany the deposit. The data will be at the Safeguarded level of access, only available to registered users under the UKDS End User License.

Are any restrictions on data sharing required?

What should I include here?

Some data is too sensitive to share openly and so need to be safeguarded: this may mean removing from the dataset, anonymising, or setting access controls. Other restrctions might include an embargo delaying the release of data, for example to allow you to explore opportunities for publication, IP exploitation, or to comply with contractual agreements with collaborators/funders.

State which data cannot be shared, the reasons for this, and how restrictions will be made.

Why is this important?

You should document any data restrictions so that these can be checked prior to releasing datasets to avoid the risk of data disclosure or breaching any agreements.

Examples

Survey data will be anonymised before sharing, but the video files of interviews must not be shared due to the personal and sensitive data they contain, which cannot be deidentified.

The software code will be under a 24-month embargo agreed with the commercial partner. All other data will be anonymised before release via the designated repository.

Which data are of long-term value and should be retained, shared, and/or preserved?

What should I include here?

You will not be likely to need or be able to keep all the data you collect or create, so will have to decide which is of value and which can be destroyed. Some considerations when deciding data to select: what data will be needed to validate your research findings; potential purposes for re-use; can data be easily reproduced; are the data unique; are there legal, ethical, or contractual reasons to keep the data; are the data personal or sensitive.

Describe how you will appraise the data to decide which to retain, share, and/or preserve
State which data are of long-term value and suitable for retention or preservation and why

Why is this important?

Appraisal can take place during the project as well as at completion, so planning for how you will select data can help ensure you retain data of value and keep only what is necessary. It is also important to plan for how you will meet funder expectations on retention and preservation if you are bidding for, or have been awarded, a grant.

Examples

Anonymised survey data underlying publications will be retained and shared on the UEL Research Repository so findings can be validated and for use by other researchers. Video recordings are not suitable for sharing and will be destroyed at close of project.

Raw data are of long-term value to researchers in the field and will be shared in line with funder expectations with relevant documentation where consent has been obtained, apart from data from failed experiments which will be destroyed. Records including lab notebooks, protocols, and participant consent will be securely archived.

What is the long-term preservation or retention plan for the data?

What should I include here?

Preservation of data involves more than just long-term storage, it includes activities and processes necessary to enable data to be available and usable into the future, such as file format migration and performing quality checks. Not all data needs to undergo preservation but may still need to be retained.

UEL has a digital archiving service, Arkivum, which is suitable for long-term secure safeguarding of data and includes data integrity checks: access is limited to system administrators at Arkivum and in Library, Archives, and Learning Services at UEL, so would not be suitable for data you require regular access to. We can provide more information about this service and whether it would be appropriate.

Funders may have specific retention periods relating to research data: UKRI’s Concordat on Open Research Data states that data underlying publications should be retained for 10 years from the date of publication. UEL’s Research Data Management Policy is that, “Data must be appraised (reviewed) at the end of the research project and every 5 years thereafter, unless another timescale is specified by the research funder, until the data are transferred or destroyed.”

Indicate where data will be retained and/or preserved and for how long
State what actions will take place at the end of that period, such as a review for further retention, transfer elsewhere, or to be destroyed
Identify any processes required for preservation, such as data cleaning, conversion to open formats so data can be used without proprietary software, or creation of metadata

Why is this important?

Planning for retention and/or preservation helps ensure that necessary data are accessible and usable into the future, as well as to meet any funder, legal, or ethical conditions. Some processes or technologies may require additional resources, which may be easier to obtain when planned for.

Examples

Anonymised transcripts, thematic codes, and consent forms will be stored on the PI’s UEL OneDrive for 5 years and backed up on SharePoint, after which they will be reviewed for further retention or deletion. Consent forms will be retained for one year after the project end to allow the PI to share results with participants as outlined in the Participant Information Sheet.

Anonymised data will be deposited and shared on an appropriate domain repository. Raw data and records will be stored securely on Arkivum, an ISO27001 certified digital archiving service with a data integrity guarantee. Paper records will be scanned and digital files migrated to open formats. Data will be retained for 20 years, in line with the MRC’s expectation for clinical studies, upon which it will be reviewed.

Who will be responsible for data management?

What should I include here?

Whilst the PI will have overall responsibility for managing research data, you should include the persons or entities responsible for different aspects of data management: this could be a research assistant or other colleague on the research team, or an external third party such as a professional transcription service.

Why is this important?

An outline of who is responsible for research data management processes can help to ensure consistency and accountability. It can aid the transfer of tasks if someone leaves the research team and identify who may need specific training or support to carry out their role.

Example

The PI will take overall responsibility for data management. The Research Assistant (name) will be responsible for transcription, performing scheduled back-ups, and quality assurance.

What resources will you require to deliver your plan?

What should I include here?

Details of resources required for collection and management of your research data additional to those you have available to you through standard provision at UEL: for example, storage for high volumes of data, specialist software or equipment. State whether there will be costs associated with the extra provision and how these will be met.

Why is this important?

Procurement of resources additional to those provided as standard by UEL can take time and require funds: early identification and planning can help with this.

Examples

The project will generate 8 TB of data requiring additional secure storage space, including non-networked encrypted hard disk storage for analysis of sensitive health data

The research team will require access to an SFTP server from UEL IT Services to securely transfer large data files to the professional transcription service

What should I include here?

State when you will review your DMP: this could be a schedule with specific dates, at project milestones, or to address issues that were unclear when drafting the initial version of the plan. Updated plans should be sent to researchdata@uel.ac.uk

Why is this important?

DMPs should be considered ‘living’ documents, to be reviewed and updated when required during the research project. Some funders require you to provide a schedule for review of your DMP.

Examples

The DMP will be submitted with the grant application and reviewed every 3 months in line with the funder’s required schedule

The DMP will next be updated following review of the project’s ethics application by the UEL Ethics and Integrity Sub-Committee and reviewed regularly should new information arise.

Scholarly Communications

Guidance on the UEL DMP template

What data will you collect or create?

What should I include here?

Why is this important?

Examples

How will the data be collected or created?

What should I include here?

Why is this important?

Examples

What documentation and metadata will accompany the data?

What should I include here?

Describe what documentation and metadata you will record, specifying any formal standards used.

You do not have to use a formal standard if it is not appropriate, but otherwise describe your approach to data documentation.

Why is this important?

Documentation and metadata provide information and context to enable yourself or others to discover, use, and reproduce your data-- facilitating data to be FAIR (Findable, Accessible, Interoperable, and Reusable).

Examples

Identify any ethical issues and how these will be managed

What should I include here?

This section should describe ethical issues that could affect the management of your data and how these will be addressed, rather than the ethics of conducting the research project as a whole.

Ethical issues may include protecting the anonymity of participants when handling their personal data and confidentiality of commercially sensitive or environmentally sensitive data (for example, the location of endangered species).

How you will obtain informed consent for data collection and storage, as well as for sharing and archiving research data in the future.

Why is this important?

Examples

How will the data be stored and backed up during the research?

What should I include here?

Specify the storage and back up locations for each data type (physical and electronic formats) during the ‘active’ phase of your research and why these are the most appropriate.

Define back up procedures & schedule, as well as details of data integrity checks.

Include details of data transfer, such as when storing on devices or instruments off-network or in the field: when and how will you securely move data to UEL storage.

Why is this important?

Examples

How will you manage access and security?

What should I include here?

Describe what measures you will take to keep data secure, in storage and while transferring

State who will have access to the ‘active’ research data and how this access will be controlled

Why is this important?

Ensuring sound methods of security and access will mitigate the risk of data loss or disclosure; researchers have regulatory and ethical obligations to keep the personal and sensitive data of their participants safe and seen only by those that are authorised to.

Examples

How will you share the data?

What should I include here?

Benefits of open data include: validation and reproducibility of results; increased citations; acceleration of research, as duplication of effort is reduced, and other researchers can build upon your work; and can provide opportunities for future collaboration with others in the field.

State who might be interested in the data as well as how and when the data will be shared, identifying an appropriate repository.

Identify actions required to make the data suitable for sharing, e.g., anonymisation, or converting to open file formats

Describe how others are permitted to re-use your data: there are licenses that communicate such terms, for example Creative Commons licenses. CC BY is recommended for widest re-use whilst requiring attribution to the data owner.

If data cannot be shared openly, provide the reasons for this.

Why is this important?

Examples

The transcripts are of potential interest to researchers in the field. They will be anonymised before deposit to the ISO27001 certified secure UEL Research Repository at project end alongside appropriate documentation & metadata, assigned a DOI, and shared under a CC BY 4.0 license.

Are any restrictions on data sharing required?

What should I include here?

State which data cannot be shared, the reasons for this, and how restrictions will be made.

Why is this important?

You should document any data restrictions so that these can be checked prior to releasing datasets to avoid the risk of data disclosure or breaching any agreements.

Examples

Survey data will be anonymised before sharing, but the video files of interviews must not be shared due to the personal and sensitive data they contain, which cannot be deidentified.

The software code will be under a 24-month embargo agreed with the commercial partner. All other data will be anonymised before release via the designated repository.

Which data are of long-term value and should be retained, shared, and/or preserved?

What should I include here?

Describe how you will appraise the data to decide which to retain, share, and/or preserve

State which data are of long-term value and suitable for retention or preservation and why

Why is this important?

Examples

Anonymised survey data underlying publications will be retained and shared on the UEL Research Repository so findings can be validated and for use by other researchers. Video recordings are not suitable for sharing and will be destroyed at close of project.

What is the long-term preservation or retention plan for the data?

What should I include here?

Indicate where data will be retained and/or preserved and for how long

State what actions will take place at the end of that period, such as a review for further retention, transfer elsewhere, or to be destroyed

Identify any processes required for preservation, such as data cleaning, conversion to open formats so data can be used without proprietary software, or creation of metadata

Why is this important?

Planning for retention and/or preservation helps ensure that necessary data are accessible and usable into the future, as well as to meet any funder, legal, or ethical conditions. Some processes or technologies may require additional resources, which may be easier to obtain when planned for.