research data management
data storage and archiving

handling data on a daily basis
In the process of conducting research in an efficient and responsible manner, data security is particularly important. It should be taken care of as early as the project implementation stage, when data is collected, processed and analysed.
Materials which are the basis of any further work should be given special consideration.
They should be kept in a separate location and protected from alteration so that they are not overwritten or deleted. Read-only file settings can be used. Further work should be carried out on copies, documenting the subsequent stages of the research, including procedures and methods.
In the course of a project, successive versions of data files are often created. They can be processed, analysed, and enriched with other data. Version control plays an important role in ensuring file security and guarantees data integrity.
A good solution is to keep copies of the main files and temporary working copies separately, and to adopt strict rules for versioning and synchronising files in different locations.
Regular backups are a good practice to prevent data loss.
A standard and recommended approach is following the 3-2-1 rule: keeping three copies of data on two different types of storage media, one of them offsite (in a different physical location).
Both the development of a data management plan and its implementation require consideration of the following issues:
– ways of creating backups,
– finding storage location for backup copies,
– planning the frequency and numbers of backups,
– creating the procedure for recovering lost data,
– allocating different responsibilities for backup and recovery procedures among team members.
Encryption is one of the security measures against unwanted disclosure of data.
Secure algorithms with a public key should be used (the public encryption key is different from the private decryption key), with care taken to store the private key in a secure location inaccessible to unauthorised persons.
Data can also be placed on pre-encrypted disk partitions by transferring them via encrypted network communication protocols. Both data encryption and secure network communication should be implemented by specialised software, preferably selected by the IT department of the research institution.
data storage during the project
The general data retention policy is designed to minimise the risks associated with loss, damage or unwanted alteration of data. Failure, destruction or loss of equipment can seriously jeopardise a project. Re-collection of lost data is often not possible. Other risks may involve the premature or unplanned release of data.
The storage of data during the project should take into account the current needs related, among others, to the conditions of data collection (e.g. field work, use of specific apparatus or equipment), to the processing and analysis of data (e.g. collaboration with other team members), or to the protection of specific types of data (e.g. personal data, sensitive personal data, confidential data).
The data storage strategy requires the identification of data storage locations and procedures related to copying, modifying, versioning, deleting, and granting access to the data. It may further include the establishment of different levels of protection depending on the possible risks of disclosure, damage or loss of the data.
LONG-TERM DATA STORAGE
The post-project data retention should take into account obligations stemming from grant agreements or policies of the institutions conducting and funding the research, as well as good practices and standards adopted in the specific field or area of research. The scientific or historical value of the data in relation to the state of the art, the uniqueness of the data, the potential for reuse, the quality of the data and the completeness of the documentation should be assessed.
Long-term data storage cannot be limited to simply storing data. Consideration must be given to the process of data degradation that occurs over time and the risk of the obsolescence of specific data storage media, file formats or reading software. Ensuring data security and integrity requires planned and systematic measures, which have specific costs.
For data to be made openly available, the use of a data repository is an appropriate solution. The repository has its own policy for long-term data preservation, e.g. recommends the deposit of files in specific formats, regularly controls the checksums and, in case of non-compliance, retrieves the backups stored elsewhere.
Research institutions should also provide their own strategies and technical solutions, especially for data that is not intended to be made available and must be given special protection, such as personal data.
Long-term data storage also involves data selection. It may not be possible to retain all data for financial reasons. The volume of data produced and collected is constantly increasing, which translates into ever higher costs for storing, backing up and maintaining an active policy to ensure data security.
Data storage locations
When choosing where to store data, consider matters such as relevant legal frameworks and policies, security, storage purpose in relation to the project stage, technical requirements (e.g. data size), costs.
External hard drives, flash drives, and CDs
Only suitable for temporary, short-term data storage or for data transfer when online transmission is not possible. Devices should be protected by strong passwords and encryption, and their performance should be monitored.
e.g. institutional Google Drive, OneDrive, Dropbox, Nextcloud
Useful for providing remote and easy access to data and other information for everyone involved. Should not be the only solution used for storage and backup or used to store unencrypted personal data. Terms of use should be checked against the service provider’s rights to use the content. Preference should be given to European, national or institutional services that store data in Europe.
Computers and laptops
Only appropriate for projects involving a limited number of people (preferably only one person) and where data and files do not need to be moved frequently between personal computers. A plan for working with different local computers, e.g. a private laptop and a desktop computer at the place of employment, should include a procedure for backup control and file version control.
Shared drives on the servers of an institution
Suitable for collaborative projects with many people who need access to data. Access and permission controls are necessary. They should be used in combination with an appropriate security strategy and detailed versioning rules, as well as a long-term archiving strategy.