Minutes of the Data Sharing Workshop

COST Connect meeting on data sharing collected researchers from several COST actions to discuss the state of data sharing in the different research fields, issues with the data sharing, and how these issues can be resolved. In addition to COST action representatives, the meeting was attended by representatives of several societies and organizations dealing with the different aspects of data sharing in science. In general, it was an interesting meeting and, as representatives of MitoEAGLE action, we found it educational in many aspects. The meeting was organized around topical discussion tables, with the topics proposed by the participants. To set the scene, several actions were selected, including MitoEAGLE, to present the overview of their activities in the context of the data sharing.
When comparing to the other actions, our datasets are orders of magnitude smaller than datasets of several other groups. As such, our action is not inhibited by share sizes of the data and our main issues are not related to the storage, as for some others. When dealing with animal experiments, we are also not hit by EU GDPR. Several research projects, in bioinformatics and social sciences, for example, have to deal with GDPR on a daily basis and some projects are rendered impossible due to the limits imposed by GDPR. Having a relatively vague understanding of GDPR and its implications does not help either, as was stressed by some groups. However, mitochondrial research on human samples can be subject of GDPR restrictions, regardless of how anonymous the samples are and the research teams involved in such research should be advised by a university or hospital lawyer.
General issues involving data sharing in terms of proprietary data, imposing FAIR principles (, how to share data from technical point of view, were discussed. No firm conclusions were reached as well as it is not clear who should take initiative in setting the guidelines and supervise data sharing. Several granting agencies now require data sharing plan as a part of the applications, but the formulations of the requirements are rather new and, as it looks, still work in progress. As such, the meetings, like this one organized by COST, are envisioned to help with the development of standards in the data sharing and suggesting ideas on imposing them.
The financial aspect of data sharing, such as development of the tools and paying for long term storage, has been raised but not answered in any concrete manner. In my opinion, when we compare data sharing initiatives with open access publishing, there are much more obstacles to overcome for data sharing to be efficient. While open access publishing mainly requires payment of the fee and the journals are taking care of the distribution, proper data sharing requires, in addition to infrastructure, education of researchers on how to share the data. Such educational aspect was touched, but, in my opinion not sufficiently, and should be addressed in future.
As for interesting organizations, RDA, a voluntary data sharing alliance ( should be mentioned. We should look into the infrastructure provided by them and see if we can use it for sharing data that arises in mitochondrial research. This organization is open, voluntary, and composed out of scientist who wish to share their data. Its probably possible also to discuss with the representatives of RDA on how to adjust the data sharing platform for our needs better, if it is needed.
Out of the table discussions, discussion on metadata was probably most relevant to us. As we have seen in the brilliant work by several laboratories in our action, it is insufficient to share respiration rates, even if they are measured in what is supposedly the same conditions. Namely, we saw great variability in permeabilized fiber respiration rates imposed by procedures of determining weight, skill level of researcher, as well as differences in state of the chemicals used in the experiments. In data sharing context, all this information about the data is described using metadata.
Already definition of metadata was different between researchers from different backgrounds, demonstrating how difficult it is to agree on which metadata should be included in the experiment description. No universal rules are possible in this respect and each field will have to make its own recommendations. In this respect, storing such heterogeneous datasets from different fields, their analysis in accordance with the FAIR principles, makes it very challenging to approach metadata formulation properly. In the discussion of the example of flame research, it was agreed that the formulation of proper and agreed vocabulary has to be considered as a first step of metadata description. Thus, definition of respiratory terms made in MitoEAGLE was done in the context of resolving metadata storage problem.
The detailed report of the meeting is prepared by COST and I expect it to be posted soon on their website. This report, in addition to description of the discussion topics, will have uploaded presentations making it possible to take a detailed look into what was discussed on the meeting. I would like to thank COST for taking initiative and organizing the meeting like this. It's certainly timely and, while leaving most questions unanswered, allowed to collect people working on it and share the experience. I think this experience will be of advantage for our COST project and beyond it.

