Benefits of Data Sharing (and Disadvantages)

Understanding and evaluating the pros and cons of sharing your data is increasingly important as journals are more frequently incentivizing authors to share their data. You can openly share your data as a tool/resource on NITRC.

How To Share Your Data

Should you share your neuroimaging data?

Increasingly, journals are incentivizing authors to share their data – and in some cases, it is becoming required for publication. Given this, it is important to understand and evaluate the pros and cons of sharing your data.

The pros are:

  1. Increased citations. Some researchers might publish using your data and cite you. Even though researchers might not publish using your data, they might run rudimentary analysis to compare with their results and cite you. On average, there is a 25% citation boost for linking your data to your publications.
  2. Create new collaborations. Some researchers might contact you to assist them in reanalyzing your data for a specific question that might not have been relevant to you at the time of publication – or they might want to submit a grant application with you to do so.
  3. You might get another publication out of publishing your data. For example, Scientific Data, a Nature Group journal focuses on publication of data only, where you describe your data, how it is publically released, and provide a detailed description and basic quality metrics.
  4. It will be easier for you or your collaborators to reuse that data. Admittedly, the person most likely to reanalyze that data in the future is probably you. If you ever have to reanalyze that data again, having it properly formatted and readily available online could save you dozens of hours of headache juggling through old hard drives, CD (even archival tapes), re-contacting the student who acquired it and now moved to another job, etc…
  5. You might increase your chance of getting funding. This is especially true for agencies such as NIH (National Institute of Health in the USA) that promote such practices. Showing a track record of releasing your data will show that you mean business!
  6. You are benefiting science. There is probably a reason why you became an academic scientist that goes beyond personal gain. Perhaps, it was to inspire new generations or advance human knowledge. There is little doubt that sharing your data is more aligned with these beliefs than not sharing your data.

The cons are:

  1. Preparation time. It takes time to format your data so it can be shared online. Admittedly, some researchers simply dump their files with no documentation on NITRC, figshare, OSF, or another repository that does not impose any data formatting. However, for neuroimaging data, we would advocate the use of repositories such as OpenNeuro that enforce formats such as BIDS. There are now many software to format your data as BIDS in just a few clicks. For example, for EEG, which is my area of research, there is an EEGLAB plugin that will automatically convert raw imported data files to BIDS. Formatting your data to BIDS will ensure that it contains all the documentation necessary for reuse and that it is compatible with standard processing pipelines.
  2. Data, especially when you have a lot of it, is power. We all know researchers that have acquired a couple of dozen neuroimaging datasets and restrict their access to a handful of collaborators that might publish with it so they can be a co-author. Releasing the data publicly would mean loss of potential publication and collaborations. If you are one of these researchers and it works for you, then you should probably continue to do that. However, there are a large number of researchers that cling to their data just in case such an opportunity might come up. If you are one of those that retain access to their data just in case, we would argue that the potential benefits of publicly releasing your data outweigh the potential loss. First, you are probably more likely to spur collaborations with your data online. Second, not formatting and releasing your data after you are finished with the experiment means that the efforts to do so a couple of years down the road will become exponentially harder.

I hope I have convinced you to share your data, if not for the public good, for your own benefit.

Sharing data is just the first step. The second step is to share your analysis pipeline and hopefully, I will get to write a future blog on this topic.

- Arnaud Delorme, Ph.D. – shared his first study containing raw EEG datasets from 16 participants in 2002 on his personal website

Licensing Information Available

While licensing in neuroimaging research is a topic that has yet to settle clearly, we want to provide researchers with up-to-date information from the community. This information is now captured on NITRC in the Community Information section of the User Guide. View Licensing Information >

Data Usage on the Image Repository

Each year NITRC is delivering about 125TB of data to neuroimaging researchers, most of which is data from the Image Repository. Curious to see what all the excitement is about? View the Image Repository >

NITRC-CE LITE Offers a Lightweight Computational Environment for Data Analysis

NITRC-CE is designed to run neuroimaging pipelines using container technologies as a light-weight container player. Many NITRC-hosted software tools such as Docker or Singularity (as well as many others) have containers available. See Installed Packages for NITRC-CE LITE >

The Results are in for the 2020 State of Neuroimaging Survey

Encouragingly, the majority of respondents (61.7%) publicly share data they acquire. View the full results here.

Subscribe here to be informed about future State of Neuroimaging Surveys.
Copyright © 2020 NeuroImaging Tools and Resources Collaboratory, All rights reserved.