DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > Data Sharing Is Key to Overcoming the Reproducibility Crisis

Data Sharing Is Key to Overcoming the Reproducibility Crisis

Look at research about why supporting reproducibility is important and how data sharing is key to overcoming the reproducibility crisis.

Adi Gaskell user avatar by
Adi Gaskell
·
Jul. 01, 17 · Big Data Zone · Analysis
Like (1)
Save
Tweet
7.22K Views

Join the DZone community and get the full member experience.

Join For Free

The opening up of the data that underpins scientific research is something I’ve long supported, and whilst a degree of progress has been made in the area, there is still much to be done.

For instance, I wrote earlier this year about a study from Elsevier that discussed an apparent paradox in the research world. It revealed that whilst the majority of researchers openly admit that their work would benefit from more open data, few of them actually share their own data.

The study consisted of a survey of over 1,200 researchers from around the world in fields including genetics and humanities. It came to a number of clear conclusions:

  • Researchers support open data, at least when it comes to the benefits their own research derives from it. They’re much less familiar with sharing their own data, with inexperience and the academic culture significant reasons given for this.
  • Funders are not driving change. Researchers didn’t feel that the wishes of funders to share data more widely were driving change, with most researchers believing they own the data used in their work.
  • Many are still not sharing at all. 34% of researchers don’t publish data at all, and when they do, it’s usually in the form of tables and annexes rather than raw data.
  • Patchy standards. Researchers are evenly divided between those who think that good standards exist for citing published data and those who do not.
  • Subject-specific sharing. There were also pronounced differences in the sharing practices across subject areas, with some subjects firmly embedding data sharing into the design and execution of research.

Supporting Reproducibility

Something the paper didn’t touch on is the challenge the research industry has with reproducibility. A recent paper from researchers at Penn State suggests that this may have as much to do with the difficulties in managing data as any other factor.

“What we researchers try to do is provide the science-consuming public with genuine insights about brain and behavior,” the authors say. “We want to say things that are robust and true. Without reproducibility, it’s hard to say that convincingly.”

There are increasingly technology systems available to help researchers not only ensure their own data is usable but also to easily work with the data generated (and shared) by others. There are also data repositories, such as the Databrary platform developed by lead author Rick Gilmore, to assist researchers.

Nowhere are the challenges greater than in cognitive neuroscience. It’s an extremely computationally intensive field, with data produced in a wide range of sizes and formats from devices such as EEGs and MRI scanners. This leads to a fragmented landscape for data sharing in the sector.

“Right now, data sharing is still largely unfunded and unrewarded and is only rarely required,” the authors say. “It’s something that isn’t a universal requirement for federal grant funding, for example.”

They suggest that rather than treating the published paper as the finished product, we need researchers to regard the data that underpinned those papers with equal importance.

“In addition to publishing scientific papers, behavioral and brain scientists need to be more open about the detailed procedures underlying their studies, more freely share the statistical programs that they use in analyzing data,” they say. “And researchers should share the data itself as openly as possible.”

It’s something that I think most people outside of the research industry understand, but changing behaviors inside it remains challenging.

Data (computing) Data sharing

Published at DZone with permission of Adi Gaskell, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • API Security Tools: What To Look For
  • Major PostgreSQL Features You Should Know About
  • Python Class Attribute: Class Attribute vs. Instance Attribute
  • What Are the Best Performance Tuning Strategies for Your SQL Server Indexes?

Comments

Big Data Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo