  • The reproducibility crisis in science
  • Databrary.org
  • Questions to discuss

The reproducibility crisis in science

What proportion of findings in the published scientific literature (in the fields you care about) are actually true?

  • 100%
  • 90%
  • 70%
  • 50%
  • 30%

How do we define what “actually true” means?

Is there a reproducibility crisis in science?

  • Yes, a significant crisis
  • Yes, a slight crisis
  • No crisis
  • Don’t know

Have you failed to reproduce an analysis from your lab or someone else’s?

Will emphasizing transparency and openness in science…

yield more robust and reliable findings that others can readily build upon

(SRCD, 2019)

Is open sharing of research data and materials…

essential for the conduct of research and its application to practice and policy

(SRCD, 2019)


Data about people requires protection

  • Breaches of privacy
  • Breaches of confidentiality
  • How are data collected?
  • How are data stored and shared?

Video and audio data pose special risks

  • Faces & voices
  • Names, personal locations
  • Behaviors

Video data have unique research potential

How to protect against risk & realize potential?

  • World’s only data library specialized for storing and sharing video and audio
  • Hosted at New York University
  • Opened 2014
  • 563 institutions; 1665 researchers; 53,222 hours of video + other data; 523 shared projects

How Databrary protects personal data

Open sharing (but with restricted audiences)

  • Researchers require institutional authorization
  • Formal access agreement
  • Site-wide access, not dataset-specific
  • Data use and contribution


  • Restricted data sharing has long track-record
  • Meaningful sharing permission; clarifies nature of risk
  • Empowers participants
  • Researchers & institutions determine what to share & when

  • Open, but not public, sharing
  • Researchers, Institutions need not reinvent wheels
  • More discoverable than personal websites or institutional repositories
  • More secure than public data and materials services or journal web pages

  • Consistent curation makes reuse easier
  • Works for data beyond video
  • Secure data interaction via API

Databrary 2.0

  • Updated policy framework
  • Rewriting in Node.js, Hasura/GraphQL, Vue.js/Quasar


Where do researchers in your field share your data and materials?

If sharing data and materials is not commonplace, why?

What barriers must be overcome to make it commonplace?

CCan’t sharing data in repositories makes reproducible workflows easier?

Who owns data? Who should?

Does de-identification offer sufficient protection to participants?

Shouldn’t most (all?) human data be shared via restricted means?



