Open Data and Developmental Science

Rick O. Gilmore

2018-10-12 08:23:05


Open Data and Developmental Science (ODDS) Initiative


  • Motivation
  • State of open developmental science
  • Needs assessment and future planning


Gilmore, R. O. (2016). From big data to deep insight in developmental science. Wiley interdisciplinary reviews. Cognitive science, 7(2), 112–126. Retrieved October 9, 2018, from

The advancement of detailed and diverse knowledge about the development of the world’s children is essential for improving the health and well-being of humanity. We regard scientific integrity, transparency, and openness as essential for the conduct of research and its application to practice and policy.

SRCD Task Force on Scientific Integrity and Openness

the principles of human subject research require an analysis of both risks and benefits…such an analysis suggests that researchers may have a positive duty to share data in order to maximize the contribution that individual participants have made.

(Brakewood & Poldack, 2013)

State of open developmental science

Large-scale research projects

Environmental influences on Child Health Outcomes (ECHO)

Play & Learning Across a Year (PLAY)

\(n=900\) 12-, 18-, 24-mo-olds; \(n=30\) sites; \(n=65\) launch group members


  • Adolph, Tamis-LeMonda, Gilmore (Co-PIs); Buss, Perez-Edgar, Berenbaum, Chi (PSU)
  • Video as data and documentation
  • 900+ infant/mother dyads, 1-hr video recordings of natural home activity
  • Solitary play, dyadic play, video home tour; ambient sound levels
  • Demographics, health, language, media use surveys
  • All openly (w/researchers) shared on

Adolescent Brain and Cognitive Development (ABCD) Study

Goings on at Penn State

Needs assessment and future planning

Possible areas of activity

  • How to meet NIMH data sharing requirements
  • Coordinating access to, storage of large-scale datasets (e.g., ABCD)
  • Using restricted data: How and why
  • R-eproducible research with R and R markdown
  • Reproducible workflows in cognitive and affective neuroscience
  • Support for secondary data analysis
  • Ethics of sharing sensitive data
  • Data management: It’s 12am, do you know where your data are?
  • Sharing data across levels of analysis (geo- and other linkages)
  • Reproducible workflows with data repositories (e.g., databraryapi R package)
  • The 5/5/5 proposal
  • Data repositories as ‘platforms for discovery’
  • Video as data and documentation: Why and how

Video and behavioral analysis

Behavior is the critical factor underlying many issues in public health. Behavior contributes to the progression or prevention of disease; it defines disorders or marks recovery; and it provides leverage points for therapeutic intervention.

Clinicians and health researchers have many tools to measure physical health—from blood assays to brain images. But where are the tools to measure healthy and risky behaviors in the contexts where they naturally occur?

Video analysis of human behavior is the next frontier in AI, health, and biomedicine.

Jayaraman, S., Smith, L.B., Raudies, F. & Gilmore, R.O. (2014). Natural Scene Statistics of Visual Experience Across Development and Culture. Databrary. Retrieved October 11, 2018 from

Ossmy, Gilmore, & Adolph (in press)

Discussion: Where do we want to go? How do we get there?

What kinds of data do you

  • Produce/collect yourself
  • Gather and analyze from other sources?

What challenges do you face in

  • Data collection or gathering?
  • Data analysis?
  • Data storage or management?
  • Data sharing?

If you share data now, where?

  • Federal data repository
  • Institutional repository
  • Domain repository (e.g., Databrary, ICPSR, OpenNeuro)
  • Personal/lab/project/department website

If you want to share (or share more) data, what barriers must be overcome?

Are you interested in the video analysis of behavior?

Funding opportunities


This talk was produced on 2018-10-12 08:23:06 in RStudio 1.1.453 using R Markdown and the reveal.JS framework. The code and materials used to generate the slides may be found at Information about the R Session that produced the slides is as follows:

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## other attached packages:
## [1] revealjs_0.9
## loaded via a namespace (and not attached):
##  [1] compiler_3.5.1  backports_1.1.2 magrittr_1.5    rprojroot_1.3-2
##  [5] tools_3.5.1     htmltools_0.3.6 yaml_2.2.0      Rcpp_0.12.18   
##  [9] stringi_1.2.4   rmarkdown_1.10  knitr_1.20      stringr_1.3.1  
## [13] digest_0.6.16   evaluate_0.11