Open Data and Developmental Science

Rick O. Gilmore

2018-10-12 08:23:05

Preliminaries



Open Data and Developmental Science (ODDS) Initiative

http://csc.la.psu.edu/research/collaborative-research-initiatives

Agenda

  • Motivation
  • State of open developmental science
  • Needs assessment and future planning

Motivation

Gilmore, R. O. (2016). From big data to deep insight in developmental science. Wiley interdisciplinary reviews. Cognitive science, 7(2), 112–126. Retrieved October 9, 2018, from http://onlinelibrary.wiley.com/doi/10.1002/wcs.1379/full

The advancement of detailed and diverse knowledge about the development of the world’s children is essential for improving the health and well-being of humanity. We regard scientific integrity, transparency, and openness as essential for the conduct of research and its application to practice and policy.

SRCD Task Force on Scientific Integrity and Openness

the principles of human subject research require an analysis of both risks and benefits…such an analysis suggests that researchers may have a positive duty to share data in order to maximize the contribution that individual participants have made.

(Brakewood & Poldack, 2013)

State of open developmental science

Large-scale research projects

Environmental influences on Child Health Outcomes (ECHO)

Play & Learning Across a Year (PLAY)

\(n=900\) 12-, 18-, 24-mo-olds; \(n=30\) sites; \(n=65\) launch group members

PLAY

  • Adolph, Tamis-LeMonda, Gilmore (Co-PIs); Buss, Perez-Edgar, Berenbaum, Chi (PSU)
  • Video as data and documentation
  • 900+ infant/mother dyads, 1-hr video recordings of natural home activity
  • Solitary play, dyadic play, video home tour; ambient sound levels
  • Demographics, health, language, media use surveys
  • All openly (w/researchers) shared on Databrary.org

Adolescent Brain and Cognitive Development (ABCD) Study

https://data-archive.nimh.nih.gov/abcd

https://www.srcd.org/meetings/special-topic-meetings/devsec18

Goings on at Penn State

Needs assessment and future planning

Possible areas of activity

  • How to meet NIMH data sharing requirements
  • Coordinating access to, storage of large-scale datasets (e.g., ABCD)
  • Using restricted data: How and why
  • R-eproducible research with R and R markdown
  • Reproducible workflows in cognitive and affective neuroscience
  • Support for secondary data analysis
  • Ethics of sharing sensitive data
  • Data management: It’s 12am, do you know where your data are?
  • Sharing data across levels of analysis (geo- and other linkages)
  • Reproducible workflows with data repositories (e.g., databraryapi R package)
  • The 5/5/5 proposal
  • Data repositories as ‘platforms for discovery’
  • Video as data and documentation: Why and how

Video and behavioral analysis

Behavior is the critical factor underlying many issues in public health. Behavior contributes to the progression or prevention of disease; it defines disorders or marks recovery; and it provides leverage points for therapeutic intervention.

Clinicians and health researchers have many tools to measure physical health—from blood assays to brain images. But where are the tools to measure healthy and risky behaviors in the contexts where they naturally occur?

Video analysis of human behavior is the next frontier in AI, health, and biomedicine.

Jayaraman, S., Smith, L.B., Raudies, F. & Gilmore, R.O. (2014). Natural Scene Statistics of Visual Experience Across Development and Culture. Databrary. Retrieved October 11, 2018 from http://doi.org/10.17910/B7988V

Ossmy, Gilmore, & Adolph (in press)

https://pjreddie.com/darknet/yolo/ https://www.youtube.com/watch?v=MPU2HistivI&feature=youtu.be

Discussion: Where do we want to go? How do we get there?

What kinds of data do you

  • Produce/collect yourself
  • Gather and analyze from other sources?

What challenges do you face in

  • Data collection or gathering?
  • Data analysis?
  • Data storage or management?
  • Data sharing?

If you share data now, where?

  • Federal data repository
  • Institutional repository
  • Domain repository (e.g., Databrary, ICPSR, OpenNeuro)
  • Personal/lab/project/department website

If you want to share (or share more) data, what barriers must be overcome?

Are you interested in the video analysis of behavior?

Funding opportunities

Resources

This talk was produced on 2018-10-12 08:23:06 in RStudio 1.1.453 using R Markdown and the reveal.JS framework. The code and materials used to generate the slides may be found at https://github.com/gilmore-lab/DEVSEC-2018/promise-of-open-dev-sci/. Information about the R Session that produced the slides is as follows:

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] revealjs_0.9
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.5.1  backports_1.1.2 magrittr_1.5    rprojroot_1.3-2
##  [5] tools_3.5.1     htmltools_0.3.6 yaml_2.2.0      Rcpp_0.12.18   
##  [9] stringi_1.2.4   rmarkdown_1.10  knitr_1.20      stringr_1.3.1  
## [13] digest_0.6.16   evaluate_0.11