Big data behavioral science: From micro- to macro-scale

Rick Gilmore

2018-09-07 07:20:47

Acknowledgements


Karen Adolph (NYU)

Catherine Tamis-LeMonda (NYU)

Kasey Soska (Databrary/PLAY)

Jeff Spies (Databrary, 221b.io)

Agenda

  • Is there a reproducibility ‘crisis’ in science?
  • Video as data and documentation
  • From micro- to macro-scale
  • The future of open social & behavioral science

Reproducibility

Is there a reproducibility crisis?

  • Yes, a significant crisis
  • Yes, a slight crisis
  • No crisis
  • Don’t know

Not just behavioral science…

Why is reproducibility hard?

A manifesto for reproducible science

Improving reproducibility

  • Open data
  • Open materials
  • Better, more widely shared procedural documentation
  • Data interoperability and linkage

Video as data and documentation

Adolph, K., Tamis-LeMonda, C. & Gilmore, R.O. (2017). PLAY Project: Pilot Data Collections. Databrary. Retrieved August 21, 2018 from https://nyu.databrary.org/volume/444

Video…

  • Captures spatial & temporal structure of behavior
  • Huge potential for secondary use

Video as documentation…

The PLAY Project Wiki: https://dev1.ed-projects.nyu.edu/wikis/docuwiki/doku.php/landing

Video’s challenges

  • Faces & voices, names
  • Blurring, alteration diminishes value for secondary use
  • Hard(er) to share
  • Diversity of formats

Meeting the challenges

  • Convert to consistent file formats
  • Restrict access
  • Institutional agreement
  • Secure permission to share

Big data behavioral science: From micro- to macro-scale

Play & Learning Across a Year (PLAY)

Adolph, K., Tamis-LeMonda, C. & Gilmore, R.O. (2017). PLAY Project: Pilot Data Collections. Databrary. Retrieved August 21, 2018 from https://nyu.databrary.org/volume/444

Multiple functional domains

  • Language & gesture
  • Locomotion & physical activity
  • Object interaction
  • Emotional expression

Parent-reported…

  • Demographic & health information
  • Child vocabulary
  • Media use
  • Temperament

Sampling

  • \(n=900\) dyads: 300 12-mo-olds, 300 18 mo-olds, 300 24 mo-olds
  • First-borns
  • Approximating demographics of \(n=30\) data collection sites

Shared on Databrary

  • Permission to share with researchers
  • With geographic codes (Census Block Group)

Reproducible workflows

Video-enhanced wiki: https://dev1.ed-projects.nyu.edu/wikis/docuwiki/doku.php

  • R + R Markdown, acs, choroplethr packages

The future of open social & behavioral science

Improving reproducibility

  • Open data & materials
  • Better, more widely shared procedural documentation

Spanning levels of analysis

  • Planning for linkage (e.g., geographic)
  • Blurring disciplinary boundaries
  • Seeking partnerships
    • What would your colleagues want to know about the microstructure of infant/mother behavior?

Accelerating discovery

  • FAIR (Findable, Accesible, Interoperable, and Reusable) data (Wilkinson et al., 2016)
  • Federal data sources with (where possible) APIs
  • Data repositories

Maintaining participant privacy

  • Ask permission to share (especially for sensitive, identifiable data)
  • Don’t promise to destroy data (but GDPR?)
  • Don’t unduly restrict future reuses

Materials

This talk was produced on 2018-09-07 07:20:47 in RStudio 1.1.453 using R Markdown. The code and materials used to generate the slides may be found at https://github.com/gilmore-lab/2018-09-07-fprs/. Information about the R Session that produced the slides is as follows:

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
## 
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.5.1  backports_1.1.2 magrittr_1.5    rprojroot_1.3-2
##  [5] htmltools_0.3.6 tools_3.5.1     revealjs_0.9    yaml_2.2.0     
##  [9] Rcpp_0.12.18    stringi_1.2.4   rmarkdown_1.10  knitr_1.20     
## [13] stringr_1.3.1   digest_0.6.15   evaluate_0.11