The Video DatAbservatory: A platform for behavioral discovery

Rick O. Gilmore

2018-01-26 08:50:37

Support


Agenda

Survey/discussion

Status report

New initiative(s)

Survey/discussion

Developmental science could be more open & transparent

Agree

Disagree

Developmental science should be more open and transparent

Agree

Disagree

Agree

Disagree

Data from developmental research should be more widely and readily available

Agree

Disagree

Methods and materials used in developmental research should be more widely and readily available

Agree

Disagree

I have used data shared by others

Agree

Disagree

If data from publication X or project Y were more widely and readily available, I would use it

Agree

Disagree

Unless there are privacy or contractual limitations, data files described in published papers should be readily available in forms reusable by others

Agree

Disagree

I use video or audio recordings in my teaching

Agree

Disagree

I use video or audio recordings in my current research

Agree

Disagree

I can imagine using video or audio recordings in my research

Agree

Disagree

I use video or audio recordings to document my research procedures

Agree

Disagree

I could envision using video or audio recordings to document my research procedures

Agree

Disagree

Video and audio recordings require extensive and expensive post-processing and coding

Agree

Disagree

It’s hard to find and access data that I might want to repurpose

Agree

Disagree

Once found and accessed, there can be a huge cost in “harmonizing” data from different sources

Agree

Disagree

Agree

Disagree

Agree

Disagree

Status update

Funded NSF (2012-16), NICHD (2013-18), & Sloan Fdn (2017-18)

Opened spring 2014

Approaching 1,000 researchers (~680 PIs + ~290 affiliates)

500+ data/stimulus sets (~20% shared), 13,700+ hours

Free, open-source, multi-platform video/audio coding tool

Windows OS fix nearly complete

Updates for transcription

Play & Learning Across a Year (PLAY) Project

Play is the central context and activity of early development

What do parents and infants actually do when they play?

Adolph, K., Tamis-LeMonda, C. & Gilmore, R.O. (2016). PLAY Project: Webinar discussions on protocol and coding. Databrary. Retrieved January 24, 2018 from https://nyu.databrary.org/volume/232

Adolph, K., Tamis-LeMonda, C. & Gilmore, R.O. (2016). PLAY Project: Materials. Databrary. Retrieved January 24, 2018 from https://nyu.databrary.org/volume/254.

  • \(n=900\) infant/mother dyads; 300 @ 12-, 18-, 24-months
  • 30 dyads from 30 sites across the US
  • 1 hr natural activity
    • 3 min solitary toy play
    • 2 min dyadic toy play
    • video tour of home
  • Demographics + parent-report questionnaires about health, family, temperament
  • Ambient sound levels

  • Census block group geocoding

  • Video as data AND documentation

What questions would you ask about these sorts of data?

How could the data be made maximimally useful to other researchers?

Ideas about seeking additional funding

How infrastructure can enable open, transparent, and reproducible, “big data” developmental science

https://youtu.be/pW6nZXeWlGM https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation

https://youtu.be/VOC3huqHrss https://pjreddie.com/darknet/yolo/

https://www.youtube.com/watch?v=zEhlimS9feo#action=share

Jayaraman, S., et al. (2014). Natural Scene Statistics of Visual Experience Across Development and Culture. Databrary. Retrieved January 24, 2018 from http://doi.org/10.17910/B7988V


Cole, P.M., Gilmore, R.O., Scherf, K.S. & Perez-Edgar, K. (2016). The Proximal Emotional Environment Project (PEEP). Databrary. Retrieved January 24, 2018 from http://doi.org/10.17910/B7.248

Jupyter notebook

From static repository to dynamic analysis platform

NSF Research Implementations for Data Intensive Research in the Social, Behavioral, and Economic Sciences (RIDIR)

https://www.nsf.gov/pubs/2018/nsf18517/nsf18517.htm

3-4 awards, anticipated funding $4.5 M

Due February 28, 2018

Our idea

Aim 1: Enhance Databrary’s shared video & audio recordings with new, machine-generated metadata

Aim 2: Create secure, cloud-based workspace for developing and testing machine learning models on Databrary resources

Aim 3: Develop robust workflows for automated gaze direction analysis from video

Aim 1

Collaborate with Tal Yarkoni and adapt his (NIH-funded) pliers package

images/video: faces, objects, visual saliency, indoor/outdoor, text in image

sound: speech/non-speech, sound spectra

How to return ‘tags’ to Databrary in useful form? (time series + summary stats)

Offer tagging of unshared data volumes?

Privacy/confidentiality issues

Aim 2

Cloud-based workspace for analysis & visualization

Linked to Databrary files

Facilitate convenient sshfs or similar file-sharing, version control

Infrastructure to spawn cloud-based virtual machines to manage computationally intensive analyses

`Ingest’ session metadata from spreadsheets

Aim 3

Collaborate with Kim Scott (LookIt), Rhodri Cusack and others

Develop and test new ML model from existing tagged training data + eye tracking

http://cusacklab.s3.amazonaws.com/html/downloads/annotate3.mp4

Source: Rhodri Cusack

The bigger picture

Bring reproducible machine-assisted video/audio tagging to wider range of behavioral scientists

Make data sharing even more appealing, attractive, valuable

Make psychology a more cumulative science (Mischel 2009)

Your turn

Stack

This talk was produced on 2018-01-26 08:50:37 in RStudio 1.1.383 using R Markdown and the reveal.JS framework. The code and materials used to generate the slides may be found at https://github.com/gilmore-lab/2018-01-26-p2c/. Information about the R Session that produced the slides is as follows:

## R version 3.4.1 (2017-06-30)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
## 
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.4.1  backports_1.1.0 magrittr_1.5    rprojroot_1.2  
##  [5] htmltools_0.3.6 tools_3.4.1     revealjs_0.9    yaml_2.1.14    
##  [9] Rcpp_0.12.12    stringi_1.1.5   rmarkdown_1.6   knitr_1.17     
## [13] stringr_1.2.0   digest_0.6.12   evaluate_0.10.1