No foolin’

May all our crises be ‘good’ ones

Rick Gilmore

Support

  • National Institutes of Health, R01-HD094830
  • National Science Foundation, OAS-2032713
  • John S. Templeton Foundation
  • Penn State Child Study Center

Agenda

  • Crises…
  • Struggling…
  • And resolution…

Crises…

Crisis #1

(Baker, 2016)

(Gilmore, Hillary, Lazar, & Wham, 2023)

(Baker, 2016)

(Baker, 2016)

(Baker, 2016)

(Saul, 2023)

(Oransky & Marcus, 2023)

Tim Errington

Center for Open Science

Open Science Bootcamp 2023

…The initial aim of the project was to repeat 193 experiments from 53 high-impact papers…However, the various barriers and challenges we encountered while designing and conducting the experiments meant that we were only able to repeat 50 experiments from 23 papers…

(Errington, Denis, Perfito, Iorns, & Nosek, 2021)

Tim Errington

Center for Open Science

Open Science Bootcamp 2023

…the data needed to compute effect sizes and conduct power analyses was publicly accessible for just 4 of 193 experimentsnone of the 193 experiments were described in sufficient detail in the original paper to enable us to design protocols to repeat the experiments…

(Errington et al., 2021)

Tim Errington

Center for Open Science

Open Science Bootcamp 2023

…While authors were extremely or very helpful for 41% of experiments, they were minimally helpful for 9% of experiments, and not at all helpful (or did not respond to us) for 32% of experiments

(Errington et al., 2021)

Tim Errington

Center for Open Science

Open Science Bootcamp 2023

…This experience draws attention to a basic and fundamental concern about replication – it is hard to assess whether reported findings are credible.

(Errington et al., 2021)

What the actual F-test??!

What colleagues have said…

Why can’t we just trust each other?

If I have to review someone else’s statistics, I’ll stop reviewing.

I’d never use data shared by X; I don’t trust what they do.

I don’t have {time, permission, money} to {share data, share materials, pre-register a study, collect a large enough sample to be well-powered…}

Back to the search for (statistical) significance!

Three Little Pigs

Pantheon

Crisis #2a: Metaphysics

  • What are we talking about?
  • Psychology a science of behavior and internal experience (perception, cognition, emotion…)
  • What is the logical, physical status (\(\Phi\)) of psychological states (\(\Psi\))?
  • If \(\Psi_{internal}\) can’t be measured directly but \(\Psi_{external}\) can be

Crisis #2b: Construct validity

Crisis #2c: What are the “laws” of psychological science?

https://149365049.v2.pressablecdn.com/wp-content/uploads/2014/07/6-blind-men-hans-1024x654.jpg

flowchart LR
  A([Grad-1]) --> B(Prof-1)
  B --> C[/Area-1/]
  C --> D[Dept-1]
  D --> E{College-1}
  F([Grad-2]) --> B
  G([Grad-3]) --> B
  H([Grad-4]) --> I[Prof-2]
  I --> C
  J([Grad-5]) --> K[Prof-3]
  K --> L[/Area-2/]
  L --> D
  N[Prof-4] --> L
  O[Prof-5] --> P[Dept-2]
  P --> E
  Q([Grad-6]) --> R[Prof-6]
  R --> S[Dept-3]
  S --> T{College-2}
  T --> U((Univ-1))
  E --> U

Crisis #2e: Toothbrush culture

Walter Mischel

…psychologists tend to treat other peoples’ theories like toothbrushes; no self-respecting individual wants to use anyone else’s.

(Mischel, 2011)

Walter Mischel

The toothbrush culture undermines the building of a genuinely cumulative science, encouraging more parallel play and solo game playing, rather than building on each other’s directly relevant best work.

(Mischel, 2011)

Struggling

Source: https://images.agoramedia.com/wte3.0/gcms/Why-Babies-Love-Mirrors-722x406.jpg

You may ask yourself…

Talking Heads - “Once In A Lifetime”

via GIPHY

If it was fun, they wouldn’t have to pay you to do it.

Jerry Gilmore

Robert Merton

What is scientific culture?

  • a stock of accumulated knowledge (facts & findings)

  • a set of characteristic methods

  • a set of cultural values (Merton, 1973, p. 268)

Richard Feynmann

…the idea that we all hope you have learned in studying science in school…It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty

(Feynman, 1974)

Richard Feynmann

The first principle is that you must not fool yourself—and you are the easiest person to fool. So you have to be very careful about that. After you’ve not fooled yourself, it’s easy not to fool other scientists…

(Feynman, 1974)

Richard Feynmann

…a specific, extra type of integrity that is not lying, but bending over backwards to show how you’re maybe wrong, that you ought to do when acting as a scientist. And this is our responsibility as scientists, certainly to other scientists, and I think to laymen.

(Feynman, 1974)

Richard Feynmann

…if you’re doing an experiment, you should report everything that you think might make it invalid–not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment…

(Feynman, 1974)

Richard Feynmann

…If you’ve made up your mind to test a theory, or you want to explain some idea, you should always decide to publish it whichever way it comes out. If we only publish results of a certain kind, we can make the argument look good. We must publish both kinds of result.

(Feynman, 1974)

…So I have just one wish for you—the good luck to be somewhere where you are free to maintain the kind of integrity I have described, and where you do not feel forced by a need to maintain your position in the organization or financial support, or so on, to lose your integrity. May you have that freedom.

(Feynman, 1974)

Resolution

via GIPHY

Qian, Berenbaum, & Gilmore (2022)

Motivation

Vision tasks used in Qian et al. 2022

Mental rotation task

Hobbies task

(Qian et al., 2022)

(Qian et al., 2022)

(Qian et al., 2022)

Lessons learned

  • Replications are good
    • And sadly uncommon
    • Bigger (combined) samples \(\rightarrow\) higher power to detect smaller effects
    • Don’t we want to know the distribution of effect sizes | manipulations X, Y, Z?

Lessons learned

Lessons learned

  • Preregistration helps draw bright line between confirmatory and exploratory analyses
    • Hard, but useful discipline
    • Included several post-registration exploratory analyses, including one suggested by reviewer.
    • “Causal” analysis: Vision better predictor of mental rotation scores than sex.

Lessons learned

Power analysis curves from (Qian et al., 2022)

  • Don’t fully understand relationship between sample size, power, and effect size?

Lessons learned

  • We need bigger samples…

…Assuming a realistic range of prior probabilities for null hypotheses, false report probability is likely to exceed 50% for the whole (psychology and cognitive neuroscience) literature. (Szucs & Ioannidis, 2017)

Lessons learned

  • Internally reproducible workflows
    • Bolster confidence (“don’t fool yourself”)
    • Easier to share with collaborators, the world
    • Always in a “share-able” form
    • Especially scriptable figures!

https://github.com/gilmore-lab/sex-differences-project

  • Words + Images + Videos + Code + Figures \(\rightarrow\) web pages, .pptx, PDF, .docx, etc.
  • Gilmore & Valcin talk at the Open Science Bootcamp 2023.

Lessons learned

  • Plot (all of) your data
  • And save the plots, so you can use them later

Contrast Task

Motion Task

Contrast Task

Motion Task

Contrast Task

Motion Task

Lessons learned

  • Working collaboratively “in the open” is hard
  • Open science practices take time and effort
  • Quick (& dirty) is, well…
  • Pay now (incrementally) or pay (a lot) later…

Figure 1: Embed curation & QA or defer it

Opening the file drawer

\(\Psi_{internal}\) states not directly observable, but…

  • Behaviors are!
  • Contexts can be!

Images and video are essential to the study of behavior

(Gilmore & Adolph, 2017)

Play & Learning Across a Year (PLAY)

  • What do mothers and their infants actually do when they are together at home?
  • \(n=\) 1,000 dyads
  • 30+ sites
  • 12, 18-, & 24-month-olds
  • 1 hr natural behavior; video home tour; video-recorded survey questionnaires
  • All openly shared on Databrary.org

(Soska et al., 2021, Fig 3)

(Soska et al., 2021, Fig 1)

Time series of PLAY recruiting calls

Distribution of prospective participant birthweights

Parent-reported child bedtime

Crawling onset

Comparing Adolph to WHO walk onset criteria

Crawl onset vs. Walk onset

  • Video + other identifiable/sensitive data
  • Share openly1, but not publicly, with restricted community of researchers2 who are authorized by their institutions.
  • Share with explicit participant/parent permission
  • Python equivalent (databrarypy) in progress.
databraryr::login_db(Sys.getenv("DATABRARY_LOGIN"))
[1] TRUE
df <- databraryr::download_session_csv(vol_id = 4) |> 
  readr::read_csv()
xtabs(~ `participant-gender` + `participant-race`, df)
                  participant-race
participant-gender Asian Black or African American More than one
            Female     2                         2             6
            Male       7                         0             5
                  participant-race
participant-gender Unknown or not reported White
            Female                       2    18
            Male                         3    22

Download and open a video from Databrary: https://nyu.databrary.org/volume/1/slot/9807/-?asset=1

db_vid <- databraryr::download_asset()
system(paste0("open ", db_vid))

https://penn-state-open-science.github.io/bootcamp-2023/

“Good enough” practices

  • Dr. Alaina Pearce, Good enough practices data and project management | talk slides
  • Dr. Alaina Pearce, Data management: Practicalities | talk slides

Gilmore’s additions…

Methods reproducibility

Methods reproducibility refers to the provision of enough detail about study procedures and data so the same procedures could, in theory or in actuality, be exactly repeated.

(Goodman et al., 2016)

  • Does a typical journal article satisfy this?
  • Does project X from your lab group satisfy this?
  • Reproducible to your team \(\rightarrow\) reproducible to others

Open science practices need not be all or none

Richard Feynmann

The first principle is that you must not fool yourself—and you are the easiest person to fool….

(Feynman, 1974)

Richard Feynmann

So I have just one wish for you—the good luck to be somewhere where you are free to maintain the kind of integrity I have described…May you have that freedom.

(Feynman, 1974)

What about the other crises?

  • Speed vs. accuracy, quantity vs. quality
  • Avoiding cognitive biases in our own work (e.g., Munafò et al., 2017)
  • \(\Phi\leftarrow\Psi_{internal}\)? That’s another talk

(Baker, 2016)

May all our crises be ‘good’ ones…

The replication crisis has led to positive structural, procedural, and community changes.

(Korbmacher et al., 2023)

If it was fun, they wouldn’t have to pay you to do it.

If it was fun, they wouldn’t have to pay you to do it.

But now that it’s more fun, they still should pay you…probably better than they are now. Especially now that you have my new great-grandson to love and support…

Great Grandpa Jerry Gilmore

No foolin’: May all our crises be ‘good’ ones

Rick Gilmore
rog1 AT-SYMBOL psu PERIOD edu
114 Moore
github.com/gilmore-lab
rick-gilmore.com
github.com/psu-psychology
github.com/penn-state-open-science

Resources

qrcode::qr_code("https://gilmore-lab.github.io/dev-prosem-2023-fall/") |>
  plot()

HTML slides from Rick Gilmore’s talk on 2023-09-06

About

This talk was produced using Quarto with the RStudio Integrated Development Environment (IDE), version 2023.6.2.561, (Posit team, 2023).

The source files are in R and R Markdown, then rendered to HTML using the revealJS framework. The HTML slides are hosted in a GitHub repo and served by the GitHub pages service at the following URL: https://gilmore-lab.github.io/dev-prosem-2023-fall/

[1] TRUE

References

Abramov, I., Gordon, J., Feldman, O., & Chavarga, A. (2012). Sex & vision i: Spatio-temporal resolution. Biology of Sex Differences, 3(1), 20. https://doi.org/10.1186/2042-6410-3-20
Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature News, 533(7604), 452. https://doi.org/10.1038/533452a
Errington, T. M., Denis, A., Perfito, N., Iorns, E., & Nosek, B. A. (2021). Challenges for assessing replicability in preclinical cancer biology. eLife, 10, e67995. https://doi.org/10.7554/eLife.67995
Feynman, R. P. (1974). Cargo cult science. https://calteches.library.caltech.edu/51/2/CargoCult.htm. Retrieved from https://calteches.library.caltech.edu/51/2/CargoCult.htm
Gabelica, M., Bojčić, R., & Puljak, L. (2022). Many researchers were not compliant with their published data sharing statement: A mixed-methods study. Journal of Clinical Epidemiology, 150, 33–41. https://doi.org/10.1016/j.jclinepi.2022.05.019
Gilmore, R. O., & Adolph, K. E. (2017). Video can make behavioural research more reproducible. Nature Human Behavior, 1. https://doi.org/10.1038/s41562-017-0128
Gilmore, R. O., Hillary, F., Lazar, N., & Wham, B. (2023). Penn state open science survey. https://penn-state-open-science.github.io/survey-fall-2022/index.html. Retrieved from https://penn-state-open-science.github.io/survey-fall-2022/index.html
Gilmore, R. O., & Kohler, P. J. (n.d.). Symmetry-sorting: Behavioral studies associated with symmetry project. Github. Retrieved from https://github.com/gilmore-lab/symmetry-sorting
Goodman, S. N., Fanelli, D., & Ioannidis, J. P. A. (2016). What does research reproducibility mean? Science Translational Medicine, 8(341), 341ps12–341ps12. https://doi.org/10.1126/scitranslmed.aaf5027
Kohler, P. J., Vedak, S., & Gilmore, R. O. (2022). Perceptual similarities among wallpaper group exemplars. Symmetry, 14(5), 857. https://doi.org/10.3390/sym14050857
Korbmacher, M., Azevedo, F., Pennington, C. R., Hartmann, H., Pownall, M., Schmidt, K., … Evans, T. (2023). The replication crisis has led to positive structural, procedural, and community changes. Communications Psychology, 1(1), 1–13. https://doi.org/10.1038/s44271-023-00003-2
Merton, R. W. (1973). The normative structure of science. In R. K. Merton & N. W. Storer (Eds.), The sociology of science: Theoretical and empirical investigations (pp. 267–278). The University of Chicago Press.
Mischel, W. (2011). Becoming a cumulative science. APS Observer, 22(1). Retrieved from https://www.psychologicalscience.org/observer/becoming-a-cumulative-science
Mitroff, I. I. (1974). Norms and counter-norms in a select group of the apollo moon scientists: A case study of the ambivalence of scientists. American Sociological Review, 39(4), 579–595. https://doi.org/10.2307/2094423
Miyakawa, T. (2020). No raw data, no science: Another possible source of the reproducibility crisis. Molecular Brain, 13(1), 24. https://doi.org/10.1186/s13041-020-0552-2
Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Sert, N. P. du, … Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. https://doi.org/10.1038/s41562-016-0021
Murray, S. O., Schallmo, M.-P., Kolodny, T., Millin, R., Kale, A., Thomas, P., … Tadin, D. (2018). Sex differences in visual motion processing. Current Biology: CB. https://doi.org/10.1016/j.cub.2018.06.014
Oransky, I., & Marcus, A. (2023). Science corrects itself, right? A scandal at stanford says it doesn’t. Scientific American. Retrieved from https://www.scientificamerican.com/article/science-corrects-itself-right-a-scandal-at-stanford-says-it-doesnt/
Posit team. (2023). RStudio: Integrated development environment for r. Boston, MA: Posit Software, PBC. Retrieved from http://www.posit.co/
Qian, Y., Berenbaum, S. A., & Gilmore, R. O. (2022). Vision contributes to sex differences in spatial cognition and activity interests. Scientific Reports, 12(1), 17623. https://doi.org/10.1038/s41598-022-22269-y
Qian, Y., Seisler, A. R., & Gilmore, R. O. (2021). Children’s perceptual sensitivity to optic flow-like visual motion differs from adults. Developmental Psychology, 57(11), 1810–1821. https://doi.org/10.1037/dev0001227
Saul, S. (2023). Stanford president will resign after report found flaws in his research. The New York Times. Retrieved from https://www.nytimes.com/2023/07/19/us/stanford-president-resigns-tessier-lavigne.html
Shaqiri, A., Roinishvili, M., Grzeczkowski, L., Chkonia, E., Pilz, K., Mohr, C., … Herzog, M. H. (2018). Sex-related differences in vision are heterogeneous. Scientific Reports, 8(1), 7521. https://doi.org/10.1038/s41598-018-25298-8
Soska, K. C., Xu, M., Gonzalez, S. L., Herzberg, O., Tamis-LeMonda, C. S., Gilmore, R. O., & Adolph, K. E. (2021). (Hyper)active data curation: A video case study from behavioral science. Journal of Escience Librarianship, 10(3). https://doi.org/10.7191/jeslib.2021.1208
Szucs, D., & Ioannidis, J. P. A. (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15(3), e2000797. https://doi.org/10.1371/journal.pbio.2000797
Tedersoo, L., Küngas, R., Oras, E., Köster, K., Eenmaa, H., Leijen, Ä., … Sepp, T. (2021). Data sharing practices and data availability upon request differ across scientific disciplines. Scientific Data, 8(1), 192. https://doi.org/10.1038/s41597-021-00981-0
Vanpaemel, W., Vermorgen, M., Deriemaecker, L., & Storms, G. (2015). Are we wasting a good crisis? The availability of psychological research data after the storm. Collabra, 1(1). https://doi.org/10.1525/collabra.13
Watson, C. (2022, June). Many researchers say they’ll share data — but don’t. http://dx.doi.org/10.1038/d41586-022-01692-1. https://doi.org/10.1038/d41586-022-01692-1
Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. The American Psychologist, 61(7), 726–728. https://doi.org/10.1037/0003-066X.61.7.726