BS in science

2021-10-19 08:55:00

Preliminaries

About me

Professor of Psychology
Founding Director of Human Imaging, PSU SLEIC
Co-Founder/Co-Director Databrary.org

B.A. in cognitive science, Brown University
Ph.D. in cognitive neuroscience, Carnegie Mellon University
vision, perception & action, brain development, open science
ham (K3ROG), actor/singer, banjo-picker, hiker/cyclist, coder

Overview

Reproducibility in science
Science and sin
BS: Beyond sin

Reproducibility in science

https://youtu.be/66oNv_DJuPc

What proportion of studies in the published scientific literature are actually true?

100%
90%
70%
50%
30%

How do we define what “actually true” means?

Is there a reproducibility crisis in science?

Yes, a significant crisis
Yes, a slight crisis
No crisis
Don’t know

[[@baker_1500_2016]](http://doi.org/10.1038/533452a)

(Baker, 2016)

What does “reproducibility” mean?

Are we all talking about the same thing?

Methods reproducibility

Enough details about materials & methods recorded (& reported) so that
Can same results with same materials & methods

(Goodman, Fanelli, & Ioannidis, 2016)

Results reproducibility

Same results from independent study

(Goodman, Fanelli, & Ioannidis, 2016)

Inferential reproducibility

Same inferences from one or more studies or reanalyses
Meta- or mega-analyses

(Goodman, Fanelli, & Ioannidis, 2016)

Factors contributing to irreproducible research

(Baker, 2016)

Reproducibility crisis

Not just psychology and related behavioral sciences
“Hard” sciences, too
Challenges affect data collection to statistical analysis to reporting to publishing

Robert Merton and The ‘Ethos’ of Science

Wikipedia

universalism: scientific validity is independent of sociopolitical status/personal attributes of its participants
communalism: common ownership of scientific goods (intellectual property)
disinterestedness: scientific institutions benefit a common scientific enterprise, not specific individuals
organized skepticism: claims should be exposed to critical scrutiny before being accepted

So, is science ‘full of it?’

https://i.pinimg.com/originals/11/99/55/1199557a7cbc80c73dfd6dd58ff19b54.jpg

BS persuades, but (knowingly) disregards truth (Frankfurt, On Bullshit)
Science practices and communications attempt to persuade about the truth
But if truth value disregarded, overlooked, downplayed, exaggerated…

Science and sin

The sin of unreliability

https://i.reddituploads.com/cfb6336d162f4b908cb6715d3da752b5?fit=max&h=1536&w=1536&s=cb3b9e51ea5fef6fdc229fb24b740b7d

There are many ways to be wrong…

Studies are underpowered

[[@Szucs2017-fc]](http://dx.doi.org/10.1371/journal.pbio.2000797)

(Szucs & Ioannidis, 2017)

“Assuming a realistic range of prior probabilities for null hypotheses, false report probability is likely to exceed 50% for the whole literature.”

(Szucs & Ioannidis, 2017)

Too often our studies are too small…

The sin of hoarding…

http://www.clubbinghouse.com/images/photos/covers/20080403153957.jpg

…grants, …students, …data

Psychologists asked to share data…

(Vanpaemel, Vermorgen, Deriemaecker, & Storms, 2015)

Response	Percent
No reply	41%
Refused/unable to share data	18%
No data despite promise	4%
Data shared after reminder	16%
Data shared after 1st request	22%

The sin of corruptibility…

[[@lacour_when_2014]](http://doi.org/10.1126/science.1256151)

(LaCour & Green, 2014)

http://pubman.mpdl.mpg.de/pubman/item/escidoc:1569964:8/component/escidoc:1569966/Stapel_Investigation_Final_report.pdf

http://www.sciencemag.org/news/2012/09/harvard-psychology-researcher-committed-fraud-us-investigation-concludes

The sin of bias…

Bem, D.J. (2011). Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100(3), 407-425.

“This article reports 9 experiments, involving more than 1,000 participants, that test for retroactive influence by ‘time-reversing’ well-established psychological effects so that the individual’s responses are obtained before the putatively causal stimulus events occur.”

“We argue that in order to convince a skeptical audience of a controversial claim, one needs to conduct strictly confirmatory studies and analyze the results with statistical tests that are conservative rather than liberal…”

(Wagenmakers, Wetzels, Borsboom, & Maas, 2011)

“We conclude that Bem’s p values do not indicate evidence in favor of precognition; instead, they indicate that experimental psychologists need to change the way they conduct their experiments and analyze their data.”

(Wagenmakers, Wetzels, Borsboom, & Maas, 2011)

The sin of hurrying…

[[@Nuijten2015-ul]](https://doi.org10.3758/s13428-015-0664-2)

(Nuijten, Hartgerink, Assen, Epskamp, & Wicherts, 2015)

The sin of narrowmindedness…

“…psychologists tend to treat other peoples’ theories like toothbrushes; no self-respecting individual wants to use anyone else’s.”

(Mischel, 2011)

“The toothbrush culture undermines the building of a genuinely cumulative science, encouraging more parallel play and solo game playing, rather than building on each other’s directly relevant best work.”

(Mischel, 2011)

The sin of pragmatism…

“Reviewers and editors want novel, interesting results. Why would I waste my time doing careful direct replications?”

“Reviewing papers is hard, unpaid work. If I have to check someone’s stats, too, I’ll quit.”

– Any number of researchers I’ve talked with

https://i.pinimg.com/originals/9b/f4/e5/9bf4e5a1d584bc6065c3e3231f9b3220.gif

In our defense…

Behavior multidimensional

(Adolph et al., 2016)

Embedded in nested networks of causal factors

Humans are diverse

But much (lab-based) data collected are from Western, Educated Industrialized, Rich, Democratic (WEIRD) populations

Henrich et al., 2010

Data about humans are sensitive, hard(er) to share

Psychology is harder than physics

BS: beyond sin

Beyond sin

No physics envy, but we can learn from physics and other fields
Openness and transparency are the means, not the ends
How do we accelerate, broaden, and deepen discovery?

[[@munafo_manifesto_2017]](http://doi.org/10.1038/s41562-016-0021)

(Munafò et al., 2017)

Reproducibility in psychological science

Bigger samples
Multiple replications
Registration
Data, procedure, and materials sharing
“Data blinding”
Larg(er)-scale replication studies

Reproducibility Project: Psychology (RPP)

(Collaboration, 2015)

“We conducted replications of 100…studies published in three psychology journals using high-powered designs and original materials when available….The mean effect size (r) of the replication effects …was half the magnitude of the mean effect size of the original effects…”

(Collaboration, 2015)

“Ninety-seven percent of original studies had significant results (P < .05). Thirty-six percent of replications had significant results.”

(Collaboration, 2015)

“39% of effects were subjectively rated to have replicated the original result…”

(Collaboration, 2015)

(Camerer et al., 2018)

(Camerer et al., 2018)

If it’s too good to be true, it probably isn’t

https://80000hours.org/psychology-replication-quiz/

Video as data and documentation

Improved statistical practices

Automated checking of paper statistics (in American Psychological Association formats) via Statcheck
Redefine “statistical significance” as \(p<.005\)? (Benjamin et al., 2017)
Or move away from NHST toward more robust and cumulative practices (Bayesian, CI/effect-size-driven) - Better capture what we know or think we know

Store data, materials, code in repositories

Data libraries
Funder, journal mandates for sharing increasing
But no long-term, stable, funding sources for curation, archiving, sharing

http://cdn2.hubspot.net/hub/134568/file-1208368053-jpg/6-blind-men-hans.jpg

http://static.neatorama.com/images/2012-09/girl-hugging-elephant.jpg

Stuff happens

http://3.bp.blogspot.com/-hcXiVcmcm5g/Tsv1QlO-kUI/AAAAAAAACF0/vT_UjyHwA4k/s1600/manure.jpg

But science is still a pretty good shovel

https://www.amazon.com/Shovel-Manure-Other-Lessons-Country-ebook/dp/B004PLNRNC

rogilmore@psu.edu

https://gilmore-lab.github.io

https://github.com/gilmore-lab/2021-10-19-bs-in-science/

Resources

This talk was produced on 2021-10-19 in RStudio using R Markdown. The code and materials used to generate the slides may be found at https://github.com/gilmore-lab/2021-10-19-bs-in-science/. Information about the R Session that produced the code is as follows:

## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 11.6
## 
## Matrix products: default
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets 
## [6] methods   base     
## 
## other attached packages:
## [1] DiagrammeR_1.0.6.1
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.7         knitr_1.33        
##  [3] servr_0.23         magrittr_2.0.1    
##  [5] R6_2.5.0           jpeg_0.1-9        
##  [7] rlang_0.4.11       highr_0.9         
##  [9] stringr_1.4.0      visNetwork_2.1.0  
## [11] tools_4.1.0        websocket_1.4.1   
## [13] xfun_0.24          png_0.1-7         
## [15] jquerylib_0.1.4    htmltools_0.5.1.1 
## [17] yaml_2.2.1         digest_0.6.27     
## [19] processx_3.5.2     RColorBrewer_1.1-2
## [21] later_1.2.0        htmlwidgets_1.5.3 
## [23] sass_0.4.0         promises_1.2.0.1  
## [25] ps_1.6.0           mime_0.11         
## [27] glue_1.4.2         evaluate_0.14     
## [29] rmarkdown_2.9      stringi_1.7.3     
## [31] compiler_4.1.0     bslib_0.2.5.1     
## [33] jsonlite_1.7.2     pagedown_0.15     
## [35] httpuv_1.6.1

References

Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature News, 533(7604), 452. https://doi.org/10.1038/533452a

Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T.-H., Huber, J., Johannesson, M., … Wu, H. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour, 1. https://doi.org/10.1038/s41562-018-0399-z

Collaboration, O. S. (2015). Estimating the reproducibility of psychological. Science, 349(6251), aac4716. https://doi.org/10.1126/science.aac4716

Goodman, S. N., Fanelli, D., & Ioannidis, J. P. A. (2016). What does research reproducibility mean? Science Translational Medicine, 8(341), 341ps12–341ps12. https://doi.org/10.1126/scitranslmed.aaf5027

LaCour, M. J., & Green, D. P. (2014). When contact changes minds: An experiment on transmission of support for gay equality. Science, 346(6215), 1366–1369. https://doi.org/10.1126/science.1256151

Mischel, W. (2011). Becoming a cumulative science. APS Observer, 22(1). Retrieved from https://www.psychologicalscience.org/observer/becoming-a-cumulative-science

Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Sert, N. P. du, … Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. https://doi.org/10.1038/s41562-016-0021

Nuijten, M. B., Hartgerink, C. H. J., Assen, M. A. L. M. van, Epskamp, S., & Wicherts, J. M. (2015). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 1–22. https://doi.org/10.3758/s13428-015-0664-2

Szucs, D., & Ioannidis, J. P. A. (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15(3), e2000797. https://doi.org/10.1371/journal.pbio.2000797

Vanpaemel, W., Vermorgen, M., Deriemaecker, L., & Storms, G. (2015). Are we wasting a good crisis? The availability of psychological research data after the storm. Collabra, 1(1). https://doi.org/10.1525/collabra.13

Wagenmakers, E.-J., Wetzels, R., Borsboom, D., & Maas, H. L. J. van der. (2011). Why psychologists must change the way they analyze their data: The case of psi: Comment on bem (2011). J. Pers. Soc. Psychol., 100(3), 426–432. https://doi.org/10.1037/a0022790

Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726–728. https://doi.org/10.1037/0003-066X.61.7.726