Episode 17

P-Values: Are we using a flawed statistical tool?

Normal Curves: Sexy Science, Serious Statistics · Regina Nuzzo and Kristin Sainani

September 22, 20251h 14m

Audio is streamed directly from the publisher (op3.dev) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page View transcript Chapters

Show Notes

P-values show up in almost every scientific paper, yet they’re one of the most misunderstood ideas in statistics. In this episode, we break from our usual journal-club format to unpack what a p-value really is, why researchers have fought about it for a century, and how that famous 0.05 cutoff became enshrined in science. Along the way, we share stories from our own papers—from a Nature feature that helped reshape the debate to a statistical sleuthing project that uncovered a faulty method in sports science. The result: a behind-the-scenes look at how one statistical tool has shaped the culture of science itself.

Statistical topics

Bayesian statistics
Confidence intervals
Effect size vs. statistical significance
Fisher’s conception of p-values
Frequentist perspective
Magnitude-Based Inference (MBI)
Multiple testing / multiple comparisons
Neyman-Pearson hypothesis testing framework
P-hacking
Posterior probabilities
Preregistration and registered reports
Prior probabilities
P-values
Researcher degrees of freedom
Significance thresholds (p < 0.05)
Simulation-based inference
Statistical power
Statistical significance
Transparency in research
Type I error (false positive)
Type II error (false negative)
Winner’s Curse

Methodological morals

“If p-values tell us the probability the null is true, then octopuses are psychic.”
“Statistical tools don't fool us, blind faith in them does.”

References

Nuzzo R. Scientific method: statistical errors. Nature. 2014 Feb 13;506(7487):150-2. doi: 10.1038/506150a.
Nuzzo, R., 2015. Scientists perturbed by loss of stat tools to sift research fudge from fact. Scientific American, pp.16-18.
Nuzzo RL. The inverse fallacy and interpreting P values. PM&R. 2015 Mar;7(3):311-4. doi: 10.1016/j.pmrj.2015.02.011. Epub 2015 Feb 25.
Nuzzo, R., 2015. Probability wars. New Scientist, 225(3012), pp.38-41.
Sainani KL. Putting P values in perspective. PM&R. 2009 Sep;1(9):873-7. doi: 10.1016/j.pmrj.2009.07.003.
Sainani KL. Clinical versus statistical significance. PM&R. 2012 Jun;4(6):442-5. doi: 10.1016/j.pmrj.2012.04.014.
McLaughlin MJ, Sainani KL. Bonferroni, Holm, and Hochberg corrections: fun names, serious changes to p values. PM&R. 2014 Jun;6(6):544-6. doi: 10.1016/j.pmrj.2014.04.006. Epub 2014 Apr 22.
Sainani KL. The Problem with "Magnitude-based Inference". Med Sci Sports Exerc. 2018 Oct;50(10):2166-2176. doi: 10.1249/MSS.0000000000001645.
Sainani KL, Lohse KR, Jones PR, Vickers A. Magnitude-based Inference is not Bayesian and is not a valid method of inference. Scand J Med Sci Sports. 2019 Sep;29(9):1428-1436. doi: 10.1111/sms.13491.
Lohse KR, Sainani KL, Taylor JA, Butson ML, Knight EJ, Vickers AJ. Systematic review of the use of "magnitude-based inference" in sports science and medicine. PLoS One. 2020 Jun 26;15(6):e0235318. doi: 10.1371/journal.pone.0235318.
Wasserstein, R.L. and Lazar, N.A., 2016. The ASA statement on p-values: context, process, and purpose. The American Statistician, 70(2), pp.129-133.

Kristin and Regina’s online courses:

Demystifying Data: A Modern Approach to Statistical Understanding

Clinical Trials: Design, Strategy, and Analysis

Medical Statistics Certificate Program

Writing in the Sciences

Epidemiology and Clinical Research Graduate Certificate Program

Programs that we teach in:

Epidemiology and Clinical Research Graduate Certificate Program

Find us on:

Kristin - LinkedIn & Twitter/X

Regina - LinkedIn & ReginaNuzzo.com

(00:00) - Intro & claim of the episode
(01:00) - Why p-values matter in science
(02:44) - What is a p-value? (ESP guessing game)
(06:47) - Big vs. small p-values (psychic octopus example)
(08:29) - Significance thresholds and the 0.05 rule
(09:00) - Regina’s Nature paper on p-values
(11:32) - Misconceptions about p-values
(13:18) - Fisher vs. Neyman-Pearson (history & feud)
(16:26) - Botox analogy and type I vs. type II errors
(19:41) - Dating app analogies for false positives/negatives
(22:02) - How the 0.05 cutoff got enshrined
(24:43) - Misinterpretations: statistical vs. practical significance
(26:19) - Effect size, sample size, and “statistically discernible”
(26:48) - P-hacking and researcher degrees of freedom
(29:49) - Transparency, preregistration, and open science
(30:55) - The 0.05 cutoff trap (p = 0.049 vs 0.051)
(31:21) - The biggest misinterpretation: what p-values actually mean
(33:32) - Paul the psychic octopus (worked example)
(36:02) - Why Bayesian statistics differ
(39:52) - Why aren’t we all Bayesian? (probability wars)
(41:08) - The ASA p-value statement (behind the scenes)
(43:19) - Key principles from the ASA white paper
(44:18) - Wrapping up Regina’s paper
(45:36) - Kristin’s paper on sports science (MBI)
(48:13) - What MBI is and how it spread
(50:46) - How Kristin got pulled in (Christie Aschwanden & FiveThirtyEight)
(54:08) - Critiques of MBI and “Bayesian monster” rebuttal
(56:17) - Spreadsheet autopsies (Welsh & Knight)
(58:08) - Cherry juice example (why MBI misleads)
(01:00:25) - Rebuttals and smoke & mirrors from MBI advocates
(01:02:58) - Winner’s Curse and small samples
(01:03:41) - Twitter fights & “establishment statistician”
(01:05:59) - Cult-like following & Matrix red pill analogy
(01:08:09) - Wrap-up

Topics

Normal Curves podcastRegina NuzzoKristin SainaniStanfordstatisticsbest statistics podcastp-valuesstatistical significancetype I errortype II errorhypothesis testingFisherNeyman-PearsonBayesian statisticsprior probabilitiesposterior probabilitiesstatistical powermultiple testingp-hackingmagnitude-based inferenceconfidence intervalseffect sizewinner’s cursepreregistrationopen scienceresearch transparencyscience culturescientific methodsreplication crisissports scienceNature paperstatistical sleuthing

← All episodes of Normal Curves: Sexy Science, Serious Statistics