PLAY PODCASTS
P-Values: Are we using a flawed statistical tool?
Episode 17

P-Values: Are we using a flawed statistical tool?

Normal Curves: Sexy Science, Serious Statistics · Regina Nuzzo and Kristin Sainani

September 22, 20251h 14m

Audio is streamed directly from the publisher (op3.dev) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

P-values show up in almost every scientific paper, yet they’re one of the most misunderstood ideas in statistics. In this episode, we break from our usual journal-club format to unpack what a p-value really is, why researchers have fought about it for a century, and how that famous 0.05 cutoff became enshrined in science. Along the way, we share stories from our own papers—from a Nature feature that helped reshape the debate to a statistical sleuthing project that uncovered a faulty method in sports science. The result: a behind-the-scenes look at how one statistical tool has shaped the culture of science itself.


Statistical topics

  • Bayesian statistics
  • Confidence intervals 
  • Effect size vs. statistical significance
  • Fisher’s conception of p-values
  • Frequentist perspective
  • Magnitude-Based Inference (MBI)
  • Multiple testing / multiple comparisons
  • Neyman-Pearson hypothesis testing framework
  • P-hacking
  • Posterior probabilities
  • Preregistration and registered reports
  • Prior probabilities
  • P-values
  • Researcher degrees of freedom
  • Significance thresholds (p < 0.05)
  • Simulation-based inference
  • Statistical power 
  • Statistical significance
  • Transparency in research 
  • Type I error (false positive)
  • Type II error (false negative)
  • Winner’s Curse


Methodological morals

  • “​​If p-values tell us the probability the null is true, then octopuses are psychic.”
  • “Statistical tools don't fool us, blind faith in them does.”


References


Kristin and Regina’s online courses: 

Demystifying Data: A Modern Approach to Statistical Understanding  

Clinical Trials: Design, Strategy, and Analysis 

Medical Statistics Certificate Program  

Writing in the Sciences 

Epidemiology and Clinical Research Graduate Certificate Program 

Programs that we teach in:

Epidemiology and Clinical Research Graduate Certificate Program 


Find us on:

Kristin -  LinkedIn & Twitter/X

Regina - LinkedIn & ReginaNuzzo.com

  • (00:00) - Intro & claim of the episode
  • (01:00) - Why p-values matter in science
  • (02:44) - What is a p-value? (ESP guessing game)
  • (06:47) - Big vs. small p-values (psychic octopus example)
  • (08:29) - Significance thresholds and the 0.05 rule
  • (09:00) - Regina’s Nature paper on p-values
  • (11:32) - Misconceptions about p-values
  • (13:18) - Fisher vs. Neyman-Pearson (history & feud)
  • (16:26) - Botox analogy and type I vs. type II errors
  • (19:41) - Dating app analogies for false positives/negatives
  • (22:02) - How the 0.05 cutoff got enshrined
  • (24:43) - Misinterpretations: statistical vs. practical significance
  • (26:19) - Effect size, sample size, and “statistically discernible”
  • (26:48) - P-hacking and researcher degrees of freedom
  • (29:49) - Transparency, preregistration, and open science
  • (30:55) - The 0.05 cutoff trap (p = 0.049 vs 0.051)
  • (31:21) - The biggest misinterpretation: what p-values actually mean
  • (33:32) - Paul the psychic octopus (worked example)
  • (36:02) - Why Bayesian statistics differ
  • (39:52) - Why aren’t we all Bayesian? (probability wars)
  • (41:08) - The ASA p-value statement (behind the scenes)
  • (43:19) - Key principles from the ASA white paper
  • (44:18) - Wrapping up Regina’s paper
  • (45:36) - Kristin’s paper on sports science (MBI)
  • (48:13) - What MBI is and how it spread
  • (50:46) - How Kristin got pulled in (Christie Aschwanden & FiveThirtyEight)
  • (54:08) - Critiques of MBI and “Bayesian monster” rebuttal
  • (56:17) - Spreadsheet autopsies (Welsh & Knight)
  • (58:08) - Cherry juice example (why MBI misleads)
  • (01:00:25) - Rebuttals and smoke & mirrors from MBI advocates
  • (01:02:58) - Winner’s Curse and small samples
  • (01:03:41) - Twitter fights & “establishment statistician”
  • (01:05:59) - Cult-like following & Matrix red pill analogy
  • (01:08:09) - Wrap-up


Topics

Normal Curves podcastRegina NuzzoKristin SainaniStanfordstatisticsbest statistics podcastp-valuesstatistical significancetype I errortype II errorhypothesis testingFisherNeyman-PearsonBayesian statisticsprior probabilitiesposterior probabilitiesstatistical powermultiple testingp-hackingmagnitude-based inferenceconfidence intervalseffect sizewinner’s cursepreregistrationopen scienceresearch transparencyscience culturescientific methodsreplication crisissports scienceNature paperstatistical sleuthing