
Season 2 · Episode 1752
Whisper Small Beats Whisper Large in Speed & Accuracy
A 4GPU benchmark on Ubuntu shows the 1.5B parameter Whisper Large is slower and less accurate than the tiny Whisper Small.
My Weird Prompts · Daniel Rosehill
March 29, 202626m 58s
Audio is streamed directly from the publisher (dts.podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
A new benchmark on Ubuntu Linux using Handy and ONNX Runtime tested 13 speech-to-text models on a consumer AMD Radeon RX 7800 XT. The results reveal a surprising reality: OpenAI's massive Whisper Large model was nearly 3x slower and made 3 errors, while the tiny Whisper Small finished in under 1 second with zero errors. This episode explores why bigger isn't always better in AI, the "Goldilocks zone" of latency, and why streaming models might be the wrong tool for push-to-talk workflows.