Season 2 · Episode 1752

Whisper Small Beats Whisper Large in Speed & Accuracy

A 4GPU benchmark on Ubuntu shows the 1.5B parameter Whisper Large is slower and less accurate than the tiny Whisper Small.

My Weird Prompts · Daniel Rosehill

March 29, 202626m 58s

Audio is streamed directly from the publisher (dts.podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page View transcript

Show Notes

A new benchmark on Ubuntu Linux using Handy and ONNX Runtime tested 13 speech-to-text models on a consumer AMD Radeon RX 7800 XT. The results reveal a surprising reality: OpenAI's massive Whisper Large model was nearly 3x slower and made 3 errors, while the tiny Whisper Small finished in under 1 second with zero errors. This episode explores why bigger isn't always better in AI, the "Goldilocks zone" of latency, and why streaming models might be the wrong tool for push-to-talk workflows.

← All episodes of My Weird Prompts

Whisper Small Beats Whisper Large in Speed &amp; Accuracy

Show Notes

Whisper Small Beats Whisper Large in Speed & Accuracy