Season 2 · Episode 1548

Modal and the End of the Serverless GPU Cold Start

Stop waiting for containers to warm up. Discover how Modal is reinventing GPU infrastructure to eliminate friction in AI development.

My Weird Prompts · Daniel Rosehill

March 25, 202620m 56s

Audio is streamed directly from the publisher (dts.podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page View transcript

Show Notes

Serverless computing promised a frictionless experience, but the reality for many AI developers has been a cycle of waiting for containers to warm up and GPUs to initialize. In this episode, we dive deep into Modal, the platform challenging the cloud giants by building a custom container runtime and scheduler from the ground up specifically for high-performance AI workloads. We explore technical breakthroughs like GPU snapshots that slash cold starts from fifteen seconds to under three, and the financial "51% rule" that helps teams decide between serverless and bare-metal infrastructure. From massive concurrency in video generation to the hurdles of running architectural simulations in Linux-native environments, we examine how Modal is reshaping the way we think about compute. Discover why the next generation of AI applications requires a fundamental shift in how we manage infrastructure.

← All episodes of My Weird Prompts