Season 2 · Episode 1500

Why Google is Killing RAG and OpenAI Embraces Latency

The era of the chatbot is over. Discover how the "agentic substrate" of 2026 is redefining computing through GPT, Gemini, and Claude.

My Weird Prompts · Daniel Rosehill

March 24, 202622m 18s

Audio is streamed directly from the publisher (dts.podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page View transcript

Show Notes

The era of talking to a box on a screen is officially over. In this episode, we explore the transition into the "Multi-Surface Operating Layer," where AI serves as an invisible substrate for professional life rather than a standalone product. We dive deep into the technical divergence of late March 2026, comparing the architectural DNA of GPT-5.4, Gemini 3.1, and Claude 4.6. Why is Claude leading in real-world coding while Gemini dominates fluid intelligence benchmarks? We break down the trade-offs between OpenAI’s high-latency "Thinking" models and Google’s low-latency recursive memory. Beyond the software, we discuss the strategic move to AMD hardware and the legal clouds looming over training data. This episode provides a comprehensive roadmap for anyone building in the new AI stack, from the nuances of Mixture-of-Experts routing to the shift toward universal multimodal perception. Whether you are a developer, researcher, or tech enthusiast, this deep dive reveals how the choice of model now determines the very logic of your automated workflows.

← All episodes of My Weird Prompts