Linear Digressions

Explorations in Machine Learning and Data Science

Katie Malone

312 episodesEN

Show overview

Linear Digressions has been publishing since 2014, and across the 12 years since has built a catalogue of 312 episodes. That works out to roughly 100 hours of audio in total. Releases follow a fortnightly cadence.

Episodes typically run ten to twenty minutes — most land between 15 min and 24 min — though episode length varies meaningfully from one episode to the next. None of the episodes are flagged explicit by the publisher. It is catalogued as a EN-language Technology show.

The show is actively publishing — the most recent episode landed 1 months ago, with 21 episodes already out so far this year. The busiest year was 2016, with 62 episodes published. Published by Katie Malone.

Episodes

312

Running

2014–2026 · 12y

Median length

19 min

Cadence

Fortnightly

From the publisher

Linear Digressions is a podcast about machine learning and data science. Machine learning is being used to solve a ton of interesting problems, and to accomplish goals that were out of reach even a few short years ago. 896520

Latest Episodes

View all 312 episodes

What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

Apr 20, 202619 min

Unfaithful Chain of Thought

Apr 13, 202624 min

Benchmark Bank Heist

Apr 6, 202612 min

Benchmarking AI Models

How do you know if a new AI model is actually better than the last one? It turns out answering that question is a lot messier than it sounds. This week we dig into the world of LLM benchmarks — the standardized tests used to compare models — exploring two canonical examples: MMLU, a 14,000-question multiple choice gauntlet spanning medicine, law, and philosophy, and SWE-bench, which throws real GitHub bugs at models to see if they can fix them. Along the way: Goodhart's Law, data contamination, canary strings, and why acing a test isn't always the same as being smart.

Mar 30, 202629 min

The Hot Mess of AI (Mis-)Alignment

The paperclip maximizer — the classic AI doom scenario where a hyper-competent machine single-mindedly converts the universe into office supplies — might not be the AI risk we should actually lose sleep over. New research from Anthropic's AI safety division suggests misaligned AI looks less like an evil genius and more like a distracted wanderer who gets sidetracked reading French poetry instead of, say, managing a nuclear power plant. This week we dig into a fascinating paper reframing AI misalignment through the lens of bias-variance decomposition, and why longer reasoning chains might actually make things worse, not better. - "The Hot Mess Theory of AI Misalignment: How Misalignment Scales with Model Intelligence and Task Complexity" — Anthropic AI Safety. https://arxiv.org/abs/2503.08941

Mar 23, 202622 min

The Bitter Lesson

Every AI builder knows the anxiety: you spend months engineering prompts, tuning pipelines, and chaining calls together — then a new model drops and half your work evaporates overnight. It turns out researchers have been wrestling with this exact dynamic for 30 years, and they keep arriving at the same uncomfortable answer. That answer is called the Bitter Lesson — and understanding it might be the most important thing you can do for whatever you're building right now. From Deep Blue to AlexNet to modern LLMs, scale keeps beating sophistication, and knowing which side of that line your work falls on makes all the difference. Links - Richard Sutton, "The Bitter Lesson" - Alon Halevy, Peter Norvig, and Fernando Pereira, "The Unreasonable Effectiveness of Data" - Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, "ImageNet Classification with Deep Convolutional Neural Networks"

Mar 15, 202619 min

From Atari to ChatGPT: How AI Learned to Follow Instructions

From Atari to ChatGPT: How AI Learned to Follow Instructions by Katie Malone

Mar 9, 202625 min

It's RAG time: Retrieval-Augmented Generation

Today we are going to talk about the feature with the worst acronym in generative AI: RAG, or Retrieval Augmented Generation. If you've ever used something like "Chat with My Docs," if you have an internal AI chatbot that has access to your company's documents, or you've created one yourself on some kind of personal project and uploaded a bunch of documents for the AI to use — you have encountered RAG, whether you know it or not. It's an extremely effective technique. Works super well for taking general purpose models like ChatGPT or Claude and turning them into AIs that are aware of all the specific information that makes them truly useful in a huge variety of situations. RAG is pretty interesting under the hood, so I thought it would be fun to spend a little while talking about it. You are listening to Linear Digressions. RAG was first introduced in this paper from Facebook Research in 2021: https://arxiv.org/pdf/2005.11401

Mar 2, 202617 min

Chasing Away Repetitive LLM Responses with Verbalized Sampling

One of the things that LLMs can be really helpful with is brainstorming or generating new creative content. They are called Generative AI, after all—not just for summarization and question-and-answer tasks. But if you use LLMs for creative generation, you may find that their output starts to seem repetitive after a little while. Let's say you're asking it to create a poem, some dialogue, or a joke. If you ask once, it'll give you something that sounds pretty reasonable. But if you ask the same thing 10 times, it might give you 10 things that sound kind of the same. Today's episode is about a technique called verbalized sampling, and it's a way to mitigate this repetitiveness—this lack of diversity in LLM responses for creative tasks. But one of the things I really love about it is that in understanding why this repetitiveness happens and why verbalized sampling actually works as a mitigation technique, you start to get some pretty interesting insights and a deeper understanding of what's going on with LLMs under the surface. The paper discussed in this episode is Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity https://arxiv.org/abs/2510.01171

Feb 23, 202619 min

We're Back

It's been (*checks watch*) about five and a half years since we last talked. Fortunately nothing much has happened in the AI/data science world in that time. So let's just pick up where we left off, shall we?

Feb 16, 20262 min

A Key Concept in AI Alignment: Deep Reinforcement Learning from Human Preferences

Modern AI chatbots have a few different things that go into creating them. Today we're going to talk about a really important part of the process: the alignment training, where the chatbot goes from being just a pre-trained model—something that's kind of a fancy autocomplete—to something that really gives responses to human prompts that are more conversational, that are closer to the ones that we experience when we actually use a model like ChatGPT or Gemini or Claude. To go from the pre-trained model to one that's aligned, that's ready for a human to talk with, it uses reinforcement learning. And a really important step in figuring out the right way to frame the reinforcement learning problem happened in 2017 with a paper that we're going to talk about today: Deep Reinforcement Learning from Human Preferences. You are listening to Linear Digressions. The paper discussed in this episode is Deep Reinforcement Learning from Human Preferences https://arxiv.org/abs/1706.03741

Feb 14, 202619 min

Linear Digressions

Show overview

From the publisher

Latest Episodes

Agent Economics (The Agents Season, Episode 10)

Agent Trust, Oversight and Control (The Agents Season, Episode 9)

Many Agents, Many Problems (The Agents Season, Episode 8)

How Do You Evaluate An AI Agent? (The Agents Season, Episode 7)

AI Agent Failure Modes (The Agents Season, Episode 6)

Agentic Planning (The Agents Season, Episode 5)

Memory Management for AI Agents (The Agents Season, Episode 4)

Lost in the Middle (The Agents Season, Episode 3)

ReAct and Tool Usage (The Agents Season, Episode 2)

What's an AI Agent? And Why's That Hard to Define? (The Agents Season, Episode 1)

Unfaithful Chain of Thought

Benchmark Bank Heist

Benchmarking AI Models

The Hot Mess of AI (Mis-)Alignment

The Bitter Lesson

From Atari to ChatGPT: How AI Learned to Follow Instructions

It's RAG time: Retrieval-Augmented Generation

Chasing Away Repetitive LLM Responses with Verbalized Sampling

We're Back

A Key Concept in AI Alignment: Deep Reinforcement Learning from Human Preferences