[Linkpost] “Evidence that Recent AI Gains are Mostly from Inference-Scaling” by Toby_Ord

EA Forum Podcast (Curated & popular) · EA Forum Team

February 2, 202610m 1s

Audio is streamed directly from the publisher (dl.type3.audio) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

This is a link post. In the last year or two, the most important trend in modern AI came to an end. The scaling-up of computational resources used to train ever-larger AI models through next-token prediction (pre-training) stalled out. Since late 2024, we’ve seen a new trend of using reinforcement learning (RL) in the second stage of training (post-training). Through RL, the AI models learn to do superior chain-of-thought reasoning about the problem they are being asked to solve. This new era involves scaling up two kinds of compute:<ol> <li> the amount of compute used in RL post-training</li><li> the amount of compute used every time the model answers a question</li></ol> Industry insiders are excited about the first new kind of scaling, because the amount of compute needed for RL post-training started off being small compared to the tremendous amounts already used in next-token prediction pre-training. Thus, one could scale the RL post-training up by a factor of 10 or 100 before even doubling the total compute used to train the model. But the second new kind of scaling is a problem. Major AI companies were already starting to spend more compute serving their models to customers than in the training [...] --- First published: February 2nd, 2026 Source: <a href="https://forum.effectivealtruism.org/posts/5zfubGrJnBuR5toiK/evidence-that-recent-ai-gains-are-mostly-from-inference?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://forum.effectivealtruism.org/posts/5zfubGrJnBuR5toiK/evidence-that-recent-ai-gains-are-mostly-from-inference</a> Linkpost URL: <a href="https://forum.effectivealtruism.org/out?url=https%3A%2F%2Fwww.tobyord.com%2Fwriting%2Fmostly-inference-scaling" rel="noopener noreferrer" target="_blank">https://www.tobyord.com/writing/mostly-inference-scaling</a> --- Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=ea_forum&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>. ---<div style="max-width: 100%";>Images from the article:<a href="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/qqz9nrnpez7fwtiozpvm" target="_blank"><img src="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/qqz9nrnpez7fwtiozpvm" alt="Graph comparing base model and RL post-training performance on "MATH – level 5" by tokens used." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/gcxdklza9xj9nocryib1" target="_blank"><img src="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/gcxdklza9xj9nocryib1" alt="Scatter plot comparing Base Model and RL Post-Training performance, titled "GPQA Diamond"." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/np01yzc1zjuuffww5joz" target="_blank"><img src="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/np01yzc1zjuuffww5joz" alt="Graph showing performance versus tokens used for base model and RL post-training, titled "OTIS Mock AIME 2024-25"." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/jjpotd3qr7dukcvvflbx" target="_blank"><img src="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/jjpotd3qr7dukcvvflbx" alt="Graph showing performance versus tokens used for "MATH - level 5" with base and RL post-training models." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/tyycad9mlhayb43cj90a" target="_blank"><img src="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/tyycad9mlhayb43cj90a" alt="Graph showing performance versus tokens used for "GPQA Diamond" with base and RL models." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/oeaaj2wks9fsh40ftyi1" target="_blank"><img src="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/oeaaj2wks9fsh40ftyi1" alt="Graph showing performance versus tokens used comparing base model and reinforcement learning post-training results." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/mqqnogyo0xf6zfrybjjd" target="_blank"><img src="https://res.cloudinary.com/cea/image/upload/f_auto,q_auto/v1/mirroredImages/5zfubGrJnBuR5toiK/mqqnogyo0xf6zfrybjjd" alt="Graph showing performance versus tokens used for base model and RL post-training versions." style="max-width: 100%;" /></a>Apple Podcasts and Spotify do not show images in the episode description. Try <a href="https://pocketcasts.com/" target="_blank" rel="noreferrer">Pocket Casts</a>, or another podcast app.</div>

← All episodes of EA Forum Podcast (Curated & popular)