
(Voiceover) OpenAI's Reinforcement Finetuning and RL for the masses
December 11, 202412m 40s
Audio is streamed directly from the publisher (api.substack.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Original post:
https://www.interconnects.ai/p/openais-reinforcement-finetuning
Chapters
00:00 Introduction
04:19 The impact of reinforcement finetuning’s existence
07:29 Hypotheses on reinforcement finetuning’s implementation
Figures
Fig. 1, Yann’s Cake
Fig. 2, Grader config
Fig. 3, RLVR learning curves
This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe