PLAY PODCASTS
AISN #66: AISN #66: Evaluating Frontier Models, New Gemini and Claude, Preemption is Back

AISN #66: AISN #66: Evaluating Frontier Models, New Gemini and Claude, Preemption is Back

AI Safety Newsletter · Center for AI Safety

December 2, 202512m 27s

Audio is streamed directly from the publisher (dl.type3.audio) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

<p> Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required..</p> <p> Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.</p><p> In this edition we discuss the new AI Dashboard, recent frontier models from Google and Anthropic, and a revived push to preempt state AI regulations.</p><p> Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.</p><p><strong> CAIS Releases the AI Dashboard for Frontier Performance</strong></p><p> CAIS launched its AI Dashboard, which evaluates frontier AI systems on capability and safety benchmarks. The dashboard also tracks the industry's overall progression toward broader milestones such as AGI, automation of remote labor, and full self-driving.</p><p> How the dashboard works. The AI Dashboard features three leaderboards—one for text, one for vision, and one for risks—where frontier models are ranked according to their average score across a battery of benchmarks. Because CAIS evaluates models directly across a wide range of tasks, the dashboard provides apples-to-apples comparisons of how different frontier models perform on the same set of evaluations and safety-relevant behaviors.</p><p> Ranking frontier models for [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:33) CAIS Releases the AI Dashboard for Frontier Performance</p><p>(04:05) Politicians Revive Push for Moratorium on State AI Laws</p><p>(06:39) Gemini 3 Pro and Claude Opus 4.5 Arrive</p><p>(09:17) In Other News</p><p>(09:20) Government</p><p>(10:15) Industry</p><p>(11:03) Civil Society</p><p>(12:00) Discussion about this post</p> <p>---</p> <p><b>First published:</b><br/> December 2nd, 2025 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-66-aisn-66-evaluating?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-66-aisn-66-evaluating</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p> <p>---</p><div style="max-width: 100%";><p><strong>Images from the article:</strong></p><a href="https://substackcdn.com/image/fetch/$s_!f-UV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f08358-439a-4e39-a811-5d4f78ab870b_1786x958.png" target="_blank"><img src="https://substackcdn.com/image/fetch/$s_!f-UV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f08358-439a-4e39-a811-5d4f78ab870b_1786x958.png" alt="Graph showing AI model performance over time titled "Average Score" with "Risk Index Lower is Better" subtitle." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/$s_!Y8M1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3db661b-4306-44ae-9a6c-3bb6a35a1929_1600x505.png" target="_blank"><img src="https://substackcdn.com/image/fetch/$s_!Y8M1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3db661b-4306-44ae-9a6c-3bb6a35a1929_1600x505.png" alt="Table showing AI model performance scores across reasoning, coding, and gaming benchmarks." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/$s_!Yav1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087dee12-73ff-4f2d-8b0f-d5df932ccdb1_1922x870.png" target="_blank"><img src="https://substackcdn.com/image/fetch/$s_!Yav1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087dee12-73ff-4f2d-8b0f-d5df932ccdb1_1922x870.png" alt="Bar chart titled "Risk Index" showing risk scores for 10 AI models." style="max-width: 100%;" /></a><p><em>Apple Podcasts and Spotify do not show images in the episode description. Try <a href="https://pocketcasts.com/" target="_blank" rel="noreferrer">Pocket Casts</a>, or another podcast app.</em></p></div>