
AISN #45: Center for AI Safety 2024 Year in Review
AI Safety Newsletter · Center for AI Safety
December 19, 202411m 31s
Audio is streamed directly from the publisher (dl.type3.audio) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
<p> As 2024 draws to a close, we want to thank you for your continued support for AI safety and review what we’ve been able to accomplish. In this special-edition newsletter, we highlight some of our most important projects from the year.</p><p> The mission of the Center for AI Safety is to reduce societal-scale risks from AI. We focus on three pillars of work: research, field-building, and advocacy.</p><p><strong> Research</strong></p><p> CAIS conducts both technical and conceptual research on AI safety. Here are some highlights from our research in 2024:</p><p> Circuit Breakers. We published breakthrough research showing how circuit breakers can prevent AI models from behaving dangerously by interrupting crime-enabling outputs. In a jailbreaking competition with a prize pool of tens of thousands of dollars, it took twenty thousand attempts to jailbreak a model trained with circuit breakers. The paper was accepted to NeurIPS 2024.</p><picture></picture><p> The WMDP Benchmark. We developed the Weapons [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:34) Research</p><p>(04:25) Advocacy</p><p>(06:44) Field-Building</p><p>(10:38) Looking Ahead</p> <p>---</p>
<p><b>First published:</b><br/>
December 19th, 2024 </p>
<p><b>Source:</b><br/>
<a href="https://newsletter.safe.ai/p/aisn-45-center-for-ai-safety-2024?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/aisn-45-center-for-ai-safety-2024</a> </p>
<p>---</p>
<p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p>
<p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>
<p>---</p><div style="max-width: 100%";><p><strong>Images from the article:</strong></p><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2925c7c6-ee18-4ab9-8405-fca897d63024_1546x1048.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2925c7c6-ee18-4ab9-8405-fca897d63024_1546x1048.png" alt="undefined" style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2461925-c1d1-49ae-bc5f-f5e8740a8079_1192x422.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2461925-c1d1-49ae-bc5f-f5e8740a8079_1192x422.png" alt="undefined" style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faca531b4-89a6-4cb3-b01f-4a90529147f1_1600x728.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faca531b4-89a6-4cb3-b01f-4a90529147f1_1600x728.png" alt="undefined" style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39af8cc5-f5b2-499d-9339-2cec4dba653b_1600x964.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39af8cc5-f5b2-499d-9339-2cec4dba653b_1600x964.png" alt="undefined" style="max-width: 100%;" /></a><p><em>Apple Podcasts and Spotify do not show images in the episode description. Try <a href="https://pocketcasts.com/" target="_blank" rel="noreferrer">Pocket Casts</a>, or another podcast app.</em></p></div>