
AISN #32: Measuring and Reducing Hazardous Knowledge in LLMs
AI Safety Newsletter · Center for AI Safety
March 7, 202417m 56s
Audio is streamed directly from the publisher (dl.type3.audio) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
<p> Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.</p><p><strong> Measuring and Reducing Hazardous Knowledge</strong></p><p> The recent White House Executive Order on Artificial Intelligence highlights risks of LLMs in facilitating the development of bioweapons, chemical weapons, and cyberweapons.</p><p> To help measure these dangerous capabilities, CAIS has partnered with Scale AI to create WMDP: the Weapons of Mass Destruction Proxy, an open source benchmark with more than 4,000 multiple choice questions that serve as proxies for hazardous knowledge across biology, chemistry, and cyber. </p><p> This benchmark not only helps the world understand the relative dual-use capabilities of different LLMs, but it also creates a path forward for model builders to remove harmful information from their models through machine unlearning techniques. </p><picture></picture><p> Measuring hazardous knowledge in bio, chem, and cyber. Current evaluations of dangerous AI capabilities have [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:03) Measuring and Reducing Hazardous Knowledge</p><p>(04:35) Language models are getting better at forecasting</p><p>(07:51) Proposals for Private Regulatory Markets</p><p>(14:25) Links</p> <p>---</p>
<p><b>First published:</b><br/>
March 7th, 2024 </p>
<p><b>Source:</b><br/>
<a href="https://newsletter.safe.ai/p/ai-safety-newsletter-32-measuring?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-32-measuring</a> </p>
<p>---</p>
<p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p>
<p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>