AISN #32: Measuring and Reducing Hazardous Knowledge in LLMs

AI Safety Newsletter · Center for AI Safety

March 7, 202417m 56s

Audio is streamed directly from the publisher (dl.type3.audio) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required. Measuring and Reducing Hazardous Knowledge The recent White House Executive Order on Artificial Intelligence highlights risks of LLMs in facilitating the development of bioweapons, chemical weapons, and cyberweapons. To help measure these dangerous capabilities, CAIS has partnered with Scale AI to create WMDP: the Weapons of Mass Destruction Proxy, an open source benchmark with more than 4,000 multiple choice questions that serve as proxies for hazardous knowledge across biology, chemistry, and cyber. This benchmark not only helps the world understand the relative dual-use capabilities of different LLMs, but it also creates a path forward for model builders to remove harmful information from their models through machine unlearning techniques. <picture></picture> Measuring hazardous knowledge in bio, chem, and cyber. Current evaluations of dangerous AI capabilities have [...] ---Outline:(00:03) Measuring and Reducing Hazardous Knowledge(04:35) Language models are getting better at forecasting(07:51) Proposals for Private Regulatory Markets(14:25) Links --- First published: March 7th, 2024 Source: <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-32-measuring?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-32-measuring</a> --- Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research. Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.

← All episodes of AI Safety Newsletter