
AISN #52: An Expert Virology Benchmark
AI Safety Newsletter · Center for AI Safety
April 22, 202510m 10s
Audio is streamed directly from the publisher (dl.type3.audio) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
<p> Plus, AI-Enabled Coups.</p> <p> In this edition: AI now outperforms human experts in specialized virology knowledge in a new benchmark; A new report explores the risk of AI-enabled coups.</p><p> Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.</p><p><strong> An Expert Virology Benchmark</strong></p><p> A team of researchers (primarily from SecureBio and CAIS) has developed the Virology Capabilities Test (VCT), a benchmark that measures an AI system's ability to troubleshoot complex virology laboratory protocols. Results on this benchmark suggest that AI has surpassed human experts in practical virology knowledge.</p><p> VCT measures practical virology knowledge, which has high dual-use potential. While AI virologists could accelerate beneficial research in virology and infectious disease prevention, bad actors could misuse the same capabilities to develop dangerous pathogens. Like the WMDP benchmark, the VCT is designed to evaluate practical dual-use scientific knowledge—in this case, virology.</p><picture></picture><p> The benchmark consists of 322 multimodal questions [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:29) An Expert Virology Benchmark</p><p>(04:04) AI-Enabled Coups</p><p>(07:58) Other news</p> <p>---</p>
<p><b>First published:</b><br/>
April 22nd, 2025 </p>
<p><b>Source:</b><br/>
<a href="https://newsletter.safe.ai/p/ai-safety-newsletter-52-an-expert?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-52-an-expert</a> </p>
<p>---</p>
<p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p>
<p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>
<p>---</p><div style="max-width: 100%";><p><strong>Images from the article:</strong></p><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65c50d63-f694-4e50-b713-d11384af9822_1482x704.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65c50d63-f694-4e50-b713-d11384af9822_1482x704.png" alt="Flowchart showing risk factors and mitigations for AI-enabled coups." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8a6551e-901f-4a72-9af1-2db6e168ce3b_1508x1156.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8a6551e-901f-4a72-9af1-2db6e168ce3b_1508x1156.png" alt="Flow diagram showing three key risk factors for AI-enabled coups." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70e50853-3b57-4275-92b3-08c437938175_1600x1223.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70e50853-3b57-4275-92b3-08c437938175_1600x1223.png" alt="Flow chart showing scenarios for "Example scenarios for AI-enabled military coups," with pre-coup and seizing power stages." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed14d08-eef1-46ce-9e37-f697fdf5932e_1600x963.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ed14d08-eef1-46ce-9e37-f697fdf5932e_1600x963.png" alt="Graph showing "AI Progress on VCT" comparing language models' performance over time.
The graph plots various AI models' capabilities against a median expert virologist benchmark (shown at 50th percentile), with dates from July 2023 to April 2025 on the x-axis. Models like GPT-4, Sonnet series, and Gemini show increasing performance scores." style="max-width: 100%;" /></a><hr style="margin-top: 24px; margin-bottom: 24px;" /><a href="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F570a4466-7195-40b5-bdae-0bf3853676fc_1600x1441.png" target="_blank"><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F570a4466-7195-40b5-bdae-0bf3853676fc_1600x1441.png" alt=""The VCT benchmark" diagram showing virology knowledge mapped by practicality and misuse potential.
The graph plots virology topics on two axes: vertical (conceptual to practical) and horizontal (low to high misuse potential), with a blue-outlined target area indicating benchmark focus." style="max-width: 100%;" /></a><p><em>Apple Podcasts and Spotify do not show images in the episode description. Try <a href="https://pocketcasts.com/" target="_blank" rel="noreferrer">Pocket Casts</a>, or another podcast app.</em></p></div>