PLAY PODCASTS
AI Safety Newsletter

AI Safety Newsletter

78 episodes — Page 2 of 2

AISN #26: National Institutions for AI Safety

<p>Also, Results From the UK Summit, and New Releases From OpenAI and xAI.</p> <p>Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.</p><p>This week's key stories include: </p><ol><li><p>The UK, US, and Singapore have announced national AI safety institutions. </p></li><li><p>The UK AI Safety Summit concluded with a consensus statement, the creation of an expert panel to study AI risks, and a commitment to meet again in six months. </p></li><li><p>xAI, OpenAI, and a new Chinese startup released new models this week. </p></li></ol><p><strong>UK, US, and Singapore Establish National AI Safety Institutions</strong></p><p>Before regulating a new technology, governments often need time to gather information and consider their policy options. But during that time, the technology may diffuse through society, making it more difficult for governments to intervene. This process, termed the Collingridge Dilemma, is a fundamental challenge in technology policy.</p><p>But recently [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:36) UK, US, and Singapore Establish National AI Safety Institutions</p><p>(03:53) UK Summit Ends with Consensus Statement and Future Commitments</p><p>(05:39) New Models From xAI, OpenAI, and a New Chinese Startup</p><p>(09:28) Links</p> <p>---</p> <p><b>First published:</b><br/> November 15th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/national-institutions-for-ai-safety?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/national-institutions-for-ai-safety</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Nov 15, 202312 min

AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks.

<p>Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.</p><p><strong>White House Executive Order on AI</strong></p><p>While Congress has not voted on significant AI legislation this year, the White House has left their mark on AI policy. In June, they secured voluntary commitments on safety from leading AI companies. Now, the White House has released a new executive order on AI. It addresses a wide range of issues, and specifically targets catastrophic AI risks such as cyberattacks and biological weapons. </p><p>Companies must disclose large training runs. Under the executive order, companies that intend to train “dual-use foundation models” using significantly more computing power than GPT-4 must take several precautions. First, they must notify the White House before training begins. Then [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:13) White House Executive Order on AI</p><p>(03:56) Kicking Off The UK AI Safety Summit</p><p>(06:18) Progress on Voluntary Evaluations of AI Risks</p><p>(08:52) Links</p> <p>---</p> <p><b>First published:</b><br/> October 31st, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-25?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-25</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Oct 31, 202311 min

AISN #24: Kissinger Urges US-China Cooperation on AI, China’s New AI Law, US Export Controls, International Institutions, and Open Source AI.

<p>Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.</p><p><strong>China's New AI Law, US Export Controls, and Calls for Bilateral Cooperation</strong></p><p>China details how AI providers can fulfill their legal obligations. The Chinese government has passed several laws on AI. They’ve regulated recommendation algorithms and taken steps to mitigate the risk of deepfakes. Most recently, they issued a new law governing generative AI. It's less stringent than earlier draft version, but the law remains more comprehensive in AI regulation than any laws passed in the US, UK, or European Union. </p><p>The law creates legal obligations for AI providers to respect intellectual property rights, avoid discrimination, and uphold socialist values. But as with many AI policy proposals, these are [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:15) China's New AI Law, US Export Controls, and Calls for Bilateral Cooperation</p><p>(04:58) Proposed International Institutions for AI</p><p>(08:15) Open Source AI: Risks and Opportunities</p><p>(11:25) Links</p> <p>---</p> <p><b>First published:</b><br/> October 18th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-24?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-24</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Oct 18, 202313 min

AISN #23: New OpenAI Models, News from Anthropic, and Representation Engineering.

<p>Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.</p><p><strong>OpenAI releases GPT-4 with Vision and DALL·E-3, announces Red Teaming Network</strong></p><p>GPT-4 with vision and voice. When GPT-4 was initially announced in March, OpenAI demonstrated its ability to process and discuss images such as diagrams or photographs. This feature has now been integrated into GPT-4V. Users can now input images in addition to text, and the model will respond to both. Users can also speak to GPT-4V, and the model will respond verbally.</p><p>GPT-4V may be more vulnerable to misuse via jailbreaks and adversarial attacks. Previous research has shown that multimodal models, which can process multiple forms of input such as both text and images, are more vulnerable to adversarial attacks than text-only models. GPT-4V's System Card includes some experiments [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:11) OpenAI releases GPT-4 with Vision and DALL·E-3, announces Red Teaming Network</p><p>(02:39) Writer's Guild of America Receives Protections Against AI Automation</p><p>(03:42) Anthropic receives $1.25B investment from Amazon, and announces several new policies</p><p>(06:21) Representation Engineering: A Top-Down Approach to AI Transparency</p><p>(07:57) Links</p> <p>---</p> <p><b>First published:</b><br/> October 4th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-23?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-23</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Oct 4, 20239 min

AISN #21: Google DeepMind’s GPT-4 Competitor, Military Investments in Autonomous Drones, The UK AI Safety Summit, and Case Studies in AI Policy.

<p>Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.</p><p><strong>Google DeepMind’s GPT-4 Competitor</strong></p><p>Computational power is a key driver of AI progress, and a new report suggests that Google’s upcoming GPT-4 competitor will be trained on unprecedented amounts of compute. </p><p>The model, currently named Gemini, may be trained by the end of this year with 5x more computational power than GPT-4. By the end of next year, the report projects that Google will have the ability to train a model with 20x more compute than GPT-4. </p><p>For reference, the compute difference between GPT-3 and GPT-4 was 100x. If these projections are true, Google’s new models could create a meaningful spike relative to current AI capabilities. </p><p>Google’s position [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:14) Google DeepMind’s GPT-4 Competitor</p><p>(02:41) US Military Invests in Thousands of Autonomous Drones</p><p>(04:37) United Kingdom Prepares for Global AI Safety Summit</p><p>(06:15) Case Studies in AI Policy</p><p>(08:55) Links</p> <p>---</p> <p><b>First published:</b><br/> September 5th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-21?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-21</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Sep 5, 20239 min

AISN #20: LLM Proliferation, AI Deception, and Continuing Drivers of AI Capabilities.

<p><strong>AI Deception: Examples, Risks, Solutions</strong></p><p>AI deception is the topic of a new paper from researchers at and affiliated with the Center for AI Safety. It surveys empirical examples of AI deception, then explores societal risks and potential solutions.</p><p>The paper defines deception as “the systematic production of false beliefs in others as a means to accomplish some outcome other than the truth.” Importantly, this definition doesn't necessarily imply that AIs have beliefs or intentions. Instead, it focuses on patterns of behavior that regularly cause false beliefs and would be considered deceptive if exhibited by humans.</p><p>Deception by Meta’s CICERO AI. Meta developed the AI system CICERO to play Diplomacy, a game where players build and betray alliances in [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:11) AI Deception: Examples, Risks, Solutions</p><p>(04:35) Proliferation of Large Language Models</p><p>(09:25) Continuing Drivers of AI Capabilities</p><p>(14:30) Links</p> <p>---</p> <p><b>First published:</b><br/> August 29th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-20?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-20</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Aug 29, 202315 min

[Paper] “An Overview of Catastrophic AI Risks” by Dan Hendrycks, Mantas Mazeika and Thomas Woodside

<p class="c2">Rapid advancements in artificial intelligence (AI) have sparked growing concerns among experts, policymakers, and world leaders regarding the potential for increasingly advanced AI systems to pose catastrophic risks. Although numerous risks have been detailed separately, there is a pressing need for a systematic discussion and illustration of the potential dangers to better inform efforts to mitigate them. This paper provides an overview of the main sources of catastrophic AI risks, which we organize into four categories: malicious use, in which individuals or groups intentionally use AIs to cause harm; AI race, in which competitive environments compel actors to deploy unsafe AIs or cede control to AIs; organizational risks, highlighting how human factors and complex systems can increase the chances of catastrophic accidents; and rogue AIs, describing the inherent difficulty [...]</p> <p>---</p> <p><b>First published:</b><br/> June 21st, 2023 </p> <p><b>Source:</b><br/> <a href="https://arxiv.org/abs/2306.12001?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://arxiv.org/abs/2306.12001</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Aug 21, 20233h 3m

[Paper] “Unsolved Problems in ML Safety” by Dan Hendrycks, Nicholas Carlini, John Schulman and Jacob Steinhardt

<p class="c71 c80">Machine learning (ML) systems are rapidly increasing in size, are acquiring new capabilities, and are increasingly deployed in high-stakes settings. As with other powerful technologies, safety for ML should be a leading research priority. In response to emerging safety challenges in ML, such as those introduced by recent large-scale models, we provide a new roadmap for ML Safety and refine the technical problems that the field needs to address. We present four problems ready for research, namely withstanding hazards (“Robustness”), identifying hazards (“Monitoring”), steering ML systems (“Alignment”), and reducing deployment hazards (“Systemic Safety”). Throughout, we clarify each problem’s motivation and provide concrete research directions.</p> <p>---</p> <p><b>First published:</b><br/> June 16th, 2022 </p> <p><b>Source:</b><br/> <a href="https://arxiv.org/abs/2109.13916?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://arxiv.org/abs/2109.13916</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Aug 21, 202353 min

[Paper] “X-Risk Analysis for AI Research” by Dan Hendrycks and Mantas Mazeika

<p class="c23">Artificial intelligence (AI) has the potential to greatly improve society, but as with any powerful technology, it comes with heightened risks and responsibilities. Current AI research lacks a systematic discussion of how to manage long-tail risks from AI systems, including speculative long-term risks. Keeping in mind the potential benefits of AI, there is some concern that building ever more intelligent and powerful AI systems could eventually result in systems that are more powerful than us; some say this is like playing with fire and speculate that this could create existential risks (x-risks). To add precision and ground these discussions, we provide a guide for how to analyze AI x-risk, which consists of three parts: First, we review how systems can be made safer today, drawing on time-tested concepts from hazard analysis and systems safety that have been designed to steer large processes in safer directions. Next, we discuss strategies [...]</p> <p>---</p> <p><b>First published:</b><br/> October 22nd, 2022 </p> <p><b>Source:</b><br/> <a href="https://arxiv.org/abs/2206.05862?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://arxiv.org/abs/2206.05862</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Aug 21, 202339 min

AISN #19: US-China Competition on AI Chips, Measuring Language Agent Developments, Economic Analysis of Language Model Propaganda, and White House AI Cyber Challenge.

<p><strong>US-China Competition on AI Chips</strong></p><p>Modern AI systems are trained on advanced computer chips which are designed and fabricated by only a handful of companies in the world. The US and China have been competing for access to these chips for years. Last October, the Biden administration partnered with international allies to severely limit China’s access to leading AI chips.</p><p>Recently, there have been several interesting developments on AI chips. China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry. Meanwhile, the United States has struggled [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:15) US-China Competition on AI Chips</p><p>(04:09) Measuring Language Agents Developments</p><p>(06:07) An Economic Analysis of Language Model Propaganda</p><p>(08:11) White House Competition Applying AI to Cybersecurity</p><p>(09:40) Links</p> <p>---</p> <p><b>First published:</b><br/> August 15th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-19?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-19</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Aug 15, 202310 min

AISN #18: Challenges of Reinforcement Learning from Human Feedback, Microsoft’s Security Breach, and Conceptual Research on AI Safety.

<p><strong>Challenges of Reinforcement Learning from Human Feedback</strong></p><p>If you’ve used ChatGPT, you might’ve noticed the “thumbs up” and “thumbs down” buttons next to each of its answers. Pressing these buttons provides data that OpenAI uses to improve their models through a technique called reinforcement learning from human feedback (RLHF).</p><p>RLHF is popular for teaching models about human preferences, but it faces fundamental limitations. Different people have different preferences, but instead of modeling the diversity of human values, RLHF trains models to earn the approval of whoever happens to give feedback. Furthermore, as AI systems become more capable, they can learn to deceive human evaluators into giving undue approval.</p><p>Here we discuss a new [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:13) Challenges of Reinforcement Learning from Human Feedback</p><p>(05:26) Microsoft’s Security Breach</p><p>(06:59) Conceptual Research on AI Safety</p><p>(09:25) Links</p> <p>---</p> <p><b>First published:</b><br/> August 8th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-18?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-18</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Aug 8, 202311 min

AISN #17: Automatically Circumventing LLM Guardrails, the Frontier Model Forum, and Senate Hearing on AI Oversight.

<p><strong>Automatically Circumventing LLM Guardrails</strong></p><p>Large language models (LLMs) can generate hazardous information, such as step-by-step instructions on how to create a pandemic pathogen. To combat the risk of malicious use, companies typically build safety guardrails intended to prevent LLMs from misbehaving. </p><p>But these safety controls are almost useless against a new attack developed by researchers at Carnegie Mellon University and the Center for AI Safety. By studying the vulnerabilities in open source models such as Meta’s LLaMA 2, the researchers can automatically generate a nearly unlimited supply of “adversarial suffixes,” which are words and characters that cause any model’s safety controls to fail. </p><p>This discovery calls into question the fundamental limits of safety [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:12) Automatically Circumventing LLM Guardrails</p><p>(05:40) AI Labs Announce the Frontier Model Forum</p><p>(07:54) Senate Hearing on AI Oversight</p><p>(14:42) Links</p> <p>---</p> <p><b>First published:</b><br/> August 1st, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-17?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-17</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Aug 1, 202315 min

AISN #16: White House Secures Voluntary Commitments from Leading AI Labs, and Lessons from Oppenheimer .

<p><strong>White House Unveils Voluntary Commitments to AI Safety from Leading AI Labs</strong></p><p>Last Friday, the White House announced a series of voluntary commitments from seven of the world's premier AI labs. Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI pledged to uphold these commitments, which are non-binding and pertain only to forthcoming "frontier models" superior to currently available AI systems. The White House also notes that the Biden-Harris Administration is developing an executive order alongside these voluntary commitments.</p><p>The commitments are timely and technically well-informed, demonstrating the ability of federal policymakers to respond capably and quickly to AI risks. The Center for AI Safety supports these commitments as a precedent for cooperation on AI [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:11) White House Unveils Voluntary Commitments to AI Safety from Leading AI Labs</p><p>(05:05) Lessons from Oppenheimer</p><p>(10:38) Links</p> <p>---</p> <p><b>First published:</b><br/> July 25th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-16?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-16</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jul 25, 202312 min

AISN #15: China and the US take action to regulate AI, results from a tournament forecasting AI risk, updates on xAI’s plan, and Meta releases its open-source and commercially available Llama 2.

<p><strong>Both China and the US take action to regulate AI</strong></p><p>Last week, regulators in both China and the US took aim at generative AI services. These actions show that China and the US are both concerned with AI safety. Hopefully, this is a sign they can eventually coordinate.</p><p><strong>China’s new generative AI rules</strong></p><p>On Thursday, China’s government released new rules governing generative AI. China’s new rules, which are set to take effect on August 15th, regulate publicly-available generative AI services. The providers of such services will be criminally liable for the content their services generate. </p><p>The rules specify illegal [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:17) Both China and the US take action to regulate AI</p><p>(00:36) China’s new generative AI rules</p><p>(03:15) The FTC investigates OpenAI</p><p>(05:01) Results from a tournament forecasting AI risk</p><p>(08:18) Updates on xAI’s plan</p><p>(09:05) Meta releases Llama 2, open-source and commercially available</p> <p>---</p> <p><b>First published:</b><br/> July 19th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-15?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-15</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jul 19, 202312 min

AISN #14: OpenAI’s ‘Superalignment’ team, Musk’s xAI launches, and developments in military AI use .

<p><strong>OpenAI announces a ‘superalignment’ team</strong></p><p>On July 5th, OpenAI announced the ‘Superalignment’ team: a new research team given the goal of aligning superintelligence, and armed with 20% of OpenAI’s compute. In this story, we’ll explain and discuss the team’s strategy.</p><p>What is superintelligence? In their announcement, OpenAI distinguishes between ‘artificial general intelligence’ and ‘superintelligence.’ Briefly, ‘artificial general intelligence’ (AGI) is about breadth of performance. Generally intelligent systems perform well on a wide range of cognitive tasks. For example, humans are in many senses generally intelligent: we can learn how to drive a car, take a derivative, or play piano, even though evolution didn’t train us for those tasks. A superintelligent system would not only be [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:11) OpenAI announces a ‘superalignment’ team</p><p>(03:50) Musk launches xAI</p><p>(05:12) Developments in Military AI Use</p> <p>---</p> <p><b>First published:</b><br/> July 12th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-14?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-14</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jul 12, 20239 min

AISN #13: An interdisciplinary perspective on AI proxy failures, new competitors to ChatGPT, and prompting language models to misbehave.

<p><strong>Interdisciplinary Perspective on AI Proxy Failures</strong></p><p>In this story, we discuss a recent paper on why proxy goals fail. First, we introduce proxy gaming, and then summarize the paper’s findings. </p><p>Proxy gaming is a well-documented failure mode in AI safety. For example, social media platforms use AI systems to recommend content to users. These systems are sometimes built to maximize the amount of time a user spends on the platform. The idea is that the time the user spends on the platform approximates the quality of the content being recommended. However, a user might spend even more time on a platform because they’re responding to an enraging post or interacting [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:13) Interdisciplinary Perspective on AI Proxy Failures</p><p>(06:06) A Flurry of AI Fundraising and Model Releases</p><p>(12:53) Adversarial Inputs Make Chatbots Misbehave</p><p>(15:52) Links</p> <p>---</p> <p><b>First published:</b><br/> July 5th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-13?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-13</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jul 5, 202317 min

AISN #12: Policy Proposals from NTIA’s Request for Comment, and Reconsidering Instrumental Convergence.

<p><strong>Policy Proposals from NTIA’s Request for Comment</strong></p><p>The National Telecommunications and Information Administration publicly requested comments on the matter from academics, think tanks, industry leaders, and concerned citizens. They asked 34 questions and received more than 1,400 responses on how to govern AI for the public benefit. This week, we cover some of the most promising proposals found in the NTIA submissions. </p><picture></picture><p><strong>Technical Proposals for Evaluating AI Safety</strong></p><p>Several NTIA submissions focused on the technical question of how to evaluate the safety of an AI system. We review two areas of active research: red-teaming and transparency. </p><p><strong>Red Teaming: Acting like an Adversary</strong></p><p>Several submissions proposed government support for evaluating AIs via red teaming. In this evaluation method, a [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:11) Policy Proposals from NTIA’s Request for Comment</p><p>(00:48) Technical Proposals for Evaluating AI Safety</p><p>(01:04) Red Teaming: Acting like an Adversary</p><p>(02:24) Transparency: Understanding AIs From the Inside</p><p>(03:51) Governance Proposals for Improving Safety Processes</p><p>(04:25) Requiring a License for Frontier AI Systems</p><p>(06:29) Unifying Sector-Specific Expertise and General AI Oversight</p><p>(07:51) Does Antitrust Prevent Cooperation Between AI Labs?</p><p>(08:40) Reconsidering Instrumental Convergence</p><p>(10:39) Links</p> <p>---</p> <p><b>First published:</b><br/> June 27th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-12?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-12</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jun 27, 202313 min

AISN #11: An Overview of Catastrophic AI Risks.

<p><strong>An Overview of Catastrophic AI Risks</strong></p><p>Global leaders are concerned that artificial intelligence could pose catastrophic risks. 42% of CEOs polled at the Yale CEO Summit agree that AI could destroy humanity in five to ten years. The Secretary General of the United Nations said we “must take these warnings seriously.” Amid all these frightening polls and public statements, there’s a simple question that’s worth asking: why exactly is AI such a risk?</p><p>The Center for AI Safety has released a new paper to provide a clear and comprehensive answer to this question. We detail the precise risks posed by AI, the structural dynamics making these problems so difficult to solve, and the technical, social, and political responses required to overcome this [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:08) An Overview of Catastrophic AI Risks</p><p>(00:56) Malicious actors can use AIs to cause harm.</p><p>(02:18) Racing towards an AI disaster.</p><p>(04:05) Safety should be a goal, not a constraint.</p><p>(05:46) The challenge of AI control.</p><p>(07:53) Positive visions for the future of AI.</p><p>(09:02) Links</p> <p>---</p> <p><b>First published:</b><br/> June 22nd, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-11?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-11</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jun 22, 202311 min

AISN #10: How AI could enable bioterrorism, and policymakers continue to focus on AI .

<p><strong>How AI could enable bioterrorism</strong></p><p>Only a hundred years ago, no person could have single handedly destroyed humanity. Nuclear weapons changed this situation, giving the power of global annihilation to a small handful of nations with powerful militaries. Now, thanks to advances in biotechnology and AI, a much larger group of people could have the power to create a global catastrophe. </p><p>This is the upshot of a new paper from MIT titled “Can large language models democratize access to dual-use biotechnology?” The authors demonstrate that today’s language models are capable of providing detailed instructions for non-expert users about how to create pathogens that could cause a global pandemic.</p><p>Language models can help users build dangerous [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:10) How AI could enable bioterrorism</p><p>(03:48) Policymakers continue to focus on AI</p><p>(05:27) Links</p> <p>---</p> <p><b>First published:</b><br/> June 13th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-10?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-10</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jun 13, 20236 min

AISN #9: Statement on Extinction Risks, Competitive Pressures, and When Will AI Reach Human-Level? .

<p><strong>Top Scientists Warn of Extinction Risks from AI</strong></p><p>Last week, hundreds of AI scientists and notable public figures signed a public statement on AI risks written by the Center for AI Safety. The statement reads:</p><p>“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”</p><p>The statement was signed by a broad, diverse coalition. The statement represents a historic coalition of AI experts — along with philosophers, ethicists, legal scholars, economists, physicists, political scientists, pandemic scientists, nuclear scientists, and climate scientists — establishing the risk of extinction from advanced, future AI systems as one of the world’s most important problems. </p><p>The international community is [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:10) Top Scientists Warn of Extinction Risks from AI</p><p>(03:35) Competitive Pressures in AI Development</p><p>(07:22) When Will AI Reach Human Level?</p><p>(12:47) Links</p> <p>---</p> <p><b>First published:</b><br/> June 6th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-9?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-9</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Jun 6, 202314 min

AISN #8: Why AI could go rogue, how to screen for AI risks, and grants for research on democratic governance of AI.

<p><strong>Yoshua Bengio makes the case for rogue AI</strong></p><p>AI systems pose a variety of different risks. Renowned AI scientist Yoshua Bengio recently argued for one particularly concerning possibility: that advanced AI agents could pursue goals in conflict with human values. </p><p>Human intelligence has accomplished impressive feats, from flying to the moon to building nuclear weapons. But Bengio argues that across a range of important intellectual, economic, and social activities, human intelligence could be matched and even surpassed by AI. </p><p>How would advanced AIs change our world? Many technologies are tools, such as toasters and calculators, which humans use to accomplish our goals. AIs are different, Bengio says. [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:11) Yoshua Bengio makes the case for rogue AI</p><p>(05:11) How to screen AIs for extreme risks</p><p>(09:12) Funding for Work on Democratic Inputs to AI</p><p>(10:43) Links</p> <p>---</p> <p><b>First published:</b><br/> May 30th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-8?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-8</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

May 30, 202312 min

AISN #7: Disinformation, recommendations for AI labs, and Senate hearings on AI.

<p><strong>How AI enables disinformation</strong></p><p>Yesterday, a fake photo generated by an AI tool showed an explosion at the Pentagon. The photo was falsely attributed to Bloomberg News and circulated quickly online. Within minutes, the stock market declined sharply, only to recover after it was discovered that the picture was a hoax. </p><p>This story is part of a broader trend. AIs can now generate text, audio, and images that are unnervingly similar to their naturally occurring counterparts. How will this affect our world, and what kinds of solutions are available?</p><picture></picture>The fake image generated by an AI showed an explosion at the Pentagon.<p>AIs can generate personalized scams. When John Podesta was the chair of Hillary Clinton’s 2016 presidential campaign [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:10) How AI enables disinformation</p><p>(05:38) Governance recommendations on AI safety</p><p>(08:21) Senate hearings on AI regulation</p><p>(11:10) Links</p> <p>---</p> <p><b>First published:</b><br/> May 23rd, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-7?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-7</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

May 23, 202312 min

AISN #6: Examples of AI safety progress, Yoshua Bengio proposes a ban on AI agents, and lessons from nuclear arms control .

<p><strong>Examples of AI safety progress</strong></p><p>Training AIs to behave safely and beneficially is difficult. They might learn to game their reward function, deceive human oversight, or seek power. Some argue that researchers have not made much progress in addressing these problems, but here we offer a few examples of progress on AI safety. </p><p>Detecting lies in AI outputs. Language models often output false text, but a recent paper suggests they understand the truth in ways not reflected in their output. By analyzing a model’s internals, we can calculate the likelihood that a model believes a statement is true. The finding has been replicated in models that answer [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:13) Examples of AI safety progress</p><p>(03:56) Yoshua Bengio proposes a ban on AI agents</p><p>(07:19) Lessons from Nuclear Arms Control for Verifying AI Treaties</p><p>(10:02) Links</p> <p>---</p> <p><b>First published:</b><br/> May 16th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-6?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-6</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

May 16, 202311 min

AISN #5: Geoffrey Hinton speaks out on AI risk, the White House meets with AI labs, and Trojan attacks on language models.

<p><strong>Geoffrey Hinton is concerned about existential risks from AI</strong></p><p>Geoffrey Hinton won the Turing Award for his work on AI. Now he says that part of him regrets his life’s work, as he believes that AI poses an existential threat to humanity. As Hinton puts it, “it’s quite conceivable that humanity is just a passing phase in the evolution of intelligence.”</p><picture></picture><p>AI is developing more rapidly than Hinton expected. In 2015, Andrew Ng argued that worrying about AI risk is like worrying about overpopulation on Mars. Geoffrey Hinton also used to believe that advanced AI was decades away, but recent progress has changed his views. Now he says [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:12) Geoffrey Hinton is concerned about existential risks from AI</p><p>(02:32) White House meets with AI labs</p><p>(04:22) Trojan Attacks on Language Models</p><p>(06:51) Assorted Links</p> <p>---</p> <p><b>First published:</b><br/> May 9th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-5?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-5</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

May 9, 20238 min

AISN #4: AI and cybersecurity, persuasive AIs, weaponization, and Hinton talks AI risks.

<p><strong>Cybersecurity Challenges in AI Safety</strong></p><p>Meta accidentally leaks a language model to the public. Meta’s newest language model, LLaMa, was publicly leaked online against the intentions of its developers. Gradual rollout is a popular goal with new AI models, opening access to academic researchers and government officials before sharing models with anonymous internet users. Meta intended to use this strategy, but within a week of sharing the model with an approved list of researchers, an unknown person who had been given access to the model publicly posted it online. </p><p>How can AI developers selectively share their models? One inspiration could be the film industry, which places watermarks and tracking technology on “screener” copies of movies sent [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:11) Cybersecurity Challenges in AI Safety</p><p>(02:48) Artificial Influence: An Analysis Of AI-Driven Persuasion</p><p>(05:37) Building Weapons with AI</p><p>(07:47) Assorted Links</p> <p>---</p> <p><b>First published:</b><br/> May 2nd, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-4?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-4</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

May 2, 20239 min

AISN #3: AI policy proposals and a new challenger approaches.

<p><strong>Policy Proposals for AI Safety</strong></p><p>Critical industries rely on the government to protect consumer safety. The FAA approves new airplane designs, the FDA tests new drugs, and the SEC and CFPB regulate risky financial instruments. Currently, there is no analogous set of regulations for AI safety. </p><p>This could soon change. President Biden and other members of Congress have recently been vocal about the risks of artificial intelligence and the need for policy solutions.</p><picture></picture><p>From guiding principles to enforceable laws. Previous work on AI policy such as the White House Blueprint for an AI Bill of Rights and the NIST AI Risk Management Framework has articulated guiding principles like interpretability, robustness, and privacy. But these recommendations are not enforceable – AI [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:09) Policy Proposals for AI Safety</p><p>(04:19) Competitive Pressures in AI Development</p> <p>---</p> <p><b>First published:</b><br/> April 25th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-3?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-3</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Apr 25, 20237 min

AISN #2: ChaosGPT and the rise of language model agents, evolutionary pressures and AI, AI safety in the media.

<p><strong>ChaosGPT and the Rise of Language Agents</strong></p><p>Chatbots like ChatGPT usually only respond to one prompt at a time, and a human user must provide a new prompt to get a new response. But an extremely popular new framework called AutoGPT automates that process. With AutoGPT, the user provides only a high-level goal, and the language model will create and execute a step-by-step plan to accomplish the goal.</p><p>AutoGPT and other language agents are still in their infancy. They struggle with long-term planning and repeat their own mistakes. Yet because they limit human oversight of AI actions, these agents are a step towards dangerous deployment of autonomous AI. </p><p>Individual bad actors [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:12) ChaosGPT and the Rise of Language Agents</p><p>(02:49) Natural Selection Favors AIs over Humans</p><p>(05:17) AI Safety in the Media</p> <p>---</p> <p><b>First published:</b><br/> April 18th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-2?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-2</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Apr 18, 20237 min

AISN #1: Public opinion on AI, plugging ChatGPT into the internet, and the economic impacts of language models..

<p><strong>Growing concerns about rapid AI progress</strong></p><p>Recent advancements in AI have thrust it into the center of attention. What do people think about the risks of AI?</p><p>The American public is worried. 46% of Americans are concerned that AI will cause “the end of the human race on Earth,” according to a recent poll by YouGov. Young people are more likely to express such concerns, while there are no significant differences in responses between people of different genders or political parties. Another poll by Monmouth University found broad support for AI regulation, with 55% supporting the creation of a federal agency that governs AI similar to how the FDA approves drugs and [...]</p> <p>---</p><p><strong>Outline:</strong></p><p>(00:12) Growing concerns about rapid AI progress</p><p>(02:53) Plugging ChatGPT into email, spreadsheets, the internet, and more</p><p>(05:35) Which jobs could be affected by language models?</p> <p>---</p> <p><b>First published:</b><br/> April 10th, 2023 </p> <p><b>Source:</b><br/> <a href="https://newsletter.safe.ai/p/ai-safety-newsletter-1?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Source+URL+in+episode+description&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">https://newsletter.safe.ai/p/ai-safety-newsletter-1</a> </p> <p>---</p> <p>Want more? Check out our <a href="https://newsletter.mlsafety.org/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Episode+description+footer" target="_blank" rel="noreferrer">ML Safety Newsletter</a> for technical safety research.</p> <p>Narrated by <a href="https://type3.audio/?utm_source=TYPE_III_AUDIO&utm_medium=Podcast&utm_content=Narrated+by+TYPE+III+AUDIO&utm_term=center_for_ai_safety&utm_campaign=ai_narration" rel="noopener noreferrer" target="_blank">TYPE III AUDIO</a>.</p>

Apr 10, 20238 min