
We Found AI's Preferences — What David Shapiro MISSED in this bombshell Center for AI Safety paper
Audio is streamed directly from the publisher (api.substack.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
The Center for AI Safety just dropped a fascinating paper — they discovered that today’s AIs like GPT-4 and Claude have preferences! As in, coherent utility functions. We knew this was inevitable, but we didn’t know it was already happening.
This episode has two parts:
In Part I (48 minutes), I react to David Shapiro’s coverage of the paper and push back on many of his points.
In Part II (60 minutes), I explain the paper myself.
00:00 Episode Introduction
05:25 PART I: REACTING TO DAVID SHAPIRO
10:06 Critique of David Shapiro's Analysis
19:19 Reproducing the Experiment
35:50 David's Definition of Coherence
37:14 Does AI have “Temporal Urgency”?
40:32 Universal Values and AI Alignment
49:13 PART II: EXPLAINING THE PAPER
51:37 How The Experiment Works
01:11:33 Instrumental Values and Coherence in AI
01:13:04 Exchange Rates and AI Biases
01:17:10 Temporal Discounting in AI Models
01:19:55 Power Seeking, Fitness Maximization, and Corrigibility
01:20:20 Utility Control and Bias Mitigation
01:21:17 Implicit Association Test
01:28:01 Emailing with the Paper’s Authors
01:43:23 My Takeaway
Show Notes
David’s source video: https://www.youtube.com/watch?v=XGu6ejtRz-0
The research paper: http://emergent-values.ai
Watch the Lethal Intelligence Guide, the ultimate introduction to AI x-risk! https://www.youtube.com/@lethal-intelligence
PauseAI, the volunteer organization I’m part of: https://pauseai.info
Join the PauseAI Discord — https://discord.gg/2XXWXvErfA — and say hi to me in the #doom-debates-podcast channel!
Doom Debates’ Mission is to raise mainstream awareness of imminent extinction from AGI and build the social infrastructure for high-quality debate.
Support the mission by subscribing to my Substack at
https://doomdebates.com
and to https://youtube.com/@DoomDebates
Get full access to Doom Debates at lironshapira.substack.com/subscribe