
TYPE III AUDIO (All episodes)
167 episodes — Page 4 of 4
"Why I think strong general AI is coming soon" by Porby
---narrator_time: 4hnarrator: pwqa: kmclient: lesswrongfeed_id: ai, ai_safety, ai_safety__forecasting---https://www.lesswrong.com/posts/K4urTDkBbtNuLivJx/why-i-think-strong-general-ai-is-coming-soonI think there is little time left before someone builds AGI (median ~2030). Once upon a time, I didn't think this.This post attempts to walk through some of the observations and insights that collapsed my estimates.The core ideas are as follows:We've already captured way too much of intelligence with way too little effort.Everything points towards us capturing way more of intelligence with very little additional effort.Trying to create a self-consistent worldview that handles all available evidence seems to force very weird conclusions.Some notes up front:I wrote this post in response to the Future Fund's AI Worldview Prize. Financial incentives work, apparently! I wrote it with a slightly wider audience in mind and supply some background for people who aren't quite as familiar with the standard arguments.I make a few predictions in this post. Unless otherwise noted, the predictions and their associated probabilities should be assumed to be conditioned on "the world remains at least remotely normal for the term of the prediction; the gameboard remains unflipped."For the purposes of this post, when I use the term AGI, I mean the kind of AI with sufficient capability to make it a genuine threat to humanity's future or survival if it is misused or misaligned. This is slightly more strict than the definition in the Future Fund post, but I expect the difference between the two definitions to be small chronologically.For the purposes of this post, when I refer to "intelligence," I mean stuff like complex problem solving that's useful for achieving goals. Consciousness, emotions, and qualia are not required for me to call a system "intelligent" here; I am defining it only in terms of capability.Share feedback on this narration.
LessWrong: "Consider your appetite for disagreements" by Adam Zerner
---narrator_time: 45mnarrator: pwqa: kmclient: lesswrong---https://www.lesswrong.com/posts/8vesjeKybhRggaEpT/consider-your-appetite-for-disagreementsPokerThere was a time about five years ago where I was trying to get good at poker. If you want to get good at poker, one thing you have to do is review hands. Preferably with other people.For example, suppose you have ace king offsuit on the button. Someone in the highjack opens to 3 big blinds preflop. You call. Everyone else folds. The flop is dealt. It's a rainbow Q75. You don't have any flush draws. You missed. Your opponent bets. You fold. They take the pot and you move to the next hand.Once you finish your session, it'd be good to come back and review this hand. Again, preferably with another person. To do this, you would review each decision point in the hand. Here, there were two decision points.The first was when you faced a 3BB open from HJ preflop with AKo. In the hand, you decided to call. However, this of course wasn't your only option. You had two others: you could have folded, and you could have raised. Actually, you could have raised to various sizes. You could have raised small to 8BB, medium to 10BB, or big to 12BB. Or hell, you could have just shoved 200BB! But that's not really a realistic option, nor is folding. So in practice your decision was between calling and raising to various realistic sizes.Share feedback on this narration.
LessWrong: "Introduction to abstract entropy" by Alex Altair
---narrator_time: 3h30mnarrator: pwqa: kmclient: lesswrong---https://www.lesswrong.com/posts/REA49tL5jsh69X3aM/introduction-to-abstract-entropyThis post, and much of the following sequence, was greatly aided by feedback from the following people (among others): Lawrence Chan, Joanna Morningstar, John Wentworth, Samira Nedungadi, Aysja Johnson, Cody Wild, Jeremy Gillen, Ryan Kidd, Justis Mills and Jonathan Mustin. Illustrations by Anne Ore.Introduction & motivationIn the course of researching optimization, I decided that I had to really understand what entropy is.[1] But there are a lot of other reasons why the concept is worth studying:Information theory:Entropy tells you about the amount of information in something.It tells us how to design optimal communication protocols.It helps us understand strategies for (and limits on) file compression.Statistical mechanics:Entropy tells us how macroscopic physical systems act in practice.It gives us the heat equation.We can use it to improve engine efficiency.It tells us how hot things glow, which led to the discovery of quantum mechanics.Epistemics (an important application to me and many others on LessWrong):The concept of entropy yields the maximum entropy principle, which is extremely helpful for doing general Bayesian reasoning.Entropy tells us how "unlikely" something is and how much we would have to fight against nature to get that outcome (i.e. optimize).It can be used to explain the arrow of time.It is relevant to the fate of the universe.And it's also a fun puzzle to figure out!Share feedback on this narration.
LessWrong: "My resentful story of becoming a medical miracle" by Elizabeth
---narrator_time: 1h20mnarrator: pwqa: kmclient: lesswrong---https://www.lesswrong.com/posts/fFY2HeC9i2Tx8FEnK/my-resentful-story-of-becoming-a-medical-miracleThis is a linkpost for https://acesounderglass.com/2022/10/13/my-resentful-story-of-becoming-a-medical-miracle/You know those health books with “miracle cure” in the subtitle? The ones that always start with a preface about a particular patient who was completely hopeless until they tried the supplement/meditation technique/healing crystal that the book is based on? These people always start broken and miserable, unable to work or enjoy life, perhaps even suicidal from the sheer hopelessness of getting their body to stop betraying them. They’ve spent decades trying everything and nothing has worked until their friend makes them see the book’s author, who prescribes the same thing they always prescribe, and the patient immediately stands up and starts dancing because their problem is entirely fixed (more conservative books will say it took two sessions). You know how those are completely unbelievable, because anything that worked that well would go mainstream, so basically the book is starting you off with a shit test to make sure you don’t challenge its bullshit later?Well 5 months ago I became one of those miraculous stories, except worse, because my doctor didn’t even do it on purpose. This finalized some already fermenting changes in how I view medical interventions and research. Namely: sometimes knowledge doesn’t work and then you have to optimize for luck.I assure you I’m at least as unhappy about this as you are. Share feedback on this narration.
LessWrong: "Lies, Damn Lies, and Fabricated Options" by Duncan Sabien
---narrator_time: 1h30mnarrator: pwqa: kmclient: lesswrong---https://www.lesswrong.com/posts/gNodQGNoPDjztasbh/lies-damn-lies-and-fabricated-optionsThis is an essay about one of those "once you see it, you will see it everywhere" phenomena. It is a psychological and interpersonal dynamic roughly as common, and almost as destructive, as motte-and-bailey, and at least in my own personal experience it's been quite valuable to have it reified, so that I can quickly recognize the commonality between what I had previously thought of as completely unrelated situations.The original quote referenced in the title is "There are three kinds of lies: lies, damned lies, and statistics."Background 1: GyroscopesGyroscopes are weird.Except they're not. They're quite normal and mundane and straightforward. The weirdness of gyroscopes is a map-territory confusion—gyroscopes seem weird because my map is poorly made, and predicts that they will do something other than their normal, mundane, straightforward thing.In large part, this is because I don't have the consequences of physical law engraved deeply enough into my soul that they make intuitive sense.I can imagine a world that looks exactly like the world around me, in every way, except that in this imagined world, gyroscopes don't have any of their strange black-magic properties. It feels coherent to me. It feels like a world that could possibly exist."Everything's the same, except gyroscopes do nothing special." Sure, why not.But in fact, this world is deeply, deeply incoherent. It is Not Possible with capital letters. And a physicist with sufficiently sharp intuitions would know this—would be able to see the implications of a world where gyroscopes "don't do anything weird," and tell me all of the ways in which reality falls apart.Share feedback on this narration.
LessWrong: "How might we align transformative AI if it’s developed very soon?" by Holden Karnofsky
---narrator_time: 4h30mnarrator: pwqa: kmfeed_id: ai, ai_safety, ai_safety__technical, ai_safety__governanceclient: lesswrong---https://www.lesswrong.com/posts/rCJQAkPTEypGjSJ8X/how-might-we-align-transformative-ai-if-it-s-developed-very This post is part of my AI strategy nearcasting series: trying to answer key strategic questions about transformative AI, under the assumption that key events will happen very soon, and/or in a world that is otherwise very similar to today's. This post gives my understanding of what the set of available strategies for aligning transformative AI would be if it were developed very soon, and why they might or might not work. It is heavily based on conversations with Paul Christiano, Ajeya Cotra and Carl Shulman, and its background assumptions correspond to the arguments Ajeya makes in this piece (abbreviated as “Takeover Analysis”). I premise this piece on a nearcast in which a major AI company (“Magma,” following Ajeya’s terminology) has good reason to think that it can develop transformative AI very soon (within a year), using what Ajeya calls “human feedback on diverse tasks” (HFDT) - and has some time (more than 6 months, but less than 2 years) to set up special measures to reduce the risks of misaligned AI before there’s much chance of someone else deploying transformative AI. Share feedback on this narration.
LessWrong: "Quintin's alignment papers roundup - week 1" by Quintin Pope
---narrator_time: 1h30mnarrator: pwqa: kmclient: lesswrong---https://www.lesswrong.com/posts/7cHgjJR2H5e4w4rxT/quintin-s-alignment-papers-roundup-week-1IntroductionI've decided to start a weekly roundup of papers that seem relevant to alignment, focusing on papers or approaches that might be new to safety researchers. Unlike the Alignment Newsletter, I'll be spending relatively little effort on summarizing the papers. I'll just link them, copy their abstracts, and potentially describe some of my thoughts on how the paper relates to alignment. Hopefully, this will let me keep to a weekly schedule.The purpose of this series isn't so much to share insights directly with the reader, but instead to make them aware of already existing research that may be relevant to the reader's own research.Share feedback on this narration.
EA Forum Weekly Summaries – Episode 6 (Oct. 24 - 30, 2022)
---client: ea_forumproject_id: summariesnarrator: cs---Original article:https://forum.effectivealtruism.org/s/W4fhpuN26naxGCBbN/p/YxiXZcddn4kEqGdr9This is part of a weekly series summarizing the top posts on the EA Forum — you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.Narrated by Coleman Jackson Snell. Summaries written by Zoe Williams (Rethink Priorities).Published by TYPE III AUDIO on behalf of the Effective Altruism Forum.Share feedback on this narration.
EA Forum Weekly Summaries – Episode 5 (Oct. 17 - 23, 2022)
---client: ea_forumproject_id: summariesnarrator: cs---Original article:https://forum.effectivealtruism.org/s/W4fhpuN26naxGCBbN/p/Hi5z6tm9d2keHALgvThis is part of a weekly series summarizing the top posts on the EA Forum — you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.Narrated by Coleman Jackson Snell. Summaries written by Zoe Williams (Rethink Priorities).Published by TYPE III AUDIO on behalf of the Effective Altruism Forum.Share feedback on this narration.
LessWrong: "MIRI announces new "Death With Dignity" strategy" by Eliezer Yudkowsky
---narrator_time: 3h00mnarrator: pwqa: kmclient: lesswrongfeed_id: ai, ai_safety---Share feedback on this narration.
LessWrong: "What failure looks like" by Paul Christiano
---narrator_time: 2h00mnarrator: pwqa: kmclient: lesswrongfeed_id: ai, ai_safety, ai_safety__technical, ai_safety__governance---Share feedback on this narration.
LessWrong: "A transparency and interpretability tech tree" by evhub
---narrator_time: 4h30mnarrator: pwqa: kmclient: lesswrongfeed_id: ai, ai_safety, ai_safety__technical---https://www.lesswrong.com/posts/nbq2bWLcYmSGup9aF/a-transparency-and-interpretability-tech-treeCrossposted from the AI Alignment Forum. May contain more technical jargon than usual.Thanks to Chris Olah, Neel Nanda, Kate Woolverton, Richard Ngo, Buck Shlegeris, Daniel Kokotajlo, Kyle McDonell, Laria Reynolds, Eliezer Yudkowksy, Mark Xu, and James Lucassen for useful comments, conversations, and feedback that informed this post.The more I have thought about AI safety over the years, the more I have gotten to the point where the only worlds I can imagine myself actually feeling good about humanity’s chances are ones in which we have powerful transparency and interpretability tools that lend us insight into what our models are doing as we are training them.[1] Fundamentally, that’s because if we don’t have the feedback loop of being able to directly observe how the internal structure of our models changes based on how we train them, we have to essentially get that structure right on the first try—and I’m very skeptical of humanity’s ability to get almost anything right on the first try, if only just because there are bound to be unknown unknowns that are very difficult to predict in advance.Certainly, there are other things that I think are likely to be necessary for humanity to succeed as well—e.g. convincing leading actors to actually use such transparency techniques, having a clear training goal that we can use our transparency tools to enforce, etc.—but I currently feel that transparency is the least replaceable necessary condition and yet the one least likely to be solved by default.Nevertheless, I do think that it is a tractable problem to get to the point where transparency and interpretability is reliably able to give us the sort of insight into our models that I think is necessary for humanity to be in a good spot. I think many people who encounter transparency and interpretability, however, have a hard time envisioning what it might look like to actually get from where we are right now to where we need to be. Having such a vision is important both for enabling us to better figure out how to make that vision into reality and also for helping us tell how far along we are at any point—and thus enabling us to identify at what point we’ve reached a level of transparency and interpretability that we can trust it to reliably solve different sorts of alignment problems.The goal of this post, therefore, is to attempt to lay out such a vision by providing a “tech tree” of transparency and interpretability problems, with each successive problem tackling harder and harder parts of what I see as the core difficulties. This will only be my tech tree, in terms of the relative difficulties, dependencies, and orderings that I expect as we make transparency and interpretability progress—I could, and probably will, be wrong in various ways, and I’d encourage others to try to build their own tech trees to represent their pictures of progress as well.Share feedback on this narration.
EA Forum Weekly Summaries – Episode 4 (Oct. 10 - 16, 2022)
---client: ea_forumproject_id: summariesnarrator: cs---Original article:https://forum.effectivealtruism.org/s/W4fhpuN26naxGCBbN/p/pmJRXG3cTgrt779EpThis is part of a weekly series summarizing the top posts on the EA Forum — you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.Narrated by Coleman Jackson Snell. Summaries written by Zoe Williams (Rethink Priorities).Published by TYPE III AUDIO on behalf of the Effective Altruism Forum.Share feedback on this narration.
Preventing an AI-related catastrophe (full)
---narrator_time: 12h00mnarrator: pwqa: kmclient: 80000_hours---By Benjamin Hilton.Abstract:AI might bring huge benefits—if we avoid the risks.Source URL:https://80000hours.org/problem-profiles/artificial-intelligence/Share feedback on this narration.
Preventing an AI-related catastrophe (fewer footnotes)
---narrator_time: 15h00mnarrator: pwfeed_id: ai, ai_safetyqa: kmclient: 80000_hours---By Benjamin Hilton.Abstract:AI might bring huge benefits—if we avoid the risks.Source URL:https://80000hours.org/problem-profiles/artificial-intelligence/Share feedback on this narration.
EA Forum Weekly Summaries – Episode 2 (Sept. 19 - 25, 2022)
---client: ea_forumproject_id: summariesnarrator: cs---Original article:https://forum.effectivealtruism.org/posts/tokGikSg3fSJun4Lw/ea-and-lw-forums-weekly-summary-19-25-sep-22This is part of a weekly series summarizing the top posts on the EA Forum — you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.Narrated by Coleman Jackson Snell. Summaries written by Zoe Williams (Rethink Priorities).Published by TYPE III AUDIO on behalf of the Effective Altruism Forum.Share feedback on this narration.
EA Forum Weekly Summaries – Introduction & Episode 1 (Sept. 19 - 25, 2022)
---client: ea_forumproject_id: summariesnarrator: cs---Note from Coleman Snell:Thanks for listening to the very first episode of EA Forum Summaries Weekly! Please note that this podcast will only contain summaries of EA Forum posts, and not LessWrong posts. This is to keep the episodes short & sweet for a weekly series. Other options would include raising the karma threshold on both.Original article:https://forum.effectivealtruism.org/posts/5wzhWsHrZSLwXxc5q/ea-and-lw-forums-weekly-summary-12-18-sep-22This is part of a weekly series summarizing the top posts on the EA Forum — you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.Narrated by Coleman Jackson Snell. Summaries written by Zoe Williams (Rethink Priorities).Published by TYPE III AUDIO on behalf of the Effective Altruism Forum.Share feedback on this narration.