Interconnects

156 episodes — Page 3 of 4

Interviewing Arvind Narayanan on making sense of AI hype

Arvind Narayanan is a leading voice disambiguating what AI does and does not do. His work, with Sayash Kapoor at AI Snake Oil, is one of the few beacons of reasons in a AI media ecosystem with quite a few bad Apples. Arvind is a professor of computer science at Princeton University and the director of the Center for Information Technology Policy. You can learn more about Arvind and his work on his website, X, or Google Scholar.This episode is all in on figuring out what current LLMs do and don’t do. We cover AGI, agents, scaling laws, autonomous scientists, and past failings of AI (i.e. those that came before generative AI took off). We also briefly touch on how all of this informs AI policy, and what academics can do to decide on what to work on to generate better outcomes for technology.Transcript and full show notes: https://www.interconnects.ai/p/interviewing-arvind-narayananChapters* [00:00:00] Introduction* [00:01:54] Balancing being an AI critic while recognizing AI's potential* [00:04:57] Challenges in AI policy discussions* [00:08:47] Open source foundation models and their risks* [00:15:35] Personal use cases for generative AI* [00:22:19] CORE-Bench and evaluating AI scientists* [00:25:35] Agents and artificial general intelligence (AGI)* [00:33:12] Scaling laws and AI progress* [00:37:41] Applications of AI outside of tech* [00:39:10] Career lessons in technology and AI research* [00:41:33] Privacy concerns and AI* [00:47:06] Legal threats and responsible research communication* [00:50:01] Balancing scientific research and public distributionGet Interconnects (https://www.interconnects.ai/podcast)...... on YouTube: https://www.youtube.com/@interconnects... on Twitter: https://x.com/interconnectsai... on Linkedin: https://www.linkedin.com/company/interconnects-ai... on Spotify: https://open.spotify.com/show/2UE6s7wZC4kiXYOnWRuxGv This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Oct 17, 202454 min

(Voiceover) Building on evaluation quicksand

Read the full post here: https://www.interconnects.ai/p/building-on-evaluation-quicksandChapters00:00 Building on evaluation quicksand01:26 The causes of closed evaluation silos06:35 The challenge facing open evaluation tools10:47 Frontiers in evaluation11:32 New types of synthetic data contamination13:57 Building harder evaluationsFiguresFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webp This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Oct 16, 202416 min

Interviewing Andrew Trask on how language models should store (and access) information

Andrew Trask is one of the bright spots in engaging with AI policy for me in the last year. He is a passionate idealist, trying to create a future for AI that enables privacy, academic research, and government involvement in a rapidly transforming ecosystem. Trask is a leader of the OpenMined organization facilitating researcher access to non-public data and AIs, a senior research scientist at Google DeepMind, a PhD student at the University of Oxford, an author and educator on Deep Learning.You can find more about Trask on Twitter or Google Scholar. You may want to watch his recent talk at Cohere on the future of AI (and why data breakthroughs dominate), his lecture at MIT on privacy preserving ML, or his book on deep learning that has a substantial GitHub component. Here’s a slide I liked from his recent Cohere talk:The organization he helps run, OpenMined, has a few principles that say a lot about his ambitions and approaches to modern AI:We believe we can inspire all data owners to open their data for research by building open-source privacy software that empowers them to receive more benefits (co-authorships, citations, grants, etc.) while mitigating risks related to privacy, security, and IP.We cover privacy of LLMs, retrieval LLMs, secure enclaves, o1, Apple's new models, and many more topics.More on Andrew: https://x.com/iamtraskTranscript and more information: https://www.interconnects.ai/p/interviewing-andrew-traskInterconnects (https://www.interconnects.ai/)...... on YouTube: https://www.youtube.com/@interconnects... on Twitter: https://x.com/interconnectsai... on Linkedin: https://www.linkedin.com/company/interconnects-ai... on Spotify: https://open.spotify.com/show/2UE6s7wZC4kiXYOnWRuxGvWe Mention* Claude 3.5 launch and “pre release testing with UK AISI” (and the US AI Safety Institute)* OpenMined and PySyft* CSET (Center for Security and Emerging Technology)* NAIRR* The “open data wall”* Apple’s Secure Enclaves, Nvidia Secure Enclave* Data-store language models literature* RETRO: Retrieval-Enhanced Transformer from DeepMind (2021)* SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore (2023)* Scaling Retrieval-Based Language Models with a Trillion-Token Datastore (2024)Chapters[00:00:00] Introduction[00:03:12] Secure enclaves and pre-release testing with Anthropic and UK Safety Institute[00:16:31] Discussion on public AI and government involvement[00:20:55] Data store language models and better approaches to “open training data”[00:42:18] History and development of OpenMined[00:48:57] Use of language models on air-gapped networks[00:52:10] Near future of secure enclave technology and industry adoption[00:58:01] Conclusions and future trajectory of AI development This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Oct 10, 20241h 0m

How scaling changes model behavior

How scaling changes model behaviorSome trends are reasonable to extrapolate, some are not. Even for the trends we are succeeding at extrapolating, it is not clear how that signal translates into different AI behaviors.Read it here: https://www.interconnects.ai/p/how-scaling-changes-model-behavior[00:00] How scaling changes model behavior[05:03] Metaphors for what scaling may solve[08:45] Short-term scaling is already de-riskedFig. 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webpFig. 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/scaling-laws.webpFig. 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/situational-awareness.webp This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Oct 9, 202411 min

[Article Voiceover] AI Safety's Crux: Culture vs. Capitalism

SB1047's veto, OpenAI's turnover, and a constant treadmill pushing AI startups to be all too similar to big technology name brands.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/ai-safety-culture-vs-capitalism00:00 AI Safety's Crux: Culture v Capitalism06:03 SB1047 as a regulatory litmus test for AI safety08:36 Capitalism at the helm This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Oct 2, 202410 min

Interviewing Riley Goodside on the science of prompting

Riley Goodside is a staff prompting engineer at Scale AI. Previously working in data science, he is often seen as the default for the new role of a “prompt engineer.” He regularly posts incisive prompts that illicit notable behavior from the most popular AI models.I really resonated with this saying from Anthropic’s recent podcast on prompt engineering — “now we write essays and treat them as code.” In order to be good at prompting, you need to understand that natural language operates as our code used to.This episode is a masterclass on why you should care about prompting and how it impacts results. Of course, there’s a bunch of great discussion on recent models that reflect the need for different and or better prompting. Enjoy it!Listen on Apple Podcasts, Spotify, and where ever you get your podcasts. For other Interconnects interviews, go here.We mention:* Prompting to push the frontier of AI models,* Post-training and prompting interaction,* Prompting base models,* o1, Reflection 70B, reasoning,* Scale’s leaderboard, evaluation tricks, evaluation needs,* PlanSearch paper* Julius AI* “The hottest programming language is english”* “Think silently” instructions* Scale Leaderboard and Humanity’s Last Exam* ChatML formattingChapters* [00:00:09] Introduction* [00:02:40] Riley's path to LLMs* [00:07:54] Impact of ChatGPT on prompt engineering* [00:12:03] OpenAI's o1* [00:18:21] Autoregressive inference and prompting sensitivities* [00:24:48] Reflection 70B model and its implications* [00:28:00] Impact of prompting on evaluation* [00:32:43] Prompting vs. Google search* [00:46:55] Prompting and RLHF/post-training* [00:56:57] Prompting of AI agents* [01:01:20] Importance of hands-on experience with language models* [01:05:00] Importance and challenges of AI model evaluationTranscriptBuilt with smol-podcaster.Nathan L. [00:01:08]: Hey, Riley, welcome to the show.Riley G. Hey, Nathan, great to be here.Nathan L. [00:01:14]: Yeah, so for the audience here, I mostly wanted to try to, as I work on post-training a lot and I see my own difficulty in taking prompting seriously and the things that I don't think that we are doing enough, and I don't see any reason why it can't be scientific in how we do prompting. So that's my biggest goal with this. I think there's a lot of podcasts where we could kind of say, like, what is the history of prompting? Where is it going? And that's easy to kind of redo. And I still find it interesting, but I just don't think there's enough people talking about the role of prompting in evaluation, how prompting changes with how your post-training models, because we're trying to take that seriously and how we have a post-training setup, but we just like regularly run into these things like system prompts aren't handled well, how to release a model of a system prompt. So that's the tone that I'm trying to get to when I ask these questions. And also OpenAI's 01 model just came out, so I'm definitely going to get onto that pretty quickly because that's what everyone's excited about. I like to start with background just to kind of get to know people, because a lot of this is just, I want to talk to interesting people in AI, is like, how did you become interested in prompting? I think I've seen your background in data science and then your joint scale around when Chad2BT came out, which is fun timing, but like, how did you become maybe obsessed with this, but like the focal point of your work?Riley G. [00:02:40]: Yeah, I have sort of an unusual introduction to large language models. For most of my career, I've been a data scientist, mostly in the on-mandating industry. I was at OkCupid and Grindr. And after I left Grindr, I took sort of a sabbatical to educate myself, I guess, about the progress in large language models. It was around the time that GPT-3 codecs had just come out. And that was where I think I started to become really interested because I was following along with maybe, certainly when GPT-2 came out, the examples there wowed me as much as they wowed the rest of the world, I think, with the example of the news article about the unicorn and all that. And not long after that, we had AI Dungeon, and I played around with AI Dungeon a bit. But at that point, language models seemed to be mostly about language, that they were sort of very heavily focused on stylistic mimicry and creative writing and so on. And when Codex came out, it really started this thought of that text is a more universal interface than we were giving you credit for, that language models might be more broadly useful. And I just became very excited in a practical sense of what these models could do for what I kind of intuited was very boilerplate-like data science code, that I thought of like most of the Python and Julia and R and things that I've written over my career, this seemed like stuff that an LLM could handle. And that was sort of one of its early strong points. So I was playing around with, I think one of my fir

Sep 30, 20241h 8m

[Article Voiceover] Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem

Sorry this one was late! Thanks for bearing with me, and keep sending feedback my way. Still a year or two away from when I have time to record these, but I would love to.Open-source tools, examples, limits, and the state of training multimodal models.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/molmo-and-llama-3-vision00:00 Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem02:47 Llama vision: Multimodality for the masses of developers03:27 Molmo: a (mostly) open-source equivalent to Llama vision08:45 How adding vision changes capabilities and reasoning11:47 Multimodal language models: Earlier on the exponentialFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_013.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_015.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_021.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_023.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_027.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_030.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_037.pngFig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_046.pngFig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_048.pngFig 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_050.pngFig 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_052.pngFig 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_054.pngFig 13: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_058.pngFig 14: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_065.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Sep 27, 202414 min

[Article Voiceover] Reverse engineering OpenAI's o1

What productionizing test-time compute shows us about the future of AI. Exploration has landed in language model training.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/reverse-engineering-openai-o100:00 Reverse engineering OpenAI's o101:52 From Q-star to Strawberry to o105:13 Training o1 with reinforcement learning09:24 What is o1 doing when given a prompt?11:49 Questions to consider to understand o1's structure11:56 1. How does an RL-trained language model act?12:38 2. Is it an online / test-time search?14:20 3. Is it one model at inference?15:29 Open-source o1, the future of o1, and the future of AIFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_014.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_016.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_018.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_020.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_024.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_026.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_034.pngFig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_048.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Sep 17, 202418 min

Futures of the data foundry business model

Scale AI's future versus further scaling of language model performance. How Nvidia may take all the margins from the data market, too.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/ai-data-foundry00:00 Futures of the data foundry business model02:57 What it is like to work with data vendors06:06 Data foundries: Risks08:18 Data foundries: Growth vectors09:50 Realistic expectationsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/data-foundry/img_008.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/data-foundry/img_012.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/data-foundry/img_023.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Sep 11, 202411 min

A post-training approach to AI regulation with Model Specs

And why the concept of mandating "model spec's" could be a good start.(Oops, forgot to upload this yesterday!)This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/a-post-training-approach-to-ai-regulation0:00 A post-training approach to AI regulation with Model Specs1:45 Expanded roles of Model Specifications3:40 Near future of Model Specifications This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Sep 10, 20245 min

OpenAI's Strawberry, LM self-talk, inference scaling laws, and spending more on inference

Whether or not scaling works, we should spend more on inference.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/openai-strawberry-and-inference-scaling-laws00:00 OpenAI's Strawberry, LM self-talk, inference scaling laws, and spending more on inference01:51 OpenAI's Strawberry04:16 Self-talk in language models07:45 Inference scaling lawsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/strawberry/img_006.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/strawberry/img_021.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/strawberry/img_023.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/strawberry/img_037.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Sep 5, 202410 min

OLMoE and the hidden simplicity in training better foundation models

Ai2 released OLMoE, which is probably our "best" model yet relative to its peers, but not much has changed in the process.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/olmoe-and-building-better-llms00:00 OLMoE and the hidden simplicity in training better foundation models02:04 Frontier model team compute allocations04:19 De-risking training complexity06:40 On organizational complexity09:05 Compounding improvements -- the key to building better language modelsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmoe/img_005.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmoe/img_007.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmoe/img_009.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmoe/img_011.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmoe/img_028.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmoe/img_030.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmoe/img_032.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Sep 4, 202410 min

On the current definitions of open-source AI and the state of the data commons

The Open Source Initiative is working towards a definition.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/defining-open-source-ai0:00 On the current definitions of open-source AI and the state of the data commons3:17 Reasons to not mandate fully released data4:24 Sufficient but not exhaustive data docs5:22 Frustration with the data commons7:04 We need more examples to define the definitionFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/defining-open-source/img_005.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Aug 28, 20248 min

Nous Hermes 3 and exploiting underspecified evaluations

The latest model from one of the most popular fine-tuning labs makes us question how a model should be identified as a "frontier model."This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/nous-hermes-30:00 Nous Hermes 3 and exploiting underspecified evaluations5:29 Parsing training lessons from Hermes 3Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_005.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_010.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_012.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_020.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_027.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_030.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_032.pngFig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/nous-hermes-3/img_036.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Aug 16, 20248 min

Interviewing Ross Taylor on LLM reasoning, Llama fine-tuning, Galactica, agents

I had the pleasure of Talking with Ross Taylor, who has a great spectrum of unique experiences in the language modeling space — evaluation experience, Galactica lead author, Llama post training, etc. This is a really great conversation on the frontier of language model (LM) reasoning, LM deployments and demos, LM’s for science, RLHF, and other topics. I’ve been trying to get Ross to come on for a bit. He’s one of those people in the LM space that doesn’t speak too much, but when you do, you listen.Ross Taylor was previously an LLM lead at Meta AI, heading up the reasoning team. Previously he led the early work on LLM agents, and was the research lead on the Galactica project. Before that, he was a co-founder of Papers with Code, which was acquired by Meta in 2019. Before that, he has worked as a quant in sports betting and finance, and before that a policy advisor for the UK Government. He is currently working on a new startup.Listen on Apple Podcasts, Spotify, and where ever you get your podcasts. For other Interconnects interviews, go here.YouTubeChapters* [00:00:00] Introduction of Ross Taylor and his background* [00:02:12] Papers with Code* [00:09:58] Galactica, goals, controversy, legacy* [00:18:12] Technical details of the Galactica model* [00:23:18] Potential for language models to make scientific discoveries* [00:25:21] Defining and improving reasoning in language models* [00:32:38] Process-based reward models and their potential applications* [00:35:00] Generating synthetic data for SFT* [00:40:23] Evaluating the effectiveness of language models as judges for human preference data* [00:42:43] Considerations for creating base models that are easy to fine-tune* [00:46:45] Balancing SFT and RLHF* [00:54:13] Characteristics of successful post-training teams* [00:58:26] Future directions for language model developmentWe mention* Galactica* Papers with Code* Rob Stojnic (co-founder of Papers with Code)* DPO, PPO* Armen Aghajanyan (Chameleon)* Tom Scialom on Latent Space* Soumith Chintala (PyTorch)* Alex Graves* Llama 3 paper* Process Reward Models / Let’s Verify Step by StepTranscriptBuilt with smol-podcaster and with love of Latent Space.Nathan Lambert [00:01:07]: Today, we're here with Ross. This is a really exciting one. I've been trying to get Ross on the show for a while. Ross has done a lot of interesting work. And also the path to where you ended up with working on state-of-the-art LLaMA work at Meta is very interesting to me. So we're going to start with some of that, but then there are a few people that want to know more about reasoning and some of the RLHF stuff. We won't cover the secretive new start-up - I don't know what it is, but that's how it goes these days. I'm sure it'll be great. So welcome to the show!Ross Taylor [00:01:41]: Thanks for having me.Nathan Lambert [00:01:44]: So I wanted to start with Papers with Code. For people that don't know, Papers with Code is one of these platforms - I never was a heavy user of it - but it collates papers, people can upvote them, popular papers, attaching code and dataset and evaluations to papers, which is great - it was like sort of ahead of its time. It fits into a lot of these open ecosystem things. So I'm kind of curious, like, how you ended up there and why you all started this startup that ended up building this thing that got acquired by Meta?Ross Taylor [00:02:12]: Yeah, that was a weird one. This was like back in 2018. So I was at an incubator, I just quit my previous job and I was like, okay, I want to do a startup. And I met Rob, my co-founder, who came along with me for the journey. We both came from different backgrounds. I was from a sports betting / quant finance kind of background, which is a whole other episode I guess. And Rob was in various startups, like applying ML to things like hate speech detection, that kind of stuff. And the cool thing was, we both resonated on similar kinds of problems within the ML space, even though we came from different domains. So we spent a lot of time doing various experiments, trying to make new kinds of ML tooling, thinking of these stupid questions like “what is the Git equivalent for ML?” - that kind of stuff. One of those experiments was hacking around on this little website to solve a really basic problem: I'm trying to reproduce this paper, but I can't find the code. That was the thing that really blew up beyond our expectations. It was weird because we thought it was fairly trivial at first.Nathan Lambert [00:03:16]: What year was this? 2018?Ross Taylor [00:03:18]: Yeah.Nathan Lambert [00:03:19]: This makes sense. I think this was like, I was starting Deep RL then, but Deep RL was so hot, which was like the worst evaluation has ever been probably for ML. Like people complain about it today, but like Deep RL evaluation was like, every single person was just lying to make themselves look better.Ross Taylor [00:03:38]: The interesting thing now is that the open ecosystem has shifted to focus

Aug 8, 20241h 2m

A recipe for frontier model post-training

Apple, Meta, and Nvidia all agree -- synthetic data, iterative training, human preference labels, and lots of filtering.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/frontier-model-post-training00:00 Llama 3.1 post-training and the new normal for RLHF01:18 A new standard pipeline01:45 Human preference data02:59 Scaling RLHF05:03 Synthetic data06:10 The new normal06:51 Data quality is king07:18 Apple confirms the new normalFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_018.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_020.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_031.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_033.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_035.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Aug 7, 202410 min

Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education

This week, I had the pleasure of chatting with Sebastian Raschka. Sebastian is doing a ton of work on the open language model ecosystem and AI research broadly. He’s been writing the great Ahead of AI newsletter (that has the biggest audience overlap with Interconnects, at 26%, so a lot of you know him) and multiple educational books, all on top of being a full time machine learning engineer at Lightning.ai, where he maintains LitGPT, which he described as being like Karpathy’s NanoGPT, with slightly more abstractions.This conversation mostly surrounds keeping up with AI research, the state of the open LLM ecosystem post Llama 3.1, and many narrow topics in between. I learned that Sebastian used to be an Arxiv moderator, which gives some simple color on how Arxiv and sifting through thousands of papers works. We cover a lot of ground here, so I hope you enjoy it.Listen on Apple Podcasts, Spotify, and where ever you get your podcasts. For other interviews, go here.YouTubeChapters* [00:00:00] Introduction & Sebastian’s background* [00:04:28] The state of deep learning and language models in 2018* [00:08:02] Sebastian's work at Lightning AI and LitGPT* [00:12:23] Distillation and its potential in language model training* [00:14:14] Implementing language models and common pitfalls* [00:18:45] Modern architectures: Mixture of experts models, early v. late fusion multimodal* [00:24:23] Sebastian's book on building language models from scratch* [00:27:13] Comparing ChatGPT, Claude, and Google's Gemini for various tasks* [00:38:21] Vibing and checking new language models during implementation* [00:40:42] Selecting papers to read and moderating Arxiv* [00:45:36] Motivation for working on AI education* [00:52:46] Llama 3 fine-tuning* [00:57:26] The potential impact of AI on jobs in writing and education* [01:00:57] The future directions of AITranscriptBuilt with smol-podcaster and with love of Latent Space.Nathan Lambert [00:00:00]: Hey, Sebastian, welcome to this kind of interconnects, normally researcher interviews. You were a professor, so that definitely counts. You do a lot of different things these days. Let's get talking into language models. Welcome. Yeah.Sebastian Raschka [00:01:35]: Thanks so much for the invitation, Nathan. I'm a big fan actually of the interconnects newsletter, so I'm hoping we can have some fun chat about research, LLMs, and what's hot these days, basically. Yeah.Nathan Lambert [00:01:48]: I have a little section on the end, which is keeping up with AI research, writing about AI and process, because you do so many things, but I kind of want to jump into how you got to AI, because you have an interesting career path. So you were a professor at Wisconsin Madison for years. I saw in statistics, which ... I also went all the way back to find your PhD thesis, which was uncovering hidden patterns of molecular recognition. So this was a while ago, and is this kind of ... Can you explain your background and how you got into AI? I'm guessing it's through computational statistics or something like this.Sebastian Raschka [00:02:24]: Yeah. Close. So yeah, you did some research there. Interesting. So yeah, it's been a long time since my PhD thesis. This is maybe seven years now. And back then, it started even earlier when I got into AI, that was like, I would say 2012-ish. I was in grad school and I was taking a statistical pattern classification class. And in that class, yeah, the star of the show was basically naive Bayes classifiers, or in general, Bayesian methods for pattern recognition. And from there, I kind of really got into machine learning. So there was, I would say, more statistical-based, but it was all about classifying things. And then I think it was also right about the time where Cozera was launched, and I saw Andrew Ng's Cozera class. That was, I think, the first class in 2011-12 back then. And yeah, that's basically how I started from statistical pattern classification into machine learning. And I applied that for computational biology problems like molecule and drug discovery, like pharmaceutical drug discovery. And yeah, from there, I joined at some point after my graduation, the University of Wisconsin in Madison, where I was in the statistics department, but I did mostly deep learning research, essentially. I was the only one basically doing Python, deep learning, machine learning stuff. So yeah.Nathan Lambert [00:03:48]: What year was this, and what did it look like at the time?Sebastian Raschka [00:03:52]: That was around 2018, I think August 2018, when I joined the department. And yeah, I mean, so it's the statistics department, but my work was technically all machine learning and deep learning. I mean, a lot of students were really excited about learning machine learning. I think it was just around the time where it got really popular. And yeah, I was teaching machine learning and deep learning classes as well. They were always like, you know, full and crowded, like a lot o

Aug 1, 20241h 3m

GPT-4o-mini changed ChatBotArena

And how to understand Llama three point one's results.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/gpt-4o-mini-changed-chatbotarena0:00 GPT-4o-mini changed ChatBotArena3:23 Llama 3 in the arena5:13 Partial solutions and next stepsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/new-chatbotarena/img_013.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/new-chatbotarena/img_015.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/new-chatbotarena/img_019.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/new-chatbotarena/img_021.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/new-chatbotarena/img_025.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/new-chatbotarena/img_039.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/new-chatbotarena/img_043.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jul 31, 20247 min

Llama 3.1 405b, Meta's AI strategy, and the new open frontier model ecosystem

Defining the future of the AI economy and regulation. Is Meta's AI play equivalent to the Unix stack for open-source software?This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/llama-405b-open-frontier-model00:00 Llama 3.1 405b, Meta's AI strategy, and the new open frontier model ecosystem01:37 Meta's open frontier model03:51 Zuckerberg's vision for open-source AI (vs. reality)08:35 Does the Llama 3.1 license support open-source AI?12:55 Different futures for regulating frontier modelsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-405/img_008.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-405/img_010.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-405/img_015.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-405/img_018.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-405/img_050.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jul 23, 202415 min

SB 1047, AI regulation, and unlikely allies for open models

SB 1047, AI regulation, and unlikely allies for open modelsThe rallying of the open-source community against CA SB 1047 can represent a turning point for AI regulation.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/sb-1047-and-open-weights00:00 Introduction01:53 SB 1047 and targeting regulation07:57 Unlikely allies of "open"12:05 What would I regulate today? This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jul 17, 202414 min

Switched to Claude 3.5

I Switched to Claude 3.5Speculations on the role of RLHF and why I love the model for people who pay attention.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/switched-to-claude-from-chatgpt00:00 I Switched to Claude 3.503:57 Product priorities05:15 RLHF's peak?Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/claude/img_016.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/claude/img_018.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/claude/img_020.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/claude/img_022.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jul 3, 20246 min

Interviewing Dean Ball on AI policy: CA SB 1047, upcoming AI disaster response, Llama 3 405B, Chinese open-source AI, and scaling laws

I’m really excited to resume the Interconnects Interviews with Dean W. Ball from the Hyperdimensional Substack (you should subscribe). We cover the whole stack of recent happenings in AI policy, focusing of course on California’s bill SB 1047. We cover many, many more great topics here including:* What will happen in the case of a minor AI disaster,* If Meta will release the 405B model, and why,* The status of Chinese open-source AI,* Training on model outputs,* Anthropic’s recent strategy,* What scaling laws actually mean,* Creating content and shifting the needle of the AI discourse.Watch the video on YouTube below or listen on podcast players here.Interconnects is a reader-supported publication. Consider becoming a subscriber.Chapters* 00:00 Intro and Welcome Dean Ball * 02:44 The Origins of California Bill SB1047 * 08:56 The Evolution of Bill SB1047 * 13:00 How SB1047 Affects Fine-Tuning * 20:00 The Future of Bill SB1047 * 21:58 The Impact of AI Disasters * 29:02 Meta and its 400 billion Parameter Model * 32:25 Open Source AI and the Chinese Market * 37:37 The Future of Open Source AI * 43:35 Synthetic Data, Licenses, and Future AI Development * 45:18 Anthropic's Approach to AI Safety * 50:46 Scaling Laws * 53:01 The Role of Audience in Influencing AI PolicyLinks* Dean’s series on SB-1047: one, two, and three.* Other AI policy Substacks: Jural Networks and Intersecting AI * Senator Scott Wiener. CA SB 1047 itself.* Another post on CA SB 1047 from Answer AI.* Situational Awareness by Leopold Aschenbrenner.* Lina Kahn on her P(doom) and warnings in support of open-source.* Ben Thompson’s Framework for Moderation in technology.TranscriptNathan Lambert (00:00:01): Hello, and welcome back to InterConnect's interview series. It's been a few months. I'm really excited for this one. We're here with Dean Ball, who is a research fellow at the Mercatus Center. He works on AI policy right now, and he's the author of the Hyperdimensional Substack, which is kind of the AI policy substack that emerged when I was spamming into the void that we need to have some good AI policy newsletters out there. There are a couple more that I could add to the show notes of this that I'm aware of from friends that used to be at OpenAI, friends at AI2, so I'll add some of those as well.But in this kind of summer slowdown of releases, I thought it would be a great time to kind of revisit some of the core themes on AI policy, open versus closed, kind of things that I'm wondering about in the future that I know are coming that are looming AI disasters, what some of these closed source companies are trying to do in the policy space. I think this is the sort of interview that we could probably do multiple times. I think we've started talking in DMs and it's clear that we're aligned on a whole bunch of things. We read each other's work. I think this should be kind of fun and I'm just happy to do this.I think the core of this interview I'll give you a chance to introduce yourself if you want, if you want to add anything else that I missed, and then we're just going to go into this California bill SB 1047. Probably talk about this. I'll ask you about the story of how it happened and then where we're at now. And I think that'll kind of lead into a lot of interesting debates. So do you have any background you want to add that makes you an interesting person in the AI space? Or is it just that there's so many things that need to be done in AI that if you're focused, you can kind of have an impact in an area?Dean W Ball (00:01:44): Yeah, I mean, I think basically, you know, I've mostly written on policy unrelated to tech for my career, state and local a lot. So the fact that a lot of the policy action on AI seems to be happening at the state level has been very relevant. But I've also just like always been paying attention to the AI literature. I remember 2017, I think, when the Alec Radford Amazon podcast product reviews paper came out and I said to a colleague this is gonna be a big deal I think one day and you know we I tried to use GPT-2 to do like social science research like policy research back in 2019 so I've been playing around with these for a while and I try my best to write from a combination of a relatively technically informed person, but also someone who understands the policy side.Nathan Lambert (00:02:43): Yeah, so I think we should jump right into it. What is the origin of the story of this California bill? My understanding is it just kind of showed up and everyone in the Bay Area was like, like where did this come from? Having actually passed the state Senate as like, do you have any, does your story start there as well? Or did you kind of know this was coming?Dean W Ball (00:03:03): So I saw, Scott Wiener, the author of the bill had telegraphed that he was working on, something in AI policy, I think in maybe October or November of 2023. And then the actual bill text came out in early February. And I remember when it came out b

Jun 27, 202456 min

RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition

Things to be aware of if you work on language model fine-tuning.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/rlhf-roundup-202400:00 RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition04:32 How big is the impact of RLHF relative to pretraining?05:54 RewardBench retrospective after 100 models and 90% peak accuracy09:19 LMSYS's reward modeling competitionFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_009.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_012.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_017.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_026.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jun 26, 202411 min

Frontiers in synthetic data

Synthetic data is known to be a super powerful tool for every level of the language modeling stack. It's documented as being used for expanding vanilla pretraining data and creating large swaths of fine-tuning data. Many, many more rumors surround its use, Anthropic's pretraining-scale constitutional AI, Mistral AI's first models being pretrained on OpenAI outputs, Q-star's hopes as OpenAI's remaining moat, and much more. The diversity of use cases for synthetic data makes planning around the role of synthetic data in solving specific goals.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/frontiers-in-synthetic-data00:00 Frontiers in synthetic data01:14 1. Direct distillation is still king02:54 2. Are Gemini Flash and Claude Haiku distilled?04:03 3. Filtering prevents collapse06:30 4. Synthetic data strategy taxes07:32 5. Pros and cons of training on multi-output-source synthetic datasets08:54 6. Structured synthetic data09:42 7. Weak-to-strong generalization is maybe real10:27 8. Creating synthetic prompts is overlooked again This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jun 21, 202411 min

Text-to-video AI is already abundant

Signs point to a general-use Sora-like model coming very soon, maybe even with open-weights.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/text-to-video-ai-is-already-abundant0:00 Text-to-video AI is already abundant5:08 What's next for the text-to-video market?6:49 Are text-to-video models good for the world?Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/text-to-video/img_005.mp4Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/text-to-video/img_009.mp4Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/text-to-video/img_011.mp4Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/text-to-video/img_013.mp4Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/text-to-video/img_015.mp4Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/text-to-video/img_017.mp4 This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jun 18, 20248 min

AI for the rest of us

Apple Intelligence makes a lot of sense when you get out of the AI bubble.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/apple-intelligence00:00 AI for the rest of us02:46 Apple's technical approach03:32 Core models: What did Apple build?05:35 Alignment strategies: Some new things!10:00 Orchestrating adapters and on-device magic11:58 Light for other narratives around AIFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/apple-intelligence/img_005.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/apple-intelligence/img_015.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/apple-intelligence/img_039.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/apple-intelligence/img_041.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jun 12, 202412 min

A realistic path to robotic foundation models

A realistic path to robotic foundation modelsNot "agents" and not "AGI." Some thoughts and excitement after revisiting the industry thanks to Physical Intelligence founders Sergey Levine and Chelsea Finn.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/robotic-foundation-models0:00 A realistic path to robotic foundation models2:51 Key factors for the future of robotics6:19 Everything is a token: The transformerification of robotics This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jun 5, 20247 min

We aren't running out of training data, we are running out of open training data

Data licensing deals, scaling, human inputs, and repeating trends in open vs. closed.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/the-data-wall0:00 We aren't running out of training data, we are running out of open training data2:51 Synthetic data: 1 trillion new tokens per day4:18 Data licensing deals: High costs per token6:33 Better tokens: Search and new frontiers This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

May 29, 20248 min

Name, image, and AI's likeness

Celebrity's power will only grow in the era of infinite content.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/name-image-and-ai-likeness0:00 Name, image, and AI's likeness1:11 OpenAI's second terrible, horrible, no good, very bad week4:36 The expansion of name and likeness7:46 Culture and AI development This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

May 22, 20249 min

OpenAI chases Her

ChatGPT leaves the textbox, and Google is building the same, and more, as practical tools.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/openai-and-her00:00 OpenAI chases Her02:10 Talking to ChatGPT03:53 GPT-4o: Toward omnimodal models08:25 Google's mirror with Gemini10:11 OpenAI's AI Safety: Have your cake and eat it tooFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/her/img_018.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/her/img_023.jpg This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

May 16, 202412 min

OpenAI's Model (behavior) Spec, RLHF transparency, and personalization questions

Now we will have some grounding for when weird ChatGPT behaviors are intended or side-effects -- shrinking the Overton window of RLHF bugs.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/openai-rlhf-model-spec00:00 OpenAI's Model (behavior) Spec, RLHF transparency, and personalization questions02:56 Reviewing the Model Spec08:26 Where RLHF can fail OpenAI12:23 From Model Spec's to personalizationFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_027.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_029.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_033.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_034.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_041.webpFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_046.webp This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

May 13, 202414 min

RLHF: A thin line between useful and lobotomized

Many, many signs of life for preference fine-tuning beyond spoofing chat evaluation tools.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/how-rlhf-works-200:00 How RLHF works, part 2: A thin line between useful and lobotomized04:27 The chattiness paradox08:09 The mechanism for making models chattier10:42 Next steps for RLHF researchFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_012.webpFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_018.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_025.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

May 1, 202413 min

Phi 3 and Arctic: Outlier LMs are hints

Models that seem totally out of scope from recent open LLMs give us a sneak peek of where the industry will be in 6 to 18 months.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/phi-3-and-arctic-llms0:00 Phi 3 and Arctic: Outlier LMs are hints1:01 Arctic & open mixture of expert trends6:10 Phi 3, synthetic data, and small modelsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_004.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_008.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_018.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Apr 30, 20249 min

AGI is what you want it to be

Certain definitions of AGI are backing people into a pseudo-religious corner.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/agi-is-what-you-want-it-to-be00:00 AGI is what you want it to be04:01 RL still rules the AGI discourse05:43 Modern AGI tests07:37 Agency and shifting goalpostsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/agi/img_018.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/agi/img_020.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Apr 24, 202410 min

Llama 3: Scaling open LLMs to AGI

Meta shows that scaling won't be a limit for open LLM players in the near future.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/llama-3-and-scaling-open-llms00:00 Llama 3; scaling open LLMs to AGI01:44 Pretraining, data, and basic evals06:06 Alignment and human evaluations10:08 Chatting with Meta AI and Llama 3 70B Instruct11:55 Same Llama license (mostly)12:52 The healthy open LLM ecosystemFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_011.jpegFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_013.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_015.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_020.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_036.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_040.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_046.jpegFig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_061.pngFig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_063.webpFig 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_066.pngFig 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_068.jpeg This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Apr 21, 202415 min

Stop "reinventing" everything to "solve" alignment

Integrating some non computing science into reinforcement learning from human feedback can give us the models we want.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/reinventing-llm-alignment0:00 Stop "reinventing" everything to "solve" AI alignment2:19 Social Choice for AI Alignment: Dealing with Diverse Human Feedback7:03 OLMo 1.7 7B: A truly open model with actually good benchmarksFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_013.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_015.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_018.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_024.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_027.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Apr 17, 20247 min

The end of the "best open LLM"

Modeling the compute versus performance tradeoff of many open LLMs.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/compute-efficient-open-llms0:00 The end of the "best open LLM"3:05 Compute efficient open LLMsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_004.jpegFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_009.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_014.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_016.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_018.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_020.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_022.pngFig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_024.pngFig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_028.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Apr 15, 20246 min

Why we disagree on what open-source AI should be

Last minute title change from: The tech industry can't agree on what open-source AI means. That's the process.How to read what multiple people mean by the word openness and see through the PR speak.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/flavors-of-open-source-ai0:00 The tech industry can't agree on what open-source AI means. That's the process.2:45 1. Effective Accelerationists, Techno-Optimists, capitalists, etc.3:39 2. Scientists, promoting understanding and transparency5:16 3. Inclusion, public interest, and fighting concentration of power6:19 4. Freedom advocates7:25 Dissecting "openness"Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/openness/img_004.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Apr 3, 20248 min

DBRX: The new best open LLM and Databricks' ML strategy

Databricks' new model is surpassing the performance of Mixtral and Llama 2 while still being in a size category that's reasonably accessible.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolshttps://www.interconnects.ai/p/databricks-dbrx-open-llm00:00 DBRX: The new best open model and Databricks' ML strategy03:36 The DBRX narrative07:33 Databricks' open LLM (and AI) strategy09:42 Playing with DBRX Instruct14:54 Digging for detailsFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_007.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_012.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_023.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_045.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_047.pngFig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_059.pngFig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_066.jpegFig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_068.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Mar 29, 202416 min

Evaluations: Trust, performance, and price (bonus, announcing RewardBench)

Evaluation is not only getting harder with modern LLMs, it's getting harder because it means something different.This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/evaluations-trust-performance-and-price00:00 Evaluations: Trust, performance, and price (bonus, announcing RewardBench)03:14 The rising price of evaluation05:40 Announcing RewardBench: The First reward model evaluation tool08:37 Updates to RLHF evaluation toolsYouTube code intro: https://youtu.be/CAaHAfCqrBAFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_026.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_030.pngFigure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_034.pngFigure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_040.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Mar 21, 202412 min

Model commoditization and product moats

Where moats are tested now that so many people have trained GPT4 class models. Claude 3, Gemini 1.5, Inflection 2.5, and Mistral Large are here to party.This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/gpt4-commoditization-and-moats00:00 Building LLM moats despite the commoditization of GPT404:38 The Open's opportunities08:02 It's amazing people still think LLMs aren't going to be useful09:50 Things that are comingFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/moats/img_004.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/moats/img_028.pngFigure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/moats/img_032.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Mar 13, 202410 min

The koan of an open-source LLM

A proposal for a new definition of an "open source" LLM and why no definition will ever just work.This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/an-open-source-llm00:00 The koan of an open-source LLM03:22 A new naming scheme for open LLMs07:09 Pivot points and politics08:16 Claude 3, arms race, commoditization, and national security10:01 Doomers debunking bio risks of LLMs themselves11:21 Mistral's perceived reversal and the EU13:22 Messy points: Transparency, safety, and copyright13:32 The muddling of transparency15:22 The muddling of "safety"16:30 The muddling of licenses and copyright20:12 Vibes points and next stepsFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/open-source/img_046.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/open-source/img_064.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Mar 6, 202423 min

Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between

This interview is available on podcast players and YouTube.I’m excited to bring you another interview! This one is a deep dive right in my wheelhouse — all things RLHF. Louis Castricato is probably the hidden star of RLHF in the open. I’m not sure anyone who can speak freely knows as much as him. As I’ve said again and again on Interconnects:Giving a voice to researchers is the best way to cut through the noise and understand what is happening with core developments of LLM technologies.Louis recently has been founding a new startup focused on synthetic data for alignment, Synth Labs, and is a researcher at Eleuether AI. This interview should speak for itself, and it’ll need re-listens, even for myself. The list of topics we cover touches on pretty much every major and minor issue facing model fine-tuning. Please reach out or comment if there’s a paper we mention that I didn’t link before. Happy to dig it up for you. For more on Synth Labs, there was a profile in Bloomberg from Rachel Metz.This post is very technical, more than usual. If you’re having a hard time with it, I suggest you listen to my RLHF 201 post on Latent Space first.ChaptersThese are generated with smol-podcaster with moderate edits.High-level chapters* 00:00:00: Introduction* 00:01:24: Gemini News and RLHF’s Part in it* 00:09:05: Long Context, In-Context, and Multimodal RLHF* 00:21:20: What are people missing about RLHF these days?* 00:30:30: OpenAI's Influence and the Need for Alternatives* 00:39:20: Synth Labs and the Future of Alignment* 00:55:00: Evaluation Talk p2: Open-ended Evaluation and Data Diversity* 00:59:20: Algorithm Roundup: PPO, DPO, KTO, IPO* 01:18:38: CarperAI, Early Days of RLHF, Reflecting on ChatGPTDetailed chapters* 00:00:00: Introduction and Overview of RLHF* 00:02:02: Gemini News, Custom Demographics in Image Prompts, and Controllability Issues in AI Models* 00:05:21: Fixing Biases in AI Models Post-Training, Representation in AI Data* 00:09:00: Multimodal RLHF and Video RLHF* 00:16:09: Evaluating Long Context Behavior in AI Models* 00:17:05: The Potential of In-Context RLHF* 00:21:24: Shift from PPO to DPO in RLHF* 00:23:19: Generalization and Evaluation in RLHF, Human Evaluation* 00:27:03: The Discrepancy Between Research and Company Needs in Alignment* 00:29:20: Impact of ChatGPT and Language Model Outputs on Data Sets* 00:31:39: The Concept of Uncensoring Models* 00:34:05: Lack of Safety Data Sets in Instruction Tuning* 00:35:23: LMSYS ChatBotArena, AlpacaEval, MT Bench p1* 00:39:25: Introduction to Synth Labs and Alignment Platform* 00:43:05: Developing OpenCAI Constitutional AI Data Set* 00:49:41: The Need for Open-Ended Evaluation in RLHF, eval p2* 00:54:13: The Importance of Releasing Models for RLHF Research* 00:58:17: Self-Instruction and Self-Rewarding LMs* 01:01:03: Working on RLHF at Carper AI* 01:04:25: Scaling PPO in RLHF* 01:08:01: The Impact of ChatGPT on Carper AI* 01:10:56: The Potential of KTO (Kahneman-Tversky Optimization)* 01:17:39: The Importance of Implementation Details in RLHF* 01:20:14: The Initial Focus at Carper AI* 01:23:36: The Future of RLHF and Open Science CollaborationInterconnects is a reader-supported publication. Consider becoming a subscriber.Papers & artifacts we discuss* Recursively Summarizing Books with Human Feedback* Needle in a haystack recent example repository. * Urial paper: The unlocking spell on base llms: Rethinking alignment via in-context learning* Misha paper from Deepmind: In-context Reinforcement Learning with Algorithm Distillation* Museli Optimizer: Muesli: Combining Improvements in Policy Optimization* Unintended Impacts of LLM Alignment on Global Representation* Pink Elephants Problem: Suppressing Pink Elephants with Direct Principle Feedback* Cut the Carp: Cut the CARP: Fishing for zero-shot story evaluation* MT Bench data for correlating human to GPT4 preferences Full transcriptNote: this is generated by smol-podcaster and has minor bugs post human edits.Nathan [00:00:01]: The ticker's going up. Welcome, Louis. You're the second guest on the InterConnects podcast, I think. It's an interesting one for me because everyone kind of points to me now as the person that is in the face of RLHF and I get a lot of questions and to me Louis has represented that person. I think Louis provided a lot most of the information on the first RLHF blog post that I wrote for Hugging Face back in the day. If there's somebody that I want to ask questions about RLHF, it generally goes to him. So now you all are gonna know this in the open. We're gonna cover a lot of things. As always, I'm trying to talk with researchers on the ground and people actually doing things in these topics. I think we're gonna cover a lot of things today. We're in the Latent Space podcast. If you're watching on video, you may have noticed that we're in the Latent Space studio and they reminded us we've got to start off with covering the Gemini news and what that means for RLHF and then most of this is

Mar 4, 20241h 26m

How to cultivate a high-signal AI feed

Basic tips on how to assess inbound ML content and cultivate your news feed.This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/making-a-ml-feed00:00 How I assess all these AI releases01:22 1. Model access and demos are king of credibility02:31 2. Focus your feed on depth or breadth03:09 3. Examples of using the model normally show its usable, shockingly04:10 4. Leaderboards as the single leading claim is often anti-signal05:00 5. Basic deep learning conceptual checks will often save you06:13 6. If it's not even remotely reproducible or verifiable, it's not science07:10 7. Don't over-index on Twitter08:32 8. Data sharing, licenses, communication clarity, and small things add up08:58 9. Research papers, technical reports, blog posts, and Tweets all serve different purposes09:49 10. Socialize your information and build relationships This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 28, 202410 min

Google ships it: Gemma open LLMs and Gemini backlash

Google rejoins the open model party and gets some backlash for a frequent problem for generative AI.This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/gemma-google-ships-it00:00 Google ships it: Gemma open LLMs and Gemini backlash03:12 Getting to know Gemma07:11 Alignment details08:55 Aside: What is REINFORCE? Some history of RL11:08 Implementation details and RLHF12:18 Terms of use: RAIL Licenses history repeated14:05 Is Google back on top? Gemini's woesFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_008.webpFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_014.pngFigure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_035.pngFigure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_051.pngFigure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_055.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 22, 202417 min

10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and more

10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and moreThis is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/sora-gemini-follow-up00:00 10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and more00:46 1. Deepfake detection of Sora01:59 2. Playing with long-context, problem settings, and prompting03:39 3. Gemini paper snooping: contamination and citation games05:42 4. Training data and token estimates of YouTube07:42 5. Unlocking model-based RL and downstream research08:52 6. Midjourney style matching, V-JEPA, replicating Sora in the open10:09 7. Architectures and academic links10:57 8. Pixel peeping from the arts11:58 9. Inference costs13:24 10. Pressure on Llama and Mistral14:03 11. Sound effects, physics, and the complete pictureFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_003.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_007.mp4Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_009.mp4Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_011.mp4Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_037.mp4Figure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_044.pngFigure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_047.pngFigure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_049.mp4 This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 20, 202414 min

Releases! OpenAI’s Sora for video, Gemini 1.5's infinite context, and a secret Mistral model

Emergency blog! Three things you need to know from the ML world that arrived yesterday.This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/sora-gemini-and-mistral-next0:00 OpenAI's Sora for video, Gemini 1.5, and a secret Mistral model0:53 Sora: OpenAI's text-to-video model4:59 Gemini 1.5: Google's effectively infinite context length8:01 Mistral-next: Another funny release methodFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_015.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_023.pngFigure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_026.pngFigure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_036.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 16, 20249 min

Why reward models are still key to understanding alignment

In an era dominated by direct preference optimization and LLMasajudge, why do we still need a model to output only a scalar reward?This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: In an era dominated by direct preference optimization and LLM-as-a-judge, why do we still need a model to output only a scalar reward?Podcast figures:Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_004.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_009.png0:00 Why reward models are still key to understanding alignment This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 14, 20247 min

Alignment-as-a-Service: Scale AI vs. the new guys

Scale's making over $750 million per year selling data for RLHF, who's coming to take it?This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/alignment-as-a-service00:00 Alignment-as-a-Service upstarts taking on Scale AI04:25 The competition with humans-in-the-loop06:05 Scaling Alignment-as-a-Service via AI feedbackPodcast figures:Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/aaas/img_008.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 7, 202410 min

Open Language Models (OLMos) and the LLM landscape

A small model at the beginning of big changes.This is AI generated audio with Python and 11LabsSource code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/olmo0:00 Open Language Models (OLMos) and the LLM landscape6:24 Thought experiments7:51 The LLM landscape heading into 2024Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmo/img_010.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 1, 20249 min

« Prev 1 234 Next »