Episode 296

From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294)

November 11, 202546m 54s

Audio is streamed directly from the publisher (mcdn.podbean.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
We explore CALM & why open science matters. 🔥📊