
Season 2 · Episode 260
Digital Archeology: The Primitive Power of GPT-1
Revisit the 2018 model that started it all. Herman and Corn dive into GPT-1's romance-novel roots and its 117-million-parameter legacy.
My Weird Prompts · Daniel Rosehill
January 20, 202619m 18s
Audio is streamed directly from the publisher (dts.podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
In this episode, Herman Poppleberry and Corn take a fascinating trip back to 2018 to perform some "digital archeology" on the model that started a revolution: GPT-1. While modern users in 2026 might find its 117-million-parameter capacity and tendency to output gibberish laughable, the hosts explain why this "primitive" tool was actually the Wright brothers' flyer of the artificial intelligence era. They dive deep into the technical limitations of the time, including the 512-token context window and the use of absolute positional embeddings that caused the model to frequently lose its train of thought. Beyond the specs, Herman and Corn discuss the shift from supervised learning to unsupervised pre-training and how a dataset of 11,000 unpublished romance novels shaped the early worldview of generative AI. By comparing the raw engine of GPT-1 to the "layered cakes" of 2026, this episode provides a crucial perspective on how far the industry has come and why the ghost of this original architecture still lives within the trillion-parameter giants of today.