
303 - How LLMs Work - the 20 minute explainer
Ever get asked "how do LLMs work?" at a party and freeze? We walk through the full pipeline: tokenization, embeddings, inference — so you understand it well enough to explain it. Walk away with a mental model that you can use for your next dinner party.
Fragmented - AI Developer Podcast · Kaushik Gopal, Iury Souza
Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Ever get asked "how do LLMs work?" at a party and freeze? We walk through the full pipeline: tokenization, embeddings, inference — so you understand it well enough to explain it. Walk away with a mental model that you can use for your next dinner party.
_Full shownotes at fragmentedpodcast.com.
Show Notes
Words -> Tokens:
- OpenAI Tokenizer visualizer -
Visualize how text becomes tokens
Tokens -> Embeddings:
- RGB Color model - wikipedia
- Word2Vec technique - wikipedia
- Efficient Estimation of Word Representation -
original Word2Vec paper by Mikolov et al.
- Efficient Estimation of Word Representation -
Embeddings -> Inference:
Get in touch
We'd love to hear from you. Email is the
best way to reach us or you can check our contact page for other
ways.
We want to hear all the feedback: what's working, what's not, topics you'd like
to hear more on. We want to make the show better for you so let us know!
Co-hosts:
[!fyi] We transitioned from Android development to AI starting with
Ep. #300. Listen to that episode for the full story behind
our new direction.