Episode 303

303 - How LLMs Work - the 20 minute explainer

Ever get asked "how do LLMs work?" at a party and freeze? We walk through the full pipeline: tokenization, embeddings, inference — so you understand it well enough to explain it. Walk away with a mental model that you can use for your next dinner party.

Fragmented - AI Developer Podcast · Kaushik Gopal, Iury Souza

February 2, 202625m 45s

Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

_Full shownotes at fragmentedpodcast.com.

Show Notes

Words -> Tokens:

OpenAI Tokenizer visualizer -
Visualize how text becomes tokens

Tokens -> Embeddings:

RGB Color model - wikipedia
Word2Vec technique - wikipedia
- Efficient Estimation of Word Representation -
  original Word2Vec paper by Mikolov et al.

Embeddings -> Inference:

Get in touch

We'd love to hear from you. Email is the
best way to reach us or you can check our contact page for other
ways.

We want to hear all the feedback: what's working, what's not, topics you'd like
to hear more on. We want to make the show better for you so let us know!

Co-hosts:

[!fyi] We transitioned from Android development to AI starting with
Ep. #300. Listen to that episode for the full story behind
our new direction.

Topics

inferencellmnlptokensword2vecmachine-learningexplainerembeddings

← All episodes of Fragmented - AI Developer Podcast