PLAY PODCASTS
Scaling Large ML Models to Small Devices with Atila Orhon

Scaling Large ML Models to Small Devices with Atila Orhon

Software Engineering Daily · softwareengineeringdaily.com

May 7, 202456m 44s

Audio is streamed directly from the publisher (traffic.megaphone.fm) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

The size of ML models is growing into the many billions of parameters. This poses a challenge for running inference on non-dedicated hardware like phones and laptops.

Argmax is a startup focused on developing methods to run large models on commodity hardware. A key observation behind their strategy is that the largest models are getting larger, but the smallest models that are commercially relevant are getting smaller. The company was started in 2023 and has raised money from General Catalyst and other industry leaders.

Atila Orhon is the founder of Argmax and he previously worked at Apple and NVIDIA. He joins the show to talk about working in computer vision, building ML tooling at Apple, optimizing ML models, and more.