PLAY PODCASTS
Ep 7: Co-Creator of Databricks Dolly Mike Conover on Open-Source LLMs
Episode 7

Ep 7: Co-Creator of Databricks Dolly Mike Conover on Open-Source LLMs

Patrick and Jacob sit down with Mike Conover, Staff Software Engineer at Databricks and Co-Creator of Databricks Dolly, the world’s first truly open instruction-tuned LLM, to discuss the magic behind Dolly, Alpaca and other instruction-tuned LLMs, the unreasonable effectiveness of fine-tuning, how they got all Databricks employees to help them curate the Dolly dataset (hint: google forms), and more.

Unsupervised Learning with Jacob Effron · Mike Conover, Jacob Effron, Patrick Chase

May 11, 202348m 56s

Audio is streamed directly from the publisher (cdn.simplecast.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

Patrick and Jacob sit down with Mike Conover, Staff Software Engineer at Databricks and Co-Creator of Databricks Dolly, the world’s first truly open instruction-tuned LLM, to discuss the magic behind Dolly, Alpaca and other instruction-tuned LLMs, the unreasonable effectiveness of fine-tuning, how they got all Databricks employees to help them curate the Dolly dataset (hint: google forms), and more.

 

(0:00) - Intro

(5:54) - The birth of Dolly

(12:03) - Data curation at Databricks

(15:34) - Advice for building LLMs

(24:10) - The future of instruction-tuning datasets

(30:43) - UI innovation

(38:16) - The future of machine learning infrastructure

(42:05) - How SkipFlag would be different with the tools we have today

(47:01) - What Mike has learned since Dolly

 

With your co-hosts:

@jasoncwarner

- Former CTO GitHub, VP Eng Heroku & Canonical

@ericabrescia

- Former COO Github, Founder Bitnami (acq’d by VMWare)

@patrickachase

- Partner at Redpoint, Former ML Engineer LinkedIn

@jacobeffron

- Partner at Redpoint, Former PM Flatiron Health