Episode 1358

DBT: Data Build Tool with Tristan Handy

Software Engineering Daily · softwareengineeringdaily.com

March 9, 20201h 2m

Audio is streamed directly from the publisher (traffic.megaphone.fm) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Original episode page

Show Notes

A data warehouse serves the purpose of providing low latency queries for high volumes of data. A data warehouse is often part of a data pipeline, which moves data through different areas of infrastructure in order to build applications such as machine learning models, dashboards, and reports.

Modern data pipelines are often associated with the term “ELT” or Extract, Load, Transform. In the “ELT” workflow, data is taken out of a source such as a data lake, loaded into a data warehouse, and then transformed within the data warehouse to create materialized views on the data. Data warehouse queries are usually written in SQL, and for the last 50 years, SQL has been the primary language for executing these kinds of queries.

DBT is a system for data modeling that allows the user to write queries that involve a mix of SQL and a templating language called Jinja. Jinja allows the analyst to blend imperative code along with the declarative SQL. Tristan Handy is the CEO of Fishtown Analytics, the company that created DBT, and he joins the show to discuss how DBT works, and the role it plays in modern data infrastructure.

← All episodes of Software Engineering Daily