PLAY PODCASTS
183: Why Modern Data Quality Must Move Beyond Traditional Data Management Practices with Chad Sanderson of Gable.ai

183: Why Modern Data Quality Must Move Beyond Traditional Data Management Practices with Chad Sanderson of Gable.ai

This week on The Data Stack Show, Eric and Kostas chat with Chad Sanderson, the CEO at Gable.ai. During the episode, Chad discusses the complexities of managing the data supply chain, emphasizing the importance of data quality, feedback loops, and aligning incentives within organizations. He shares his journey from analyst to data infrastructure leader at companies like Oracle, Sephora, and Microsoft. Chad introduces his company, Gable, which tackles upstream data quality issues. He critiques traditional data catalogs and advocates for a more dynamic, decentralized approach. The conversation explores the role of metadata, the integration of data quality checks in the software development lifecycle, the need for cultural shifts towards data responsibility, the significance of full lineage graphs and semantic metadata, treating data as a product with quality gates, and more.

The Data Stack Show

March 27, 20241h 2m

Audio is streamed directly from the publisher (afp-928695-injected.calisto.simplecastaudio.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

Highlights from this week’s conversation include:

  • Chad’s background and journey in data (0:46)
  • Importance of Data Supply Chain (2:19)
  • Challenges with Modern Data Stack (3:28)
  • Comparing Data Supply Chain to Real-world Supply Chains (4:49)
  • Overview of Gable.ai (8:05)
  • Rethinking Data Catalogs (11:42)
  • New Ideas for Managing Data (15:16)
  • Data Discovery and Governance Challenges (18:51)
  • Static Code Analysis and AI Impact on Data (24:55)
  • Creating Contracts and Defining Data Lineage (27:31)
  • Data Quality Issues and Upstream Problems (32:32)
  • Challenges with Third-Party Vendors and External Data (34:29)
  • Incentivizing Engineers for Data Quality (40:28)
  • Feedback Loops and Actionability in Data Catalogs (45:30)
  • Missing metadata (48:57)
  • Role of AI in data semantics (50:27)
  • Data as a product (54:26)
  • Slowing down to go faster (57:38)
  • Quantifying the cost of data changes (1:01:24)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.