PLAY PODCASTS
The Data Stack Show

The Data Stack Show

502 episodes — Page 5 of 11

168: Decoding Data Mesh: Principles, Practices, and Real-World Applications Featuring Paolo Platter, Zhamak Dehghani, and Melissa Logan

Highlights from this week’s conversation include:Defining data mesh (6:37)Addressing the scale of organizational complexity and usage (9:04)The shift from monolithic to microservices (12:24)The sociological structure in data mesh (13:59)Data product generation and sharing in data mesh (17:27)Data Mesh: Simplifying Data Work (24:09)Getting Started with Data Mesh (29:14)Building products for Data Mesh (36:42)Building a customizable and extensible platform to shape data practice (39:28)The characteristics of a data product (48:40)Defining what a data product is not (50:45)The origin of the term "mesh" in data mesh (53:32)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 13, 202356 min

The PRQL: A Data Mesh Deep Dive with Paolo Platter, Zhamak Dehghani, and Melissa Logan

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation regarding Data Mesh with Paolo Platter, Zhamak Dehghani, and Melissa Logan. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 11, 20233 min

167: Data-Driven Investing and Company Building with Ben Miller of Fundrise

Highlights from this week’s conversation include:Ben’s background in real estate (3:27)Why Fundrise was Started (4:37)Democratizing Investment Opportunities (6:35)Investment Thesis for Venture (11:55)Challenges with Data and Technology (12:34)Importance of Data Model Abstraction (20:03)Data Infrastructure and Investments (23:22)Evolution of Data Engineering (25:12)Closing the Tooling Gap (34:23)The user base segmentation (36:28)The emotional reality of investment decisions (40:50)Data inputs for real estate investment (47:07)The work of data infrastructure (48:28)The limitations of underwriting analysis (49:36)Improving accuracy with data infrastructure (52:43)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 6, 202357 min

The PRQL: Fundrise's Data-Driven Approach to Investment in Real Estate and Tech with Ben Miller

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Ben Miller of Fundrise. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 4, 20233 min

166: Data Processing Fundamentals and Building a Unified Execution Engine Featuring Pedro Pedreira of Meta

Highlights from this week’s conversation include:The concept of composable at a lower level of data infrastructure (1:28)New architectures and components that allow developers to build databases (3:44)Pedro's background and experience in data infrastructure (6:18)The Spectrum of Latency and Analytics (12:59)Different Query Engines for Different Use Cases (16:32)Vectorized vs Code Gen Data Processing (19:33)Vectorization and Code Generation (21:21)Examples of Vectorized Engines (24:33)Rewriting Execution Engine in C++ (27:22)Different Organization of Presto and Spark (33:17)Arrow and its Extensions (37:15)The similarities between analytics and ML (44:33)Offline feature engineering and data preprocessing for training (48:00)Dialect and semantic differences in using Velox for different engines (50:01)The convergence of dialects (52:23)Challenges of substrate and semantics (53:18)Future plans for Velox (58:09)The discussion on evolving Parquet (1:03:38)The integration of the relational model and the tensor model (1:07:29)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 29, 20231h 12m

The PRQL: How Does Composability in Data Infrastructure Differ at Different Levels of Abstraction? Featuring Pedro Pedreira of Meta

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Pedro Pedreira of Meta. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 27, 20236 min

165: SQL Queries, Data Modeling, and Data Visualization with Colin Zima of Omni

Highlights from this week’s conversation include:Colin's Background and Starting Omni (1:48)Defining “good” at Google search early in his career (4:42)Looker's Unique Approach to Analytics (9:48)The paradigm shift in analytics (10:52)The architecture of Looker and its influence (12:04)Combatting the challenge of unbundling in the data stack (14:26)The evolution of analytics engineering (21:50)Enhancing user flexibility in Omni (23:44)The evolution of BI tools (32:53)What does the future look like for BI tools? (35:14)The role of Python and notebooks in BI (39:48)The product experience of Omni and its vision (45:27)Expectations for the future of Omni (47:52)The relationship between algorithms and business logic (50:51)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 22, 202354 min

The PRQL: Building a Data Product for Data People: Looker's Vision and Omni's Future with Colin Zima

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Colin Zima of Omni. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 20, 20231 min

164: How The GTM and Data Teams at Snowflake Work Together with Travis Henry and Hillary Carpio

Highlights from this week’s conversation include:The Unique Perspective of Practitioners (2:10)Account-based Marketing (6:30)Sales Development Representatives (SDR) (8:05)Descriptive, People, and Engagement Data (11:38)Data Overload and Actionable Data (14:20)Working with Data Teams and Internal Data (17:52)The relationship between business and data teams (22:27)The importance of collaboration between marketing and data teams (24:17)Travis and Hillary writing a book (25:33)The taxonomy of personas (34:23)Bucketing and grouping people in data systems (35:37)Account-based marketing and sales alignment (39:00)The data-driven approach and reliance on technology (44:25)Managing complexity in data and account-based marketing (45:35)Adapting to change and evolving data artifacts (51:58)The importance of understanding the business (54:58)Collaboration between data and go-to-market teams (55:56)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 15, 202356 min

The PRQL: Navigating the World of Data Overload with Travis Henry and Hillary Carpio of Snowflake

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Travis Henry and Hillary Carpio of Snowflake. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 13, 20234 min

163: Simplifying Real-Time Streaming with David Yaffe and Johnny Graettinger of Estuary

Highlights from this week’s conversation include:Johnny and David’s background in working together (1:56)The background story of Estuary (4:15)The challenges of ad tech and the need for low latency (5:44)Use cases for moving data at scale (10:35)Real-time data replication methods (11:54)Challenges with Kafka and the birth of Gazette (13:54)Comparing Kafka and Gazette (20:22)The importance of existing streaming tools (22:28)Challenges of managing Kafka and the need for a different approach (23:40)The role of compaction in streaming applications (26:54)The challenge of relaxing state management (34:01)Replication and the problem of data synchronization (36:48)Incremental Back Fills and Risk-Free Production Database (46:03)Estuary as a Platform and Connectors (47:45)The challenges of real-time streaming (57:56)Orchestration in real-time streaming (1:00:51)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 8, 20231h 3m

The PRQL: The Shortcomings of Apache Kafka with David Yaffe and Johnny Graettinger of Estuary

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with David Yaffe and Johnny Graettinger of Estuary. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 6, 20233 min

162: Accelerating Enterprise AI Transformation With Open Source LLMs Featuring Mark Huang of Gradient

Highlights from this week’s conversation include:The potential of AI-driven applications (1:34)The need for hardware infrastructure in AI experimentation (2:40)Oligopoly on the closed side (11:50)Advantages of private side vs. open source (13:18)Leveraging valuable data within enterprises (16:00)The urgency of adopting LLMs in the enterprise (24:02)Expansion of LLMs into new business verticals (25:06)The challenges of operationalizing LLMs (29:32)Seamless experience with OpenAI (37:29)Operationalizing with Gradient (38:36)The early genesis of Gradient (48:53)The democratization of AI through endpoints (51:44)What is the future of language models? (54:07)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Nov 1, 202357 min

The PRQL: How LLMs are Transforming Enterprise Workflows with Mark Huang of Gradient

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Mark Huang of Gradient. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 30, 20233 min

161: The Intersection of Generative AI and Data Infrastructure with Chang She of LanceDB

Highlights from this week’s conversation include:Chang’s background and journey with Pandas (6:26)The persisting challenges in data collection and preparation (10:37)The resistance to change in using Python for data workflows (13:05)AI hype and its impact (14:09)The success and evolution of Pandas as a data framework (20:04)The vision for a next-generation data infrastructure (26:48]LanceDB's file and table format (34:35)Trade-Offs in Lance Format (42:45)Introducing the Vector Database (46:30)The split between production and serving databases (51:14)The importance of unstructured data and multimodal use cases (57:01)The potential of generative AI and the balance between value and hype (1:01:34)Changing expectations of interacting with information systems (1:13:53)Final thoughts and takeaways (1:15:32)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 25, 20231h 21m

The PRQL: How Did Pandas Become a Data Science Powerhouse? Featuring Chang She of Eto Labs

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Chang She of Eto Labs. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 23, 20234 min

160: Closing the Gap Between Dev Teams and Data Teams with Santona Tuli of Upsolver

Highlights from this week’s conversation include:Santona’s journey from nuclear physics to data science (4:59)The appeal of startups and wearing multiple hats (8:12)The challenge of pseudoscience in the news (10:24)Approaching data with creativity and rigor (13:22)Challenges and differences in data workflows (14:39)Schema Evolution and Quality Problems (27:01)Real-time Data Monitoring and Anomaly Detection (30:34)The importance of data as a business differentiator (35:48)The SQL job creation process (46:25)Different options for creating solver jobs (47:20)Adding column-level expectations (50:17)Discussing the differences of working with data as a scientist and in a startup (1:00:18)Final thoughts and takeaways (1:04:01)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 18, 20231h 5m

The PRQL: The Intersection of Physics, Data Science, and Product Development with Santona Tuli of Upsolver

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Santona Tuli of Upsolver. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 16, 20235 min

159: What Is a Vector Database? Featuring Bob van Luijt of Weaviate

Highlights from this week’s conversation include:How music impacted Bob’s data journey (3:16)Music’s relationship with creativity and innovation (11:38)The genesis of Weaviate and the idea of vector databases (14:09)The joy of creation (19:02)OLAP Databases (22:21)The progression of complexity in databases (24:31)Vector database (29:23)Scaling suboptimal algorithms (34:34)The future of vector space representation (35:51)Databases role in different industries (39:14)The brute force approach to discovery (45:57)Retrieval augmented generation (51:26)How generative model interacts with the database (57:55)Final thoughts and takeaways (1:03:20)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 11, 20231h 8m

The PRQL: Enhancing Search and Recommendation Systems with Vector Databases with Bob van Luijt of Weaviate

bonus

In this bonus conversation, Eric and Kostas preview their upcoming conversation with Bob van Luijt of Weaviate. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 9, 20235 min

158: The Orchestration Layer as the Data Platform Control Plane With Nick Schrock of Dagster Labs

Highlights from this week’s conversation include:Nick’s background and journey in data (2:28)Founding Dagster Labs (7:50)The evolution of data engineering (12:32)Fragmentation in data infrastructure (15:04)The role of orchestration in data platforms (19:53)The importance of operational tools for data pipelines (25:01)Lessons learned from working with GraphQL (26:19)The role of the orchestrator in data engineering (34:51)The boundaries between data infrastructure and product engineering (37:33)Different orchestrators in the data infrastructure landscape(42:03)The role of MLOps in data engineering (46:04)Data Quality and Orchestration (51:04)Future of Data Teams and Orchestration (54:27)Final thoughts and takeaways from (58:01)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 4, 20231h 2m

The PRQL: The Power of Data Orchestration: A Game-Changer for Data Infrastructure, Featuring Nick Schrock of Dagster Labs

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Nick Schrock of Dagster Labs. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Oct 2, 20233 min

157: From Search Engine to Answer Engine Using Grounded Generative AI, Featuring Amr Awadallah of Vectara

Highlights from this week’s conversation include:Amr’s extensive background in data (3:23)The evolution of neural networks (9:21)The role of supervised learning in AI (11:17)Explaining Vectara (13:07)Papers that laid the foundation for AI (15:02)Contextualized translation and personalization (20:07)Ease of use and answer-based search (25:01)AI and potential liabilities (35:54)Minimizing difficulties in large language models (36:43)The process of extracting documents in multidimensional space (44:47)Summarization process (46:33)The danger of humans misusing technology (54:59)Final thoughts and takeaways (57:12)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 27, 20231h 3m

The PRQL: How Can Large Language Models Revolutionize Decision-Making? Featuring Amr Awadallah of Vectara

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Amr Awadallah of Vectara. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 25, 20235 min

156: Simple, Performant, Cost-effective Data Streaming with Alex Gallego of Redpanda Data

Highlights from this week’s conversation include:Alex’s background in the data space and the creation of Redpanda (4:23)The cost and complexity of streaming (11:07)The evolution of storage with Kafka (12:04)The distinction between streaming technologies (15:10)Simplicity as a Core Design Principle (27:03)Cost Efficiency in a Cloud Native Era (30:44)Removing complexity with Redpanda (34:21)Migrations and compatibility with Redpanda (40:35)The Future of Redpanda (43:44)The Story Behind Redpanda (46:45)Final thoughts and takeaways (50:25)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 20, 202354 min

The PRQL: Redpanda: Revolutionizing Streaming Systems and Challenging the Kafka Status Quo with Alex Gallego

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Alex Gallego of Redpanda. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 18, 20233 min

155: Bringing Innovation to Enterprise Resource Planning with Emilie Schario of Turbine

Highlights from this week’s conversation include:Emilie’s background and journey in data (3:42)The problem of three-way match (8:56)Operational workflows and how data stacks solve them (13:16)Turbine’s solution as a lightweight ERP (14:05)Workflows and analytics (14:59)Consolidating information into helpful application (27:41)Challenges in operational workflows (32:19)Friction and hurdles in ERP usage (39:28)A solution for purchase order management (40:47)Turbine’s focus and limitations (45:26)Building a software that gets out of the way (52:51)Final thoughts and takeaways (54:25)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 13, 20231h 1m

The PRQL: Making ERP Systems More User-Friendly with Emilie Schario of Turbine

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Emilie Schario of Turbine. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 11, 20236 min

154: Making Cross-Company Data Exchange Easy with Pardis Noorzad of General Folders

Highlights from this week’s conversation include:Pardis’ background and journey in data (3:24)AI before the hype (8:37)Founding General Folders (12:36)Data collaboration challenges (15:31)Examples of data sharing (17:40)Data transfer in various industries (22:16)Defining the transfer problem (28:30)The demand for scalable solutions (32:06)Data transfer and model exposition (41:02)Data governance and API (43:23)Final thoughts and takeaways (56:48)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 6, 20231h 2m

The PRQL: Simplifying Data Collaboration with Pardis Noorzad of General Folders

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Pardis Noorzad of General Folders. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Sep 4, 20235 min

153: The Future of Data Science Notebooks with Jakub Jurových of Deepnote

Highlights from this week’s conversation include:Jakub’s journey into data and working with notebooks (2:43)Overview of Deepnote and its features (7:22)Notebook 1.0 and 2.0 (14:04)Notebook 3.0 and its potential impact (15:46)The need for collaboration across organizations (17:16)Real-time, asynchronous, and organizational collaboration (28:02)Challenges to collaboration (32:03)Notebooks as a universal computational medium (36:14)The rise of exploratory programming (41:40)The power of natural language interface (43:04)The evolving grammar of using notebooks (47:02)Final thoughts and takeaways (55:50)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 30, 202359 min

The PRQL: Exploring the Evolution of Notebooks with Jakub Jurových of Deepnote

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Jakub Jurových of Deepnote. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 28, 20234 min

152: Three Steps To Enhance Product Analytics with Ken Fine of Heap

Highlights from this week’s conversation include:Ken’s background and journey to Heap (2:32)Heap’s problem-solving approach (8:19)Auto-capture and its significance in the marketplace (13:03)Providing qualitative context: sessions and surveys (16:23)Collection and storage of data (25:42) Challenges of real-time data collection (26:40)The true gap in the market today (37:39)Consolidation and aggregation of data solutions (41:58)Simplifying the data stack (47:32)A different approach in engineering and software development (51:12)Skills and Stages in Company Growth (55:58)Final thoughts and takeaways (1:02:52)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 23, 20231h 7m

The PRQL: Auto Tracking in Product Analytics with Ken Fine of Heap

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Ken Fine of Heap. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 21, 20234 min

151: How To Unlock the Data Warehouse for Marketing with Chris Sell of GrowthLoop

Highlights from this week’s conversation include:The need for reverse ETL in marketing (2:24)Closing the gap between engineering, data, and marketing teams (8:37)The analytics persona’s opportunity (11:53)Interface layer (13:06)Approach to messy warehouse data (15:57)The need for a complicated infrastructure (28:43)Challenges in data integration for marketers (29:26)The evolution of the analytics stack (31:53)Orchestration of the data warehouse (38:39)The role of marketing tools (40:35)Generating custom assets (46:27)The shift towards making data processes easier (48:13)Final thoughts and takeaways (49:23)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 16, 202353 min

The PRQL: How Can Reverse ETL Revolutionize Marketing Data Management? Featuring Chris Sell of GrowthLoop

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Chris Sell of GrowthLoop. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 14, 20233 min

150: How Salespeople Use Data, Salesforce vs. Snowflake, and How LLMs Are Transforming Sales with Brendan Short of Groundswell

Highlights from this week’s conversation include:Brendan’s background and journey to Groundswell (2:25)The impact of generative AI on sales reps and product building (5:38)Lead sourcing challenges (12:22)Salesforce as a data model (14:30)The need for guardrails in building applications around sales (24:37)The question of interfaces in the layers of Salesforce (26:11)A UI solution for sales and marketing (30:45)The future of logic and machine learning models (37:11)The battle for data ownership (39:36)Actioning data and the role of refineries (46:03)The potential for decentralized systems using generative AI (46:59)Product building for the future (57:47)Building trust in data tools (59:10)The era of innovation (1:09:20)Final thoughts and takeaways (1:10:43)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 9, 20231h 12m

The PRQL: Generative AI Transforming the Sales Process Featuring Brendan Short of Groundswell

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Brendan Short of Groundswell. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 7, 20235 min

149: Turning Tables Into APIs for Real-time Data Apps, Featuring Matteo Pelati and Vivek Gudapuri of Dozer

Highlights from this week’s conversation include:Building Dozer: Simplifying Data Sources into APIs (1:13)Bridging Data Engineering with Application Engineering (4:19)Turning Data Sources into APIs (7:46)The cost of caching (12:59)Challenges with legacy systems (14:30)Real-time data integration (19:31)YAML and SQL experience (25:37)Behind the scenes of Dozer (29:18)Heavy Workloads and Low Latency (42:00)Use Cases of Dozer (45:51)Reliability and storing data from different connectors (51:35)Importance of observability in serving data to customers (53:24)Final thoughts and takeaways (56:34)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Aug 2, 20231h 3m

The PRQL: Turning Data Into an API with Matteo Pelati and Vivek Gudapuri of Dozer

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Matteo Pelati and Vivek Gudapuri of Dozer. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 31, 20236 min

148: Exploring the Intersection of DAGs, ML Code, and Complex Code Bases: An Elegant Solution Unveiled with Stefan Krawczyk of DAGWorks

Highlights from this week’s conversation include:Stefan’s background in data (2:39)What is DAGWorks? (3:55)How building point solutions influenced Stefan’s journey (5:03)Solving the tooling problems of self-service at an organization (11:44)Creating Hamilton (15:53)How Hamilton works with definitions and time-series data (19:34)What makes Hamilton an ML-oriented framework? (23:39)Navigating the differences between ML teams and other data teams (26:27)Understanding the fundamentals of Hamilton (28:25)Dealing with types and conflicts in programming (33:18)How Hamilton helps improve pipelines and maintaining data (37:11)Why unit testing is important for a data scientist (44:54)The ups and downs of founding building a data solution (46:32)Connecting with DAGWorks and trying out Hamilton (50:01)Final thoughts and takeaways (52:46)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 26, 202357 min

The PRQL: A Methodology for Better DAGs with Stefan Krawczyk of DAGWorks

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 24, 20234 min

Shop Talk: Snowflake Summit Recap

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 21, 202320 min

147: Where Data and Infrastructure Converge Featuring Lars Kamp of Resoto

Highlights from this week’s conversation include:Lars work on Resoto in helping to cut cloud costs for organizations (2:02)The trend of large resources to micro resources (5:59)What are some of the typical resource drains in data infrastructure (8:56)Managing cost on the backend with scale and experimentation (12:51)Solutions for resource management problems (17:38)How Resoto is solving pain points in resource management (26:17)Navigating the complexities of data infrastructure (29:01)Resoto’s solution for interpreting difficult cloud data products (36:35)Exploring relationships of data points and finding solutions (43:40)Querying in graph database (47:46)How to go from graph to SQL (49:13)How can data teams plan for costs in the coming years (50:53)Final thoughts and takeaways (53:49)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 19, 202358 min

The PRQL: Cloud Resource Management Is a Data Problem Featuring Lars Kamp of Resoto

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Lars Kamp of Resoto. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 17, 20233 min

146: What Is a Customer Data Platform? Featuring Soumyadeb Mitra of Rudderstack

Highlights from this week’s conversation include:Soumyadeb’s background and journey in data (5:49)Defining customer data (8:10)The complexity of customer data collection (10:04)What is a CDP and how it is properly deployed (17:12)Bridging the gap of data collection and useful analytics for marketing (21:46)How Rudderstack translates data and the new profile feature (25:30)The foundations of data in building a 360 degree customer profile (30:30)Solutions for the intersection between engineering and business users (34:35)How AI and other future technologies will impact data (41:14)Final thoughts and takeaways (46:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 12, 202351 min

The PRQL: Building Data Products for Multiple Personas with Soumyadeb Mitra of Rudderstack

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Soumyadeb Mitra of Rudderstack. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 10, 20234 min

145: What is Synthetic Data? Featuring Omar Maher of Parallel Domain

Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 5, 202358 min

The PRQL: Synthetic Data and Self Driving Cars with Omar Maher of Parallel Domain

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jul 3, 20235 min

144: Explaining Features, Embeddings, and the Difference Between ML and AI with Simba Khadder of Featureform

Highlights from this week’s conversation include:Simba’s background in the data space (3:05)Subscription intelligence (6:41)ML and Distributed Systems (9:09)The Brutal Subscription Industry (12:31)Serendipity in Recommender Systems (16:31)Subscription as a Strategy (20:47)Customizing Content for Subscribers (22:19)Creating User Embeddings (25:53)Building Featureform (28:01)Embedding Projections (32:47)Spaces and similarity (35:53)User embeddings and transformer models (38:22)Vector Databases for AI/ML (45:05)Orchestrating Transformations in Featureform (51:00)Impact of new technologies on feature stores (56:17)Embeddings and the future of ML (59:20)The gap between ML and business logic (1:02:26)Final thoughts and takeaways (1:06:37)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jun 28, 20231h 11m