PLAY PODCASTS
The Data Stack Show

The Data Stack Show

502 episodes — Page 4 of 11

The PRQL: From Programming Tic Tac Toe to Building an Operating System for Natural Language Programs With Binny Gill of Kognitos

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jun 3, 20242 min

191: From Amazon to Consulting: Time Series Forecasting and How to Communicate Data Analytics Insights with David McCandless of McCandless Consulting

Highlights from this week’s conversation include:David's Background and Journey in Data (0:30)Transition to Time Series Forecasting (2:03)Working on Time Series Forecasting at Amazon (2:55)Challenges and Experience in Time Series Forecasting (4:32)Transitioning to a New Role at Amazon (5:52)Tools and Methods for Time Series Forecasting (8:17)Forecasting Impact and Accuracy (15:30)Explaining Variance and Lessons Learned (18:58)Understanding Downstream Consumers and Empathy for Business Leaders (20:36)Amazon's Culture and Decision-Making Process (24:27)Assimilating into Amazon's Culture (26:04)Interpreting Data for Business Stakeholders (28:34)Consulting for Small Businesses (30:28)Challenges in Automation and Maintenance (32:18)Analyzing Financial Metrics for Small Businesses (34:51)Tooling and Data Solutions for Small Businesses (39:52)Empowering Small Businesses with Data (46:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 29, 202449 min

The PRQL: Practical Applications for Time Series Forecasting with David McCandless of McCandless Consulting

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 28, 20242 min

190: Aligning Data Teams and Data Tools With Business Needs Featuring Ben Rogojan, the Seattle Data Guy

Highlights from this week’s conversation include:Ben’s background and journey in data (0:18)Relating data to business outcomes (2:33)Facebook's approach to data-driven business outcomes (4:43)Subjectivity and data-driven business outcomes (8:43)Infrastructure and data collection at Facebook (12:04)The importance of first-party data and the death of third-party cookies (16:27)Facebook's Data and Attribution Challenges (20:08)Facebook's Infrastructure and Tooling (23:41)Differences in Data Approaches (28:26)Challenges of Data Project Alignment with Business Outcomes (32:58)Integration of Data into Tools and Partnerships (35:12)Building Alliances with Embedded Data Analysts (38:08)Budgeting for Data Teams (40:02)Healthy Team Dynamics and Budgeting (44:18)Data Team Reporting Structure (46:23)Connecting with Ben and More Content (50:55)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 22, 202452 min

The PRQL: Data Success From Mid-market to Enterprise with Ben Rogojan

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 20, 20242 min

189: Customer Data Modeling, The Data Warehouse, Reverse ETL, and Data Activation with Ryan McCrary of RudderStack

Highlights from this week’s conversation include:Ryan's Background and Roles in Data (0:05)Data Activation and Dashboard Staleness (1:27)Profiles and Data Activation (2:54)Customer-Facing Experience and Product Management (3:40)Profiles Product Overview (5:10)Use Cases for Profiles (6:44)Challenges with Data Projects (9:19)Entity Management and Account Views (15:33)Handling Entities and Duplicates (17:55)Challenges in Entity Management (22:18)Product Management and Data Solutions (26:08)Reverse ETL and Data Movement (31:58)Accessibility of Data Warehouses (36:14)Profiles and Entity Features (37:47)Cohorts Creation and Use Cases (41:17)Customer Data and Targeting (43:09)Activations and Reverse ETL (45:57)ML and AI Use Cases (55:53)Data Activation and ML Predictions (57:02)Spicy Take and Future Product Features (59:47)ETL Evolution and Cloud Tools (1:00:50)Unbundling and Future Trends (1:02:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 16, 20241h 3m

The PRQL: How to Get Business Teams Closer to Customer Data (The Right Way)

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 13, 20243 min

188: How To Invest in Data Infrastructure and Data Projects That Create Business Value with Matthew Kelliher-Gibson of Rudderstack

Highlights from this week’s conversation include:Matt KG’s Background in Data (0:35)Challenges in purchasing data tools (1:28)Early experiences in data analysis (9:51)Matt’s Transition to a subprime auto loan company (13:19Transition to RudderStack and software purchase decisions (17:36)Tech Problems: People and Process (22:02)Challenges in Purchasing Data Tools (22:55)Budget Constraints and Purchasing Decisions (24:46)Challenges with Platform Documentation (26:55)Metrics and Cost Efficiency (30:11)Risk and Conviction in Purchasing Decisions (32:53)Justification and Value Creation (38:17)Connecting Data to Business Value (42:03)Navigating Business Relationships (46:25)Empowering Analysts (49:54)Relational Capital and Team Competency (51:29)Final thoughts and takeaways (54:16)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 8, 202456 min

The PRQL: Navigating the Procurement Process for Data Infrastructure Tooling With Matthew Kelliher-Gibson of Rudderstack

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 6, 20242 min

187: Startup Lessons and Torch Passing with Kostas Pardalis

Highlights from this week’s conversation include:Kostas Passes the Baton as Co-Host of the Podcast (0:24)Reflecting on the Podcast (2:56)New Co-Host John Wessel and His Background in Data (4:34)Kostas Journey in Data (10:55)Rudderstack's Explosive Growth (21:28)The Podcast's Inception and Marketing Activities (24:19)Evolution of the podcast (27:22)Memorable guests and experiences (28:29)Connecting with industry leaders and key innovators in the space (33:05)Kostas' new venture (36:26)Advice for the new co-host (42:17)Final Thoughts and Takeaways (44:47)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

May 1, 202446 min

The PRQL: Why Is Kostas a Guest on His Own Podcast?

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 29, 20243 min

186: Data Fusion and The Future Of Specialized Databases with Andrew Lamb of InfluxData

Highlights from this week’s conversation include:The Evolution of Data Systems (0:47)The Role of Open Source Software (2:39)Challenges of Time Series Data (6:38)Architecting InfluxDB (9:34)High Cardinality Concepts (11:36)Trade-Offs in Time Series Databases (15:35)High Cardinality Data (18:24)Evolution to InfluxDB 3.0 (21:06)Modern Data Stack (23:04)Evolution of Database Systems (29:48)InfluxDB Re-Architecture (33:14)Building an Analytic System with Data Fusion (37:33)Challenges of Mapping Time Series Data into Relational Model (44:55)Adoption and Future of Data Fusion (46:51)Externalized Joins and Technical Challenges (51:11)Exciting Opportunities in Data Tooling (55:20)Emergence of New Architectures (56:35)Final thoughts and takeaways (57:47)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 24, 202458 min

The PRQL: Open Source and the Evolution of Data Systems with Andrew Lamb of InfluxData

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 22, 20243 min

Data Council Week: A Decade of Supporting the Data Community with Pete Soderling

bonus

Highlights from this week’s conversation include:Pete’s background and the origin story of Data Council (1:04)Reflecting on 10 years of Data Council (2:07)Impact of the pandemic on conferences (5:25)Rebuilding after the pandemic (7:42)Evolution of Data Council (10:33)Balancing content and sponsorship (16:17)Selecting speakers and content at Data Council (19:39)Highlights from the conference this year (21:58)Realization of AI Future (22:45)Embracing AI at Data Council (23:31)Announcement of Prime Ventures (25:43)Improving Data Council (27:45)Trends and Technologies (29:46)Cautions for Startups (31:47)Connecting with Data Council and Final Takeaways (33:09)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 18, 202436 min

Data Council Week: AI Isn’t Just Hype - How To Successfully Apply LLMs Today with Tristan Zajonc of Continual

bonus

Highlights from this week’s conversation include:Tristan's Background and Journey into Data (1:14)Evolution of Machine Learning and AI (3:13)Impact of Generative AI (6:33)MLOps and Challenges in Early Data Science (8:48)Success and Applications of AI Today (11:34)Continual AI Copilot Platform (18:04)Challenges in building remarkable AI assistants (19:58)Reliability and accuracy in AI responses (25:31)Regulation and adoption of AI assistants (31:30)Future of AI assistants and Continual AI (33:12)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 17, 202435 min

Data Council Week: How To Do Self-Service Data Analytics and Business Intelligence Right with Ryan Dolley of GoodData

bonus

Highlights from this week’s conversation include:Ryan’s background in data (0:58)Transition from Performing Arts to Data (2:23)Understanding End Users in Data Projects (6:08)Learning from Failures in Data Projects (8:07)The self-service era (19:50)Struggles of self-service (21:23)The disillusion with dashboards (26:23)GoodData's approach (30:06)Merging wisdom with modern approach (31:50)User experience with GoodData (34:05)Defining metrics and AI (36:35)Connecting with Ryan and GoodData (39:26)Final thoughts and takeaways (41:06)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 15, 202442 min

185: The Evolution of Data Processing, Data Formats, and Data Sharing with Ryan Blue of Tabular

Highlights from this week’s conversation include:The Evolution of Data Processing (2:36)Ryan’s Background and Journey in Data (4:52)Challenges in Transitioning to S3 (8:47)Impact of Latency on Query Performance (11:43)Challenges with Table Representation (15:26)Designing a New Metadata Format (21:36)Integration with Existing Tools and Open Source Project (24:07)Initial Features of Iceberg (26:11)Challenges of Manual Partitioning (31:49)Designing the Iceberg Table Format (37:31)Trade-offs in Writing Workloads (47:22)Database Systems and File Systems (55:00)Vendor Influence on Access Controls (1:01:58)Restructuring Data Security (1:03:39)Delegating Access Controls (1:07:22)Column-level Access Controls (1:14:19)Exciting Releases and Future Plans (1:17:47)Centralization of Components in Data Infrastructure (1:25:37)Fundamental Shift in Data Architecture (1:28:28)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 10, 20241h 29m

The PRQL: The Two Parallel Tracks of Development In Data Processing with Ryan Blue of Tabular

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 8, 20244 min

184: Kafka Streams and Operationalizing Event Driven Applications with Apurva Mehta of Responsive

Highlights from this week’s conversation include:Apruva’s background in streaming technology (0:48)Developer experience and Kafka streams (2:47)Motivation to bootstrap a startup (4:09)Meeting the Confluent founders and early work at Confluent (6:59)Projects at Confluent and transition to engineering management (10:34)Overview of Responsive and event-driven applications (12:55)Defining event-driven applications (15:33)Importance of latency and state in event-driven applications (18:54)Low Latency and Stateful Processing (21:52)In-Memory Storage and Evolution of Kafka (25:02)Motivation for KSQL and Kafka Streams (29:46)Category Creation and Database-like Interface (34:33)Developer Experience with Kafka and Kafka Streams (38:50)Kafka Streams Functionality and Operational Challenges (41:44)Metrics and Tuning Configurations (43:33)Architecture and Decoupling in Kafka Streams (45:39)State Storage and Transition from RocksDB (47:48)Future of Event-Driven Architectures (56:30)Final thoughts and takeaways (57:36)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 3, 202458 min

The PRQL: Event-Driven Applications: Where Low Latency Meets High Impact with Apurva Mehta of Responsive

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Apr 1, 20243 min

183: Why Modern Data Quality Must Move Beyond Traditional Data Management Practices with Chad Sanderson of Gable.ai

Highlights from this week’s conversation include:Chad’s background and journey in data (0:46)Importance of Data Supply Chain (2:19)Challenges with Modern Data Stack (3:28)Comparing Data Supply Chain to Real-world Supply Chains (4:49)Overview of Gable.ai (8:05)Rethinking Data Catalogs (11:42)New Ideas for Managing Data (15:16)Data Discovery and Governance Challenges (18:51)Static Code Analysis and AI Impact on Data (24:55)Creating Contracts and Defining Data Lineage (27:31)Data Quality Issues and Upstream Problems (32:32)Challenges with Third-Party Vendors and External Data (34:29)Incentivizing Engineers for Data Quality (40:28)Feedback Loops and Actionability in Data Catalogs (45:30)Missing metadata (48:57)Role of AI in data semantics (50:27)Data as a product (54:26)Slowing down to go faster (57:38)Quantifying the cost of data changes (1:01:24)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 27, 20241h 2m

The PRQL: The Data Supply Chain with Chad Sanderson of Gable.ai

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 25, 20247 min

182: Building a Dynamic Data Infrastructure at Enterprise Scale Featuring Kevin Liu of Stripe

Highlights from this week’s conversation include:Kevin’s background and work at Stripe (0:31)Evolution of Data Infrastructure at Stripe (2:18)Kevin's Interest in Data (5:29)Software Engineer or Data Engineer? (8:27)Speech Recognition Work at Amazon (11:06)Efficiency and Cost Management (15:50)Metadata and Query Analysis (18:38)Surprising Discoveries in Metadata Analysis (21:43)Optimizing Cost and Value (23:55)Product Sizing Stripe Data (26:39)Popular Tool for Data Interaction (30:08)Enabling Data Infrastructure Integration (35:22)Value of Data Pipelining for Stripe (39:32)Next Generation Product and Technology (43:54)Maximizing value in a decentralized environment (51:34)Future of open source projects in data infrastructure (57:59)Final thoughts and takeaways (59:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 20, 20241h 0m

The PRQL: Exploring the Intersection of Software Engineering and Data Management with Kevin Liu of Stripe

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 18, 20246 min

181: OLAP Engines and the Next Generation of Business Intelligence with Mike Driscoll of Rill Data

Highlights from this week’s conversation include:Michael’s background and journey in data (0:33)The origin story of Druid (2:39)Experiences and growth in Data (8:08)Druid's evolution (21:46)Druid's architectural decisions (26:32)The user experience (30:06)The developer experience (35:14)The evolution of BI tools (40:55)Data architecture and integration (47:53)AI's impact on BI (52:26)What would Mike be doing if he didn’t work in data? (56:27)Final thoughts and takeaways (57:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 13, 202459 min

The PRQL: Making the Data Stack Serverless in the Cloud with Mike Driscoll of Rill Data

bonus

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 11, 20245 min

180: Data Observability and AI for Data Operations Featuring Kunal Agarwal of Unravel Data

Highlights from this week’s conversation include:The evolution of data operations (1:13)Unravel's role in simplifying data operations (2:17)Kunal’s journey from fashion to enterprise data management (5:23)\The Unravel platform and its components (10:08)Challenges in data operations at scale (16:34)Users of Unravel within an organization (22:32)Calculating ROI on data products (25:55)Understanding the cost of data operations (27:01)Measuring productivity and reliability (30:59)Diversity of technologies in data operations (34:52)Efficiency in cost management (44:15)Implementing observability in AI (47:55)Challenges of AI Adoption (50:17)Final thoughts and takeaways (51:36)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 6, 202453 min

The PRQL: What’s Driving The Evolution of Data Operations? Featuring Kunal Agarwal of Unravel Data

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Kunal Agarwal of Unravel Data. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Mar 4, 20245 min

179: Time Series Data Management and Data Modeling with Tony Wang of Stanford University

Highlights from this week’s conversation include:Tony's background and research focus (3:35)Challenges in academia and industry (6:15)Ph.D. student's routine (10:47)Academic paper review process (15:26)Aha moments in research (20:05)Academic lab structure (23:09)The decision to move from hardware to data research (24:43)Research focus on time series data management (27:40)Data modeling in time series and OLAP systems (32:01)Issues and potential solutions for parquet format (37:32)Role of external indices in parquet files (42:19)Tony's open source project (47:11)Final thoughts and takeaways (49:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 28, 202450 min

The PRQL: How is Academic Research Shaping the Future of Data Processing Systems? Featuring Tony Wang of Stanford University

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Tony Wang of Stanford University. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 26, 20243 min

178: How to Build a Data Stack to Win PLG, Featuring Peter Chapman

Highlights from this week’s conversation include:Peter's background and journey in data (0:26)Introduction to PLG (4:18)Starting in data at Heroku (6:05)Building the data stack at Heroku (8:13)Data stack requirements for early-stage companies (12:00)Differentiating PLG companies from open source companies (19:26)Venture capital and open source as a lever for growth (22:56)Initial data modeling and analysis (25:38)Operationalizing Data (29:16)Sales and Marketing Operationalization (31:52)Identifying Signals (34:16)Challenges in Developing Signals (37:07)Account Management for Developer Tools (42:30)Challenges in Achieving Margins (45:02)Leveraging Infrastructure for Margins (47:35)Inference vs Training (54:55)Final thoughts and takeaways (57:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 21, 202457 min

The PRQL: Building a Future-Proof Data Stack from Day Zero? Featuring Peter Chapman

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Peter Chapman, a GTM consultant. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 19, 20245 min

177: AI-Based Data Cleaning, Data Labelling, and Data Enrichment with LLMs Featuring Rishabh Bhargava of refuel

Highlights from this week’s conversation include:The overview of refuel (0:33)The evolution of AI and LLMs (3:51)Types of LLM models (12:31)Implementing LLM use cases and cost considerations (00:15:52)User experience and fine-tuning LLM models (21:49)Categorizing search queries (22:44)Creating internal benchmark framework (29:50)Benchmarking and evaluation (35:35)Using refuel for documentation (44:18)The challenges of analytics (46:45)Using customer support ticket data (48:17)The tagging process (50:18)Understanding confidence scores (59:22)Training the model with human feedback (1:02:37)Final thoughts and takeaways (1:05:48)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 14, 20241h 7m

The PRQL: Exploring the Evolution of AI and ML with Rishabh Bhargava of refuel

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Rishabh Bhargava of refuel. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 12, 20244 min

176: The Fundamentals of Event-Driven Orchestration and How Generative AI Is Shaping Its Future with Viren Baraiya of orkes.io

Highlights from this week’s conversation include:Viren’s background in data (0:39)Evolution of Orchestration (1:52)AI Orchestration (3:00)Understanding Conductor and orkes (6:26)Event-Driven Orchestration (8:10)Viren’s Transition to Founder (12:27)Non-Technical Aspects of Being a Founder (15:50)Democratizing AI for Developers (18:16)The evolution of microservices orchestration (21:56)Challenges in appealing to the 99% developer group (24:32)Value of orchestration for developers (30:31)Role of orchestrators in managing faults (37:37)The intersection of AI and orchestration (40:27)Evolution of AI (44:04)Thriving in AI Environment (47:58)Final thoughts and takeaways (51:25)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 7, 202453 min

The PRQL: The Evolution of Application Orchestration Featuring Viren Baraiya of orkes.io

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Viren Baraiya of orkes.io. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 5, 20244 min

175: The Parts, Pieces, and Future of Composable Data Systems, Featuring Wes McKinney, Pedro Pedreira, Chris Riccomini, and Ryan Blue

Highlights from this week’s conversation include:Introduction of the panel (0:05)Defining composable data stack (5:22)Components of a composable data stack (7:49)Challenges and incentives for composable components (10:37)Specialization and modularity in data workloads (13:05)Organic evolution of composable systems (17:50)Efficiency and common layers in data management systems (22:09)The IR and Data Computation (23:00)Components of the Storage Layer (26:16)Decoupling Language and Execution (29:42)Apache Calcite and Modular Frontend (36:46)Data Types and Coercion (39:27)Describing Data Sets and Schema (42:00)Open Standards and Frontiers (46:22)Challenges of standardizing APIs (48:15)Trade-offs in building composable systems (54:04)Evolution of data system composability (56:32)Exciting new projects in data systems (1:01:57)Final thoughts and takeaways (1:17:25)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 31, 20241h 18m

The PRQL: Exploring the Evolution, Challenges, and Benefits of Composable Data Stacks Featuring Wes McKinney, Pedro Pedreira, Chris Riccomini, and Ryan Blue

bonus

In this bonus episode, Eric and Kostas preview their upcoming discussion with a panel of experts as Wes McKinney (Co-Founder, Voltron), Pedro Pedreira Software Engineer, Meta), Chris Riccomini (Seed Investor, various startups), and Ryan Blue (Co-Founder and CEO, Tabular) join the show. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 29, 20244 min

174: Does Your Data Stack Need a Semantic Layer? Featuring Artyom Keydunov of Cube Dev

Highlights from this week’s conversation include:Artyom’s background in the data space (0:32)The growth and changes at Cube (5:58)Pain points of managing metrics definitions across different tools (9:39)Trade-offs between coupled and decoupled semantic layers (12:12)Making a case for implementing a semantic layer (14:17)The evolution of semantic layers (23:28)Challenges in designing a decoupled semantic layer (24:16)Different approaches to solving the interface problem (26:58)Implementing a SQL engine in Cube (35:58)Overhead and debugging in semantic layers (39:08)The semantic layer and its importance (46:26)The need for semantics in data products (47:34)What’s the future of semantic layers and user experience? (51:49)Final thoughts and takeaways (57:34)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 24, 202458 min

The PRQL: Why is a Semantic Layer Important in the Modern Data Stack? Featuring Artyom Keydunov of Cube Dev

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Artyom Keydunov of Cube Dev. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 22, 20243 min

173: Data Analytics Is a Team Sport, Featuring Jay Henderson of Alteryx

Highlights from this week’s conversation include:No Code Analytics (1:22)Analytics as a Team Sport (2:31)The workflow of someone without Alteryx (11:27)Alteryx's ability to handle diverse data sources (14:32)The balance between ease of use and complexity (23:06)Enabling casual end users with a no code interface (24:19)Taking analytics to the data (31:47)The boundaries between data engineers and end users (33:44)The importance of collaboration in analytics (34:12)The potential of every employee being a data worker (35:28)The human nature of the product and users in large enterprises (00:45:38)Final thoughts and takeaways (46:21)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 17, 202446 min

The PRQL: Bridging the Gap Between Messy Data and Sophisticated Analytics with Jay Henderson of Alteryx

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Jay Henderson of Alteryx. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 15, 20243 min

172: How WebAssembly is Enabling the Third Wave of Cloud Compute with Matt Butcher of Fermyon Technologies

Highlights from this week’s conversation include:Matt’s background and journey with Fermyon (2:32)WebAssembly and enhanced security models (3:43)The IOT Startup and Google Acquisition (10:49)Google's Early Containers (11:50)Scaling and anticipating requests (20:22)Introduction to WebAssembly and its importance (23:32)The Benefits of WebAssembly (30:57)Comparison of Virtual Machines, Containers, and Micro VMs (33:12)The Importance of Fast Startup Times in WebAssembly (37:39)Metaphysics and software development (42:12)The importance of effective communication in code development (43:18)The challenges and progress of WebAssembly (47:40)Requirements of different teams and different jobs (52:17)Final thoughts and takeaway (53:14)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 10, 202456 min

The PRQL: WebAssembly: The Future of Cloud Workloads Made Simple with Matt Butcher of Fermyon Technologies

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Matt Butcher of Fermyon Technologies. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 8, 20244 min

171: Machine Learning Pipelines Are Still Data Pipelines with Sandy Ryza of Dagster

Highlights from this week’s conversation include:The role of an orchestrator in the lifecycle of data (1:34)Relevance of orchestration in data pipelines (00:02:45)Changes around data ops and MLOps (3:37)Data Cleaning (11:42)Overview of Dagster (13:50)Assets vs Tasks in Data Pipeline (19:15)Building a Data Pipeline with Dexter (25:40)Difference between Data Asset and Materialized Dataset (28:28)Defining Lineage and Data Assets in Dagster (29:32)The boundaries of software and organizational structures (37:25)The benefits of a unified orchestration framework (39:56)Orchestration in the development phase (45:29)The emergence of analytics engineer role (51:53)Fluidity in data pipeline and infrastructure roles (52:40)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 3, 202455 min

The PRQL: Does Machine Learning Need Its Own Orchestrator? Featuring Sandy Ryza of Dagster

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Sandy Ryza of Dagster. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 2, 20243 min

170: Discussing Data Roles and Solving Data Problems with Katie Bauer of GlossGenius

Highlights from this week’s conversation include:The evolution of the data scientist role (1:03)Common problems in different companies (2:05)Measuring and curating content on Reddit (4:29)The challenges of working with unstructured content at Reddit and Twitter (11:03)Lessons learned from Reddit and applying them at Twitter (13:17)Data challenges and customer behavior analysis at GlossGenius (20:16)How the data scientist's role has changed over time (00:25:10)The essence of the data scientist/engineer role (29:00)Dynamics and overlaps between different data roles (32:09)The perfect data team for Twitter (34:19)Building a data team at a startup like GlossGenius (36:36)The right time to bring in a dedicated data person in a startup (38:52)The analytics engineer role (46:25)Challenges in implementing telemetry (50:31)Final thoughts and takeaways (52:24)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 27, 202353 min

The PRQL: What is a Data Scientist? Featuring Katie Bauer of GlossGenius

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Katie Bauer of GlossGenius. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 26, 20232 min

169: Data Models: From Warehouse to Business Impact with Tasso Argyros of ActionIQ

Highlights from this week’s conversation include:The Evolution of Databases and Data Systems (2:33)Abstracting Data for Business Users (4:31)Building a Database for Google-like Search (7:58)The Big Data Explosion (11:10)Selling Myspace as First Customer (13:14)Starting ActionIQ (16:57)The customer-centric organization (22:46)Transitioning to customer data focus (23:53)Understanding business users' needs (28:30)Supporting Arbitrary Queries and Data Models (34:42)Unique Technical Perspective of Clickstream Data (37:01)The value per terabyte of data (46:45)Building a product for multiple personas (50:45)Composability and Benefits (58:05)Evolution of Storage and Compute (1:00:09)Composability and Treasure Data (1:02:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 20, 20231h 5m

The PRQL: From Databases to Customer Data Platforms with Tasso Argyros of ActionIQ

bonus

In this bonus episode, Eric and Kostas preview their upcoming conversation with Tasso Argyros of ActionIQ. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 18, 20236 min