PLAY PODCASTS
Big Data Analysis with SQL (gpn22)

Big Data Analysis with SQL (gpn22)

Chaos Computer Club - archive feed · Julian

June 2, 202440m 37s

Audio is streamed directly from the publisher (cdn.media.ccc.de) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

This talk explains how you can build your own scalable data processing system with just a few open source tools: DBT, Trino, Iceberg and MinIO. And also why SQL is still the best language for data analysis! Have you ever used PostgreSQL to store *massive* amounts of data? Did your queries take *minutes* or even *hours* to compute? The field of data analysis is rather complex and a ton of solutions are available: therefore I will show how to compare systems with each other. You will learn why databases like PostgreSQL or MongoDB are not suited to compute analytics queries on huge amounts of data. Then we will look at data analysis architectures that are capable of scaling to terabytes of data and I will explain why they are better in those particular situations. At the end of the talk you will know which solution is best suited for your next large-scale data project! about this event: https://cfp.gulas.ch/gpn22/talk/L3SXWL/

Topics

gpn224712024Software & Infrastructure