PLAY PODCASTS
108: PySpark - Jonathan Rioux
Episode 108

108: PySpark - Jonathan Rioux

Test & Code

April 9, 202031m 4s

Audio is streamed directly from the publisher (test-and-code.sfo3.cdn.digitaloceanspaces.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

Apache Spark is a unified analytics engine for large-scale data processing.
 PySpark blends the powerful Spark big data processing engine with the Python programming language to provide a data analysis platform that can scale up for nearly any task.

Johnathan Rioux, author of "PySpark in Action", joins the show and gives us a great introduction of Spark and PySpark to help us decide how to get started and decide whether or not to decide if Spark and PySpark are right you.

Special Guest: Jonathan Rioux.


Links:




Topics

data sciencePySparkPythondata processing