PLAY PODCASTS
Packaging Data Analyses & Using pandas GroupBy
Episode 217

Packaging Data Analyses & Using pandas GroupBy

The Real Python Podcast · Real Python

August 16, 202455m 22s

Audio is streamed directly from the publisher (dts.podtrac.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

<p>What are the best practices for organizing data analysis projects in Python? What are the advantages of a more package-centric approach to data science? Christopher Trudeau is back on the show this week, bringing another batch of PyCoder&rsquo;s Weekly articles and projects.</p> <p>We discuss Joshua Cook&rsquo;s recent article &ldquo;How I Use Python to Organize My Data Analyses.&rdquo; The article covers how his process for building data analysis projects has evolved and now incorporates modern Python packaging techniques. </p> <p>Christopher shares his recent video course on grouping real-world data with pandas. The course offers a quick refresher before digging into how to use pandas GroupBy to manipulate, transform, and summarize data.</p> <p>We also share several other articles and projects from the Python community, including a news roundup, working with JSON data in Python, running an Asyncio event loop in a separate thread, knowing the why behind a system&rsquo;s code, a retro game engine for Python, and a project for vendorizing packages from PyPI.</p> <p>This episode is sponsored by Mailtrap.</p> <div class="alert alert-primary" role="alert"> <p><strong>Course Spotlight:</strong> <a href="https://realpython.com/courses/pandas-groupby-real-world-data/">pandas GroupBy: Grouping Real World Data in Python</a></p> <p>In this course, you&rsquo;ll learn how to work adeptly with the pandas GroupBy while mastering ways to manipulate, transform, and summarize data. You&rsquo;ll work with real-world datasets and chain GroupBy methods together to get data into an output that suits your needs.</p> </div> <p>Topics:</p> <ul> <li>00:00:00 &ndash; Introduction</li> <li>00:02:18 &ndash; Setuptools Breaks Things, Then Fixes Them</li> <li>00:04:57 &ndash; PEP 751: A File Format to List Python Dependencies</li> <li>00:07:04 &ndash; Python 3.13.0 Release Candidate 1 Released</li> <li>00:07:15 &ndash; Python Insider: Python 3.12.5 released</li> <li>00:07:22 &ndash; Django 5.1 released - Django Weblog</li> <li>00:07:27 &ndash; Django security releases issued: 5.0.8 and 4.2.15</li> <li>00:07:49 &ndash; How I Use Python to Organize My Data Analyses</li> <li>00:13:45 &ndash; Sponsor: Mailtrap</li> <li>00:14:21 &ndash; pandas GroupBy: Grouping Real World Data in Python</li> <li>00:20:33 &ndash; Working With JSON Data in Python</li> <li>00:25:01 &ndash; Asyncio Event Loop in Separate Thread</li> <li>00:30:33 &ndash; Video Course Spotlight</li> <li>00:31:47 &ndash; Habits of great software engineers</li> <li>00:49:17 &ndash; pyxel: A Retro Game Engine for Python</li> <li>00:52:36 &ndash; python-vendorize: Vendorize Packages From PyPI</li> <li>00:54:18 &ndash; Thanks and goodbye</li> </ul> <p>News:</p> <ul> <li><a href="https://www.bitecode.dev/p/whats-up-python-setuptools-breaks">Setuptools Breaks Things, Then Fixes Them</a> &ndash; This post is Bite Code&rsquo;s monthly summary, but the lead story happened just days ago. In line with a 7 year old deprecation, setuptools finally removed the ability to call its <code>test</code> command. Many packages promptly broke. The following day the change was undone.</li> <li><a href="https://peps.python.org/pep-0751/">PEP 751: A File Format to List Python Dependencies for Installation Reproducibility (New)</a> &ndash; This PEP proposes a new file format for dependency specification to enable reproducible installation in a Python environment.</li> <li><a href="https://pythoninsider.blogspot.com/2024/08/python-3130-release-candidate-1-released.html">Python 3.13.0 Release Candidate 1 Released</a></li> <li><a href="https://pythoninsider.blogspot.com/2024/08/python-3125-released.html">Python Insider: Python 3.12.5 released</a></li> <li><a href="https://www.djangoproject.com/weblog/2024/aug/07/django-51-released/">Django 5.1 released - Django Weblog</a></li> <li><a href="https://www.djangoproject.com/weblog/2024/aug/06/security-releases/">Django security releases issued: 5.0.8 and 4.2.15 - Django Weblog</a></li> </ul> <p>Show Links:</p> <ul> <li><a href="https://joshuacook.netlify.app/posts/2024-07-27_python-data-analysis-org/">How I Use Python to Organize My Data Analyses</a> &ndash; This is a description of how Joshua uses Python in a package-centric way to organize his approach to data analyses. This is a system he has evolved while working on his computational biology Ph.D. and working in industry.</li> <li><a href="https://realpython.com/courses/pandas-groupby-real-world-data/">pandas GroupBy: Grouping Real World Data in Python</a> &ndash; In this course, you&rsquo;ll learn how to work adeptly with the pandas GroupBy while mastering ways to manipulate, transform, and summarize data. You&rsquo;ll work with real-world datasets and chain GroupBy methods together to get data into an output that suits your needs.</li> <li><a href="https://realpython.com/python-json/">Working With JSON Data in Python</a> &ndash; In this tutorial, you&rsquo;ll learn how to read and write JSON-encoded data in Python. You&rsquo;ll begin with practical examples that show how to use Python&rsquo;s built-in &ldquo;json&rdquo; module and then move on to learn how to serialize and deserialize custom data.</li> <li><a href="https://superfastpython.com/asyncio-event-loop-separate-thread/">Asyncio Event Loop in Separate Thread</a> &ndash; Typically, the asyncio event loop runs in the main thread, but as that is the one used by the interpreter, sometimes you want the event loop to run in a separate thread. This article talks about why and how to do just that.</li> </ul> <p>Discussion:</p> <ul> <li><a href="https://vadimkravcenko.com/shorts/habits-of-great-software-engineers/">Habits of great software engineers</a></li> </ul> <p>Projects:</p> <ul> <li><a href="https://github.com/kitao/pyxel">pyxel: A Retro Game Engine for Python</a></li> <li><a href="https://github.com/mwilliamson/python-vendorize">python-vendorize: Vendorize Packages From PyPI</a></li> </ul> <p>Additional Links:</p> <ul> <li><a href="https://realpython.com/courses/packaging-with-pyproject-toml/">Everyday Project Packaging With pyproject.toml – Real Python</a></li> <li><a href="https://www.youtube.com/watch?v=v6tALyc4C10">Packaging Your Python Code With pyproject.toml - Complete Code Conversation - YouTube</a></li> <li><a href="https://realpython.com/podcasts/rpp/197/">Episode #197: Using Python in Bioinformatics and the Laboratory – The Real Python Podcast</a></li> </ul> <p>Level up your Python skills with our expert-led courses:</p> <ul> <li><a href="https://realpython.com/courses/packaging-with-pyproject-toml/">Everyday Project Packaging With pyproject.toml</a></li> <li><a href="https://realpython.com/courses/working-json-data-python/">Working With JSON in Python</a></li> <li><a href="https://realpython.com/courses/pandas-groupby-real-world-data/">pandas GroupBy: Grouping Real World Data in Python</a></li> </ul> <p><a rel="payment" href="https://realpython.com/join">Support the podcast &amp; join our community of Pythonistas</a></p>