The Python Podcast.init

389 episodes — Page 3 of 8

Ep 288Making The Case For A (Semi) Formal Specification Of CPython

Summary The CPython implementation has grown and evolved significantly over the past ~25 years. In that time there have been many other projects to create compatible runtimes for your Python code. One of the challenges for these other projects is the lack of a fully documented specification of how and why everything works the way that it does. In the most recent Python language summit Mark Shannon proposed implementing a formal specification for CPython, and in this episode he shares his reasoning for why that would be helpful and what is involved in making it a reality. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Your host as usual is Tobias Macey and today I’m interviewing Mark Shannon about his efforts to create a formal specification for the CPython interpreter Interview Introductions How did you get introduced to Python? Can you start by describing the current state of how the Python language and the CPython runtime are defined? What is your motivation in advocating for a specification? After ~25 years of the language, why is now the time to pursue this effort? How does the history of the language and the scope of the ecosystem and community impact the effort required to make this a reality? What is involved in creating the specification and where would it be located once complete? What are some examples of languages that are formally specified? What are the possible benefits of creating a specification for the CPython virtual machine? What is the distinction between a specification for the VM as opposed to a specification for the language? What are some potential downsides to having a (semi-)formal specification become part of the definition of the interpreter? Can you describe the process of doing the work to create the specification? How are you approaching the actual definition of the specification (e.g. prose vs programmatic)? What are the tradeoffs of prose vs. an executable specification (e.g. TLA+, Alloy)? How does this work tie into your goals of improving the speed of the CPython interpreter? What are some of the most interesting, unexpected, or challenging aspects of your efforts to bring this specification to CPython? How can the community contribute to this effort? Keep In Touch markshannon on GitHub Website Picks Tobias American Gods book and TV series Mark Roadside Picnic In Death (VR game) –On Steam Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links CPython PyPy PEP 380 yield from Language Summit RustPython Jython C++ ML programming language Java Python Formal Semantics git repository CPython PEG Parser Episode with Pablo Galindo and Lysandros Nikolaou IETF RFCs The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Nov 10, 202036 min

Ep 287Bringing Artificial Intelligence Projects From Idea To Production

Full

Summary Artificial intelligence applications can provide dramatic benefits to a business, but only if you can bring them from idea to production. Henrik Landgren was behind the original efforts at Spotify to leverage data for new product features, and in his current role he works on an AI system to evaluate new businesses to invest in. In this episode he shares advice on how to identify opportunities for leveraging AI to improve your business, the capabilities necessary to enable aa successful project, and some of the pitfalls to watch out for. If you are curious about how to get started with AI, or what to consider as you build a project, then this is definitely worth a listen. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Equalum’s end to end data ingestion platform is relied upon by enterprises across industries to seamlessly stream data to operational, real-time analytics and machine learning environments. Equalum combines streaming Change Data Capture, replication, complex transformations, batch processing and full data management using a no-code UI. Equalum also leverages open source data frameworks by orchestrating Apache Spark, Kafka and others under the hood. Tool consolidation and linear scalability without the legacy platform price tag. Go to pythonpodcast.com/equalum today to start a free 2 week test run of their platform, and don’t forget to tell them that we sent you. Your host as usual is Tobias Macey and today I’m interviewing Henrik Landgren about his experiences building AI platforms to transform business capabilities. Interview Introductions How did you get introduced to Python? Can you start by sharing your thoughts on when, where, and how AI/ML are useful tools for a business? What has been your experience in building AI platforms? For organizations who are considering investing in AI capabilities, what are some alternative strategies that they might consider first? What are the cases where AI is likely to be a wasted effort, or will fail to create a return on investment? In order to be succesful in bringing AI products to production, what are the foundational capabilities that are necessary? What have you found to be a useful composition of roles and skills for building AI products? There are various statistics that all point to a remarkably low success rate for bringing AI into production. What are some of the pitfalls that organizations and engineers should be aware of when undertaking such a project? What is your strategy for identifying opportunities for a successful AI product? Once you have determined the possible utility for such a project, how do you approach the work of making it a reality? What are the common factors in what you built at Spotify and EQT ventures? Where do the two efforts diverge? Your work on Motherbrain is interesting because of the fact that it is dealing in what seems to be intangible or unpredictable forces. What kinds of input are you relying on to generate useful predictions? What are some of the most interesting, innovative, or unexpected uses of AI that you have seen? What are some of the biggest failures of AI that you are aware of? In your work at Spotify and EQT ventures, what are the most interesting, unexpected, or challenging lessons that you have learned? What advice or recommendations do you have for anyone who wants to learn more about the potential for AI and the work involved in bringing it to production? Keep In Touch LinkedIn @hlandgren on Twitter Picks Tobias Whale ba

Nov 3, 202047 min

Ep 286Power Up Your Java Using Python With JPype

Full

Summary Python and Java are two of the most popular programming languages in the world, and have both been around for over 20 years. In that time there have been numerous attempts to provide interoperability between them, with varying methods and levels of success. One such project is JPype, which allows you to use Java classes in your Python code. In this episode the current lead developer, Karl Nelson, explains why he chose it as his preferred tool for combining these ecosystems, how he and his team are using it, and when and how you might want to use it for your own projects. He also discusses the work he has done to enable use of JPype on Android, and what is in store for the future of the project. If you have ever wanted to use a library or module from Java, but the rest of your project is already in Python, then this episode is definitely worth a listen. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Karl Nelson about JPype, a language bridge that lets you use Java classes in your Python programs Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what JPype is? What was your motivation for becoming such a regular contributor to the project? Why might someone want to be able to call into the Java ecosystem from a Python program? There have been a number of other projects aiming to combine the capabilities of Java and Python, such as Jython and PyJNIus. What are the relative tradeoffs between the different options? Many of those other projects have stalled or stopped altogether. What about JPype has allowed it to survive for so long? Can you explain how JPype is implemented? How has the design and implementation of the project evolved since it was first implemented? How do the relative language versions influence the compatibility of programs on either side of the bridge? What is involved in creating a project that uses JPype? How are dependencies, packaging, distribution, etc. handled across the Java and Python portions of the code? What are some of the ways that JPype can be used for Android applications? What are some of the sharp edges or pitfalls that users of JPype should be aware of? What are some of the most interesting, innovative, or unexpected ways that you have seen JPype used? What have you found to be the most interesting or challenging aspects of building JPype? When is JPype the wrong choice? What is in store for the future of the project? Keep In Touch Thrameos on GitHub LinkedIn Picks Tobias Hiking All Trails The Hiking Project Karl Summoner’s Rift Links JPype Java Overview of Python to Java bridges Lawrence Livermore National Lab GTK– Gnome Perl C++ Matlab Java Native Interface (JNI) SciPy NumPy Matplotlib Jython PyJNIus Py4J Jep Ruby Reflection Ivy Maven JDBC Kivy Android Python Slots PyPy Java ASM Arrow Columnar Memory Format Protocol Buffers GraalVM The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Oct 26, 202048 min

Ep 285The Journey To Replace Python's Parser And What It Means For The Future

Full

Summary The release of Python 3.9 introduced a new parser that paves the way for brand new features. Every programming language has its own specific syntax for representing the logic that you are trying to express. The way that the rules of the language are defined and validated is with a grammar definition, which in turn is processed by a parser. The parser that the Python language has relied on for the past 25 years has begun to show its age through mounting technical debt and a lack of flexibility in defining new syntax. In this episode Pablo Galindo and Lysandros Nikolaou explain how, together with Python’s creator Guido van Rossum, they replaced the original parser implementation with one that is more flexible and maintainable, why now was the time to make the change, and how it will influence the future evolution of the language. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Pablo Galindo and Lysandros Nikolaou about their work on replacing the parser in CPython and what that means for the language Interview Introductions How did you get introduced to Python? Can you start by discussing the role of the parser in the lifecycle of a Python program? What were the limitations of the previous parser, and how did that contribute to complexity and technical debt in the CPython runtime? What are the options for styles of parsers, and what are the benefits of using a PEG style grammar? How does the new parser impact the approachability of the CPython code for new contributors? What was the process for reimplementing the parser and guarding against regressions in the syntax? As developers switch to the 3.9 release, what potential edge cases/bugs might they see from introducing the new parser? What new syntax options does this parser provide for the Python language? Are there any specific features that are planned for implementation in the 3.10 release that are enabled by the new parser grammar? As the language evolves due to new capabilities offered by the updated parser, how will that impact other implementations such as PyPy? What were the most interesting, unexpected, or challenging aspects of this project? What other aspects of the CPython code do you think should be reconsidered or reimplemented in light of the changes in computing and the usage of the language? Keep In Touch Pablo pablogsal on GitHub @pyblogsal on Twitter LinkedIn Lysandros LinkedIn lysnikolaou on GitHub @lysnikolaou on Twitter Picks Tobias Annual Python Developer Survey Jessica Jones TV show Pablo Raised By Wolves TV Series Lysandros Afterlife TV show Links PEP 617 – New PEG Parser for CPython Podcast Episode About Parsers CPython Bloomberg PEG Parsers Seafair LL(1) Parsers Łukasz Langa Parser Generator Concrete Syntax Tree Abstract Syntax Tree PyPy RustPython Podcast Episode IronPython Structural Pattern Matching – PEP 622 Pylint ASTroid Podcast Episode Hy Podcast Episode Walrus Operator/Assignment Expressions C99 Reference Counting Cycle Hunting/Generational Garbage Collection The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Oct 19, 20201h 5m

Ep 284Cloud Native Application Delivery Using GitOps

Full

Summary The way that applications are being built and delivered has changed dramatically in recent years with the growing trend toward cloud native software. As part of this movement toward the infrastructure and orchestration that powers your project being defined in software, a new approach to operations is gaining prominence. Commonly called GitOps, the main principle is that all of your automation code lives in version control and is executed automatically as changes are merged. In this episode Victor Farcic shares details on how that workflow brings together developers and operations engineers, the challenges that it poses, and how it influences the architecture of your software. This was an interesting look at an emerging pattern in the development and release cycle of modern applications. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Tree Schema is a data catalog that is making metadata management accessible to everyone. With Tree Schema you can create your data catalog and have it fully populated in under five minutes when using one of the many automated adapters that can connect directly to your data stores. Tree Schema includes essential cataloging features such as first class support for both tabular and unstructured data, data lineage, rich text documentation, asset tagging and more. Built from the ground up with a focus on the intersection of people and data, your entire team will find it easier to foster collaboration around your data. With the most transparent pricing in the industry – $99/mo for your entire company – and a money-back guarantee for excellent service, you’ll love Tree Schema as much as you love your data. Go to pythonpodcast.com/treeschema today to get your first month free, and mention this podcast to get %50 off your first three months after the trial. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Victor Farcic about using GitOps practices to manage your application and your infrastructure in the same workflow Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what GitOps is? What are the architectural or design elements that developers need to incorporate to make their applications work well in a GitOps workflow? What are some of the tools that facilitate a GitOps approach to managing applications and their target environments? What are some useful strategies for managing local developer environments to maintain parity with how production deployments are architected? As developers acquire more resonsibility for building the automation to provision the production environment for their applications, what are some of the operations principles that they need to understand? What are some of the development principles that operators and systems administrators need to acquire to be effective in contributing to an environment that is managed by GitOps? What are the areas for collaboration and dividing lines of responsibility between developers and platform engineers in a GitOps environment? Beyond the application development and deployment, what are some of the additional concerns that need to be built into an application in order for it to be manageable and maintainable once it is in production? What are some of the organizational principles that contribute to a successful implementation of GitOps? What are some of the most interesting, innovative, or unexpected ways that you have seen GitOps employed? What have you found to be the most challenging aspects of creating a scalable and maintainable GitOps practice? When is GitOps the wrong choice, and what are the alternatives? What resources do you recommend for anyone who wants to dig deeper into this subject? Keep In Touch LinkedIn Blog @vfarcic on Twitter Picks Tobias Pulumi

Oct 12, 202053 min

Ep 283Threading The Needle Of Interesting And Informative While You Learn To Code

Full

Summary Learning to code is a neverending journey, which is why it’s important to find a way to stay motivated. A common refrain is to just find a project that you’re interested in building and use that goal to keep you on track. The problem with that advice is that as a new programmer, you don’t have the knowledge required to know which projects are reasonable, which are difficult, and which are effectively impossible. Steven Lott has been sharing his programming expertise as a consultant, author, and trainer for years. In this episode he shares his insights on how to help readers, students, and colleagues interested enough to learn the fundamentals without losing sight of the long term gains. He also uses his own difficulties in learning to maintain, repair, and captain his sailboat as relatable examples of the learning process and how the lessons he has learned can be translated to the process of learning a new technology or skill. This was a great conversation about the various aspects of how to learn, how to stay motivated, and how to help newcomers bridge the gap between what they want to create and what is within their grasp. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Steven F. Lott about finding a project that you care about to aid in learning to program Interview Introductions How did you get introduced to Python? Can you start by outlining your experiences working with and teaching Python? Does your day-to-day experience at work suggest ways to help newcomers learn about Python? How have your experiences as an author influenced your perspective on how to help newcomers become motivated to learn programming? One of the common pieces of advice that I and others have given to people learning Python or other languages is to find a project that they want to build, but that’s not necessarily a practical approach. What are some of the difficulties that might come of that approach? What are some strategies that you have tried for helping learners identify what kinds of project are possible and practical? Beyond the difficulty of understanding what is possible and what is going to require a dedicated team of engineers to even attempt, there is the question of remaining motivated for long enough to follow through on a project in the face of syntax errors and design challenges. What can language developers and ecosystems do to improve the newcomer experience in exploring possibilities? How can we make syntax errors educational and recoverable, rather than needing accrued knowledge, or hours of web searches? As an author, there are complementary goals that may lead to conflict in the form of wanting to provide structured guidance and progression while allowing for creativity and experimentation. How have you approached those objectives in your books? What are some of the projects that have motivated you to learn new skills? What advice do you have for anyone who is working on or considering writing a book to teach a technical skill? What advice do you have for anyone who

Oct 6, 202056 min

Ep 282Solving Python Package Creation For End User Applications With PyOxidizer

Full

Summary Python is a powerful and expressive programming language with a vast ecosystem of incredible applications. Unfortunately, it has always been challenging to share those applications with non-technical end users. Gregory Szorc set out to solve the problem of how to put your code on someone else’s computer and have it run without having to rely on extra systems such as virtualenvs or Docker. In this episode he shares his work on PyOxidizer and how it allows you to build a self-contained Python runtime along with statically linked dependencies and the software that you want to run. He also digs into some of the edge cases in the Python language and its ecosystem that make this a challenging problem to solve, and some of the lessons that he has learned in the process. PyOxidizer is an exciting step forward in the evolution of packaging and distribution for the Python language and community. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Gregory Szorc about his work on PyOxidizer, a revolutionary new approach to building and distributing self-contained Python applications Interview Introductions How did you get introduced to Python? Can you start by giving an overview on the shortcomings of the current state of the art for distributing Python projects, both for deployment and end-user consumption? What is PyOxidizer and what motivated you to create it? How does PyOxidizer differ from projects such as CxFreeze, Py2Exe, or Shiv? What are the characteristics of CPython and the packaging ecosystem that make it so challenging to easily distribute self-contained applications? For someone using PyOxidizer, what is their workflow for building an executable that they can share with end users? What are some of the edge cases or special considerations that they need to be aware of? How is PyOxidizer implemented? How has the design or direction evolved since you first began working on it? From your experience in working on PyOxidizer, what changes would you like to see in the Python language or the CPython reference implementation? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on PyOxidizer? What do you have planned for the future of PyOxidizer? What are the ways that listeners can contribute to PyOxidizer? Keep In Touch Website indygreg on GitHub Picks Tobias Carlos Santana Gregory Home Air Quality Monitor Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PyOxidizer Mercurial Podcast Episode Mozilla Virtuale

Sep 29, 202049 min

Ep 281Flexible Network Security Detection And Response With Grapl

Full

Summary Servers and services that have any exposure to the public internet are under a constant barrage of attacks. Network security engineers are tasked with discovering and addressing any potential breaches to their systems, which is a never-ending task as attackers continually evolve their tactics. In order to gain better visibility into complex exploits Colin O’Brien built the Grapl platform, using graph database technology to more easily discover relationships between activities within and across servers. In this episode he shares his motivations for creating a new system to discover potential security breaches, how its design simplifies the work of identifying complex attacks without relying on brittle rules, and how you can start using it to monitor your own systems today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Colin O’Brien about Grapl, an open source platform for detection and response of system security incidents Interview Introductions How did you get introduced to Python? Can you start by describing what Grapl is and the problem that you are trying to solve with it? What was your original motivation to create it? What were the existing options for security detection and response, and how is Grapl differentiated from them? Who is the target audience for the Grapl project? How is the Grapl system architected? How has the design of the system evolved since you first began working on it? How much effort would it be to separate the Grapl architecture from AWS to migrate it to other environments? What have you found to be the benefits of splitting the implementation of the system between Rust for the system and Python for the exploration? What challenges have you faced as a result of working across those languages? What data sources does Grapl use to build its graph of events within a system? Can you talk through the overall workflow for someone using Grapl? What are some examples of the types of exploits that you can identify with Grapl? What are some of the most interesting, unexpected, or innovative ways that you have seen Grapl used? What are some of the most interesting, unexpected, or challenging lessons that you have learned while building it? When is Grapl the wrong choice? What do you have planned for the future of Grapl? Keep In Touch insanitybit on GitHub LinkedIn @InsanityBit on Twitter Picks Tobias Artemis Fowl book series by Eoin Colfer Artemis Fowl Movie Colin PyO3 Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Grapl Grapl S

Sep 22, 202053 min

Ep 280Simplified Data Extraction And Analysis For Current Events With Newspaper

Full

Summary News media is an important source of information for understanding the context of the world. To make it easier to access and process the contents of news sites Lucas Ou-Yang built the Newspaper library that aids in automatic retrieval of articles and prepare it for analysis. In this episode he shares how the project got started, how it is implemented, and how you can get started with it today. He also discusses how recent improvements in the utility and ease of use of deep learning libraries open new possibilities for future iterations of the project. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Lucas Ou-Yang about Newspaper, a framework for easily extracting and processing online articles. Interview Introductions How did you get introduced to Python? Can you start by describing what the Newspaper project is and your motivations for creating it? What are the main use cases that Newspaper is built for? What are some libraries or tools that Newspaper might replace? What are the common structures in news sites that allow you to abstract across them for content extraction? What are some ways of determining whether a site will be a good candidate for using with Newspaper? Can you talk through the developer workflow of someone using Newspaper? What are some of the other libraries or tools that are commonly used alongside Newspaper? How is Newspaper implemented? How has the design of he project evolved since you first began working on it? What are some of the most complex or challenging aspects of building an automated article extraction tool? What are some of the most interesting, unexpected, or innovative projects that you have seen built with Newspaper? What keeps you interested in the ongoing support and maintenance of the project? What do you have planned for the future of Newspaper? Keep In Touch LinkedIn @LucasOuYang on Twitter Website codelucas on GitHub Picks Tobias Million Bazillion Podcast Lucas Hackers and Painters: Big Ideas from the Computer Age by Paul Graham Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Newspaper Los Angeles Reddit Django NLP == Natural Language Processing Web Scraping Podcast Episode Requests Wintria Python Goose Diffbot Heuristics Stop Words RSS SpaCy Podcast Episode Gensim Podcast Episode PyTorch Podcast Episode NLTK LXML Beautiful Soup The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Sep 15, 202043 min

Ep 279Digging Into Dagster: An Opinionated Open Source Framework For Data Orchestration

Full

Summary Data applications are complex and continually evolving, often requiring collaboration across multiple teams. In order to keep everyone on the same page a high level abstraction is needed to facilitate a cross-cutting view of the data orchestration across integration, transformation, analytics, and machine learning. Dagster is an innovative new framework that leans on the power and flexibility of Python to provide an extensible interface to the complete lifecycle of data projects. In this episode Nick Schrock explains how he designed the Dagster project to allow for integration with the entire data ecosystem while providing an opinionated structure for connecting the different stages of computation. He also discusses how he is working to grow an open ecosystem around the Dagster project, and his thoughts on building a sustainable business on top of it without compromising the integrity of the community. This was a great conversation about playing the long game when building a business while providing a valuable utility to a complex problem domain. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Nick Schrock about Dagster, an open source data orchestrator for powering data engineering, analytics, and machine learning Interview Introductions How did you get introduced to Python? Can you start by describing what Dagster is and how it got started? What are the most common difficulties that organizations face when working with data projects? How does Dagster help in addressing those challenges? There are a number of workflow orchestration platforms, spanning a few generations of tooling. What do you see as the defining characteristics of the various options, and how does Dagster fit in that ecosystem? What are the assumptions that you made at the start of building Dagster and how have they been challenged, updated, or invalidated over the past year of working with end users? How are the internals of Dagster implemented? How has the design changed or evolved since you first began working on it? For someone who is building on top of Dagster, what is their workflow from first steps through to production? What are your guiding principles for desigining the user facing API? What are the available extension points for Dagster? What was your reason for implementing Dagster as a Python framework? With the benefit of hindsight, would you make the same decision today? What are some of the most interesting, innovative, or unexpected ways that you have seen Dagster used? What are the most interesting, unexpected, or challenging lessons that you have learned while building Dagster and working to grow its ecosystem? When is Dagster the wrong choice? As you continue to build Dagster, what is your vision for it and its ecosystem? What are the next steps that you are taking to achieve that vision? Keep In Touch @schrockn on Twitter schrockn on GitHub LinkedIn Picks Tobias Caddy web server Nick Black code formatter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the

Sep 7, 202059 min

Ep 278When, Why, and How To Use Web Scraping In A Nutshell

Full

Summary The internet is a rich source of information, but a majority of it isn’t accessible programmatically through APIs or databases. To address that shortcoming there are a variety of web scraping frameworks that aid in extracting structured data from web pages. In this episode Attila Tóth shares the challenges of web data extraction, the ways that you can use it, and how Scrapy and ScrapingHub can help you with your projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Attila Tóth about doing data extraction with web scraping. Interview Introductions How did you get introduced to Python? Can you start by explaining what web scraping is and when you might want to use it? How did you first get started with web scraping? There are a number of options for web scraping tools in Python, as well as other languages. What are the characteristics of the Scrapy project and community that have made it stand out and retain such widespread popularity? One of the perpetual questions with web scraping is that of copyright and content ownership. What should we all be aware of when scraping a given website? What are some of the most challenging aspects of crawling and scraping the web? What are some of the features of Scrapy that aid in those challenges? Once you have retrieved the content from a site, what are some of the considerations for storing and processing the data that we should be thinking about? How can we guard against a scraper breaking due to changes in the layout of a site, or simple updates that weren’t accounted for in the initial implementation? What are some of the most complicated aspects of scaling web scrapers? For someone who is interested in using Scrapy, what are some of the common pitfalls that they should be aware of? What are some of the most interesting, innovative, or unexpected projects that are built with Scrapy and ScrapingHub? What are the most interesting, unexpected, or challenging lessons that you have learned while working with web scrapers and ScrapingHub? What resources would you recommend to anyone who is looking to learn more about web scraping? Keep In Touch LinkedIn Picks Tobias Gov’t Mule Attila Awesome Web Scraping Awesome Scrapy Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Web Scraping ScrapingHub Java Android Scrapy JSoup HTMLUnit Selenium Pandas robots.txt Puppeteer Splash The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Sep 1, 202041 min

Ep 277Working In The Code Mines: Mining Software Repositories With PyDriller

Full

Summary A large portion of the software industry has standardized on Git as the version control sytem of choice. But have you thought about all of the information that you are generating with your branches, commits, and code changes? Davide Spadini created the PyDriller framework to simplify the work of mining software repositories to perform research on the technical and social aspects of software engineering. In this episode he shares some of the insights that you can gain by exploring the history of your code, the complexities of building a framework to interact with Git, and some of the interesting ways that PyDriller can be used to inform your own development practices. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Davide Spadini about PyDriller, a framework for mining software repositories Interview Introductions How did you get introduced to Python? Can you start by describing what PyDriller is and how the project got started? How is Pydriller different from other Git frameworks? What kinds of information can you discover by mining a software repository? Where and how might the collected information be used? What are the limitations of the capabilities offered by Git for investigating the repository? What are the additional metrics that you are able to extract using PyDriller? Can you describe how PyDriller itself is implemented? How has the project evolved since you first began working on it? I noticed that for testing PyDriller you crafted a set of repositories to serve as test cases. What has been the most complex or challenging aspect of writing meaningful tests to ensure a reasonable coverage of this problem domain? What would be required to add support for other version control systems? How have you used PyDriller in your own research? What are some of the most interesting, unexpected, or innovative ways that you have seen PyDriller used? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with PyDriller? What do you have planned for the future of PyDriller? Keep In Touch Website ishepard on GitHub @DavideSpadini on Twitter Picks Tobias pre-commit Davide Fall guys Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PyDriller Delft Git GitPython PyGit2 RepoDriller Mining Software Repositories Conference Lizard Hadoop Mercurial Podcast Episode Subversion CVS Neo4J GraphRepo The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Aug 25, 202040 min

Ep 276Building The Open Data Ecosystem For Music And More At Metabrainz

Full

Summary The Musicbrainz project was an early entry in the movement to build an open data ecosystem. In recent years, the Metabrainz Foundation has fostered a growing ecosystem of projects to support the contribution of, and access to, metadata, listening habits, and review of music. The majority of those projects are written in Python, and in this episode Param Singh explains how they are built, how they fit together, and how they support the goals of the Metabrains Foundation. This was an interesting exporation of the work involved in building an ecosystem of open data, the challenges of making it sustainable, and the benefits of building for the long term rather than trying to achieve a quick win. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Before you put your code into production you need to make sure that it passes all of the tests, that it has been packaged with all of the dependencies, and that you haven’t introduced any security issues. Instead of running all of that on your laptop, let Codefresh handle it automatically with their continuous integration and continuous delivery platform. Built for the modern era of cloud-native computing, they make publishing to Kubernetes, serverless platforms, and virtual machines fast and seamless. With a growing library of pre-made steps, a flexible pipeline definition, and unlimited scale Codefresh lets you ship faster and safer than ever. Go to pythonpodcast.com/codefresh today to get unlimited builds on your free account. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Param Singh about the ways that Python is being used across the various Metabrainz projects Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what the Metabrainz organization is and the various projects that it encompasses? What are the motivations for creating those projects and some of the origin story for Metabrainz? The Musicbrainz server is the longest running project and is written in Perl. What was the reason for switching to Python for all of the other *brainz projects? How does the MetaBrainz Foundation sustain itself? Where do the funds come from? How do you determine where and how to allocate the funding that you receive? Which of the *brainz projects is the most complex or challenging to build, whether due to technical or sociological reasons? How do you source and manage the information that powers all of the Metabrainz projects? How is development of the various projects organized? How does that influence the amount of code sharing that is possible between them? Of the projects that you have been involved in, how are they architected? What are the main ways that the projects differ in how they are implemented? What are some of the ways that you are using Python in support of the various projects that you work on? What are some of the most interesting, innovative, or unexpected ways that you have seen the projects or data built by Metabrainz being used? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working as a contributor and maintainer of the Metabrainz projects? What is in store for the future of the existing Metabrainz projects? What are the next domains that are being considered for building a Metabrainz platform for? Keep In Touch LinkedIn paramsingh on GitHub Website Picks Tobias Beets music library organizer Podcast Episode Param Prateek Kuhad Links Metabrainz Musicbrainz Listenbrainz Acousticbrainz Bookbrainz Critiquebrainz Picard Stripe The Himalayas Dublin Ireland XKCD Import Antigravity Antigravity Python Module Last.fm Google Summer of Code CDDB Perl Flask SQLAlchemy 3rd anniversary cake Redis PostgreSQL RabbitMQ Spark Music Technology Group Splunk Artist

Aug 17, 202048 min

Ep 275Growing Dask To Make Scaling Python Data Science Easier At Coiled

Full

Summary Python is a leading choice for data science due to the immense number of libraries and frameworks readily available to support it, but it is still difficult to scale. Dask is a framework designed to transparently run your data analysis across multiple CPU cores and multiple servers. Using Dask lifts a limitation for scaling your analytical workloads, but brings with it the complexity of server administration, deployment, and security. In this episode Matthew Rocklin and Hugo Bowne-Anderson discuss their recently formed company Coiled and how they are working to make use and maintenance of Dask in production. The share the goals for the business, their approach to building a profitable company based on open source, and the difficulties they face while growing a new team during a global pandemic. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Matthew Rocklin and Hugo Bowne-Anderson about their work building a business around the Dask ecosystem at Coiled Interview Introductions How did you get introduced to Python? Can you give a quick overview of what Dask is and your motivations for creating it? How has Dask changed or evolved in the past 3 1/2 years since we last talked about it? How has the rest of the ecosystem changed in that time? After working on Dask for the past few years, what led you to the decision to build a business around it? What are the sharp edges of programming for Dask that users are looking for help on solving? What are the difficulties that users face in deploying and maintaining a production installation of Dask? What are the limitations of Dask when scaling both up and down? What are you building at Coiled to improve the user experience for users of Python and Dask? What are your thoughts on the pros and cons of orienting your messaging around the scalability of Python, as opposed to focusing on a specific industry or problem domain? What are the challenges that you are facing in managing the tensions between the open source and proprietary work that you are doing? How are you handling the ongoing governance of the Dask project? What are some of the most interesting, unexpected, or challenging lessons that you have learned while building and launching a company based on an open source project? What do you have planned for the future of both Coiled and Dask? Keep In Touch Matt Website @mrocklin on Twitter mrocklin on GitHub Hugo LinkedIn @hugobowne on Twitter Website Picks Tobias The Hobbit Audiobook Audible Free Trial (affiliate link) Matt Prefect Hugo Race After Technology by Ruha Benjamin Ruha Benjamin on deep learning: Computational depth without sociological depth is ‘superficial learning’ Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.

Aug 10, 202052 min

Ep 274Supporting The Full Lifecycle Of Machine Learning Projects With Metaflow

Full

Summary Netflix uses machine learning to power every aspect of their business. To do this effectively they have had to build extensive expertise and tooling to support their engineers. In this episode Savin Goyal discusses the work that he and his team are doing on the open source machine learning operations platform Metaflow. He shares the inspiration for building an opinionated framework for the full lifecycle of machine learning projects, how it is implemented, and how they have designed it to be extensible to allow for easy adoption by users inside and outside of Netflix. This was a great conversation about the challenges of building machine learning projects and the work being done to make it more achievable. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Savin Goyal about Netflix’s infrastructure for machine learning Interview Introductions How did you get introduced to Python? Can you start by describing the work you are doing at Netflix to support their machine learning workloads? How are you addressing the impedance mismatch of machine learning/data science work between local experimentation and production deployment? What was the motivation for building Metaflow? How does Metaflow compare to other tools in the ecosystem such as MLFlow? What was missing in the other available tools that made Metaflow necessary? workflow for someone using Metaflow How do you approach the design of the developer interface to make it approachable to machine learning engineers? level of coupling with overall Netflix data stack How is Metaflow implemented? How has the architecture and design of the system evolved since you first began working on it? supporting infrastructure/integration points motivation/benefits of releasing it as open source What are some of the most interesting, unexpected, or challenging lessons that you have learned while building infrastructure and tooling for machine learning? When is Metaflow the wrong choice? What do you have planned for the future of Metaflow and Keep In Touch LinkedIn @savingoyal on Twitter savingoyal on GitHub Picks Tobias vdist Savin Reparing Vintage Watches Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Metaflow OCaml EC2 S3 Data Lake PyTorch Tensorflow Netflix Data Stack Spinnaker Chaos Engineering Chaos Toolkit Podcast Episode Chaos Monkey Netflix Simian Army Netflix Titus AWS Batch Netflix Meson Dataflow Programming DAG == Directed Acyclic Graph MLFlow DVC (Data Version Control) Podcast Episode CML (Continuous Machine Lea

Aug 4, 202044 min

Ep 273Learning To Program By Building Tiny Python Projects

Full

Summary One of the best methods for learning programming is to just build a project and see how things work first-hand. With that in mind, Ken Youens-Clark wrote a whole book of Tiny Python Projects that you can use to get started on your journey. In this episode he shares his inspiration for the book, his thoughts on the benefits of teaching testing principles and the use of linting and formatting tools, as well as the benefits of trying variations on a working program to see how it behaves. This was a great conversation about useful strategies for supporting new programmers in their efforts to learn a valuable skill. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Ken Youens-Clark about his book Tiny Python Projects Interview Introductions How did you get introduced to Python? What is your goal with your book of Tiny Python Projects? What motivated you to start writing it? Who is the target audience that you wrote the book for? One of the notable aspects of the book is the fact that you introduce linting and testing in the first chapter. Why is that a useful subject for the first steps of someone getting started in Python? What are some of the problems that users experience if they are introduced to these tools after they have already established a set of habits? How did you approach the structure of the book to be approachable by newcomers to Python? What was your process for deciding on the scope of the information to include in the book? What are some of the challenges that you faced in identifying self-contained projects that could fit into a single chapter? As a book that is intended to serve as a learning resource, what was your process for soliciting feedback to determine if your tone and structure is effective in teaching the reader? What elements of the Python language and ecosystem did you consciously leave out to avoid overwhelming the readers? What are some of the most interesting, unexpected, or challenging lessons that you learned while working on the book? What are your thoughts on useful resources and next steps for readers who are interested in progressing in their use of Python? Keep In Touch kyclark on GitHub Website @kycl4rk on Twitter Picks Tobias Marvel Cinematic Universe Ken Parks & Recreation TV Show Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Tiny Python Projects University of Arizona BioInformatics Perl BioPython Podcast Episode Seq Podcast Episode Pytest Podcast Episode Windows Subsystem for Linux Pylint Podcast Episode YAPF Black

Jul 28, 202055 min

Ep 272Idiomatic Functional Programming With DRY Python

Full

Summary Python is an intuitive and flexible language, but that versatility can also lead to problematic designs if you’re not careful. Nikita Sobolev is the CTO of Wemake Services where he works on open source projects that encourage clean coding practices and maintainable architectures. In this episode he discusses his work on the DRY Python set of libraries and how they provide an accessible interface to functional programming patterns while maintaining an idiomatic Python interface. He also shares the story behind the wemake Python styleguide plugin for Flake8 and the benefits of strict linting rules to engender good development habits. This was a great conversation about useful practices to build software that will be easy and fun to work on. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Nikita Sobolev about his work with DRY Python and Wemake Services Interview Introductions How did you get introduced to Python? Can you start by sharing your overarching philosophies or design aesthetics for writing maintainable software? What is your process for starting a new project, beginning at the design phase? What are some of the challenges or shortcomings that you see in the "default" way that most developers write Python? What is DRY Python is and how does it help in addressing those concerns? What was your motivation for creating these projects? There are a number of different projects that are being built under the DRY Python umbrella. Can you list the ones that are currently active and outline how they fit together? What are some of the initial challenges that newcomers to the DRY Python libraries encounter? How do you approach the design of the API and developer experience to make these development approaches more accessible? What have you seen in terms of real world impact on the maintainability and extensibility of projects that you have built on top of the DRY Python components? In addition to DRY Python you are also involved with development of the wemake-python-styleguide. Can you describe that projects goal and how it got started? If you make the linting too restrictive then developers are likely to just ignore or disable it. What have you found to be the right balance to which rules will fail a build and which are just informational? Why do you push the responsibility for things like formatting onto the developer, rather than an autoformatter such as YAPF or Black? What are some of the other supporting technologies that you rely on during your development workflow? What are some of the elements that you think are missing in the common toolbox for Python developers? What tools are we lacking entirely? What are the cases where DRY Python is the wrong choice? What are your goals and plans for the future of DRY Python and the various Wemake libraries? Keep In Touch Blog sobolevn on GitHub Picks Tobias The Map To Everywhere Nikita Russian Python Week Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for

Jul 21, 202047 min

Ep 271The Past, Present, And Future Of The FLUFL: Barry Warsaw Shares His History With Python

Full

Summary Barry Warsaw has been a member of the Python community since the very beginning. His contributions to the growth of the language and its ecosystem are innumerable and diverse, earning him the title of Friendly Language Uncle For Life. In this episode he reminisces on his experiences as a core developer, a member of the Python Steering Committee, and his roles at Canonical and LinkedIn supporting the use of Python at those companies. In order to know where you are going it is always important to understand where you have been and this was a great conversation to get a sense of the history of how Python has gotten to where it is today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This episode of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Barry Warsaw about his role in the Python community, past, present, and future. Interview Introductions How did you get introduced to Python? For anyone who isn’t familiar with you, how would you characterize your role in the Python language and community? What have been your main areas of focus in your role as a core developer? What are some of the other forms that your contributions to the language and community have taken? What are the contributions to Python that you are most proud of? Looking back at the past 25 years of Python, what do you find most interesting/surprising/exciting? How has the focus of the community changed or evolved since you first began using it? What are you currently focused on in your role in the steering council? What are the aspects of the language and community that you think need greater attention? What are the core strengths of the language and community that you believe will carry it through the next 25 years? In your current and previous roles you acted as a guiding force for Python. What are the main use cases for Python at LinkedIn? What kinds of projects are you involved with to support the other engineers in their use of Python? How much of an impact has the invisible hand of the PSU had on the overall trajectory of Python? Outside of Python, what are the programming languages or communities that you look to for inspiration? What are your personal goals for the future of Python? Keep In Touch Website warsaw on GitHub warsaw on GitLab Blog @pumpichank on Twitter Picks Tobias Hanna TV Series Barry Midnight Gospel The Expanse TV Series Audio Books Free 30 Day Audible Trial (Affiliate Link) Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links FLUFL PEP 401 Python Steering Council The PEP Talk episode Usenet BBS == Bulletin Board System comp.lang.

Jul 13, 202051 min

Ep 270Pure Python Configuration Management With PyInfra

Full

Summary Building and managing servers is a challenging task. Configuration management tools provide a framework for handling the various tasks involved, but many of them require learning a specific syntax and toolchain. PyInfra is a configuration management framework that embraces the familiarity of Pure Python, allowing you to build your own integrations easily and package it all up using the same tools that you rely on for your applications. In this episode Nick Barrett explains why he built it, how it is implemented, and the ways that you can start using it today. He also shares his vision for the future of the project and you can get involved. If you are tired of writing mountains of YAML to set up your servers then give PyInfra a try today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Podcast.__init__ is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Nick Barrett about PyInfra, a pure Python framework for agentless configuration management Interview Introductions How did you get introduced to Python? Can you start by describing what PyInfra is and its origin story? There are a number of options for configuration management of various levels of complexity and language options. What are the features of PyInfra that might lead someone to choose it over other systems? What do you see as the major pain points in dealing with infrastructure today? For someone who is using PyInfra to manage their servers, what is the workflow for building and testing deployments? How do you handle enforcement of idempotency in the operations being performed? Can you describe how PyInfra is implemented? How has its design or focus evolved since you first began working on it? What are some of the initial assumptions that you had at the outset which have been challenged or updated as it has grown? The library of available operations seems to have a good baseline for deploying and managing services. What is involved in extending or adding operations to PyInfra? With the focus of the project being on its use of pure Python and the easy integration of external libraries, how do you handle execution of python functions on remote hosts that requires external dependencies? What are some of the other options for interfacing with or extending PyInfra? What are some of the edge cases or points of confusion that users of PyInfra should be aware of? What has been the community response from developers who first encounter and trial PyInfra? What have you found to be the most interesting, unexpected, or challenging aspects of building and maintaining PyInfra? When is PyInfra the wrong choice for managing infrastructure? What do you have planned for the future of the project? Keep In Touch Fizzadar on GitHub Website @Fizzadar on Twitter LinkedIn Picks Tobias My Spy Nick Das Keyboard Ultimate Korean Short Ribs Kimchi Fried Rice Links PyInfra Oxygem WordPress Lua Gary’s Mod Java Ansible SaltStack Chef Puppet EC2 Boto 3 Hashicorp Vault Vagrant Docker Testinfra SaltStack Testinfra Plugin Dockerfile Idempotence Nginx POSIX gevent Jinja2 Click Zero Tier BSD AST Module RedBaron The intro and ou

Jul 6, 202043 min

Ep 269Build Your Own Domain Specific Language in Python With textX

Full

Summary Programming languages are a powerful tool and can be used to create all manner of applications, however sometimes their syntax is more cumbersome than necessary. For some industries or subject areas there is already an agreed upon set of concepts that can be used to express your logic. For those cases you can create a Domain Specific Language, or DSL to make it easier to write programs that can express the necessary logic with a custom syntax. In this episode Igor Dejanović shares his work on textX and how you can use it to build your own DSLs with Python. He explains his motivations for creating it, how it compares to other tools in the Python ecosystem for building parsers, and how you can use it to build your own custom languages. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Igor Dejanović about textX, a meta-language for building domain specific languges in Python Interview Introductions How did you get introduced to Python? Can you start by describing what a domain specific language is and some examples of when you might need one? What is textX and what was your motivation for creating it? There are a number of other libraries in the Python ecosystem for building parsers, and for creating DSLs. What are the features of textX that might lead someone to choose it over the other options? What are some of the challenges that face language designers when constructing the syntax of their DSL? Beyond being able to parse and process an arbitrary syntax, there are other concerns for consumers of the definition in terms of tooling. How does textX provide support to those end users? How is textX implemented? How has the design or goals of textX changed since you first began working on it? What is the workflow for someone using textX to build their own DSL? Once they have defined the grammar, how do they distribute the generated interpreter for others to use? What are some of the common challenges that users of textX face when trying to define their DSL? What are some of the cases where a PEG parser is unable to unambiguously process a defined grammar? What are some of the most interesting/innovative/unexpected ways that you have seen textX used? What have you found to be the most interesting, unexpected, or challenging lessons that you have learned while building and maintaining textX and its associated projects? While preparing for this interview I noticed that you have another parser library in the form of Parglare. How has your experience working with textX informed your designs of that project? What lessons have you taken back from Parglare into textX? When is textX the wrong choice, and someone might be better served by another DSL library, different style of parser, or just hand-crafting a simple parser with a regex? What do you have planned for the future of textX? Keep In Touch Website igordejanovic on GitHub @dejanovicigor on Twitter Picks Tobias wemake-python-styleguide Igor Interactive Fiction genre Awesome Interactive Fiction The Interactive Fiction Database TADS Inform 7 Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links textX U of Novi Sad Serbia DSL course Secondary Notation Django Xtext Eclipse PLY SLY PyParsin

Jun 30, 202054 min

Ep 268Adding Observability To Your Python Applications With OpenTelemetry

Full

Summary Once you release an application into production it can be difficult to understand all of the ways that it is interacting with the systems that it integrates with. The OpenTracing project and its accompanying ecosystem of technologies aims to make observability of your systems more accessible. In this episode Austin Parker and Alex Boten explain how the correlation of tracing and metrics collection improves visibility of how your software is behaving, how you can use the Python SDK to automatically instrument your applications, and their vision for the future of observability as the OpenTelemetry standard gains broader adoption. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Austin Parker and Alex Boten about the OpenTelemetry project and its efforts to standardize the collection and analysis of observability data for your applications Interview Introductions How did you get introduced to Python? Can you start by describing what OpenTelemetry is and some of the story behind it? How do you define observability and in what ways is it separate from the "traditional" approach to monitoring? What are the goals of the OpenTelemetry project? For someone who wants to begin using OpenTelemetry clients in their Python application, what is the process of integrating it into their application? How does the definition and adoption of a cross-language standard for telemetry data benefit the broader software community? How do you avoid the trap of limiting the whole ecosystem to the lowest common denominator? What types of information are you focused on collecting and analyzing to gain insights into the behavior of applications and systems? What are some of the challenges that are commonly faced in interpreting the collected data? With so many implementations of the specification, how are you addressing issues of feature parity? For the Python SDK, how is it implemented? What are some of the initial designs or assumptions that have had to be revised or reconsidered as it gains adoption? What is your approach to integration with the broader ecosystem of tools and frameworks in the Python community? What are some of the interesting or unexpected challenges that you have faced or lessons that you have learned while working on instrumentation of Python projects? Once an application is instrumented, what are the options for delivering and storing the collected data? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with the OpenTelemetry ecosystem? What are some of the most interesting, innovative, or unexpected ways that you have seen components in the OpenTelemetry ecosystem used? When is OpenTelemetry the wrong choice? What is in store for the future of the OpenTelemetry project? Keep In Touch Austin @austinlparker on Twitter austinlparker on GitHub Alex LinkedIn @codeboten on Twitter codeboten on GitHub Picks Tobias Pulumi Podcast Episode Austin Helm 3 Alex Algorithms To Live By: The Computer Science Of Everyday Decisions by Brian Christian and Tom Griffiths Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links OpenTelemetry Lightstep OpenTracing OpenCensus Distributed Tracing Jaeger Zipkin Observability Kubernetes Spring Flask gRPC Structlog Filebeat W3C Trace Context OpenTelemetry Python SDK OpenTelemetry Django OpenTelemetry Flask OpenTelemetry Collector OTLP == Open Telemetry Protocol The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Jun 23, 202053 min

Ep 267Build A Personal Knowledge Store With Topic Modeling In Contextualize

Full

Summary Our thought patterns are rarely linear or hierarchical, instead following threads of related topics in unpredictable directions. Topic modeling is an approach to knowledge management which allows for forming a graph of associations to make capturing and organizing your thoughts more natural. In this episode Brett Kromkamp shares his work on the Contextualize project and how you can use it for building your own topic models. He explains why he wrote a new topic modeling engine, how it is architected, and how it compares to other systems for organizing information. Once you are done listening you can take Contextualize for a test run for free with his hosted instance. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Brett Kromkamp about Contextualise, a topic modeling application that helps you build a mind map for information-heavy projects Interview Introductions How did you get introduced to Python? Can you start by describing what Contextualize is and some of the types of projects that it can be used for? What was your motivation for creating it? How do you use topic maps in your own work and creative endeavors? The space of personal note-taking and knowledge management is vast and varied. What does Contextualize do well that you have been unable to find or implement in other tools? For someone using Contextualize, what does that workflow look like? How are you approaching integration with different creative contexts (e.g. text editors, graphics editors, word processing, etc.)? Can you describe how Contextualize is implemented? How has the design evolved since you first began working on it? In the documentation for Contextualize it mentions that this is the latest in a string of topic mapping platforms that you have built. What are some of the lessons that you have learned from previous efforts that have influenced the design of this one? One of the challenges with many knowledge management tools is that they are proscriptive in how to work with them. In what ways has your own preference for how to interact with information influenced the direction of Contextualize? Being an open source application, how has its exposure to the public directed your software and user design? How do you approach the challenge of reducing friction in adding content and relations while allowing for flexibility and context management? What are some of the projects that you are using Contextualize for? What are your thoughts on the utility of something like Contextualize for capturing and organizing the collective knowledge of a team of collaborators, whether in a work or casual context? What have you found to be the most interesting, complex, or complicated aspects of building a topic mapping platform? When is Contextualize the wrong choice? What do you have planned for the future of the project? Keep In Touch Website @brettkromkamp on Twitter brettkromkamp on GitHub Picks Tobias Pydantic Podcast Episode MyPy Podcast Episode Brett Black Lives Matter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Contextualise GitHub Repository Norway IBM Rexx Java Semantic Web Topic Map ISO standard for topic maps RDF Spain Knowledge Management Graph Database Worldbuilding Roam Research TopicDB Twitter Bootstrap Hypergraph Digital Gardening Notion TiddlyWiki The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Jun 15, 202058 min

Ep 266Open Source Product Analytics With PostHog

Full

Summary You spend a lot of time and energy on building a great application, but do you know how it’s actually being used? Using a product analytics tool lets you gain visibility into what your users find helpful so that you can prioritize feature development and optimize customer experience. In this episode PostHog CTO Tim Glaser shares his experience building an open source product analytics platform to make it easier and more accessible to understand your product. He shares the story of how and why PostHog was created, how to incorporate it into your projects, the benefits of providing it as open source, and how it is implemented. If you are tired of fighting with your user analytics tools, or unwilling to entrust your data to a third party, then have a listen and then test out PostHog for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Tim Glaser about PostHog, an open source platform for product analytics Interview Introductions How did you get introduced to Python? Can you start by describing what PostHog is and what motivated you to build it? What are the goals of PostHog and who are the target audience? In the description of PostHog it mentions being a product focused analytics platform, as opposed to session based. What are the meaningful differences between the two? Customer analytics is a rather crowded market, with a large number of both commercial and open source offerings (e.g. Google Analytics, Heap, Matomo, Snowplow, etc.). How does PostHog fit in that landscape and what are the differentiating factors that would lead someone to select it over the alternativs? For anyone interested in using PostHog, do you offer a migration path from other platforms? necessary features for a customer analytics tool privacy and security issues around analytics How is PostHog implemented and how has its design evolved since you first began building it? reason for choosing Python benefits of Django thoughts on introducing Channels option to include it as a pluggable Django app integration points data lake integration challenges of providing understandable statistics and exposing options for detailed analysis Having data about how users are interacting with your site or application is interesting, but how does it help in determining the useful actions to drive success? business model and project governance What are the most complex, complicated, or misunderstood aspects of building a product analytics platform? What have you found to be the most interesting, unexpected, or challenging lessons that you have learned in the process of building PostHog? When is PostHog the wrong choice? What do you have planned for the future of PostHog? Keep In Touch timgl on GitHub LinkedIn @timgl on Twitter Picks Tobias Hitchhiker’s Guide To The Galaxy Tim Triumph Of The City by Edward Glaeser Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up

Jun 8, 202049 min

Ep 265Extending The Life Of Python 2 Projects With Tauthon

Full

Summary The divide between Python 2 and 3 lasted a long time, and in recent years all of the new features were added to version 3. To help bridge the gap and extend the viability of version 2 Naftali Harris created Tauthon, a fork of Python 2 that backports features from Python 3. In this episode he explains his motivation for creating it, the process of maintaining it and backporting features, and the ways that it is being used by developers who are unable to make the leap. This was an interesting look at how things might have been if the elusive Python 2.8 had been created as a more gentle transition. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Naftali Harris about his work on Tauthon, a fork of Python 2 that backports features from Python 3 Interview Introductions How did you get introduced to Python? Can you start by describing what Tauthon is and your motivations for creating it? What’s the story behind the name? What types of applications and environments are you using Tauthon in? How much adoption of Tauthon have you seen? What are some of the different ways that your users are employing it? Is this the missing "2.8" release? In other words, is this intended to be a bridge for simplifying the migration of existing Python 2 code to Python 3, or as an extended support window for Python 2? What features have you backported from Python 3? What is your process for identifying and prioritizing features to bring into Tauthon? What is your workflow for implementing the backported functionality in Tauthon? What are some of the cases where you have had to compromise on the functionality or syntax of a feature that you have backported in order to fit into Python 2? What is your governing philosophy for how to manage syntax or behavior differences between Python 2 and 3? What have been the most challenging features to backport and maintain? What are some of the ways that Tauthon might break existing Python 2 code? What is the story for compatibility with libraries that are Python 3 only? What have you seen in terms of adoption of Tauthon? Do you have any sense of the commonalities among those users? What are some of the ecosystem challenges that faces users of Tauthon? (e.g. Pip support, package compatibility, etc.) What are some of the most interesting, unexpected, or challenging lessons that you have learned in the process of creating and maintaining Tauthon? What are your long-term plans for Tauthon, and how have they changed since you first started working on it? Keep In Touch Website @naftaliharris on Twitter naftaliharris on GitHub Picks Tobias Dagster PyCon 2020 Online Naftali Sentilink Timsort Tim Peters Links Tauthon Function Annotations Tau Nick Coghlan MyPy Podcast Episode Matrix Multiplier Operator Python 3.9 PEG Parser lazysorted nonlocal keyword Valgrind The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Jun 2, 202033 min

Ep 264Dependency Management Improvements In Pip's Resolver

Full

Summary Dependency management in Python has taken a long and winding path, which has led to the current dominance of Pip. One of the remaining shortcomings is the lack of a robust mechanism for resolving the package and version constraints that are necessary to produce a working system. Thankfully, the Python Software Foundation has funded an effort to upgrade the dependency resolution algorithm and user experience of Pip. In this episode the engineers working on these improvements, Pradyun Gedam, Tzu-Ping Chung, and Paul Moore, discuss the history of Pip, the challenges of dependency management in Python, and the benefits that surrounding projects will gain from a more robust resolution algorithm. This is an exciting development for the Python ecosystem, so listen now and then provide feedback on how the new resolver is working for you. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Tzu-ping Chung, Pradyun Gedam, and Paul Moore about their work to improve the dependency resolution capabilities of Pip and its user experience Interview Introductions How did you get introduced to Python? Can you start by describing the focus of the work that you are doing? What is the scope of the work, and what is the established criteria for when it is considered complete? What is your history with working on the Pip source code and what interests you most about this project? What are the main sources or manifestations of technical debt that exist in Pip as of today? How does it currently handle dependency resolution? What are some of the workarounds that developers have had to resort to in the absence of a robust dependency resolver in Pip? How is the new dependency resolver implemented? How has your initial design evolved or shifted as you have gotten further along in its implementation? What are the pieces of information that the resolver will rely on for determining which packages and versions to install? (e.g. will it install setuptools > 45.x in a Python 2 virtualenv?) What are the new capabilities in Pip that will be enabled by this upgrade to the dependency resolver? What projects or features in the encompassing ecosystem will be unblocked with the introduction of this upgrade? What are some of the changes that users will need to make to adopt the updated Pip? How do you anticipate the changes in Pip impacting the viability or adoption of Python and its ecosystem within different communities or industries? What are some of the additional changes or improvements that you would like to see in Pip or other core elements of the Python landscape? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on these updates to Pip? Keep In Touch Pradyun Website pradyunsg on GitHub @pradyunsg on Twitter Paul pfmoore on GitHub Tzu-Ping uranusjr on GitHub Website @uranusjr on Twitter Picks Tzu-ping Python Launcher Joe Abercrombie author The Shattered Sea Trilogy Anime PipX Standalone Paul pipx Black nox t

May 25, 20201h 16m

Ep 263Easy Data Validation For Your Python Projects With Pydantic

Full

Summary One of the most common causes of bugs is incorrect data being passed throughout your program. Pydantic is a library that provides runtime checking and validation of the information that you rely on in your code. In this episode Samuel Colvin explains why he created it, the interesting and useful ways that it can be used, and how to integrate it into your own projects. If you are tired of unhelpful errors due to bad data then listen now and try it out today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date. Machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Samuel Colvin about Pydantic, a library for enforcing type hints at runtime Interview Introductions How did you get introduced to Python? Can you start by describing what Pydantic is and what motivated you to create it? What are the main use cases that benefit from Pydantic? There are a number of libraries in the Python ecosystem to handle various conventions or "best practices" for settings management. How does pydantic fit in that category and why might someone choose to use it over the other options? There are also a number of libraries for defining data schemas or validation such as Marshmallow and Cerberus. How does Pydantic compare to the available options for those cases? What are some of the challenges, whether technical or conceptual, that you face in building a library to address both of these areas? The 3.7 release of Python added built in support for dataclasses as a means of building containers for data with type validation. What are the tradeoffs of pydantic vs the built in dataclass functionality? How much overhead does pydantic add for doing runtime validation of the modelled data? In the documentation there is a nuanced point that you make about parsing vs validation and your choices as to what to support in pydantic. Why is that a necessary distinction to make? What are the limitations in terms of usage that you are accepting by choosing to allow for implicit conversion or potentially silent loss of precision in the parsed data? What are the benefits of punting on the strict validation of data out of the box? What has been your design philosophy for constructing the user facing API? How is Pydantic implemented and how has the overall architecture evolved since you first began working on it? What have you found to be the most challenging aspects of building a library for managing the consistency of data structures in a dynamic language? What are some of the strengths and weaknesses of Python’s type system? What is the workflow for a developer who is using Pydantic in their code? What are some of the pitfalls or edge cases that they might run into? What is involved in integrating with other libraries/frameworks such as Django for web development or Dagster for building data pipelines? What are some of the more advanced capabilities or use cases of Pydantic that are less obvious? What are some of the features or capabilities of Pydantic that are often overlooked which you think should be used more frequen

May 18, 202047 min

Ep 262Managing Distributed Teams In The Age Of Remote Work

Full

Summary More of us are working remotely than ever before, many with no prior experience with a remote work environment. In this episode Quinn Slack discusses his thoughts and experience of running Sourcegraph as a fully distributed company. He covers the lessons that he has learned in moving from partially to fully remote, the practices that have worked well in managing a distributed workforce, and the challenges that he has faced in the process. If you are struggling with your remote work situation then this conversation has some useful tips and references for further reading to help you be successful in the current environment. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You monitor your website to make sure that you’re the first to know when something goes wrong, but what about your data? Tidy Data is the DataOps monitoring platform that you’ve been missing. With real time alerts for problems in your databases, ETL pipelines, or data warehouse, and integrations with Slack, Pagerduty, and custom webhooks you can fix the errors before they become a problem. Go to pythonpodcast.com/tidydata today and get started for free with no credit card required. Your host as usual is Tobias Macey and today I’m interviewing Quinn Slack about his experience managing a fully remote company and useful tips for remote work Interview Introductions How did you get introduced to Python? Can you start by giving an overview of the team structure at Sourcegraph? You recently moved to being fully remote. What was the motivating factor and how has it changed your personal workflow? What is your prior history with working remote? team practices for visibility of progress impact of remote teams on how code is written and organized reducing review burden by writing clearer code structuring meetings when remote points of friction for remote developer teams benefits of being fully remote incentivizing documentation compensation structure Keep In Touch LinkedIn @sqs on Twitter sqs on GitHub Picks Tobias Joplin App Quinn Skunkworks by Ben Rich Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Sourcegraph Quinn’s Python Search Engine Sourcegraph Employee Handbook Gitlab Gitlab Handbook Zapier Zapier Guide To Remote Work Automattic Automattic Blog On Distributed Work Comments Showing Intent The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

May 11, 202048 min

Ep 261Maintainable Infrastructure As Code In Pure Python With Pulumi

Full

Summary After you write your application, you need a way to make it available to your users. These days, that usually means deploying it to a cloud provider, whether that’s a virtual server, a serverless platform, or a Kubernetes cluster. To manage the increasingly dynamic and flexible options for running software in production, we have turned to building infrastructure as code. Pulumi is an open source framework that lets you use your favorite language to build scalable and maintainable systems out of cloud infrastructure. In this episode Luke Hoban, CTO of Pulumi, explains how it differs from other frameworks for interacting with infrastructure platforms, the benefits of using a full programming language for treating infrastructure as code, and how you can get started with it today. If you are getting frustrated with switching contexts when working between the application you are building and the systems that it runs on, then listen now and then give Pulumi a try. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You monitor your website to make sure that you’re the first to know when something goes wrong, but what about your data? Tidy Data is the DataOps monitoring platform that you’ve been missing. With real time alerts for problems in your databases, ETL pipelines, or data warehouse, and integrations with Slack, Pagerduty, and custom webhooks you can fix the errors before they become a problem. Go to pythonpodcast.com/tidydata today and get started for free with no credit card required. Your host as usual is Tobias Macey and today I’m interviewing Luke Hoban about building and maintaining infrastructure as code with Pulumi Interview Introductions How did you get introduced to Python? Can you start by describing the concept of "infrastructure as code"? What is Pulumi and what is the story behind it? Where does the name come from? How does Pulumi compare to other infrastructure as code frameworks, such as Terraform? What are some of the common challenges in managing infrastructure as code? How does use of a full programming language help in addressing those challenges? What are some of the dangers of using a full language to manage infrastructure? How does Pulumi work to avoid those dangers? Why is maintaining a record of the provisioned state of your infrastructure necessary, as opposed to relying on the state contained by the infrastructure provider? What are some of the design principles and constraints that developers should be considering as they architect their infrastructure with Pulumi? Can you describe how Pulumi is implemented? How does Pulumi manage support for multiple languages while maintaining feature parity across them? How do you manage testing and validation of the different providers? The strength of any tool is largely measured in the ecosystem that exists around it, which is one of the reasons that Terraform has been so successful. How are you approaching the problem of bootstrapping the community and prioritizing platform support? Can you talk through the workflow of working with Pulumi to build and maintain a proper infrastructure? What are some of the ways to approach testing of infrastructure code? What does the CI/CD lifecycle for infrastructure look like? What are the limitations of infrastructure as code? How do configuration management tools fit with frameworks such as Pulumi? The core framework of Pulumi is open source, and your business model is focused around a managed platform for tracking state. How are you approaching governance of the project to ensure its continued viability and growth? What are some of the most interesting, innovative, or unexpected design patterns that you have seen your users include in their infrastructure projects? When is Pulumi the wrong choice? What do you have planned for the future of Pulumi? Keep In Touch LinkedIn lukehoban on GitHub @lukehoban on Twitter Picks Tobias Bookshelf App Luke GoBinaries.com Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailin

May 4, 20201h 0m

Ep 260Teaching Python Machine Learning

Full

Summary Python has become a major player in the machine learning industry, with a variety of widely used frameworks. In addition to the technical resources that make it easy to build powerful models, there is also a sizable library of educational resources to help you get up to speed. Sebastian Raschka’s contribution of the Python Machine Learning book has come to be widely regarded as one of the best references for newcomers to the field. In this episode he shares his experiences as an author, his views on why Python is the right language for building machine learning applications, and the insights that he has gained from teaching and contributing to the field. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Sebastian Raschka about his experiences writing the popular Python Machine Learning book Interview Introductions How did you get introduced to Python? How did you get started in machine learning? What were the concepts that you found most difficult in your career with statistics and machine learning? One of your notable contributions to the field is your book "Python Machine Learning". What inspired you to write the initial version? How did you approach the challenge of striking the right balance of depth, breadth, and accessibility for the content? What was your process for determining which aspects of machine learning to include? You have made 3 editions of the book from 2015 through December of 2019. In what ways has the book changed? What are the biggest changes to the ecosystem and approaches to ML in that timeframe? What are the fundamental challenges of developing machine learning projects that continue to present themselves? What new difficulties have arisen with the introduction of new technologies and the rise of deep learning? What are some of the ways that the Python language lends itself to analytical work? What are its shortcomings and how has the community worked around them? What do you see as the biggest risks to the popularity of Python in the data and analytics space? What are some of the common pitfalls that your readers and students face while learning about different aspects of machine learning? What are some of the industries that can benefit most from applications of machine learning? What are you most excited about in the applications or capabilities of machine learning? What are you most worried about? Keep In Touch Website @rasbt on Twitter rasbt on GitHub LinkedIn Picks Tobias Trolls World Tour Sebastian FFMPeg Normalize Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Python Machine Learning (Packt) Buy On Amazon (affiliate link) UW Madison Pascal Delphi R Perl Bioinformatics Seq Podcast Episode BioPython Podcast Episode CodeCademy Udacity CS101 Andrew Ng Coursera Support-Vector Machine Bayesian Statistics Matlab scikit-learn NumPy Pandas Podcast Episode Sebastian’s Blog Perceptron Heatmaps In R The Hundred Page Machine Learning Book by Andriy Burkov ImageNet Random Forest Logistic Regression XGBoost Theano Generative Adversarial Networks Is This Person Real / This Person Does Not Exist Reinforcement Learning AlphaGo AlphaStar Ray RLlib Open AI Google DeepMind Google Colab CUDA Julia Sebastian Raschka, Joshua Patterson, and Corey Nolet (2020). Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence. Information 2020, 11, 193 Swift Language Swift for TensorFlow Matplotlib Differential Privacy PrivacyNet YouTube recordings of Stat453: Introduction to Deep Learning and Generative Models (Spring 2020) ffm

Apr 28, 202049 min

Ep 259Build The Next Generation Of Python Web Applications With FastAPI

Full

Summary Python has an embarrasment of riches when it comes to web frameworks, each with their own particular strengths. FastAPI is a new entrant that has been quickly gaining popularity as a performant and easy to use toolchain for building RESTful web services. In this episode Sebastián Ramirez shares the story of the frustrations that led him to create a new framework, how he put in the extra effort to make the developer experience as smooth and painless as possible, and how he embraces extensability with lightweight dependency injection and a straightforward plugin interface. If you are starting a new web application today then FastAPI should be at the top of your list. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Sebastián Ramirez about FastAPI, a framework for building production ready APIs in Python 3 Interview Introductions How did you get introduced to Python? Can you start by describing what FastAPI is? What are the main frustrations that you ran into with other frameworks that motivated you to create an entirely new one? What are some of the main use cases that FastAPI is designed for? Many web frameworks focus on managing the end-to-end functionality of a website, including the UI. Why did you focus on just API capabilities? What are the benefits of building an API only framework? If you wanted to integrate a presentation layer, what would be involved in that effort? What API formats does FastAPI support? What would be involved in adding support for additional specifications such as GraphQL or JSON-LD? There are a huge number of web frameworks available just in the Python ecosystem. How does FastAPI fit into that landscape and why might someone choose it over the other options? Can you share your design philosophy for the project? What are your main sources of inspiration for the framework? You have also built the Typer CLI library which you refer to as the little sibling of FastAPI. How have your experiences building these two projects influenced their counterpart’s evolution? What are the benefits of incorporating type annotations into a web framework and in what ways do they manifest in its functionality? What is the workflow for a developer building a complex application in FastAPI? Can you describe how FastAPI itself is architected and how its design has evolved since you first began working on it? What are the extension points that are available for someone to build plugins for FastAPI? What are some of the challenges that you have faced in building an async framework that is leveraging the new ASGI specification? What are some sharp edges that users should keep an eye out for? What are some unique or underutilized features of FastAPI that users might not be aware of? What are some of the most interesting, unexpected, or innovative ways that you have seen FastAPI used? When is FastAPI the wrong choice? What are some of the most interesting, unexpected, or challenging lessons that you have learned in the process of building and maintaining FastAPI? What do you have planned for the future of the project? Keep In Touch @tiangolo on Twitter. @tiangolo on GitHub. Picks Tobias Once Upon A Time TV Show Sebastián Cloud Atlas Movie Isaac Asimov’s robot short stories Python devtools debug function async compatible requests with HTTPX RescueTime for automatic time tracking Joplin for Notes Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links FastAPI Typer Typer CLI FastAPI Alternatives, Inspiration and Comparisons Explosion’s spaCy Explosion’s Prodigy Starlette Pydantic Uvicorn Hypercorn fastapi-utils

Apr 20, 202058 min

Ep 258Distributed Computing In Python Made Easy With Ray

Full

Summary Distributed computing is a powerful tool for increasing the speed and performance of your applications, but it is also a complex and difficult undertaking. While performing research for his PhD, Robert Nishihara ran up against this reality. Rather than cobbling together another single purpose system, he built what ultimately became Ray to make scaling Python projects to multiple cores and across machines easy. In this episode he explains how Ray allows you to scale your code easily, how to use it in your own projects, and his ambitions to power the next wave of distributed systems at Anyscale. If you are running into scaling limitations in your Python projects for machine learning, scientific computing, or anything else, then give this a listen and then try it out! Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Robert Nishihara about Ray, a framework for building and running distributed applications and machine learning Interview Introductions How did you get introduced to Python? Can you start by describing what Ray is and how the project got started? How did the environment of the RISE lab factor into the early design and development of Ray? What are some of the main use cases that you were initially targeting with Ray? Now that it has been publicly available for some time, what are some of the ways that it is being used which you didn’t originally anticipate? What are the limitations for the types of workloads that can be run with Ray, or any edge cases that developers should be aware of? For someone who is building on top of ray, what is involved in either converting an existing application to take advantage of Ray’s parallelism, or creating a greenfield project with it? Can you describe how Ray itself is implemented and how it has evolved since you first began working on it? How does the clustering and task distriubtion mechanism in Ray work? How does the increased parallelism that Ray offers help with machine learning workloads? Are there any types of ML/AI that are easier to do in this context? What are some of the additional layers or libraries that have been built on top of the functionality of Ray? What are some of the most interesting, challenging, or complex aspects of building and maintaining Ray? You and your co-founders recently announced the formation of Anyscale to support the future development of Ray. What is your business model and how are you approaching the governance of Ray and its ecosystem? What are some of the most interesting or unexpected projects that you have seen built with Ray? What are some cases where Ray is the wrong choice? What do you have planned for the future of Ray and Anyscale? Keep In Touch Website @robertnishihara on Twitter robertnishihara on GitHub Picks Tobias D&D Castle Ravenloft board game One Deck Dungeon Robert The Everything Store Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Ray Anyscale UC Berkeley RISELab MATLAB Deep Learning Theano Tensorflow PyTorch Podcast Episode Philip Moritz Reinforcement Learning Hyperparameter Tuning IPython Parallel AMPLab Apache Spark Data Engineering Podcast Episode Actor Model Horovod(?) Flink Data Engineering Podcast Episode Spark Streaming Dask Data Engineering Podcast Episode gRPC Tune Rust C++ C Apache Arrow Wes McKinney Podcast Interview DataBricks MongoDB Elastic Data Engineering Podcast Episode Confluent Embarassingly Parallel Ant Financial Flame Graph The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Apr 14, 202041 min

Ep 257Building The Seq Language For Bioinformatics

Full

Summary Bioinformatics is a complex and computationally demanding domain. The intuitive syntax of Python and extensive set of libraries make it a great language for bioinformatics projects, but it is hampered by the need for computational efficiency. Ariya Shajii created the Seq language to bridge the divide between the performance of languages like C and C++ and the ecosystem of Python with built-in support for commonly used genomics algorithms. In this episode he describes his motivation for creating a new language, how it is implemented, and how it is being used in the life sciences. If you are interested in experimenting with sequencing data then give this a listen and then give Seq a try! Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on great conferences. And now, the events are coming to you, with no travel necessary! We have partnered with organizations such as ODSC, and Data Council. Upcoming events include the Observe 20/20 virtual conference on April 6th and ODSC East which has also gone virtual starting April 16th. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Ariya Shajii about Seq, a programming language built for bioinformatics and inspired by Python Interview Introductions How did you get introduced to Python? Can you start by describing what Seq is and your motivation for creating it? What was lacking in other languages or libraries for your use case that is made easier by creating a custom language? If someone is already working in Python, possibly using BioPython, what might motivate them to consider migrating their work to Seq? Can you give an impression of the scope and nature of the tasks or projects that a biologist or geneticist might build with Seq? What was your process for identifying and prioritizing features and algorithms that would be beneficial to the target audience? For someone using Seq can you describe their workflow and how it might differ from performing the same task in Python? How is Seq implemented? What are some of the features that are included to simplify the work of bioinformatics? What was your process of designing the language and runtime? How has the scope or direction of the project evolved since it was first conceived? What impact do you anticipate Seq having on the domain of bioinformatics and genomics? What have you found to be the most interesting, unexpected, and/or challenging aspects of building a language for this problem domain? What is in store for the future of Seq? Keep In Touch arshajii on GitHub Website Picks Tobias Board Games Labyrinth Boardgame Board Game Geek Ariya Breakthrough documentary Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Seq MIT CSAIL Bioinformatics LLVM Intermediate Representation MatLab Moore’s Law BioPython Smith Waterman Algorithm Hamming Distance Pattern Matching in Functional Programming SIMD == Single Instruction Multiple Data Computational Genomics Phylogenetics Sequence Read Archive public data set Google Cloud Life Sciences The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Apr 7, 202036 min

Ep 256An Open Source Toolchain For Natural Language Processing From Explosion AI

Full

Summary The state of the art in natural language processing is a constantly moving target. With the rise of deep learning, previously cutting edge techniques have given way to robust language models. Through it all the team at Explosion AI have built a strong presence with the trifecta of SpaCy, Thinc, and Prodigy to support fast and flexible data labeling to feed deep learning models and performant and scalable text processing. In this episode founder and open source author Matthew Honnibal shares his experience growing a business around cutting edge open source libraries for the machine learning developent process. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on great conferences. And now, the events are coming to you, with no travel necessary! We have partnered with organizations such as ODSC, and Data Council. Upcoming events include the Observe 20/20 virtual conference on April 6th and ODSC East which has also gone virtual starting April 16th. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Matthew Honnibal about the Thinc and Prodigy tools and an update on SpaCy Interview Introductions How did you get introduced to Python? Can you start by giving an overview of your mission at Explosion? We spoke previously about your work on SpaCy. What has changed in the past 3 1/2 years? How have recent innovations in language models such as BERT and GPT-2 influenced the direction or implementation of the project? When I last looked SpaCy only supported English and German, but you have added several new languages. What are the most challenging aspects of building the additional models? What would be required for supporting symbolic or right-to-left languages? How has the ecosystem for language processing in Python shifted or evolved since you first introduced SpaCy? Another project that you have released is Prodigy to support labelling of datasets. Can you talk through the motivation for creating it and describe the workflow for someone using it? What was lacking in the other annotation tools that you have worked with that you are trying to solve for in Prodigy? What are some of the most challenging or problematic aspects of labelling data sets for use in machine learning projects? What is a typical scale of data that can be reasonably handled by an individual or small team working with Prodigy? At what point do you find that it makes sense to use a labeling service rather than generating the labels yourself? Your most recent project is Thinc for building and using deep learning models. What was the motivation for creating it and what problem does it solve in the ecosystem? How does its design and usage compare to other deep learning frameworks such as PyTorch and Tensorflow? How does it compare to projects such as Keras that abstract across those frameworks? How do the SpaCy, Prodigy, and Thinc libraries work together? What are some of the biggest challenges that you are facing in building open source tools to meet the needs of data scientists and machine learning engineers? What are some of the most interesting or impressive projects that you have seen built with the tools your team is creating? What do you have planned for the future of Explosion, SpaCy, Prodigy, and Thinc? Keep In Touch LinkedIn @honnibal on Twitter honnibal on GitHub Picks Tobias Onward movie Matthew Coronavirus Preparedness Ray Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To

Mar 30, 202051 min

Ep 255A Flexible Open Source ERP Framework To Run Your Business

Full

Summary Running a successful business requires some method of organizing the information about all of the processes and activity that take place. Tryton is an open source, modular ERP framework that is built for the flexibility needed to fit your organization, rather than requiring you to model your workflows to match the software. In this episode core developers Nicolas Évrard and Cédric Krier are joined by avid user Jonathan Levy to discuss the history of the project, how it is being used, and the myriad ways that you can adapt it to suit your needs. If you are struggling to keep a consistent view of your business and ensure that all of the necessary workflows are being observed then listen now and give Tryton a try. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Nicolas Évrard, Cédric Krier, and Jonathan Levy about Tryton Interview Introductions How did you get introduced to Python? Can you start by describing what Tryton is and how it got started? What kinds of businesses is Tryton most suited to? What kinds of businesses is Tryton not a good fit for? Within a business, who are the primary users of Tryton? Can you talk through a typical workflow for interacting with Tryton? What are some of the most complex or challenging aspects of modeling a business while maintaining a high degree of customizability? Can you describe how Tryton is architected and how its design has evolved since it was first started? If you were to start over today, what would you do differently? There are a number of plugins for Tryton. What kinds of functionality can be customized using the available interfaces? What is the process for building a custom module for Tryton? How do you manage sustainability of the Tryton project? Given the criticality of the Tryton platform, how do you approach ongoing stability and security of the project? What is involved in deploying and maintaining an installation of Tryton? What are some of the most interesting, innovative, or unexpected ways that you have seen Tryton used? What is in store for the future of Tryton? Keep In Touch Nicolas nicoe on GitHub @nicoe on Twitter Cédric @cedrickrier on Twitter cedk on GitHub Jonathan LinkedIn Picks Tobias Audio Books Audible free trial (Affiliate Link) Overdrive – ebooks and audiobooks from your local library Public Domain Audiobooks Nicolas Civilization VI FreeCiv The 3 Body Problem Cédric Valérian and Laureline Jonathan Roil.com Links Tryton B2CK Tryton Foundation Advocate Consulting Legal Group Scheme Lisp Belgium EuroPython Conference Plone Zope VBA (Visual Basic for Applications) Django Odoo ERP == Enterprise Resource Planning Small/Medium Enterprise (SME) GTK (Gnome ToolKit) 3-Tier Application Cookiecutter Tryton Module Cookiecutter Tryton Repository Docker GNU Health Nereid The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Mar 23, 20201h 7m

Ep 254Getting A Handle On Portable C Extensions With hpy

Full

Summary One of the driving factors of Python’s success is the ability for developers to integrate with performant languages such as C and C++. The challenge is that the interface for those extensions is specific to the main implementation of the language. This contributes to difficulties in building alternative runtimes that can support important packages such as NumPy. To address this situation a team of developers are working to create the hpy project, a new interface for extension developers that is standardized and provides a uniform target for multiple runtimes. In this episode Antonio Cuni discusses the motivations for creating hpy, how it benefits the whole ecosystem, and ways to contribute to the effort. This is an exciting development that has the potential to unlock a new wave of innovation in the ways that you can run your Python code. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! As a developer, maintaining a state of flow is key to your productivity. Don’t let something as simple as the wrong function ruin your day. Kite is the smartest completions engine available for Python, featuring a machine learning model trained by the brightest stars of GitHub. Featuring ranked suggestions sorted by relevance, offering up to full lines of code, and a programming copilot that offers up the documentation you need right when you need it. Get Kite for free today at getkite.com with integrations for top editors, including Atom, VS Code, PyCharm, Spyder, Vim, and Sublime. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Antonio Cuni about hpy, a project aiming to reimagine the C API for Python Interview Introductions How did you get introduced to Python? Can you start by describing what the hpy project is and how it got started? What are the goals for the project? Who else is involved? How much engagement have you had with CPython core contributors or the steering council? Who are the consumers of the current C API for the CPython implementation? What are some of the pain points or shortcomings for those consumers? What impact does that have for users of a given library that leverages C extensions? Can you talk through the structure of the hpy project? What are some of the design challenges that you are facing for determining the external API? What is involved in integrating the hpy interface into alternate runtimes such as PyPy or RustPython? What is the potential or observed performance impact for libraries that currently rely on the existing C API? How has the vision and scope of this project been updated as you have gotten further along in the implementation? What are the downstream impacts that you anticipate in projects such as PyPy and Cython? What have you found to be the most challenging or contentious aspects of implementing hpy so far? What are some of the most interesting/unexpected/useful lessons that you have learned while working on hpy? What do you have planned for the near to medium term for hpy? Keep In Touch antocuni on GitHub Website @antocuni on Twitter Picks Tobias Poetry Antonio Collapse: How Societies Choose To Fail Or Succeed by Jared Diamond Links hpy PyPy Alex Martelli Podcast Interview Python C Extensions EuroPython Victor Stinner Cython Podcast Episode Armin Rigo NumPy ultrajson GIL == Global Interpreter Lock RustPython Podcast Episode GraalPython hpy-rust The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Mar 16, 202035 min

Ep 253Open Source Machine Learning On Quantum Computers With Xanadu AI

Full

Summary Quantum computers promise the ability to execute calculations at speeds several orders of magnitude faster than what we are used to. Machine learning and artificial intelligence algorithms require fast computation to churn through complex data sets. At Xanadu AI they are building libraries to bring these two worlds together. In this episode Josh Izaac shares his work on the Strawberry Fields and Penny Lane projects that provide both high and low level interfaces to quantum hardware for machine learning and deep neural networks. If you are itching to get your hands on the coolest combination of technologies, then listen now and then try it out for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! As a developer, maintaining a state of flow is key to your productivity. Don’t let something as simple as the wrong function ruin your day. Kite is the smartest completions engine available for Python, featuring a machine learning model trained by the brightest stars of GitHub. Featuring ranked suggestions sorted by relevance, offering up to full lines of code, and a programming copilot that offers up the documentation you need right when you need it. Get Kite for free today at getkite.com with integrations for top editors, including Atom, VS Code, PyCharm, Spyder, Vim, and Sublime. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Josh Izaac about how the work that he is doing at Xanadu AI to make it easier to build applications for quantum processors Interview Introductions How did you get introduced to Python? Can you start by describing what you are working on at Xanadu AI? How do the specifics of your quantum hardware influence the way in which developers need to build their algorithms? (e.g. as compared to DWave) What are some of the underlying principles that developers need to understand in order to take full advantage of the capabilities provided by quantum processors? Can you outline the different components and libraries that you are building to simplify the work of building machine learning/AI projects for quantum processors? What’s the story behind all of the Beatles references? How do the different libraries fit together? What are some of the workloads and use cases that you and your customers are focused on? What are some of the most challenging aspects of designing a library that is accessible to developers while being able to take advantage of the underlying hardware? How does the workflow for machine learning on quantum computers differ from what is being done in classical environments? Given the magnitude of computational power and data processing that can be achieved in a quantum processor it seems that there is a potential for small bugs to have disproportionately large impacts. How can developers identify and mitigate potential sources of error in their algorithms? For someone who is building an application or algorithm to be executed on a Xanadu processor, what does their workflow look like? What are some of the common errors or misconceptions that you have seen in customer code? Can you describe the design and implementation of the Penny Lane and Strawberry Fields libraries and how they have evolved since you first began working on them? What are some of the most ambitious or exciting use cases for quantum systems that you have seen? How are you using the computational capabilities of your platform to feed back into the research and design of successive generations of hardwar

Mar 10, 202057 min

Ep 252The Advanced Python Task Scheduler

Full

Summary Most long-running programs have a need for executing periodic tasks. APScheduler is a mature and open source library that provides all of the features that you need in a task scheduler. In this episode the author, Alex Grönholm, explains how it works, why he created it, and how you can use it in your own applications. He also digs into his plans for the next major release and the forces that are shaping the improved feature set. Spare yourself the pain of triggering events at just the right time and let APScheduler do it for you. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Alex Grönholm about APScheduler, a library for scheduling tasks in your Python projects Interview Introductions How did you get introduced to Python? Can you start by describing what APScheduler is and the main use cases that APScheduler is designed for? What was your movitvation for creating it? What is the workflow for integrating APScheduler into an application? In the documentation it says not to run more than one instance of the scheduler, what are some strategies for scaling schedulers? What are some common architectures for applications that take advantage of APScheduler? What are some potential pitfalls that developers should be aware of? Can you describe how APScheduler is implemented and how its design has evolved since you first began working on it? What have you found to be the most complex or challenging aspects of building or using a scheduling framework? What are some of the most interesting/innovative/unexpected ways that you have seen APScheduler used? What are some of the features or capabilities that you have consciously left out? What design strategies or features of APScheduler are often overlooked or underappreciated? What are some of the most useful or interesting lessons that you have learned while building and maintaining APScheduler? When is APScheduler the wrong choice for managing task execution? What do you have planned for the future of the project? Keep In Touch agronholm on GitHub Picks Tobias The Data Exchange Podcast Alex Tenacity Links APScheduler PHP Java ECMAScript Celery ERP == Enterprise Resource Planning Cron Daemon RPyC Zookeeper Data Engineering Podcast Episode RethinkDB Daylight Saving Time Falsehoods Programmers Believe About Time PyTZ Celery Beats Asphalt Framework Podcast Episode AnyIO Twisted Podcast Episode Py2EXE PyInstaller The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Mar 2, 202033 min

Ep 251Reducing The Friction Of Embedded Software Development With PlatformIO

Full

Summary Embedded software development is a challenging endeavor due to a fragmented ecosystem of tools. Ivan Kravets experienced the pain of programming for different hardware platforms when embroiled in a home automation project. As a result he built the PlatformIO ecosystem to reduce the friction encountered by engineers working with multiple microcontroller architectures. In this episode he describes the complexities associated with targeting multiple platforms, the tools that PlatformIO offers to simplify the workflow, and how it fits into the development process. If you are feeling the pain of working with different editing environments and build toolchains for various microcontroller vendors then give this interview a listen and then try it out for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Ivan Kravets about PlatformIO, an open source ecosystem for IoT development including a cross-platform IDE, unified debugger, remote unit testing, and firmware updates. Interview Introductions How did you get introduced to Python? Can you start by describing what PlatformIO is? What was your motivation for creating it? What are the aspects of embedded development that keep you interested and engaged in this space? What are some of the types of projects that someone might use PlatformIO to build? What are some of the common challenges that a developer might encounter when working on embedded systems? What are the additional complexities that get introduced as more hardware targets get added to a project? What is the workflow for someone using PlatformIO for embedded systems development? What are the different elements of PlatformIO and how do they simplify the work of building embedded systems projects? How is PlatformIO implemented and how has the system design evolved since you first began working on it? What was your reason for selecting Python as the implementation language? If you were to start over today what would you do differently? How has the embedded hardware and software landscape changed since you first started work on PlatformIO? How has that impacted your product direction? How do developers handle testing and validation of their applications? How does PlatformIO help with updating deployed devices with new firmware? What have been some of the most interesting/unexpected/innovative projects that you have seen built with PlatformIO? What have been some of the most interesting/unexpected/challenging aspects of building and maintaining PlatformIO? How are you approaching sustainability of the project and business? What do you have planned for the future of PlatformIO? Keep In Touch LinkedIn Website ivankravets on GitHub @ikravets on Twitter Picks Tobias UMass Amherst Making Electricity From Thin Air Ivan Don’t focus on the money side of your project, just focus on building a great product. Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PlatformIO Ukraine Home Automation Home

Feb 25, 202046 min

Ep 250APIs, Sustainable Open Source and The Async Web With Tom Christie

Full

Summary Tom Christie is probably best known as the creator of Django REST Framework, but his contributions to the state the web in Python extend well beyond that. In this episode he shares his story of getting involved in web development, his work on various projects to power the asynchronous web in Python, and his efforts to make his open source contributions sustainable. This was an excellent conversation about the state of asynchronous frameworks for Python and the challenges of making a career out of open source. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Tom Christie about the Encode organization and the work he is doing to drive the state of the art in async for Python Interview Introductions How did you get introduced to Python? Can you start by describing what the Encode organization is and how it came to be? What are some of the other approaches to funding and sustainability that you have tried in the past? What are the benefits to the developers provided by an organization which you were unable to achieve through those other means? What benefits are realized by your sponsors as compared to other funding arrangements? What projects are part of the Encode organization? How do you determine fund allocation for projects and participants in the organization? What is the process for becoming a member of the Encode organization and what benefits and responsibilities does that entail? A large number of the projects that are part of the organization are focused on various aspects of asynchronous programming in Python. Is that intentional, or just an accident of your own focus and network? For those who are familiar with Python web programming in the context of WSGI, what are some of the practices that they need to unlearn in an async world, and what are some new capabilities that they should be aware of? Beyond Encode and your recent work on projects such as Starlette you are also well known as the creator of Django Rest Framework. How has your experience building and growing that project influenced your current focus on a technical, community, and professional level? Now that Python 2 is officially unsupported and asynchronous capabilities are part of the core language, what future directions do you foresee for the community and ecosystem? What are some areas of potential focus that you think are worth more attention and energy? What do you have planned for the future of Encode, your own projects, and your overall engagement with the Python ecosystem? Keep In Touch Website tomchristie on Github @_tomchristie on Twitter Picks Tobias Maleficent: Mistress of Evil Abominable Tom The Lobster The Master And His Emissary by Ian McGilchrist Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Encode Django Rest Framework Starlette Zope Django Django Piston Django Tastypie Andrew Godwin ASGI Django Channels Podcast Episode Flask Pyramid Sentry Podcast Episode Tideli

Feb 18, 202043 min

Ep 249Learning To Program Python By Building Video Games With Arcade

Full

Summary Video games have been a vehicle for learning to program since the early days of computing. Continuing in that tradition, Paul Craven created the Arcade library as a modern alternative to PyGame for use in his classroom. In this episode he explains his motivations for starting a new framework for video game development, his view on the benefits of games in computer education, and how his students and the broader community are using it to build interesting and creative projects. If you are looking for a way to get new programmers engaged, or just want to experiment with building your own games, then this is the conversation for you. Give it a listen and then give Arcade a try for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Paul Craven about Arcade, an easy-to-learn Python library for creating 2D video games Interview Introductions How did you get introduced to Python? Can you start by describing what Arcade is? What inspired you to begin working on it? Who is your primary audience? As an educator, what have you found to be most effective about using games as a vehicle for teaching programming? What elements of programming or computer science do you have difficulty in addressing within the context of a video game? For someone who wants to move on from working on games to something like web development or data analytics, what elements of software design and structure are easily translated to other domains? Can you describe how Arcade is implemented and how the architecture has evolved since you first began working on it? If you were to start over today, what would you do differently? What have you found to be the most interesting/unexpected/challenging aspects of building and maintaining Arcade? What are some of the most interesting/innovative/unexpected ways that you have seen Arcade used? When is Arcade the wrong platform, or at what point does someone need to move on from Arcade? What do you have planned for the future of Arcade? Keep In Touch @professorcraven on Twitter pvcraven on GitHub Faculty Page Picks Tobias Ori And The Blind Forest Paul Fahrenheit 451 by Ray Bradbury “Mistakes can be profited by Man, when i was young I showed my ignorance in people’s faces. They beat me with sticks. By the time I was forty my blunt instrument had been honed to a fine cutting point for me. If you hide your ignorance, no one will hit you and you’ll never learn.” Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Arcade Simpson College PyGame SDL OpenGL Unity Unreal Engine GoDot Automate The Boring Stuff With Python Minesweeper Pyglet Spatial Hashing Tiled Map Editor Python Type Hints F Strings Data Classes PyMunk FFMPEG PyWeek Podcast Episode Python Discord Arcade Enhancement Requests The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Feb 11, 202041 min

Ep 248Build Your Own Personal Data Repository With Nostalgia

Full

Summary The companies that we entrust our personal data to are using that information to gain extensive insights into our lives and habits while not always making those findings accessible to us. Pascal van Kooten decided that he wanted to have the same capabilities to mine his personal data, so he created the Nostalgia project to integrate his various data sources and query across them. In this episode he shares his motivation for creating the project, how he is using it in his day-to-day, and how he is planning to evolve it in the future. If you’re interested in learning more about yourself and your habits using the personal data that you share with the various services you use then listen now to learn more. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Pascal van Kooten about his nostalgia project, a nascent framework for taking control of your personal data Interview Introductions How did you get introduced to Python? Can you start by describing your mission with the nostalgia project? How did the topic of personal data management come to be a focus for you? What other options exist for users to be able to collect and manage their own data? What capabilities were lacking in those options that made you feel the need to build Nostalgia? What is your target audience for this set of projects? How are you using Nostalgia in your own life? What are some of the insights that you have been able to gain as a result of integrating your data with Nostalgia? Can you describe the current architecture of the Nostalgia platform and how it has evolved since you began work on it? What are some of the assumptions that you are using to direct the focus of your development and interaction design? What are the minimum number of data sources needed to make this useful? What are some of the challenges that you are facing in collating and integrating different data sources? What are some of the drawbacks of using something like Nostalgia for managing your personal data? What are some of the most interesting/challenging/unexpected aspects of your work on Nostalgia so far? What do you have planned for the future of the project? Keep In Touch Website LinkedIn @kootenpv on Twitter kootenpv on GitHub Picks Tobias Jumanji: The Next Level Jumanji Pascal Bup Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links timeliner qs_ledger Nostalgia Shrynk Whereami R Language Duck Duck Go Caddy Perkeep Dark Programming Language Pandas Podcast Episode Neo4J Pandas Extension Arrays Podcast Episode Parquet Data Engineering Podcast Episode ElectronJS Zincbase The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Feb 4, 202032 min

Ep 247Simplifying Social Login For Your Web Applications

Full

Summary A standard feature in most modern web applications is the ability to log in or register using accounts that you already own on other sites such as Google, Facebook, or Twitter. Building your own integrations for each service can be complex and time consuming, distracting you from the features that you and your users actually care about. Fortunately the Python social auth library makes it easy to support third party authentication with a large and growing number of services with minimal effort. In this episode Matías Aguirre discusses his motivation for creating the library, how he has designed it to allow for flexibility and ease of use, and the benefits of delegating identity and authentication to third parties rather than managing passwords yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Matías Aguirre about Python social auth and the complexities of third-party authentication Interview Introductions How did you get introduced to Python? Can you start by describing what the Python social auth project is and your motivation for starting it? Why might someone want to integrate with or rely on a third-party identity provider in their projects? What are some of the tradeoffs or drawbacks of implementing Can you describe the current architecture of the library and how it has evolved since you first began working on it? There are a number of pre-built integrations with different web frameworks in the social auth github organization, but Django is the only one that has seen any commits recently. What are the contributing factors for that state of affairs? There are a number of authentication protocols that you support. What are the common capabilities that they each support and what are some of the more challenging differences between them? How have you implemented the interface for plugging different authentication mechanisms to allow for the variation between them while keeping the library code maintainable? What is involved in adding support for a new authentication provider or protocol? Many times authorization and authentication are conflated or used interchangeably. How does Python social auth address those concerns and what are the limitations of different mechanisms for defining permissions? For someone who is using Python social auth, what is the workflow for integrating it with their application as a consumer? What are some of the most interesting/unexpected/innovative ways that you have seen Python social auth used? What are some of the most interesting/useful/unexpected lessons that you have learned in the process of building and maintaining Python social auth? When is Python social auth more effort than it’s worth? What do you have planned for the future of the project? Keep In Touch omab on GitHub Website @linuxaddict on Twitter LinkedIn Picks Tobias Joker movie Matías Sanic asynchronous web framework Star Trek Picard TV series Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace

Jan 27, 202034 min

Ep 246Building A Business On Building Data Driven Businesses

Full

Summary In order for an organization to be data driven they need easy access to their data and a simple way of sharing it. Arik Fraimovich built Redash as a way to address that need by connecting to any data source and building attractive dashboards on top of them. In this episode he shares the origin story of the project, his experiences running a business based on open source, and the challenges of working with data effectively. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Arik Fraimovich about Redash, an open source business intelligence platform that helps you make sense of your data. Interview Introductions How did you get introduced to Python? Can you start by describing what Redash is and its origin story? What are the primary ways that it is used? The business intelligence market is quite mature and has many commercial and open source projects to choose from. What are the aspects of Redash that have allowed you to be successful? What would you consider to be your closest competitors? What was your background with data before starting on Redash? What are some of the most notable lessons that you have learned about business intelligence since starting the project? How has the landscape for business intelligence and data analysis changed since you began the project? Beyond just accessing data, Redash focuses on enabling visualization of the results. What types of visualizations do you support and how do you support users in choosing the most effective ways to represent the information? What are some of the common challenges that your users and customers encounter when communicating with data? One of the critical aspects of enabling data access in an organization is the ability to collaborate on asking and answering questions. How do you approach that challenge in Redash? How is Redash implemented and how has the overall design and architecture evolved since you first started working on it? How do you manage the complexity of supporting so many different data sources? If you were to start over today, what would you do differently? Beyond the code of Redash, you also have a business around providing it as a hosted service. What are some of the most interesting, challenging, or unexpected lessons that you have learned in the process of building and growing that service? How do you approach the direction and governance of the open source project and balance that against the wants and needs of the community? What are some of the most interesting, innovative, or unexpected ways that you have seen Redash used? When is Redash the wrong platform to use? What do you have planned for the future of the Redash business and project? Keep In Touch arikfr on GitHub Website @arikfr on Twitter Picks Tobias Data Engineering Podcast Arik Peewee ORM Amazon ECS Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Redash Google App Engine EverythingMe RedShift Metabase Data Engineering Podcast In

Jan 20, 202041 min

Ep 245Using Deliberate Practice To Level Up Your Python

Full

Summary An effective strategy for teaching and learning is to rely on well structured exercises and collaboration for practicing the material. In this episode long time Python trainer Reuven Lerner reflects on the lessons that he has learned in the 5 years since his first appearance on the show, how his teaching has evolved, and the ways that he has incorporated more hands-on experiences into his lessons. This was a great conversation about the benefits of being deliberate in your approach to ongoing education in the field of technology, as well as having some helpful references for ways to keep your own skills sharp. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m pleased to welcome back Reuven Lerner to talk about the benefits of deliberate practice for learning and improving programming skills Interview Introductions How did you get introduced to Python? In your first appearance on the show back in episode 2 we talked about your experience as a Python trainer. How has your teaching style evolved in the past 5 years? How has the focus and scope of your training changed in that time period? What have you found to be some of the most helpful and effective tactics in your training? From the learner perspective, what are some strategies that you recommend for retaining information, particularly in the context of gaining technical knowledge? In-person training vs. real-time online training vs. recorded videos, advantages and disadvantages of each. Blended learning, in which we combine aspects of the above Beyond in-person training, what are your preferred methods for learning and maintaining new skills? What is deliberate practice and how does it differ from the habits that many of us might default to? What are some of the resources that you provide for students of your trainings for practicing? What are some of the outside resources which you have found most useful or effective? Keep In Touch Website Blog @reuvenmlerner on Twitter Picks Tobias The Manager’s Path by Camille Fournier Reuven Lab Rats: How Silicon Valley Made Work Miserable For The Rest Of Us by Dan Lyons Links Deliberate Practice Reuven On Episode 2 CGI == Common Gateway Interface Language Phrasebook Jupyter Notebook Walrus Operator PyCon 2019 Presentation Python Bytes List Comprehension Weekly Python Exercise Python Morsels PyBites Practice Your Python Python Workout book by Reuven Lerner PyTest Brian Okken The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Jan 13, 202048 min

Ep 244Checking Up On Python's Role in DevOps

Full

Summary Python has been part of the standard toolkit for systems administrators since it was created. In recent years there has been a shift in how servers are deployed and managed, and how code gets released due to the rise of cloud computing and the accompanying DevOps movement. The increased need for automation and speed of iteration has been a perfect use case for Python, cementing its position as a powerful tool for operations. In this episode Moshe Zadka reflects on his experiences using Python in a DevOps context and the book that he wrote on the subject. He also discusses the difference in what aspects of the language are useful as an introduction for system operators and where they can continue their learning. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Moshe Zadke about his recent book DevOps In Python Interview Introductions How did you get introduced to Python? How did you gain experience in managing systems with Python? What is DevOps? What makes Python a good fit for managing systems? What is unique to the devops/sysadmin domain in terms of what software is used and what aspects of the language are useful? What are the main ways that Python is used for managing servers and infrastructure? What are some of the most notable changes in the ways that Python is used for server administration over the past several years? How has Python3 impacted the lives of operators? What was your motivation for writing a book about Python focused specifically on DevOps and server automation? What are some of the tools that have been replaced in your own workflow over the years? Keep In Touch Website LinkedIn @moshezadka on Twitter Picks Tobias SaltStack Podcast Episode Moshe Automat Podcast Episode Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links DevOps In Python SurveyMonkey Twisted Episode DevOps B=hive CI/CD Amoeba OS Python OS module Requests Canary Deployments Post Mortem Bash Shell Z Shell Linux Unix AWS Boto3 GitHub GitLab Debian Ubuntu CentOS Pip Poetry Pipenv pip-tools dh-virtualenv Docker Hyneck Schlaweck Presentation On Building Docker Images Ansible SaltStack Chef Puppet The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Jan 6, 202033 min

Ep 243Python's Built In IDE Isn't Just Sitting IDLE

Full

Summary One of the first challenges that new programmers are faced with is figuring out what editing environment to use. For the past 20 years, Python has had an easy answer to that question in the form of IDLE. In this episode Tal Einat helps us explore its history, the ways it is used, how it was built, and what is in store for its future. Even if you have never used the IDLE editor yourself, it is still an important piece of Python’s strength and history, and this conversation helps to highlight why that is. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Tal Einat about the IDLE editor for Python, it’s history, and what is in store for its future Interview Introductions How did you get introduced to Python? For anyone who hasn’t used it, can you start by explaining what IDLE is? IDLE has been part of the standard library for Python for quite some time now. What was the motivation for adding it to the core of Python? How has the evolution of our computing environment changed the motivation for maintaining IDLE and the use cases that it is most beneficial for? What are the benefits of including a basic editor in the default distribution of Python? What are some of the ways in which it is often used? What are the limiting factors that lead users to other IDEs or text editors? What role do you think IDLE has played in the growth of Python? What was your motivation for getting involved as a Python contributor and working on the implementation of IDLE? How is IDLE implemented and what are some of the ways that it has evolved since its initial introduction? How well has the code for IDLE aged as new features and capabilities are added to the language? What are some of the integration points available for extending IDLE? What are some of the most interesting or innovative ways that you have seen IDLE used and extended? What is planned for the future of the IDLE module? Keep In Touch LinkedIn @TalEinat on Twitter taleinat on GitHub Picks Tobias Mr. Robot Tal Captain Fantastic The Lesson To Unlearn article by Paul Graham Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links IDLE FullProof Israel Mandatory Military Service Eric Idle Monty Python Visual Studio IDLE-fork Vi Emacs Sublime Text Visual Studio Code REPL == Read Eval Print Loop Tcl/Tk Tkinter RPC == Remote Procedure Call IDLEx VPython Podcast Episode Python Turtle SVN (Subversion) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Dec 23, 201936 min

Ep 242Riding The Rising Tides Of Python

Full

Summary The past two decades have seen massive growth in the language, community, and ecosystem of Python. The career of Pete Fein has occurred during that same period and his use of the language has paralleled some of the major shifts in focus that have occurred. In this episode he shares his experiences moving from a trader writing scripts, through the rise of the web, to the current renaissance in data. He also discusses how his engagement with the community has evolved, why he hasn’t needed to use any other languages in his career, and what he is keeping an eye on for the future. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Pete Fein about his voyage on the rising tide of Python Interview Introductions How did you get introduced to Python? I understand that you have used Python exclusively in your professional life. What other languages have you been exposed to and taken inspiration from? What are some of the projects that you have been involved with which you are most proud of? How has the community and your involvement with it changed over the years? In your experience, how has the growth in the size and breadth of the community impacted its accessibility to newcomers? You have been using Python and participating in the community for quite some time now, and there have been significant changes in both within that period. What are some of the most significant technological shifts that you have noticed and been a part of? How have those shifts influenced the direction of your career? As you have moved through the different phases of your career with different areas of focus, what are some of the aspects of the work which have remained constant? What have been the biggest differences across the different problem domains? What are some of the aspects of the language or its ecosystem which you feel are lacking or don’t get enough attention? What are some of the industry trends which you are keeping a close eye on and how do you anticipate them influencing the direction of the community and your career in the upcoming years? Keep In Touch Consulting Website Personal Website @wearpants on Twitter LinkedIn wearpants on GitHub Picks Tobias Matomo Analytics Pete FastAPI PyDantic Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Chicago Scheme Structure and Interpretation of Computer Programs David Beazley Podcast Episode Twiggy logging library Jesse Noller Log4J Debian RedHat StructLog Elliot Podcast Episode Logbook Armin Ronacher Podcast Episode Pittsburgh Python Meetup Boltons library Elixir ChiPy Chicago Python user group Subversion Ruby On Rails Django Data Engineering Data Engineering Podcast Internet of Things Pittsburgh Artificial Pancreas Project Eric Holscher Read The Docs Podcast Episode Circuit Playground Express CircuitPython Podcast Episode Rust Language PyOhio PyGotham The intro and outro music is from Requiem for

Dec 16, 201944 min

Ep 241Debugging Python Projects With PySnooper

Full

Summary Debugging is a painful but necessary practice in software development. The tools that are available in Python range from the built-in debugger, to tools integrated with your coding environment, to the trusty print function. In this episode Ram Rachum describes his work on PySnooper and how it can be used to speed up your problem solving in complex or legacy applications. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, or running your build servers, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media and the Python Software Foundation. Upcoming events include the Software Architecture Conference in NYC and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Ram Rachum about PySnooper, an alternative approach to debugging your python projects Interview Introductions How did you get introduced to Python? How do developers normally debug their code, and what need does PySnooper address that isn’t addressed by the established methods? What is the workflow for using PySnooper for investigating or debugging a project? (This will probably be answered in the answer to the question above) What are some of the pieces of information that it surfaces and how do they aid the developer in directing their investigation? What were some of the projects that you were testing it with and how did they influence the direction that you took PySnooper? Can you describe how PySnooper is implemented and some of the ways that it has evolved since you first began working on it? What are some of the initial goals that you had for the project which you have since abandoned as either not useful or too challenging to implement? What are some of the edge cases or technical challenges that you have encountered while working on PySnooper, either in Python itself or in the tool? There is another project called Snoop which builds on top of your work on PySnooper to add some extra functionality and developer ergonomics. What, if anything, was your reaction to it and how has it influenced your work on PySnooper? One of the notable aspects of your work on PySnooper is the amount of attention that it garnered shortly after you published it. How has that visibility affected the long-term popularity and use of PySnooper? What have been some of the most interesting, unexpected, or difficult aspects of creating, maintaining, and promoting PySnooper? What do you have planned for the future of the project? Keep In Touch cool-RR on GitHub Personal Website Consulting Website Picks Tobias PyCon US Call for proposals Registration Ram Nonviolent communication Links PySnooper Ram’s Python workshops The PyWeb-IL meetup BlueVine’s career page Submit your CV to Ram’s email mailto:[email protected] Tel Aviv Israel Paul Graham Y Combinator startup accelerator Wing IDE PyCharm sys.settrace Python f_trace coverage.py Podcast.init Interview PEP == Python Enhancement Proposal Podcast Episode snoop project Alex Hall pdb pudb pdb++ The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Dec 9, 201945 min

Ep 240Making Complex Software Fun And Flexible With Plugin Oriented Programming

Full

Summary Starting a new project is always exciting because the scope is easy to understand and adding new features is fun and easy. As it grows, the rate of change slows down and the amount of communication necessary to introduce new engineers to the code increases along with the complexity. Thomas Hatch, CTO and creator of SaltStack, didn’t want to accept that as an inevitable fact of software, so he created a new paradigm and a proof-of-concept framework to experiment with it. In this episode he shares his thoughts and findings on the topic of plugin oriented programming as a way to build and scale complex projects while keeping them fun and flexible. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Thomas Hatch about his work on the POP library and how he is using plugin oriented programming in his work at SaltStack Interview Introductions How did you get introduced to Python? Can you start by giving your definition of Plugin Oriented Programming and your thoughts on what benefits it provides? You created the POP library as a framework for enabling developers to incorporate this pattern into their own projects. What capabilities does that framework provide and what was your motivation for creating it? How has your work on Salt influenced your thinking on how to implement plugins for software projects? How does POP fit into the future of the SaltStack project? What are some of the advanced patterns or paradigms that the POP model allows for? Can you describe how the POP library itself is implemented and some of the ways that its design has evolved since you first began experimenting with it? What are some of the languages or libraries that you have looked at for inspiration in your design and philosophy around this development pattern? For someone who is building a project on top of POP what does their workflow look like and what are some of the up-front design considerations they should be thinking of? How do you define and validate the contract exposed by or expected from a plugin subsystem? One of the interesting capabilities that you highlight in the documentation is the concept of merging applications. What are your thoughts on the challenges that an engineer might face when merging library or microservice applications built with POP into a single deployable artifact? What would be involved in going the other direction to split a single application into independently runnable microservices? When extracting common functionality from a group of existing applications, what are the relative merits of creating a plugin sybsystem vs writing a library? How does the system design of a POP application impact the available range of communication patterns for software and the teams building it? What are some antipatterns that you anticipate for teams building their projects on top of POP? In the documentation you mention that POP is just an example implementation of the broader pattern and that you hope to see other languages and developer communities adopt it. What are some of the barriers to adoption that you foresee? What are some of the limitations of POP or cases where you would recommend against following this paradigm? What are some of the most interesting, innovative, or unexpected ways that you have seen POP used? What have been some of the most interesting, unexpected, or challenging aspects of building POP? What do you have planned for the future of the PO

Dec 3, 20191h 2m

Ep 239Faster And Safer Software Development With Feature Flags

Full

Summary Any software project that is worked on or used by multiple people will inevitably reach a point where certain capabilities need to be turned on or off. In this episode Pete Hodgson shares his experience and insight into when, how, and why to use feature flags in your projects as a way to enable that practice. In addition to the simple on and off controls for certain logic paths, feature toggles also allow for more advanced patterns such as canary releases and A/B testing. This episode has something useful for anyone who works on software in any language. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Pete Hodgson about the concept of feature flags and how they can benefit your development workflow Interview Introductions How did you get introduced to Python? Can you start by describing what a feature flag is? What was your first experience with feature flags and how did it affect your approach to software development? What are some of the ways that feature flags are used? What are some antipatterns that you have seen for teams using feature flags? What are some of the alternative development practices that teams will employ to achieve the same or similar outcomes to what is possible with feature flags? Can you describe some of the different approaches to implementing feature flags in an application? What are some of the common pitfalls or edge cases that teams run into when building an in-house solution? What are some useful considerations when making a build vs. buy decision for a feature toggling service? What are some of the complexities that get introduced by feature flags for mantaining application code over the long run? What have you found to be useful or effective strategies for cataloging and documenting feature toggles in an application, particularly if they are long lived or for open source applications where there is no institutional context? Can you describe some of the lifecycle considerations for feature flags, and how the design, implementation, or use of them changes for short-lived vs long-lived use cases? What are some cases where the overhead of implementing and maintaining a feature flag infrastructure outweighs the potential benefit? What advice or references do you recommend for anyone who is interested in using feature flags for their own work? Keep In Touch Website @ph1 on Twitter moredip on GitHub Picks Tobias Circuit Playground Express CircuitPython Episode Pete Accelerate by Nicole Forsgren, Jez Humble, and Gene Kim Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Perl Ruby Django Feature Flag Pete’s Blog Post On Feature Flags Thoughtworks Continuous Delivery Continuous Delivery Book Trunk Based Development Branch By Abstraction Technical Debt Strategy Pattern Polymorphism The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Nov 26, 20191h 1m

« Prev 1 234 5 6 Next »

The Python Podcast.__init__

Ep 288Making The Case For A (Semi) Formal Specification Of CPython

Ep 287Bringing Artificial Intelligence Projects From Idea To Production

Ep 286Power Up Your Java Using Python With JPype

Ep 285The Journey To Replace Python's Parser And What It Means For The Future

Ep 284Cloud Native Application Delivery Using GitOps

Ep 283Threading The Needle Of Interesting And Informative While You Learn To Code

Ep 282Solving Python Package Creation For End User Applications With PyOxidizer

Ep 281Flexible Network Security Detection And Response With Grapl

Ep 280Simplified Data Extraction And Analysis For Current Events With Newspaper

Ep 279Digging Into Dagster: An Opinionated Open Source Framework For Data Orchestration

Ep 278When, Why, and How To Use Web Scraping In A Nutshell

Ep 277Working In The Code Mines: Mining Software Repositories With PyDriller

Ep 276Building The Open Data Ecosystem For Music And More At Metabrainz

Ep 275Growing Dask To Make Scaling Python Data Science Easier At Coiled

Ep 274Supporting The Full Lifecycle Of Machine Learning Projects With Metaflow

Ep 273Learning To Program By Building Tiny Python Projects

Ep 272Idiomatic Functional Programming With DRY Python

Ep 271The Past, Present, And Future Of The FLUFL: Barry Warsaw Shares His History With Python

Ep 270Pure Python Configuration Management With PyInfra

Ep 269Build Your Own Domain Specific Language in Python With textX

Ep 268Adding Observability To Your Python Applications With OpenTelemetry

Ep 267Build A Personal Knowledge Store With Topic Modeling In Contextualize

Ep 266Open Source Product Analytics With PostHog

Ep 265Extending The Life Of Python 2 Projects With Tauthon

Ep 264Dependency Management Improvements In Pip's Resolver

Ep 263Easy Data Validation For Your Python Projects With Pydantic

Ep 262Managing Distributed Teams In The Age Of Remote Work

Ep 261Maintainable Infrastructure As Code In Pure Python With Pulumi

Ep 260Teaching Python Machine Learning

Ep 259Build The Next Generation Of Python Web Applications With FastAPI

Ep 258Distributed Computing In Python Made Easy With Ray

Ep 257Building The Seq Language For Bioinformatics

Ep 256An Open Source Toolchain For Natural Language Processing From Explosion AI

Ep 255A Flexible Open Source ERP Framework To Run Your Business

Ep 254Getting A Handle On Portable C Extensions With hpy

Ep 253Open Source Machine Learning On Quantum Computers With Xanadu AI

Ep 252The Advanced Python Task Scheduler

Ep 251Reducing The Friction Of Embedded Software Development With PlatformIO

Ep 250APIs, Sustainable Open Source and The Async Web With Tom Christie

Ep 249Learning To Program Python By Building Video Games With Arcade

Ep 248Build Your Own Personal Data Repository With Nostalgia

Ep 247Simplifying Social Login For Your Web Applications

Ep 246Building A Business On Building Data Driven Businesses

Ep 245Using Deliberate Practice To Level Up Your Python

Ep 244Checking Up On Python's Role in DevOps

Ep 243Python's Built In IDE Isn't Just Sitting IDLE

Ep 242Riding The Rising Tides Of Python

Ep 241Debugging Python Projects With PySnooper

Ep 240Making Complex Software Fun And Flexible With Plugin Oriented Programming

Ep 239Faster And Safer Software Development With Feature Flags

The Python Podcast.init