
Software Engineering Daily
2,188 episodes — Page 19 of 44
Ep 1456Chronosphere: Scalable Metrics Database with Rob Skillington
M3 is a scalable metrics database originally built to host Uber’s rapidly growing data storage from Prometheus. When Rob Skillington was at Uber, he helped design, implement, and deploy M3. Since leaving Uber, he has co-founded a company around a hosted version of M3 called Chronosphere. If you have access to a scalable metrics database, you might as well start accumulating as much data as possible, right? Not exactly. If your company generates enough data, you probably want to turn down the dials on how frequently you save a metric. Downsampling will reduce the amount of money that you pay for these hosted metrics. In today’s show, Rob discusses the engineering and deployment of M3, and how that work led him to founding Chronosphere, as well as the product offering of the company.
Ep 1455Determined AI: Machine Learning Ops with Neil Conway
Developing machine learning models is not easy. From the perspective of the machine learning researcher, there is the iterative process of tuning hyperparameters and selecting relevant features. From the perspective of the operations engineer, there is a handoff from development to production, and the management of GPU clusters to parallelize model training. In the last five years, machine learning has become easier to use thanks to point solutions. TensorFlow, cloud provider tools, Spark, Jupyter Notebooks. But every company works differently, and there are few hard and fast rules for the workflows around machine learning operations. Determined AI is a platform that provides a means for collaborating around data prep, model development and training, and model deployment. Neil Conway is a co-founder of Determined, and he joins the show to discuss the challenges around machine learning operations, and what he has built with Determined.
Ep 1454The Good Parts of AWS with Daniel Vassallo
AWS has over 150 different services. Databases, log management, edge computing, and lots of others. Instead of being overwhelmed by all of these products, an engineering team can simplify their workflow by focusing on a small subset of AWS services–the defaults. Daniel Vassalo is the author of The Good Parts of AWS. An excerpt from the book: “The cost of acquiring new information is high and the consequence of deviating from a default choice is low, so sticking with the default will likely be the optimal choice. A default choice is any option that gives you very high confidence that it will work.” Having confidence in your workflow–even if it is a simple workflow–has advantages. S3, EC2, Elastic Load Balancers: for simple web applications, this is really all you need to build your business. Daniel Vassallo worked at AWS for more than 8 years before leaving to become an entrepreneur and author. He joins the show to talk about what the good parts of AWS are, and his strategy for building applications with that subset of services.
Ep 1453Pull Request Environments with Eric Silverman
The modern release workflow involves multiple stakeholders: engineers, management, designers, and product managers. It is a collaborative process that is often held together with brittle workflows. A developer deploys a new build to an ad hoc staging environment and pastes a link to that environment in Slack. Other stakeholders click on that link, then send messages to each other in Slack, or make comments on the pull request in GitHub. This workflow is far from ideal. Collaborating around pull requests can be made easier with a dedicated set of tools for sharing and discussing those pull requests. This is the goal of FeaturePeek, a system for spinning up dedicated pull request environments, creating screenshots and comments, and reimagining the lifecycle of the release workflow. Eric Silverman is a co-founder of FeaturePeek and he joins the show to discuss release management, the interactions between different stakeholders, and the development of his company. Much like the previous show about Postman, in which we explored how API management has become a ripe space for collaboration, the same is true of pull requests.
Ep 1452Deepgram: End-to-End Speech Recognition with Scott Stephenson
Deepgram is an end-to-end deep learning platform for speech recognition. Unlike the general purpose APIs from Google or Amazon, Deepgram models are custom-trained for each customer. Whether the customer is a call center, a podcasting company, or a sales department, Deepgram can work with them to build something specific to their use case. Sound data is incredibly rich. Consider all the features in a voice recording: volume, intonation, inflection. And once the speech is transcribed, there are many more features that can be discovered from the text transcription. Scott Stephenson is the CEO of Deepgram, and he joins the show to talk through end-to-end deep learning for speech, as well as the dynamics of the business and the deployment strategy for working with customers.
Ep 1450DynamoDB with Alex DeBrie
DynamoDB is a managed NoSQL database service from AWS. It is widely used as a transactional database to fulfill key-value and wide-column data models. In a previous show with Rick Houlihan, we explored how to build a data model and optimize the query patterns for a NoSQL database. Today’s show is about DynamoDB specifically: partitioning, indexing, query semantics, normalization, table design, and other subjects. We talk through how to be cost conscious, and how to integrate with event-based AWS Lambda triggers. Alex DeBrie is the author of The DynamoDB Book, a book whose title speaks for itself. Alex has comprehensive experience with DynamoDB, and he joins the show to share that experience through a detailed discussion of use cases and strategies related to DynamoDB.
Ep 1449Snowplow Analytics: Data Collection Platform with Alex Dean
As a user browses a webpage, that browser session generates events that need to be recorded, validated, enriched, and stored. This data is sometimes called customer data infrastructure, or CDI. This data requires a full stack of different tools: a system on the frontend to collect the data, middleware to transport the data, and backend systems for storing and loading that data into data warehouses and other analytical systems. Snowplow Analytics is a data collection platform for storing events. In Snowplow, modules called Trackers send data to Collectors. The data can then be validated and enriched, and then put into the user’s data warehouse via ETL. Alex Dean is the CEO of Snowplow, and he joins the show to talk through the business model, management, and engineering of Snowplow Analytics, as well as the overall data engineering landscape.
Ep 1448Postman: API Development with Abhinav Asthana
A software company manages and interacts with hundreds of APIs. These APIs require testing, performance analysis, authorization management, and release management. In a word, APIs require collaboration. Postman is a system for API collaboration. It allows users to test APIs with collections of requests, monitor the API responses, and visualize the query results. Users of Postman can collaborate with their team through Team Workspaces, sharing collections, environments, history, and more. Abhinav Asthana is the founder of Postman and he joins the show to talk about API management and collaboration. Abhinav started Postman as a hobby project, and it has grown into a large and successful business, far beyond the original product of API testing.
Ep 1447Cresta: Speech ML for Calls with Zayd Enam
At a customer service center, thousands of hours of audio are generated. This audio provides a wealth of information to transcribe and analyze. With the additional data of the most successful customer service representatives, machine learning models can be trained to identify which speech patterns are associated with a successful worker. By identifying these speaking patterns, a customer service center can continuously improve, with the different representatives learning the different patterns. The same is true for other speech-based tasks, such as sales calls. Cresta is a company that builds systems to ingest high volumes of speech data in order to discover features that correlate with high performance human workers. Zayd Enam is a co-founder of Cresta, and joins the show to talk about the domain of speech data and what he and his team are building at Cresta.
Ep 1446React Native Ecosystem with Nader Dabit (Summer Break Repeat)
Originally published July 6, 2017. We are taking a few weeks off. We’ll be back soon with new episodes. React Native allows developers to reuse components from one user interface on multiple platforms. React Native was introduced by Facebook to reduce the pain of teams who were rewriting their user interfaces for web, iOS, and Android. Nader Dabit hosts React Native Radio, a podcast about React Native. Nader also trains companies to use React Native through his company React Native Training. In this episode, we explore what a developer can and cannot do with React Native, when a developer needs to use native APIs, and some speculation on the future of React Native.
Ep 1445Traces: Video Recognition with Veronica Yurchuk and Kostyantyn Shysh (Summer Break Repeat)
Originally published October 8, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Video surveillance impacts human lives every day. On most days, we do not feel the impact of video surveillance. But the effects of video surveillance have tremendous potential. It can be used to solve crimes and find missing children. It can be used to intimidate journalists and empower dictators. Like any piece of technology, video surveillance can be used for good or evil. Video recognition lets us make better use of video feeds. A stream of raw video doesn’t provide much utility if we can’t easily model its contents. Without video recognition, we must have a human sitting in front of the video to manually understand what is going on in that video. Veronica Yurchuk and Kosh Shysh are the founders of Traces.ai, a company building video recognition technology focused on safety, anonymity, and positive usage. They join the show to discuss the field of video analysis, and their vision for how video will shape our lives in the future.
Ep 1444Envoy Mobile with Matt Klein (Summer Break Repeat)
Originally published July 25, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Envoy is an open source edge and service proxy that was originally developed at Lyft. Envoy is often deployed as a sidecar application that runs alongside a service and helps that service by providing features such as routing, rate limiting, telemetry, and security policy. Envoy has gained significant traction in the open source community, and has formed the backbone of popular service mesh projects such as Istio. Envoy has been mostly used as a backend technology, but the potential applications of Envoy include frontend client applications as well. The goal of Envoy is to make the network easier to work with–and the network includes client applications such as mobile apps running on a phone. Envoy Mobile is a network proxy for mobile applications. Envoy Mobile brings many of the benefits of Envoy to the mobile client ecosystem. It provides mobile developers with a library that can simplify or abstract away many of the modern advances that have been made in networking in recent years, such as HTTP2, gRPC, and QUIC. Matt Klein is the creator of Envoy, and he joins the show to discuss Envoy Mobile. Matt describes how the networking challenges of mobile applications are similar to those of backend systems and cloud infrastructure. We discuss the advances in networking technology that Envoy Mobile helps bring to the mobile ecosystem, and also touch on the scalability challenges that Matt is seeing at Lyft.
Ep 1443Data Intensive Applications with Martin Kleppman (Summer Break Repeat)
Originally published May 2, 2017. We are taking a few weeks off. We’ll be back soon with new episodes. A new programmer learns to build applications using data structures like a queue, a cache, or a database. Modern cloud applications are built using more sophisticated tools like Redis, Kafka, or Amazon S3. These tools do multiple things well, and often have overlapping functionality. Application architecture becomes less straightforward. The applications we are building today are data-intensive rather than compute-intensive. Netflix needs to know how to store and cache large video files, and stream them to users quickly. Twitter needs to update user news feeds with a fanout of the president’s latest tweet. These operations are simple with small amounts of data, but become complicated with a high volume of users. Martin Kleppmann is the author of Data Intensive Applications, an O’Reilly book about how to use modern data tools to solve modern data problems. His book includes high-level discussions about architectural strategy, and lower level discussions like how leader election algorithms can create problems for a data intensive application.
Ep 1442freeCodeCamp with Quincy Larson (Summer Break Repeat)
Originally published December 20, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. freeCodeCamp was started five years ago with the goal of providing free coding education to anyone on the Internet. freeCodeCamp has become the best place to begin learning how to write software. There are many other places that a software engineer should visit on their educational journey, but freeCodeCamp is the best place to start, because it is free, and there are no advertisements. For most people learning to code, the price of that education is important, because they are learning to code to build a new career. It’s also important that a new programmer learns from an unbiased source of information, because an ad-supported environment will educate the new programmer towards products that they might not need. freeCodeCamp has not been easy to build. Building freeCodeCamp has required expertise in software engineering, business, media, and community development. The donation-based business model of freeCodeCamp doesn’t collect very much money. Why would somebody build a non-profit when they could spend their time building a highly profitable software company? Quincy Larson is the founder of freeCodeCamp, and he joins the show for a special episode about his backstory and the journey to building the best place on the Internet for a new programmer to begin.
Ep 1441Facebook Open Source with Tom Occhino (Summer Break Repeat)
Originally published April 14, 2017. We are taking a few weeks off. We’ll be back soon with new episodes. Facebook’s open source projects include React, GraphQL, and Cassandra. These projects are key pieces of infrastructure used by thousands of developers–including engineers at Facebook itself. These projects are able to gain traction because Facebook takes time to decouple the projects from their internal infrastructure and clean up the code before releasing them into the wild. Facebook has high standards for what they are willing to release. Tom Occhino manages the React team at Facebook and works closely with engineers to determine what projects make sense to open source. In this episode, Preethi Kasireddy interviews Tom about how Facebook thinks about open source–what went right with React, why it makes sense for Facebook to continue to release new open source projects, and how full-time employees at Facebook interact with that open source codebase.
Ep 1439Redis with Alvin Richards (Summer Break Repeat)
Originally published October 24, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Redis is an in-memory database that persists to disk. Redis is commonly used as an object cache for web applications. Applications are composed of caches and databases. A cache typically stores the data in memory, and a database typically stores the data on disk. Memory has significantly faster access times, but is more expensive and is volatile, meaning that if the computer that is holding that piece of data in memory goes offline, the data will be lost. When a user makes a request to load their personal information, the server will try to load that data from a cache. If the cache does not contain the user’s information, the server will go to the database to find that information. Alvin Richards is chief product officer with Redis Labs, and he joins the show to discuss how Redis works. We explore different design patterns for making Redis high availability, or using it as a volatile cache, and we talk through the read and write path for Redis data. Full disclosure: Redis Labs is a sponsor of Software Engineering Daily.
Ep 1438HTTP with Julia Evans (Summer Break Repeat)
Originally published November 21, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. HTTP is a protocol that allows browsers and web applications to communicate across the Internet. Everyone knows that HTTP is doing some important work, because “HTTP” is at the beginning of most URLs that you enter into your browser. You might be familiar with the request/response model, and HTTP request methods such as GET, PUT, and POST. But unless you have had a reason to learn more about the details of HTTP, you probably don’t know much more than that. Julia Evans is a software engineer and writer who creates Wizard Zines, a series of easy-to-read online magazines that explain technical software topics. Julia’s zines include “Linux Debugging Tools”, “Help! I Have A Manager!”, and recently “HTTP: Learn your browser’s language”. Her zines are a creative, innovative format for describing the world of software engineering while also exploring her own artistic pursuits in writing, design, and illustration. Julia was previously on the show to discuss Ruby profiling, and she returns to the show to discuss HTTP, as well as her creative process and goals with Wizard Zines.
Ep 1437Stripe Machine Learning Infrastructure with Rob Story and Kelley Rivoire (Summer Break Repeat)
Originally published June 13, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Machine learning allows software to improve as that software consumes more data. Machine learning is a tool that every software engineer wants to be able to use. Because machine learning is so broadly applicable, software companies want to make the tools more accessible to the developers across the organization. There are many steps that an engineer must go through to use machine learning, and each additional step inhibits the chances that the engineer will actually get their model into production. An engineer who wants to build machine learning into their application needs access to data sets. They need to join those data sets, and load them into a machine (or multiple machines) where their model can be trained. Once the model is trained, the model needs to test on additional data to ensure quality. If the initial model quality is insufficient, the engineer might need to tweak the training parameters. Once a model is accurate enough, the engineer needs to deploy that model. After deployment, the model might need to be updated with new data later on. If the model is processing sensitive or financially relevant data, a provenance process might be necessary to allow for an audit trail of decisions that have been made by the model. Rob Story and Kelley Rivoire are engineers working on machine learning infrastructure at Stripe. After recognizing the difficulties that engineers faced in creating and deploying machine learning models, Stripe engineers built out Railyard, an API for machine learning workloads within the company. Rob and Kelley join the show to discuss data engineering and machine learning at Stripe, and their work on Railyard.
Ep 1436Architects of Intelligence with Martin Ford Holiday Repeat
Originally published January 31, 2019 Artificial intelligence is reshaping every aspect of our lives, from transportation to agriculture to dating. Someday, we may even create a superintelligence–a computer system that is demonstrably smarter than humans. But there is widespread disagreement on how soon we could build a superintelligence. There is not even a broad consensus on how we can define the term “intelligence”. Information technology is improving so rapidly we are losing the ability to forecast the near future. Even the most well-informed politicians and business people are constantly surprised by technological changes, and the downstream impact on society. Today, the most accurate guidance on the pace of technology comes from the scientists and the engineers who are building the tools of our future. Martin Ford is a computer engineer and the author of Architects of Intelligence, a new book of interviews with the top researchers in artificial intelligence. His interviewees include Jeff Dean, Andrew Ng, Demis Hassabis, Ian Goodfellow, and Ray Kurzweil. Architects of Intelligence is a privileged look at how AI is developing. Martin Ford surveys these different AI experts with similar questions. How will China’s adoption of AI differ from that of the US? What is the difference between the human brain and that of a computer? What are the low-hanging fruit applications of AI that we have yet to build? Martin joins the show to talk about his new book. In our conversation, Martin synthesizes ideas from these different researchers, and describes the key areas of disagreement from across the field.
Ep 1435Cruise Simulation with Tom Boyd
Cruise is an autonomous car company with a development cycle that is highly dependent on testing its cars–both in the wild and in simulation. The testing cycle typically requires cars to drive around gathering data, and that data to subsequently be integrated into a simulated system called Matrix. With COVID-19, the ability to run tests in the wild has been severely dampened. Cruise cannot put so many cars on the road, and thus has had to shift much of its testing procedures to rely more heavily on the simulations. Therefore, the simulated environments must be made very accurate, including the autonomous agents such as pedestrians and cars. Tom Boyd is VP of Simulation at Cruise. He joins the show to talk about the testing workflow at Cruise, how the company builds simulation-based infrastructure, and his work managing simulation at the company.
Ep 1434Grafana with Torkel Ödegaard
Grafana is an open source visualization and monitoring tool that is used for creating dashboards and charting time series data. Grafana is used by thousands of companies to monitor their infrastructure. It is a popular component in monitoring stacks, and is often used together with Prometheus, ElasticSearch, MySQL, and other data sources. The engineering complexities around building Grafana involve the large number of integrations, the highly configurable ReactJS frontend, and the ability to query and display large data sets. Grafana also must be deployable to cloud and on-prem environments. Torkel Ödegaard is a co-founder of Grafana Labs, and joins the show to talk about his work on the open source project and the company he is building around it.
Ep 1433Apache Airflow with Maxime Beauchemin, Vikram Koka, and Ash Berlin-Taylor
Apache Airflow was released in 2015, introducing the first popular open source solution to data pipeline orchestration. Since that time, Airflow has been widely adopted for dependency-based data workflows. A developer might orchestrate a pipeline with hundreds of tasks, with dependencies between jobs in Spark, Hadoop, and Snowflake. Since Airflow’s creation, it has powered the data infrastructure at companies like Airbnb, Netflix, and Lyft. It has also been at the center of Astronomer, a startup that helps enterprises build infrastructure around Airflow. Airflow is used to construct DAGs–directed acyclic graphs for managing data workflows. Maxime Beauchemin is the creator of Airflow. Vikram Koka and Ash Berlin-Taylor work at Astronomer. They join the show to talk about the state of Airflow–the purpose of the project, its use cases, and open source ecosystem.
Ep 1432Human in the Loop Data Analytics with Aditya Parameswaran
The life cycle of data management includes data cleaning, extraction, integration, analysis and exploration, and machine learning models. It would be great if all of this data management could be handled with automation, but unfortunately that is not an option. For most applications, data management requires a human in the loop. A human in the loop might be responsible for working in a spreadsheet, or labeling data as a mechanical turk, or creating an algorithm for data labeling in Snorkel. Data scientists and data analysts are humans in the loop, studying large data sets. Aditya Parameswaran is an assistant professor at UC Berkeley. He studies human-in-the-loop data analytics, and he joins the show to talk about the work and the projects that he is focused on, including DataSpread, an alternative to Excel, and OrpheusDB, a relational database versioning system.
Ep 1431Tilt: Kubernetes Tooling with Dan Bentley
Kubernetes continues to mature as a platform for infrastructure management. At this point, many companies have well-developed workflows and deployment patterns for working with applications built on Kubernetes. The complexity of some of these deployments may be daunting, and when a new employee joins a company, that employee needs to get quickly onboarded with the custom dev environment. Environment management is not the only issue with Kubernetes development. When a service gets updated, that update needs to be live and usable as fast as possible. When Kubernetes-related errors occur, those problems need to be easily accessible in a UI for triage. Dan Bentley is the CEO of Windmill Engineering, a company that makes a set of Kubernetes tools called Tilt. Dan joins the show to talk about the workflow for deploying Kubernetes infrastructure and the role of Tilt, the product he has been working on.
Ep 1430Uber’s Data Visualization Tools with Ib Green
Uber needs to visualize data on a range of different surfaces. A smartphone user sees cars moving around on a map as they wait for their ride to arrive. Data scientists and operations researchers within Uber study the renderings of traffic moving throughout a city. Data visualization is core to Uber, and the company has developed a stack of technologies around visualization in order to build appealing, highly functional applications. DeckGL is a library for high-performance visualizations of large data sets. LumaGL is a set of components that targets high performance rendering. These and other tools make up VisGL, the data visualization technology that powers Uber. Uber’s visualization team included Ib Green, who left Uber to co-found Unfolded.ai, a company that builds geospatial analytics products. He joins the show to discuss his work on visualization products and libraries at Uber, as well as the process of taking that work to found Unfolded.ai. Full disclosure: I am an investor in Unfolded.ai.
Ep 1429Prisma: Modern Database Tooling with Johannes Schickling
A frontend developer issuing a query to a backend server typically requires the developer to issue that query through an ORM or a raw database query. Prisma is an alternative to both of these data access patterns, allowing for easier database access through auto-generated, type-safe query building tailored to an existing database schema. By integrating with Prisma, the developer gets a database client that has query autocompletion, and an API server with less boilerplate code. Prisma also has a system called Prisma Migrate, which simplifies database and schema migrations. Johannes Schickling is CEO of Prisma, and he joins the show to talk about the developments of Prisma that have occurred since we last spoke, and where the company is headed.
Ep 1428Tecton: Machine Learning Platform from Uber with Kevin Stumpf
Machine learning workflows have had a problem for a long time: taking a model from the prototyping step and putting it into production is not an easy task. A data scientist who is developing a model is often working with different tools, or a smaller data set, or different hardware than the environment which that model will be deployed to. This problem existed at Uber just as it does at many other companies. Models were difficult to release, iterations were complicated, and collaboration between engineers could never reach a point that resembled a harmonious “DevOps”-like workflow. To address these problems, Uber developed an internal system called Michelangelo. Some of the engineers working on Michelangelo within Uber realized that there was a business opportunity in taking the Michelangelo work and turning it into a product company. Thus, Tecton was born. Tecton is a machine learning platform focused on solving the same problems that existed within Uber. Kevin Stumpf is the CTO at Tecton, and he joins the show to talk about the machine learning problems of Uber, and his current work at Tecton.
Ep 1426HoloClean: Data Quality Management with Theodoros Rekatsinas
Many data sources produce new data points at a very high rate. With so much data, the issue of data quality emerges. Low quality data can degrade the accuracy of machine learning models that are built around those data sources. Ideally, we would have completely clean data sources, but that’s not very realistic. One alternative is a data cleaning system, which can allow us to clean up the data after it has already been generated. HoloClean is a statistical inference engine that can impute, clean, and enrich data. HoloClean is centered around “The Probabilistic Unclean Database Model”, which allows for two systems–an “intension” and a “realizer” to work together to fill in missing fields and fix erroneous fields in data. HoloClean was created by Theo Rekatsinas, and he joins the show to talk about the problem of fast, unclean data, and his work with HoloClean. We also talk about other problems in machine learning and the engineering workflows around data.
Ep 1425Disaggregated Servers with Yiying Zhang
Server infrastructure traditionally consists of monolithic servers containing all of the necessary hardware to run a computer. These different hardware components are located next to each other, and do not need to communicate over a network boundary to connect the CPU and memory. LegoOS is a model for disaggregated, network-attached hardware. LegoOS disseminates the traditional operating system functionalities into loosely-coupled hardware and software components. By disaggregating data center infrastructure, the overall resource usage and failure rate of server infrastructure can be improved. Yiying Zhang is an assistant professor of computer science at UCSD. Her research focuses on operating systems, distributed systems, and datacenter networking. She joins the show to discuss her work and its implications for data centers and infrastructure.
Ep 1424Kubernetes vs. Serverless with Matt Ward
Kubernetes has become a highly usable platform for deploying and managing distributed systems. The user experience for Kubernetes is great, but is still not as simple as a full-on serverless implementation–at least, that has been a long-held assumption. Why would you manage your own infrastructure, even if it is Kubernetes? Why not use autoscaling Lambda functions and other infrastructure-as-a-service products? Matt Ward is a listener of the show and an engineer at Mux, a company that makes video streaming APIs. He sent me an email that said Mux has been having success with self-managed Kubernetes infrastructure, which they deliberately opted for over a serverless deployment. I wanted to know more about what shaped this decision to opt for self-managed infrastructure, and the costs and benefits that Mux has accrued as a result. Matt joins the show to talk through his work at Mux, and the architectural impact of opting for Kubernetes instead of fully managed serverless infrastructure.
Ep 1423Distributed Systems Research with Peter Alvaro
Every software company is a distributed system, and distributed systems fail in unexpected ways. This ever-present tendency for systems to fail has led to the rise of failure testing, otherwise known as chaos engineering. Chaos engineering involves the deliberate failure of subsystems within an overall system to ensure that the system itself can be resilient to these kinds of unexpected failures. Peter Alvaro is a distributed systems researcher who has published papers on a range of subjects, including debugging, failure testing, databases, and programming languages. He works with both academia and industry. Peter joins the show to discuss his research topics and goals.
Ep 1422Brex Engineering with Cosmin Nicolaescu
Brex is a credit card company that provides credit to startups, mostly companies which have raised money. Brex processes millions of transactions, and uses the data from those transactions to assess creditworthiness, prevent fraud, and surface insights for the users of their cards. Brex is full of interesting engineering problems. The high volume of transactions requires data infrastructure to support all those transactions coming through the platform. As a credit card company, Brex needs to integrate with credit card networks and banking systems. There are internal systems for applications such as dispute resolution. Cos Nicolaescu is the CTO at Brex. He joins the show to discuss engineering at Brex, the dynamics of a credit card company, and his strategies around management. It was an instructive look inside of a rapidly growing fintech company.
Ep 1421Edge Machine Learning with Zach Shelby
Devices on the edge are becoming more useful with improvements in the machine learning ecosystem. TensorFlow Lite allows machine learning models to run on microcontrollers and other devices with only kilobytes of memory. Microcontrollers are very low-cost, tiny computational devices. They are cheap, and they are everywhere. The low-energy embedded systems community and the machine learning community have come together with a collaborative effort called tinyML. tinyML represents the improvements of microcontrollers, lighter weight frameworks, better deployment mechanisms, and greater power efficiency. Zach Shelby is the CEO of EdgeImpulse, a company that makes a platform called Edge Impulse Studio. Edge Impulse Studio provides a UI for data collection, training, and device management. As someone creating a platform for edge machine learning usability, Zach was a great person to talk to the state of edge machine learning and his work building a company in the space.
Ep 1420Software Daily
For the last five months, we have been working on a new version of Software Daily, the platform we built to host and present our content. We are creating a platform that integrates the podcast with a set of other features that make it easier to learn from the audio interviews. Software Daily includes the following features: The world of software is large, and growing bigger every day. Software Daily is a place to explore this world of software companies and projects. If the podcast is a useful resource for you to learn about software, then Software Daily might also provide you with value. This post (and episode) is a brief description of the features that we have built into Software Daily. If you want to listen to Software Engineering Daily without ads, you can become a paid subscriber, paying $10/month or $100/year by going to softwaredaily.com/subscribe. We now have an RSS feed that paid customers can add to a podcast player like Overcast (on iOS) or Podcast Addict (on Android). You can also listen to the premium episodes using our apps for iOS or Android. Whether you are a listener who is fine with listening to ads, or you are a listener who pays to hear episodes without ads, we are happy to have you tuning in. Apple podcasts limits the number of episodes in an RSS feed to 300. The feed with the last 300 episodes is available by searching for Software Daily. In total, we have more than 1200 episodes in our back catalog. Listeners often want to find all our episodes on React, or Kubernetes, or serverless, or self-driving cars. We have been covering these topics for years, and much of the old content has retained its value. Software Daily allows you to easily find all the episodes relating to a subject that you are interested in. You can also find our most popular episodes, ranked by how people interact with them. Additionally, episode transcripts have interactive features with highlighting, commenting, and discussions. We want to create a Medium-like experience for the episodes. Software Daily is a place where listeners can write about the topics they are listening to. When you are listening to lots of episodes about a topic such as GraphQL, you may find it useful to write about that topic as a form of active learning. The topic pages also have a Q&A section. Post questions about a topic, or post an answer. Engage in the community dialogue surrounding a topic you are passionate or curious about. If there is a topic you want to write about, check out softwaredaily.com/write. We will be turning the best written content into short podcast episodes published on the weekends where we will read your contribution and mention your name. If you write something awesome, we want to turn it into audio for larger distribution. Every topic on Software Daily has a Q&A section. We have covered lots of niche software companies and open source projects, and on Software Daily we want to collect more information about the world of software with Q&A. If you want to write about a specific company or topic that you heard about on Software Daily, Q&A is also an option. Our goal with Q&A is to provide a companion experience to listening to the podcast. It is not always easy to retain what you hear in a podcast episode. Answering some questions after you listen to an episode can help with that retention. Are you looking to hire someone specific in the world of software? Post a job on the Software Daily jobs board. We will be announcing some of these jobs on the podcast, especially the more interesting postings, and ones that align with content we are producing. We appreciate you tuning into Software Daily. We would welcome your feedback, and hope you take the time to check out SoftwareDaily.com.
Ep 1419RedwoodJS with Tom Preston-Werner
Over the last 5 years, web development has matured considerably. React has become a standard for frontend component development. GraphQL has seen massive growth in adoption as a data fetching middleware layer. The hosting platforms have expanded beyond AWS and Heroku, to newer environments like Netlify and Vercel. These changes are collectively known as the JAMStack. With the changes brought by the JAMStack, it raises the question: how should an app be built today? Can a framework offer guidance for how the different layers of a JAMStack app should fit together? RedwoodJS is a framework for building JAMStack applications. Tom Preston-Werner is one of the creators of RedwoodJS, as well as the founder of GitHub and Chatterbug, a language learning app. He joins the show to talk about the future of JAMStack development, and his goals for RedwoodJS.
Ep 1418ArcGIS: Geographic Information Software with Max Payson
Geospatial analytics tools are used to render visualizations for a vast array of applications. Data sources such as satellites and cellular data can gather location data, and that data can be superimposed over a map. A map-based visualization can allow the end user to make decisions based on what they see. ArcGIS is one of the most widely used geospatial analytics platforms. It is created by ESRI, the Environmental Systems Research Institute, which was started in 1969. Today, ESRI products have 40% of the global market share of geospatial analytics software. Max Payson is a solutions engineer at ESRI, and he joins the show to talk about applications of ArcGIS, and the landscape of GIS more broadly.
Ep 1417RudderStack: Open Source Customer Data Infrastructure with Soumyadeb Mitra
Customer data infrastructure is a type of tool for saving analytics and information about your customers. The company that is best known in this category is Segment, a very popular API company. This customer data is used for making all kinds of decisions around product roadmap, pricing, and design. RudderStack is a company built around open source customer data infrastructure. RudderStack can be self-hosted, allowing users to deploy it to their own servers and manage their data however they please. Soumyadeb Mitra is the creator of RudderStack, and he joins the show to talk about the space of customer data infrastructure, and his own company.
Ep 1416Matterport 3-D Imaging with Japjit Tulsi
Matterport is a company that builds 3-D imaging for the inside of buildings, construction sites, and other locations that require a “digital twin.” Generating digital images of the insides of buildings has a broad spectrum of applications, and there are considerable engineering challenges in building such a system. Matterport’s hardware stack involves a camera built in-house by the company. The camera can take 360 degree scans of a room, stitch the imagery together, and make the digital twin available on the cloud. Japjit Tulsi works at Matterport, and he joins the show to discuss 3-D imaging, and his role as CTO of the company.
Ep 1415Frontend Performance with Anycart’s Rafael Sanches
There are many bad recipe web sites. Every time I navigate to a recipe website, it feels like my browser is filling up with spyware. The page loads slowly, everything seems broken, I can feel the 25 different JavaScript adtech tags interrupting each other. Whether I am searching for banana bread or a spaghetti sauce recipe, recipe sites usually make me lose my appetite. Anycart is a recipe platform that allows users to buy all of the ingredients for the recipe and have those ingredients delivered. It’s a vertically integrated content site and delivery system. It is also beautifully designed and extremely performant. I learned about it from Zack Bloom, who works at Cloudflare, as he mentioned it as a case study in performance. Rafael Sanches is a founder of Anycart, and he joins the show to talk about building a recipe delivery service, and the innovations in performance that were necessary to building it.
Ep 1414Software Daily
For the last five months, we have been working on a new version of Software Daily, the platform we built to host and present our content. We are creating a platform that integrates the podcast with a set of other features that make it easier to learn from the audio interviews. Software Daily includes the following features: The world of software is large, and growing bigger every day. Software Daily is a place to explore this world of software companies and projects. If the podcast is a useful resource for you to learn about software, then Software Daily might also provide you with value. This post (and episode) is a brief description of the features that we have built into Software Daily. If you want to listen to Software Engineering Daily without ads, you can become a paid subscriber, paying $10/month or $100/year by going to softwaredaily.com/subscribe. We now have an RSS feed that paid customers can add to a podcast player like Overcast (on iOS) or Podcast Addict (on Android). You can also listen to the premium episodes using our apps for iOS or Android. Whether you are a listener who is fine with listening to ads, or you are a listener who pays to hear episodes without ads, we are happy to have you tuning in. Apple podcasts limits the number of episodes in an RSS feed to 300. The feed with the last 300 episodes is available by searching for Software Daily. In total, we have more than 1200 episodes in our back catalog. Listeners often want to find all our episodes on React, or Kubernetes, or serverless, or self-driving cars. We have been covering these topics for years, and much of the old content has retained its value. Software Daily allows you to easily find all the episodes relating to a subject that you are interested in. You can also find our most popular episodes, ranked by how people interact with them. Additionally, episode transcripts have interactive features with highlighting, commenting, and discussions. We want to create a Medium-like experience for the episodes. Software Daily is a place where listeners can write about the topics they are listening to. When you are listening to lots of episodes about a topic such as GraphQL, you may find it useful to write about that topic as a form of active learning. The topic pages also have a Q&A section. Post questions about a topic, or post an answer. Engage in the community dialogue surrounding a topic you are passionate or curious about. If there is a topic you want to write about, check out softwaredaily.com/write. We will be turning the best written content into short podcast episodes published on the weekends where we will read your contribution and mention your name. If you write something awesome, we want to turn it into audio for larger distribution. Every topic on Software Daily has a Q&A section. We have covered lots of niche software companies and open source projects, and on Software Daily we want to collect more information about the world of software with Q&A. If you want to write about a specific company or topic that you heard about on Software Daily, Q&A is also an option. Our goal with Q&A is to provide a companion experience to listening to the podcast. It is not always easy to retain what you hear in a podcast episode. Answering some questions after you listen to an episode can help with that retention. Are you looking to hire someone specific in the world of software? Post a job on the Software Daily jobs board. We will be announcing some of these jobs on the podcast, especially the more interesting postings, and ones that align with content we are producing. We appreciate you tuning into Software Daily. We would welcome your feedback, and hope you take the time to check out SoftwareDaily.com.
Ep 1413AWS Virtualization with Anthony Liguori
Amazon’s virtual server instances have come a long way since the early days of EC2. There are now a wide variety of available configuration options for spinning up an EC2 instance, which can be chosen from based on the workload that will be scheduled onto a virtual machine. There are also Fargate containers and AWS Lambda functions, creating even more options for someone who wants to deploy virtualized infrastructure. The high demand for virtual machines has led to Amazon moving down the stack, designing custom hardware such as the Nitro security chip, and low level software such as the Firecracker virtual machine monitor. AWS also has built Outposts, which allow for on-prem usage of AWS infrastructure. Anthony Liguori is an engineer at AWS who has worked on a range of virtualization infrastructure: software platforms, hypervisors, and hardware. Anthony joins the show to talk about virtualization at all levels of the stack.
Ep 1411International Consumer Credit Infrastructure with Brian Regan and Misha Esipov
A credit score is a rating that allows someone to qualify for a line of credit, which could be a loan such as a mortgage, or a credit card. We are assigned a credit score based on a credit history, which could be related to work history, rental payments, or loan repayments. One problem with the credit scoring system is that it is not internationalized. If I am coming from Brazil, I have a rental history of someone from Brazil. That information does not get naturally ported over to the United States. There needs to be a system for translating a foreign credit history to a US credit history. Nova Credit is a company that makes a credit passport–a system for allowing users in one geographic location to use the credit history that they have built up to have credit in another location, namely the United States. Brian Regan and Misha Esipov work at Nova Credit, and they join the show to talk about how the company works, and the problem it solves.
Ep 1410Grapl: Graph-Based Detection and Response with Colin O’Brien
A large software company such as Dropbox is at a constant risk of security breaches. These security breaches can take the form of social engineering attacks, network breaches, and other malicious adversarial behavior. This behavior can be surfaced by analyzing collections of log data. Log-based threat response is not a new technique. But how should those logs be analyzed? Grapl is a system for modeling log data as a graph, and analyzing that graph for threats based on how nodes in the graph have interacted. By building a graph from log data, Grapl can classify interaction patterns that correspond to threats. Colin O’Brien is the creator of Grapl, and he joins the show to discuss security, as well as threat detection and response.
Ep 1409Static Analysis for Infrastructure with Guy Eisenkot
Infrastructure-as-code tools are used to define the architecture of software systems. Common infrastructure-as-code tools include Terraform and AWS CloudFormation. When infrastructure is defined as code, we can use static analysis tools to analyze that code for configuration mistakes, just as we could analyze a programming language with traditional static analysis tools. When a developer writes a program, that developer might use static analysis to parse a program for common mistakes–memory leaks, potential null pointers, and security holes. The concept of static analysis can be extended to infrastructure as code, allowing for the discovery of higher level problems such as insecure policies across cloud resources. Guy Eisenkot is an engineer with Bridgecrew, a company that makes static analysis tools for security and compliance. Guy joins the show to talk about cloud security and how static analysis can be used to improve the quality of infrastructure deployments.
Ep 1408Social Distancing Data with Ryan Fox Squire
Social distancing has been imposed across the United States. We are running an experiment unlike anything before it in history, and it is likely to have a lasting impact on human behavior. By looking at location data of how people are moving around today, we can examine the real-world impacts of social distancing. SafeGraph is a company that provides geospatial location data to be used by developers and researchers. Much of their data is aggregated from cell phone GPS pings which identify where anonymized users are in the world. This data set provides the basis for SafeGraph’s social distancing metrics, which measure how frequently people are coming into contact with one another. Ryan Fox Squire works at SafeGraph, and he returns to the show to discuss social distancing metrics and the research that has come out of studying these metrics.
Ep 1407Dropbox Engineering with Andrew Fong
Dropbox is a consumer storage product with petabytes of data. Dropbox was originally started on the cloud, backed by S3. Once there was a high enough volume of data, Dropbox created its own data centers, designing hardware for the express purpose of storing user files. Over the last 13 years, Dropbox’s infrastructure has developed hardware, software, networking, data center infrastructure, and operational procedures that make the cloud storage product best in class. Andrew Fong has been an engineer at Dropbox for 8 years. He joins the show to talk about how the Dropbox engineering organization has changed over that period of time, and what he is doing at the company today.
Ep 1406Pravega: Storage for Streams with Flavio Junquiera
“Data stream” is a word that can be used in multiple ways. A stream can refer to data in motion or data at rest. When a stream is data in motion, an endpoint is receiving new pieces of data on a continual basis. Each new data point is sent over the wire and captured by the other end. Another way a stream can be represented is as a sequence of events that have been written to a storage medium. This is a stream at rest. Pravega is a system for storing large streams of data. Pravega can be used as an alternative to systems like Apache Kafka or Apache Pulsar. Flavio Junquiera is an engineer at Dell EMC who works on Pravega. He joins the show to talk about the history of stream processing and his work on Pravega.
Ep 1405Advanced Redis with Alvin Richards
Redis is an in-memory object storage system that is commonly used as a cache for web applications. This core primitive of in-memory object storage has created a larger ecosystem encompassing a broad set of tools. Redis is also used for creating objects such as queues, streams, and probabilistic data structures. Machine learning systems also need access to fast, in-memory object storage. RedisAI is a newer module for supporting machine learning tasks. For serverless computing, RedisGears allows for the execution of functions close to your Redis instance. RedisEdge allows for edge computing with Redis. Alvin Richards returns to the show to discuss the expansion of Redis to becoming a broad suite of in-memory tools, as well as the resiliency properties of Redis and usage patterns for the tool. RedisLabs is a sponsor of Software Engineering Daily, and RedisConf is a virtual conference around Redis that runs May 12-13. If you are interested in Redis, you can check out RedisConf for free by going to RedisConf.com.
Ep 1404Multicloud MySQL with Jiten Vaidya and Anthony Yeh
For many applications, a transactional MySQL database is the source of truth. To make a MySQL database scale, some developers deploy their database using Vitess, a sharding system built on top of Kubernetes. Jiten Vaidya and Anthony Yeh work at PlanetScale, a company that focuses on building and supporting MySQL databases sharded with Vitess. Their experience comes from working at YouTube, which has a massive, rapidly growing database for storing the information about videos on the site. Sharding is not the only database problem that YouTube faced. Availability was another issue. At YouTube, the database operators want YouTube’s MySQL cluster to be resilient to the failure of an entire data center. Similarly, a developer deploying an important MySQL database to the cloud wants their database to be resilient to the failure of an entire cloud provider. Jiten and Anthony join the show to talk about their work building multicloud support for MySQL, and their process of deploying a consistent MySQL database in Azure, GCP, and AWS.
Ep 1403Isolation with Courtland Allen and Anurag Goel
We are all living in social isolation due to the quarantine from COVID-19. Isolation is changing our habits and our moods, ravaging the economy, and changing how we work. One positive change is that more people have been reconnecting with their friends and family over frequent calls and video chats. Isolation is not a normal way for humans to live. We are social animals, and we need social interaction. We’ve changed how we use Internet products. There has been an evolution of trends in online shopping, social networking, and video communication software. Courtland Allen is the founder of Indie Hackers and Anurag Goel is the founder of Render, a new cloud provider. Both Courtland and Anurag are friends of mine, and join this episode to talk about how their lives are changing as a result of social isolation.