PLAY PODCASTS
Software Engineering Daily

Software Engineering Daily

2,188 episodes — Page 25 of 44

Ep 1133Cloud Database Workloads with Jon Daniel

Relational databases such as Postgres are often used for critical workloads, such as user account data. To run a relational database service in the cloud requires a cloud provider to set up a highly durable, highly available system. Jon Daniel is an infrastructure engineer at Heroku. Jon joins the show to describe the engineering and operations required to build a managed relational database service. Full disclosure: Heroku is a sponsor of Software Engineering Daily. RECENT UPDATES: FindCollabs is a company I started recently The FindCollabs Podcast is out! FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat We are booking sponsorships for Q3, find more details at https://softwareengineeringdaily.com/sponsor/ Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service

May 6, 201959 min

Ep 1132Satellite Data Platform with Tim Kelton

Satellite images contain vast quantities of data. By analyzing the contents of satellite images over time, we can identify trends in weather, soil, and agriculture. If we combine that data with ground-level sensors, we can gather a clearer understanding of how chemicals in the air or in the dirt map to how things look from above via satellite. Descartes Labs is a company that gathers high dimensional data about our planet and turns it into machine learning models to be used by customers. In order to do this, the company has built out a data pipeline involving queueing systems, machine learning frameworks, and internal tools that are used to aggregate, clean, model, and measure data. Tim Kelton is a co-founder of Descartes Labs and he joins the show to discuss the high volume of data and the distributed systems that make up the Descartes Labs infrastructure. RECENT UPDATES: FindCollabs is a company I started recently The FindCollabs Podcast is out! FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat We are booking sponsorships for Q3, find more details at https://softwareengineeringdaily.com/sponsor/ Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service

May 3, 201936 min

Ep 1130Security Monitoring with Jeff Williams

The modern software supply chain contains many different points of distribution: JavaScript frameworks, npm modules, Docker containers, open source repositories, cloud providers, on-prem firmware, IoT, networking proxies, and so much more. With so much attack surface, securing a large enterprise is an uphill battle. Jeff Williams is the CTO at Contrast Security, a company that makes infrastructure monitoring tools. Contrast Security works by intercepting network traffic at a low level and assessing whether that traffic maps to a common threat model. Jeff joins the show to talk about different approaches to monitoring and securing large infrastructure deployments. Contrast Community Edition RECENT UPDATES: FindCollabs is a company I started recently The FindCollabs Podcast is out! FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat We are booking sponsorships for Q3, find more details at https://softwareengineeringdaily.com/sponsor/ Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service

May 2, 201952 min

Ep 1129Software Growth with Greg Kogan

Growing a software business requires an understanding of engineering, sales, and marketing. As we learn software engineering, we also pick up some knowledge about how a business should operate. We know that there are customers, and that our product needs to be scalable to serve more customers. We know that some features are more important than others, and so we focus on building the features that matter the most. But unless we make a deliberate focus, engineers do not learn how to sell and market a software product. Learning how to sell and market software is an important skill to develop. It allows a software engineer to be self-sufficient. If you already know how to write software, sales and marketing are actually the only other pieces you need to be an “entrepreneur”. And the basics of sales and marketing are often easier and more fun to learn than the first painful days of learning basic programming. Greg Kogan is an engineer who has shifted his focus to working as a consultant for companies that are trying to go to market with a technical product. Greg has helped grow companies such as Netlify, Scalyr, and Domino Data Lab. Much of his work is around products targeted toward developers. Greg joins the show to describe his methodical approach to selling and marketing software. RECENT UPDATES: FindCollabs is a company I started recently The FindCollabs Podcast is out! FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat We are booking sponsorships for Q3, find more details at https://softwareengineeringdaily.com/sponsor/ Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service

May 1, 201944 min

Ep 1128Container Platform Security with Maya Kaczorowski

A Kubernetes instance occupies a wide footprint of multiple servers, creating an appealing target to an attacker, due to its access to a large pool of compute resources. A common attack against an exposed Kubernetes cluster is to take it over for the purposes of mining cryptocurrency. Thus it is important to keep a cluster secure. The importance of security is magnified for a cloud provider. A cloud provider runs a managed Kubernetes service, which might be running thousands of Kubernetes clusters. If the cloud provider’s chosen distribution of Kubernetes contains a vulnerability, or if the Kubernetes instances are misconfigured, all of these clusters could be exposed to the same vulnerability. Maya Kaczorowski works on the security of Google’s managed Kubernetes service GKE. In today’s show we discuss the attack surface of a managed Kubernetes service. Maya was previously on the show to talk about container security. This episode is a good companion to that one, as well as a previous show with Liz Rice about container security. RECENT UPDATES: FindCollabs is a company I started recently The FindCollabs Podcast is out! FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat We are booking sponsorships for Q3, find more details at https://softwareengineeringdaily.com/sponsor/ Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service

Apr 30, 201934 min

Ep 1127Lyft’s Data Platform with Li Gao

FindCollabs is a company I started recently The FindCollabs Podcast is out! FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat We are booking sponsorships for Q3, find more details at https://softwareengineeringdaily.com/sponsor/ Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service Lyft generates petabytes of data. Driver and rider behavior, pricing information, the movement of cars through space; all of this data is received by Lyft’s backend services, buffered into Kafka queues, and processed by various stream processing systems. Lyft moves the high volumes of data into a data lake for different users throughout the company to use offline. Machine learning jobs, batch jobs, streaming jobs and materialized databases can be created on top of that data lake. Druid and Superset are used for operational analytics and dashboarding. Li Gao is a data engineer at Lyft. He joins the show to explore the different aspects of Lyft’s data platform. We also talk about the tradeoffs of streaming frameworks, and how to manage machine learning infrastructure. This episode is a great companion to our show about Uber’s data platform, and illustrates some fundamental differences in how the two ridesharing companies operate.

Apr 29, 2019

Ep 1126Cloud with Eric Brewer

To the extent that I am a software engineering journalist, I feel inclined to scrutinize all of the cloud providers. But to the extent that I am an engineer and a business person, I feel only admiration and love for the cloud providers. Cloud computing has brought the cost of starting an Internet business down to zero. Cloud computing has opened up my eyes to a world of creative possibilities that knows no boundaries, and for that I will always be a fan of all of the rivaling cloud companies because they all have played a role in creating the current software landscape. Eric Brewer is a Google Fellow and VP Infrastructure. He is well-known for his work on the CAP theorem, a distributed systems concept that formalized the tradeoffs between consistency, availability, and partition tolerance in a distributed system. At Google, Eric is as much a strategist and product creator as he is a theoretician. He has worked on database systems such as Spanner, machine learning systems such as TensorFlow, and container orchestration systems such as Kubernetes and GKE. Eric joins the show to talk about Google’s philosophy as a cloud provider, and how his understanding of distributed systems has evolved since joining the company.

Apr 26, 20191h 4m

Ep 1125Intricately: Mapping the Internet with Fima Leshinsky

RECENT UPDATES: FindCollabs is a company I started recently The FindCollabs Podcast is out! FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat We are booking sponsorships for Q3, find more details at https://softwareengineeringdaily.com/sponsor/ Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service Intricately is a company that maps the breadth and depth of cloud infrastructure usage. Using a combination of clever algorithms, data engineering, and web crawlers, Intricately derives information about how different companies spend money on infrastructure. Fima Leshinsky is the CEO and co-founder at Intricately. In his previous job at Akamai, he began to study how a cloud provider such as Akamai could figure out how much its competitors were charging certain customers. Since CDN infrastructure is a commodity with reasonably low switching cost, a provider that can undercut its competitors significantly can have an edge in the marketplace. From his work at Akamai, Fima felt there was a market opportunity to provide this kind of service to the broader market of cloud providers. There are more cloud providers than ever before, and the kind of data that Intricately aggregates is highly useful to this competitive marketplace. Fima joins the show to talk about the modern landscape of cloud providers, and how to build a system that maps the Internet.

Apr 25, 201958 min

Ep 1124gVisor: Secure Container Sandbox with Yoshi Tamura

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat. The Linux operating system includes user space and kernel space. In user space, the user can create and interact with a variety of applications directly. In kernel space, the Linux kernel provides a stable environment in which device drivers interact with hardware and manage low level resources. A Linux container is a virtualized environment that runs within user space. To perform an operation, a process in a container in user space makes a syscall (system call) into kernel space. This allows the container to have access to resources like memory and disk. Kernel space must be kept secure to ensure operating system integrity–but Linux includes hundreds of syscalls. Each syscall represents an interface between the user space and kernel space. Security vulnerabilities can emerge from this wide attack surface of different syscalls, and most applications only need a small number of syscalls to perform their required functionality. gVisor is a project to restrict the number of syscalls that the kernel and user space need to communicate. gVisor is a runtime layer between the user space container and the kernel space. gVisor reduces the number of syscalls that can be made into kernel space. The security properties of gVisor make it an exciting project today–but it is the portability features of gVisor that hint at a huge future opportunity. By inserting an interpreter interface between containers and the Linux kernel, gVisor presents the container world with the opportunity to run on operating systems other than Linux. There are many reasons why it might be appealing to run containers on an operating system other than Linux. Linux was built many years ago, before the explosion of small devices, smart phones, IoT hubs, voice assistants and smart cars. To be more speculative, Google is working on a secretive new operating system called Fuscia. gVisor could be a layer that allows workloads to be ported from Linux servers to Fuscia servers. Yoshi Tamura is a product manager at Google with a background in containers and virtualization. He joins the show to talk about gVisor and the different kinds of virtualization.

Apr 24, 201946 min

Ep 1123Observability Engineering with James Burns

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat Twilio is a communications infrastructure company with thousands of internal services and thousands of request per second. Each request generates logs, metrics, and distributed traces which can be used to troubleshoot failures and improve latency. Since Twilio is used for 2-factor authentication and text message relaying, Twilio is critical infrastructure for most applications that implement it. The service must remain highly available even in times of peak application traffic, or outages at a particular cloud provider. When he was at Twilio, James Burns worked on platform infrastructure and observability. James was at Twilio from 2014 to 2017, a time in which the company experienced rapid scalability. His work encompassed site reliability, monitoring, cost management and incident response. He also led chaos engineering exercises called “game days”, in which the company deliberately caused infrastructure to fail in order to ensure the reliability of failover systems and to discover problematic dependencies. James joins the show to talk about his time at Twilio and his perspectives on how to instrument and observe complex applications. Full disclosure: James now works at LightStep, which is a sponsor of Software Engineering Daily.

Apr 23, 20191h 5m

Ep 1122Serverless Runtimes with Steren Giannini

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service FindCollabs is hiring a React developer FindCollabs Hackathon #1 has ended! Congrats to ARhythm, Kitspace, and Rivaly for winning 1st, 2nd, and 3rd place ($4,000, $1000, and a set of SE Daily hoodies, respectively). The most valuable feedback award and the most helpful community member award both go to Vynce Montgomery, who will receive both the SE Daily Towel and the SE Daily Old School Bucket Hat. Google’s options for running serverless workloads started with App Engine. App Engine is a way to deploy an application in a fully managed environment. Since the early days of App Engine, managed infrastructure has matured and become more granular. We now have serverless databases, queueing systems, machine learning tools, and functions as a service. Developers can create fully managed, event-driven, highly scalable systems with less code and less operations. Different cloud providers are taking different approaches to offering serverless runtimes. Google’s approach involves the open source Knative project and a hosted platform for running Knative workloads called Cloud Run. Steren Giannini is a product manager at Google working on serverless tools. He joins the show to discuss Google’s serverless projects and the implementation details in building them.

Apr 22, 201951 min

Ep 1121Products with Ryan Hoover

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more FindCollabs Hackathon has ended–winners will probably be announced by the time this episode airs; we will be announcing our next hackathon in a few weeks, so stay tuned Products are an art form. As with any art, the world of products includes creators, patrons, fans, business people, and investors. Product Hunt is a place where those different people connect to build and discuss products. Products are different from other art forms in that they are measured not only through the lens of design and beauty–but also through utility. From software to books to couches to toiletry–we all have products that have improved our lives so much that we feel a deep sense of connection and hope for that product and the people behind it. Ryan Hoover is the founder of Product Hunt, a product I have found tremendous value and satisfaction from over the years. He is also a host of Product Hunt Radio, a weekly podcast with the people creating and exploring the future. Ryan joins the show to discuss products, the process of creating something useful, and his investing strategy. Ryan runs the Weekend Fund, an early stage investment fund.

Apr 19, 201957 min

Ep 1120Facebook OSS License Policy with Joel Marcey, Michael Cheng, and Kathy Kam

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more FindCollabs Hackathon has ended–winners will probably be announced by the time this episode airs; we will be announcing our next hackathon in a few weeks, so stay tuned Open source policy has become a business issue as well as a political one. Businesses like Elastic, MongoDB (the company), and Redis Labs have started to view the open source licenses of the projects they work on as a means for business defensibility against cloud providers offering similar services. It remains to be seen how viable this strategy will be for the commercial open source vendors. Companies that do not directly sell commercial open source are also grappling with questions around open source licensing. Facebook has become a force in the open source world through projects like React and GraphQL. Facebook leads these projects, but Facebook is not monetizing them other than to the extent that they use the projects to build Facebook.com. Facebook’s incentives are aligned with the rest of the industry on the quality of the GraphQL and React projects. Proper licensing can help Facebook keep those incentives in alignment. Joel Marcey, Michael Cheng, and Kathy Kam from Facebook join me for a discussion of the state of open source licensing, and how that impacts Facebook.

Apr 18, 201945 min

Ep 1119Drishti: Deep Learning for Manufacturing with Krish Chaudhury

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more FindCollabs Hackathon has ended–winners will probably be announced by the time this episode airs; we will be announcing our next hackathon in a few weeks, so stay tuned Drishti is a company focused on improving manufacturing workflows using computer vision. A manufacturing environment consists of assembly lines. A line is composed of sequential stations along that manufacturing line. At each station on the assembly line, a worker performs an operation on the item that is being manufactured. This type of workflow is used for the manufacturing of cars, laptops, stereo equipment, and many other technology products. With Drishti, the manufacturing process is augmented by adding a camera at each station. Camera footage is used to train a machine learning model for each station on the assembly line. That machine learning model is used to ensure the accuracy and performance of each task that is being conducted on the assembly line. Krish Chaudhury is the CTO at Drishti. From 2005 to 2015 he led image processing and computer vision projects at Google before joining Flipkart, where he worked on image science and deep learning for another four years. Krish had spent more than twenty years working on image and vision related problems when he co-founded Drishti. In today’s episode, we discuss the science and application of computer vision, as well as the future of manufacturing technology and the business strategy of Drishti.

Apr 17, 201954 min

Ep 1118Lyft Data Discovery with Tao Feng and Mark Grover

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more FindCollabs Hackathon has ended–winners will probably be announced by the time this episode airs; we will be announcing our next hackathon in a few weeks, so stay tuned Lyft is a ridesharing company with petabytes of data. Within Lyft, many different employees can use those data sets to build useful applications. A business analyst creates a dashboard to see how driver satisfaction is changing over time. An economist studies the pricing data to ensure that Lyft’s prices are competitive. A data scientist creates a report of how the speed of a ride correlates with 5 star ratings. A machine learning engineer trains a model to detect fraud on the platform. All of these use cases make sense–and in each of them, the employee at Lyft needs to find the necessary data sets within the company to build their application. Amundsen is a tool for finding and discovering data sets within the company. Tao Feng and Mark Grover are engineers at Lyft and join the show to talk about the problem of data discovery and the tools they have built at Lyft.

Apr 16, 201954 min

Ep 1117Protein Structure Deep Learning with Mohammed Al Quraishi

RECENT UPDATES: Podsheets is our open source set of tools for managing podcasts and podcast businesses New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more FindCollabs Hackathon has ended–winners will probably be announced by the time this episode airs; we will be announcing our next hackathon in a few weeks, so stay tuned Until Google DeepMind came into the field, protein structure prediction was dominated by academics. Protein structure prediction is the process of predicting how a protein will fold by looking at genetic code. Protein structure prediction is a perfect field to approach through the application of deep learning, because the inputs are highly dimensional and there is a plentiful array of different sets of labeled data. Protein structure deep learning is a field in which many different approaches are taken, often involving supervised learning and reinforcement learning. Mohammed Al Quraishi is a systems biologist at Harvard. His background spans computer engineering, statistics, and genetics. In his work, Mohammed explores the interplay between biology and computer systems. One area of Mohammed’s focus is protein structure prediction. In a blog post last year, Mohammed gave a brief history of protein structure prediction and described the significance of DeepMind entering the field. DeepMind’s AlphaFold technology surpassed all other competitors in the most recent CASP protein structure competition. Mohammed joins the show to discuss biology, academia, deep learning, and DeepMind.

Apr 15, 201956 min

Ep 1116Podsheets: Open Source Podcasting

Podsheets is a set of open source tools for podcast hosting, publishing, ad management, community engagement, and more. Podsheets is influenced by our experience managing Software Engineering Daily, a full-time podcast business. Software Engineering Daily is a podcast that airs 5 times per week. With 4 ads per show and 50 business weeks per year, we

Apr 14, 201956 min

Ep 1115Bubbles with Haseeb Qureshi

Haseeb Qureshi is an entrepreneur and investor. As a teenager, Haseeb played poker professionally through the online poker bubble. His path from poker to software entrepreneurship has been explored in previous episodes. In 2007, Haseeb and I met at an online poker table. As we battled each other for thousands of dollars, Haseeb and I realized we shared an affinity for obnoxious screen names, obnoxious online avatars, and the city of Austin, Texas. We were both living in the city, and met each other in the real world. In our earliest days, Haseeb and I were not friends. It was a strange time–we were disembodied minds, drifting on the Internet, attached mostly to the fluctuating balances of our Full Tilt Poker and Pokerstars accounts. This was not a time for friendship–it was a time for ruthless, modern competition. Haseeb grew tired of poker. He wrote a book about the game to memorialize his thoughts, then abandoned it. He studied philosophy and literature, searching for something new in the historical musings of humanity. He traveled Europe, working as a farmer to reconnect with the physical world. He discovered the Effective Altruism movement. Finding no solace in his poker spoils, Haseeb gave away most of his money and started from scratch. As he rebuilt himself, he found software engineering and charted a path to San Francisco, where we reconnected. In this episode, Haseeb joins me for a discussion of software, philosophy, poker, and the nature of bubbles. Indeed, Haseeb and I have now lived through four major bubbles: dot coms, poker, the 2008 financial crisis, and the crypto bubble. Throughout these bubbles, the mediums change but never does the message: human beings are deeply irrational, tribalistic, and emotional.

Apr 12, 20191h 34m

Ep 1112Consul Service Mesh with Paul Banks

RECENT UPDATES: FindCollabs $5000 Hackathon Ends Saturday April 15th, 2019 New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more Consul is a tool from HashiCorp that allows users to store and retrieve information from a highly available key/value data store. Consul is used for storage of critical cluster information, such as service IP locations and configuration data. A service interacts with Consul via a daemon process on the node of that service. The daemon process periodically shares information with the Consul server over a gossip UDP protocol and can share data on a more immediate basis using TCP. Consul’s functionality has increased recently to add secure service connectivity. Consul Connect allows services to establish mutual TLS encryption with each other. The addition of mutual TLS to the Consul feature set is closely incidental with Consul gaining a title of “service mesh.” Service mesh is an increasingly popular pattern that can encompass a variety of features: load balancing, security policy management, service discovery, and routing. Tools which offer self-described “service mesh” functionality include Linkerd, Kong, AWS App Mesh, Solo.io Gloo, and Google’s Istio open source project. Paul Banks is the engineering lead of Consul at HashiCorp. He joins the show to talk about the service mesh category and the past, present, and future of Consul.

Apr 11, 201958 min

Ep 1111Machine Learning Joins with Arun Kumar

RECENT UPDATES: FindCollabs $5000 Hackathon Ends Saturday April 15th, 2019 New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more Data sets can be modeled in a row-wise, relational format. When two data sets share a common field, those data sets can be combined in a procedure called a join. A join combines the data of two data sets into one data set that is often bigger than the initial two data sets independently occupied. In fact, this new data set is often so much bigger that it creates problems for the machine learning engineers. Arun Kumar is an assistant professor at UC San Diego. He joins the show to discuss the modern lifecycle of machine learning models, and the gaps in the tooling. Arun’s research into improving processing of joined data sets has been adopted by companies such as Google. Some of that research has been adapted into open source machine learning tools that improve the performance of machine learning jobs with minimal code required.

Apr 10, 20191h 2m

Ep 1108Streaming with Holden Karau

RECENT UPDATES: FindCollabs $5000 Hackathon Ends Saturday April 15th, 2019 New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more Distributed stream processing allows developers to build applications on top of large sets of data that are being rapidly created. Stream processing is often described as an alternative to batch processing. In batch processing, a single large computation is performed over a large, static data set. In stream processing, a computation is performed repeatedly and continuously over a data set that is being appended to. A stream is often stored in a distributed queue such as Kafka, Kinesis, Pulsar, or Google PubSub. A stream is often processed with a stream processing tool such as Spark, Flink, Storm, or Google Cloud Dataflow. Holden Karau is an engineer who works on open source projects at Google. She returns to the show to describe the state of stream processing and discuss modern best practices.

Apr 9, 201948 min

Ep 1106AWS Storage with Kevin Miller

RECENT UPDATES: FindCollabs $5000 Hackathon Ends Saturday April 15th, 2019 New version of Software Daily, our app and ad-free subscription service Software Daily is looking for help with Android engineering, QA, machine learning, and more A software application requires compute and storage. Both compute and storage have been abstracted into cloud tools that can be used by developers to build highly available distributed systems. In our previous episode, we explored the compute side. In today’s episode we discuss storage. Application developers store data in a variety of abstractions. In-memory caches allow for fast lookups. Relational databases allow for efficient retrieval of well-structured tables. NoSQL databases allow for retrieval of documents that may have a less defined schema. File storage systems allow the access pattern of nested file systems, like on your laptop. Distributed object storage systems allow for highly durable storage of any data type. Amazon S3 is a distributed object storage system with a wide spectrum of use cases. S3 is used for media file storage, archiving of log files, and data lake applications. S3 functionality has increased over the years, developing different tiers of data retrieval latency and cost structure. AWS S3 Glacier allows for long-term storage of data at a large cost reduction, in exchange for increased latency of data access. Kevin Miller is the general manager of Amazon Glacier at Amazon Web Services. He joins the show to talk about the history of storage, the different options for storage in the cloud, and the design of S3 Glacier.

Apr 8, 201951 min

Ep 1105AWS Compute with Deepak Singh

Upcoming event: FindCollabs Hackathon at App Academy on April 6, 2019 On Amazon Web Services, there are many ways to run an application on a single node. The first compute option on AWS was the EC2 virtual server instance. But EC2 is a large abstraction compared to what many people need for their nodes–which is a container with a smaller set of resources to work with. Containers can be run within a managed cluster like ECS or EKS, or run on their own as AWS Fargate instances, or simply as Docker containers running without a container orchestration tool. Beyond the option of explicit container instances, users can run their application as a “serverless” function-as-a-service such as AWS Lambda. Functions-as-a-service abstract away the container and let the developer operate at a higher level, while also providing some cost savings. Developers use these different compute options for different reasons. Deepak Singh is the director of compute services at Amazon Web Services, and he joins the show to discuss the use cases and tradeoffs of these options. Deepak also discusses how these tools are useful internally to AWS. ECS and Lambda are high-level APIs that are used to build even higher level services such as AWS Batch, which is a service for performing batch processing over large data sets.

Apr 5, 201954 min

Ep 1103Data with Ben Lorica

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Ben Lorica is the chief data scientist at O’Reilly Media and the program director of the Strata Data Conference. In his work, Ben spends time with people across the software industry, giving him broad perspective. In the early days of the data engineering ecosystem, the Hadoop vendor wars were starting between Cloudera and Hortonworks. Strata was a neutral ground for practitioners and open source contributors to meet and share ideas about the Hadoop ecosystem. Since then, the conference has grown to encompass topics such as data science, distributed databases, streaming frameworks, and machine learning. There are many open questions in the data world right now. What is the best path that an enterprise can take to build out a data platform? How should a software team be arranged to efficiently build machine learning models? Which distributed streaming frameworks should I use for what purpose? Ben joins the show to discuss modern data engineering, data science, and infrastructure.

Apr 4, 201947 min

Ep 1102Stablecoins with Rune Christensen

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 A currency can fulfill numerous financial use cases. One use case is store of value: currency holders can reliably expect their currency to maintain some value, though that value may fluctuate over time. Another use case is speculation: currency holders are owning currency in the hope that the market price of the currency will increase over time. Bitcoin is a useful store of value and an instrument for speculation. However, Bitcoin still does not fulfill the financial use case that most people need from a currency: price stability. The price of Bitcoin fluctuates rapidly, making it difficult to use Bitcoin for small purchases such as coffee. Imagine you want to buy a cup of coffee with Bitcoin. The coffee shop owner needs to offer the option to sell you that cup of coffee using Bitcoin as the medium of exchange. This owner must denominate the price of that coffee as some number of Bitcoin. Since the price of Bitcoin fluctuates so rapidly, the coffee shop owner needs to adjust the price of that cup of coffee constantly in order to make sure that the coffee is cheap enough for the consumer to want to buy it, but expensive enough to make a profit. It is hard to assign prices to market goods in terms of Bitcoin because the currency is in constant flux. Even though many of us would like to use Bitcoin in our everyday lives, most marketplaces are denominated in US dollars or other currencies because a marketplace needs a stable currency in order to operate. Rune Christensen is the CEO of MakerDAO, a system that provides a price-stable cryptocurrency. MakerDAO is an elegant set of currencies, collateralized debt, smart contracts, and other incentive tools that result in the creation of several transparent, decentralized financial instruments. Rune joins the show to talk about the importance of stablecoins and how MakerDAO has engineered a decentralized currency that has maintained stability even through tumultuous market conditions.

Apr 3, 20191h 11m

Ep 1101Blitzscaling with Chris Yeh

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Chris Yeh is an entrepreneur, investor, and author. He co-wrote Blitzscaling with LinkedIn founder Reid Hoffman. Blitzscaling is a strategy for growing a company that has found product market fit. Blitzscaling prioritizes speed over efficiency, arguing that fast growth is necessary to achieve “first scaler advantage.” When a company is the first to scale successfully within a large market, that company gains access to a wealth of market opportunities that are not available to companies which are not at scale. Examples of successful Blitzscalers include Airbnb, LinkedIn, Amazon, and Facebook. In the hypergrowth phases of these companies, there were deliberate strategic tradeoffs that caused the company to suffer in the short term in exchange for the chance at market dominance in the long term. Blitzscaling is a broad strategic concept which manifests differently in different companies. When Airbnb was in its early stages of growth in 2011, the company was faced with the existential threat of a European competitor called Wimdu. Wimdu offered to sell to Airbnb, but this would have required the merger of two companies with distinctly different cultures. Instead, Airbnb chose to raise more money and rapidly expand into Europe. In contrast, Google’s rapid path to becoming a dominant information service involved acquisitions that we now see as key Google products, including Android, Google Maps, and Google Earth. Through numerous examples in recent business history, Blitzscaling explores the fundamental tradeoff between speed and efficiency, usually biasing speed as the preferable element. But Blitzscaling does not work for every company. In the food delivery sector, many companies who tried to blitzscale ended up going out of business because they had lowered their prices too much in order to try to earn customer loyalty. By lowering their prices too much, food delivery startups built businesses with fundamentally bad unit economics and a fickle customer base. In other cases, aggressive blitzscaling can work for a short period of time, but can cause a company’s culture to suffer in ways that are very hard to repair. Blitzscaling can also cause problems in a core software product. Growing too quickly can cause a product to have a bloated user interface. If the backend infrastructure layer expands too quickly, sensitive data could be left exposed due to a lack of proper software security policies. Chris Yeh joins the show to talk about the strategy of Blitzscaling and his wide-ranging career. Chris studied creative writing and product design at Stanford before joining DE Shaw, the famous quantitative hedge fund. Later, he became an investor and worked in several leadership roles in software companies. His wide range of experiences make Chris an excellent author and conversationalist. We explored the ideas of both Blitzscaling and his previous book The Alliance, which lays out a modern vision for the dynamic between employers and employees. We also talked about investing, Dungeons and Dragons, and podcasting.

Apr 2, 20191h 8m

Ep 1100Uber Infrastructure with Prashant Varanasi and Akshay Shah

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Uber’s infrastructure supports millions of riders and billions of dollars in transactions. Uber has high throughput and high availability requirements, because users depend on the service for their day-to-day transportation. When Uber was going through hypergrowth in 2015, the number of services was growing rapidly, as was the load across those services. Using a cloud provider was a risky option, because the costs could potentially grow out of control. Uber made a decision early on to invest in physical hardware in order to keep costs at a reasonable level. In the last 3 years, Uber’s infrastructure has stabilized. The platform engineering team has built systems for monitoring, deployment, and service proxying. Developing and maintaining microservices within Uber has become easier. Prashant Varanasi and Akshay Shah are engineers who have been with Uber for more than three years. They work on Uber’s platform engineering team, and their current focus is on the service proxy layer, a sidecar that runs alongside Uber services providing features such as load balancing, service discovery, and rate limiting. Prashant and Akshay join the show to talk about Uber infrastructure, microservices, and the architecture of a service proxy. We also talk in detail about the benefits of using Go for critical systems infrastructure, and some techniques for profiling and debugging in Go.

Apr 1, 20191h 5m

Ep 1099Workload Scheduling with Brian Grant

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Google has been building large-scale scheduling systems for more than fifteen years. Google Borg was started around 2003, giving engineers at Google a unified platform to issue long-lived service workloads as well as short-lived batch workloads onto a pool of servers. Since the early days of Borg, the scheduler systems built by Google have matured through several iterations. Omega was an effort to improve the internal Borg system, and Kubernetes is an open source container orchestrator built with the learnings of Borg and Omega. A scheduling system needs to be able to accept a wide variety of workload types and find compute resources within a cluster to schedule those workloads onto. There is a wide variety of potential workloads that could be scheduled–batch jobs, stateful services, stateless services, and daemon services. Different workloads can have different priority levels. A high priority workload should be able to find compute resources quickly, and a low priority workload can wait longer to find resources. Brian Grant is a principal engineer at Google. He joins the show to talk about his experience building workload schedulers and designing APIs for engineers to interface with those schedulers.

Mar 29, 201946 min

Ep 1098Peloton: Uber’s Cluster Scheduler with Min Cai and Mayank Bansal

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Google’s Borg system is a cluster manager that powers the applications running across Google’s massive infrastructure. Borg provided inspiration for open source tools like Apache Mesos and Kubernetes. Over the last decade, some of the largest new technology companies have built their own systems that fulfill the roles of cluster management and resource scheduling. Netflix, Twitter, and Facebook have all spoken about their internal projects to make distributed systems resource allocation more economical. These companies find themselves continually reinventing scheduling and orchestration, with inspiration from Google Borg and their own internal experiences running large numbers of containers and virtual machines. Uber’s engineering team has built a cluster scheduler called Peloton. Peloton is based on Apache Mesos, and is architected to handle a wide range of workloads: data science jobs like Hadoop MapReduce; long running services such as a ridesharing marketplace service; monitoring daemons such as Uber’s M3 collector; and database services such as MySQL. Min Cai and Mayank Bansal are engineers at Uber who work on Peloton. When they set out to create Peloton, they looked at the existing schedulers in the ecosystem, including Kubernetes, Mesos, Hadoop’s YARN system, and Borg itself. Both Min and Mayank join the show today to give a brief history of distributed systems schedulers and discuss their work on Peloton. They have been working in the world of distributed systems schedulers for many years–including experiences building core Hadoop infrastructure and virtual machine schedulers at VMware.

Mar 28, 201949 min

Ep 1097Scaling Log Management with Renaud Boutet

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Log management requires the processing and indexing of high volumes of semi-structured data. A log management service takes log data and puts it in a cloud-hosted application so that application operators can access those logs to troubleshoot issues. A large tech company will produce terabytes of logs. Those logs are produced on the host where a service is running. A logging agent on that host will transfer the logs to the log management service in the cloud. Once the logs are in the cloud, they are parsed, indexed, and stored in a way that is easy to query. In 2014, Renaud Boutet co-founded Logmatic, a log management service that eventually became a leading provider. Logmatic was acquired by Datadog, and Renaud now works as a vice president at Datadog. In today’s episode, Renaud joins the show to talk about the architecture of a log management service. We talk about storage tiers, scalability requirements, failover strategies, and logging for serverless functions. Full disclosure: Datadog is a sponsor of Software Engineering Daily.

Mar 27, 201949 min

Ep 1096Security Businesses with Steve Herrod

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Steve Herrod was the CTO at VMware and now works as a managing director at General Catalyst, where he focuses on investments relating to security. Large enterprises are difficult to secure. An enterprise has sprawling infrastructure, with both on-prem and cloud infrastructure. Identity management systems, vulnerability scanning, secure network infrastructure, and policy management tools are just a few example areas where enterprises spend billions of dollars on security software. Threats often make their way into an enterprise by way of social engineering. This can result in phishing attacks, corporate espionage, and ransomware. Protecting against social engineering is very difficult, as there are so many channels to communicate through–Facebook Messenger, Linkedin, email, and ad networks can all be used to perform social engineering attacks. Enterprise security software is a very different business from other types of software companies. Unlike developer tools or cloud infrastructure, security software is usually not self-serve. Security solutions usually require a longer sales and integration process with a customer. Steve Herrod joins the show to talk about the enterprise security world, the go-to-market strategy for successful security companies, and his perspective on what makes for a viable venture capital investment.

Mar 26, 20191h 15m

Ep 1095CodeSandbox: Online Code Editor with Bas Buursma and Ives van Hoorne

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Coding in the browser has been attempted several times in the last decade. Building a development environment in the browser has numerous technical challenges. How does the code execute safely? How do you fit all of the requirements of a development environment into a browser window? How do you get users to switch from their normal IDE (interactive development environment)? CodeSandbox is an online code editor created by Ives van Hoorne and Bas Buursma. CodeSandbox allows users to program and run applications in the browser. It is a full developer platform that allows users the ability to install npm modules, run their code, and share their applications with other users. The engineering problems within CodeSandbox are not easy–building a web-based IDE is complicated. But CodeSandbox is also an exciting project because it lowers the barrier to entry for many newer programmers. The development experience for a new programmer is still a difficult onramp. If you are an experienced developer, you have a workflow that you are comfortable with. It might involve vim, or emacs, or JetBrains IDEs, or Eclipse. But newer developers can find these environments confusing and hard to get started with. The development environments of today are integrated with build tools, Github repositories, and deployment platforms. This can be overwhelming for a newer developer. CodeSandbox is a very visual tool, which makes it especially useful for new developers who learn through seeing examples running live in the browser. CodeSandbox is also used by web developers who want a modern, shareable form of developing software. Ives and Bas join the show to talk about the motivation for CodeSandbox and the engineering challenges they have solved.

Mar 25, 201950 min

Ep 1094Apache Superset with Maxime Beauchemin

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Data engineering touches every area of an organization. Engineers need a data platform to build search indexes and microservices. Data scientists need data pipelines to build machine learning models. Business analysts need flexible dashboards to understand the trends and customer use for a product. Max Beauchemin is a data engineer who has worked at Airbnb, Lyft, and Facebook. He’s the creator of two successful open source projects: Apache Airflow and Apache Superset. In a previous show, Max discussed data engineering at Airbnb, and the usage of Airflow. In today’s show, Max discusses the engineering of Apache Superset. Superset is an open source business intelligence web application. Superset allows users to create visualizations, slice and dice their data, and query it. Superset integrates with Druid, a database that supports exploratory, OLAP-style workloads. One reason Superset is distinctive is that it is a full open source application. Many open source projects are tools like databases, command line tools, and web frameworks. Superset is an open source application that can be used by individuals who are not developers–so the audience is wider than the typical open source tool built for engineers. Max joins the show to talk about his experience as a data engineer at Airbnb and Lyft, and the open source projects he has started.

Mar 22, 20191h 2m

Ep 1093FaunaDB with Evan Weaver

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Twitter’s early engineers faced scalability problems that caused infrastructure failures on a regular basis. The infamous “fail whale” could happen as a result of problems in the application servers, the network, or the database layer. When Twitter was scaling in its early days, the cloud providers were still immature. Engineers did not have access to the autoscaling cloud infrastructure that is available today. The early Twitter architecture was a combination of open source tools and internally created infrastructure custom built for Twitter’s workloads. Evan Weaver was an early engineer at Twitter, and he saw the deficiencies of the data tools that the company had access to. Twitter engineers wanted access to a truly reusable data platform that would fit Twitter’s requirements: high availability, globally replicated, and transactionally consistent. By 2012, Evan had left Twitter and started consulting for other technology companies. He found that databases across the industry were lacking the same properties that Twitter wanted, and the ideas for FaunaDB began to percolate. Around this time, there were two relevant papers about distributed databases that had come out: the Spanner paper from Google and the Calvin paper, a distributed systems paper from Yale. With inspiration from the literature, his time at Twitter, and his knowledge from consulting, Evan started FaunaDB. Seven years later, FaunaDB is a fully fledged open source project as well as a database company with a cloud service offering. Fauna is an OLTP database used by companies like Nvidia, Nextdoor, and Capital One. Evan joins the show to talk about his time spent scaling Twitter and the architecture of FaunaDB.

Mar 21, 201952 min

Ep 1092ElasticSearch at Scale with Volkan Yazici

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Bol.com is the biggest e-commerce company in the Netherlands and Belgium. For 20 years, Bol has been developing its software architecture, which includes a variety of services and databases, and a mix of physical and cloud infrastructure. For an ecommerce company, the search engine is critical for allowing customers to find the products they are looking for. But search also has many applications for internal systems. A search engine is a database with a query engine, and internal application developers want to build on top of that database. Volkan Yazici is an engineer at Bol.com specializing in search and the author of the blog post Using ElasticSearch as the Primary Data Store. In his post, Volkan describes the process of scaling ElasticSearch to fit the use cases of both internal and external users at a large ecommerce company. Volkan joins the show to discuss how search infrastructure at scale can require a carefully architected data pipeline in order to propagate changes to a large data set to a search index.

Mar 20, 201953 min

Ep 1091Serverless GraphQL with Tanmai Gopal

Upcoming events: A Conversation with Haseeb Qureshi at Cloudflare on April 3, 2019 FindCollabs Hackathon at App Academy on April 6, 2019 Modern web development tools have given frontend developers more power. On the frontend, JavaScript frameworks like React and Vue have become easier to work with. For deployment, tools like Netlify and Zeit give developers a workflow that is tightly integrated with GitHub. At the database layer, autoscaling document storage systems like Firebase and hosted Mongo solutions make it easier to work with objects. There are also a multitude of APIs that give developers rich business functionality out of the box, making it easy to build applications around SMS, payments, and computer vision. If you are building a new application today, you have the option to build it around a completely “serverless” architecture. As the backend and frontend have changed, the middleware to communicate between those layers has also evolved. GraphQL is a modern way of fetching data from disparate data sources. In previous episodes, we have talked about how GraphQL works, and some common patterns for using GraphQL in mature applications. In today’s episode, Tanmai Gopal joins the show to describe how to use GraphQL in newer applications. Tanmai is the CEO of Hasura, a company building tools around GraphQL. He discusses the advantages of using serverless functions together with GraphQL, and how to architect an event-based serverless application.

Mar 19, 201955 min

Ep 1090OSS Businesses with Mike Volpi

In the world of commercial open source, there is plenty of room for both point solution providers and cloud providers. But they are competing for the same customers, and the competitive battlefield is expanding to the nuanced world of software licensing. By changing their licenses, open source projects like Kafka, MongoDB, and Redis can prohibit AWS from certain usage patterns. This might offer some protection for companies based around the point solutions–companies like Confluent and RedisLabs. Beyond the fracas of the battle between cloud providers and point solutions, there are newer open source companies with models that do not fit tightly into any historical business models. HashiCorp makes a suite of differentiated open source tools that have not been seriously contested or offered as a service by cloud providers. GitLab makes an open source platform that is built with monitoring, logging, CI, and code hosting out of the box. As the world of open source business models expands, more companies will find opportunity in open sourcing the code that runs their products. In many cases, they will find that it strengthens their advantage rather than weakens it. The defensibility of many businesses relies more on data and network effects than the contents of the codebase. We may see the default question gradually shift from “why should I open source my codebase?” to “why shouldn’t I open source my codebase?” Mike Volpi is a partner at Index Ventures and has invested in many open source businesses over the last decade. He is on the board of Confluent, Cockroach Labs, Kong, and Elastic. Mike joins the show to share his perspective on open source business models of the past, present, and future.

Mar 18, 20191h 4m

Ep 1089Crypto Bubble with Haseeb Qureshi

This is a post written and narrated by Haseeb Qureshi, a cryptocurrency investor and entrepreneur. Haseeb is speaking at an upcoming Software Engineering Daily Meetup. The ICO bubble had no single cause. Mono-causal explanations always fall short in explaining complex phenomena. But its effects are easier to pinpoint. There are now many world class teams well-capitalized to build, scale, and evolve blockchain technology, and tens of millions of people in the world who now understand decentralization, proof of work, and private keys. Looking back, it’s really quite amazing! It comes at a high cost, but Perez hints: it’s likely that bubbles like these are the only way to overcome technological inertia. At the same time, most people had their first interaction with crypto during its orgiastic adolescence. It’s not a great look. But this has been true for every technological revolution of the last 250 years. In that regard, crypto is in good company. I was too young to appreciate the dot com bubble when it happened. It’s strange to say, but I’m glad to have witnessed a speculative bubble from up close. I’ve now got war stories to share with future generations. It was a wild time, when anyone in the world could launch a coin and raise tens of millions of dollars to build a network that no one could control. I don’t think we’ll see anything like that again for a long time. So what happens now? If you believe that crypto has the stuff of a technological revolution, then as Perez puts it, the collapse will pave the way for a more fruitful deployment phase. At the end of the day, I’m an optimist about technology. So it won’t surprise you that I think this deployment phase is coming. But it will be slow, unglamorous, and probably won’t make for nearly as entertaining of headlines. Oh well.

Mar 17, 201947 min

Ep 1087GitLab with Sid Sijbrandij

GitLab is an open source platform for software development. GitLab started with the ability to manage git repositories and now has functionality for collaboration, issue tracking, continuous integration, logging, and tracing. GitLab’s core business is selling to enterprises who want a self-hosted git installation, such as banks or other companies who prefer not to use a git service in the cloud. The vision for GitLab is to provide a platform for managing the full software development lifecycle, from code hosting to deployment–as well as tools for observability and project management. Sid Sijbrandij is the CEO of GitLab and he joins the show to talk about the product, the business, and the company’s vision for the future. GitLab’s strategy is to offer a set of tools that work for developers out of the box, cutting down on time spent integrating each individual vendor.

Mar 15, 20191h 2m

Ep 1086Linux Kernel Development with Shuah Khan

An operating system kernel manages the system resources that are needed to run applications. The Linux kernel runs most of the smart devices that we interact with, and is the largest open source project in history. Shuah Khan has worked on operating systems for two decades, including 13 years at HP and 5 years at Samsung. She has worked on proprietary operating systems and a variety of Linux operating system environments, including mobile devices. Shuah joins the show to discuss her work within Linux and her experience contributing to open source. Shuah has made significant contributions to kselftest, a set of tests for Linux. Testing the Linux kernel is complicated. Because there is so much depth to the codebase, and such a variety of ways that Linux can be used, there is also a variety of ways that the operating system gets tested. There is smoke testing, performance testing, and regression testing. There are trees of tests, and as a developer you may only want to run a subset of the tests in that tree. The conversation with Shuah ranged from the low level practices of testing the kernel to a high level discussion of how the Linux kernel can reveal dynamics of human nature.

Mar 14, 20191h 3m

Ep 1085Cryptojacking: Bitcoin Malware with Estaban Vargas

Malware is malicious software that makes money for the creator of that software. Malware can appear onto a user’s computer if that user visits a malicious website or installs malicious software by accident. There are many types of malware. Spyware sits on your machine and logs your data in order to sell it. Ransomware can lock your computer and demand that you pay money to unlock it. Adware serves you popup ads that you don’t want to see. Cryptojacking is a newer form of malware. Cryptojacking software uses your computer to mine Bitcoin and other cryptocurrencies. Cryptojacking can occur when you visit a website that is running JavaScript that is executing along with the rest of the webpage. When you visit a website with a cryptojacker, your computer will become slower, because your CPU is being taken over to mine cryptocurrency. Cryptojacking can occur anywhere that code runs–and there is a lot of code running on cloud providers. Cloud providers themselves are very secure. But a cloud provider cannot force its customers to be secure. Users who host an insecure application on a cloud provider may get infected with a cryptojacker. If I host a large, complex website on a cloud provider, and I’m serving millions of users, I’m already paying a lot in cloud costs. But when my application gets infected with a cryptojacker, my costs could shoot up. And if I don’t know why my costs are increasing, I might leave the cloud provider. Estaban Vargas is the co-founder of SafeTalpa, a company that provides defense against cryptojackers. Estaban joins the show to explain how cryptojackers work and why cloud providers have trouble defending against them.

Mar 13, 201953 min

Ep 1084Ad Fraud Engineering with Praneet Sharma and Shailin Dhar

Advertising fraud occurs when a brand pays for an advertisement online and that advertisement is shown to an automated bot account that has been created to view ads. Advertising fraud is rampant on the Internet. It’s not possible to know how much money is lost to ad fraud, but the costs are in the billions of dollars. Praneet Sharma and Shailin Dhar are the founders of Method Media Intelligence, a company that builds solutions around improving advertising quality. In previous shows, Praneet and Shailin have described the online advertising ecosystem in detail. They have told stories of bot farms, replay attacks, and adtech companies. In today’s episode, Praneet and Shailin return to the show to discuss how advertising fraud is getting worse–not better. Praneet and Shailin worked with BuzzFeed reporter Craig Silverman, who was a previous guest on the show to talk about his remarkable findings about mobile advertising fraud, which accounts for hundreds of millions of dollars in theft every year.

Mar 12, 201949 min

Ep 1083Energy Market Machine Learning with Minh Dang and Corey Noone

The demand for electricity is based on the consumption of the electrical grid at a given time. The supply of electricity is based on how much energy is being produced or stored on the grid at a given time. Because these sources of supply and demand fluctuate rapidly but predictably, energy markets present profit opportunities for traders. Minh Dang and Corey Noone are engineers with Advanced Microgrid Solutions, a company that builds software to help traders capture better opportunities in the energy markets. Minh and Corey join the show to talk about how their company builds and deploys machine learning models for market prediction. We discussed data infrastructure, machine learning model deployments, and the dynamics of the energy markets.

Mar 11, 201945 min

Ep 1082Netlify with Mathias Biilmann Christensen

Cloud computing started to become popular in 2006 with the release of Amazon EC2, a system for deploying applications to virtual machines sitting on remote data center infrastructure . With cloud computing, application developers no longer needed to purchase expensive server hardware. Creating an application for the Internet became easier, cheaper, and simpler. As the cloud has become popular, new ways of deploying applications have emerged. A developer with a web app has so many different options. You can host your app on an Amazon EC2 server, which will require you to manage cloud infrastructure in case your server crashes. You can deploy your app to Heroku, which gives your cloud deployment better uptime guarantees for a higher price than Amazon EC2. Or you can use Linode, or Microsoft Azure, or Google Cloud. There is such a large market for cloud computing that the world of cloud providers serves more niches every year. In past episodes we have explored a variety of different cloud providers, and the markets they target. Pivotal Cloud Foundry is for managing complex distributed systems applications, typically with large teams. Firebase is a cloud provider that simplifies the developer experience for applications with small teams. Spotinst is a cloud provider that emphasizes low cost. Zeit is a cloud provider that is built to manage applications through serverless “functions-as-a-service” like AWS Lambda. In today’s episode, Mathias Biilman Christensen, CEO of Netlify, joins the show. Netlify is a cloud provider that was built for modern web projects. Netlify represents the convergence of several trends in software development converging: static site deployment, serverless functions, a desire to have a “no-ops” deployment with minimal management, and the rise of newer tools like GraphQL and Gatsby. Mathias explores these trends in detail, and explores the technical challenges of building Netlify. He was a great guest, capable of talking about difficult backend problems that require writing C++, as well as the frontend world of JavaScript frameworks. One announcement before we begin: we are having a $5000 hackathon. The $5000 hackathon is for a new product we’ve been working on: FindCollabs. FindCollabs is a platform for finding collaborators and building projects. Whether you are an engineer, a musician, a designer, a videographer, or an artist, FindCollabs lets you find people and collaborate. To try out FindCollabs, just go to FindCollabs.com, you can make a project or you can join someone else’s project. And it’s very easy to make these projects–you don’t need to have anything built yet–you need to have a vision for what you want to build. And to find out about the hackathon, go to findcollabs.com/hackathon. We are giving away $5000 in cash to the coolest projects that get built before Sunday April 14th. So I recommend getting started early, finding some people to collaborate with, and building some cool stuff!

Mar 8, 201958 min

Ep 1081LinkedIn Monitoring Infrastructure with Alexander Pucher

Monitoring tools are used by every area of an organization. Business development teams use monitoring to understand the metrics for product performance. Finance teams need to understand how the costs of cloud computing resources are changing. Site reliability engineers use monitoring dashboards that applications are up and running without problematic latency. Product managers evaluate the results of AB tests based off of the monitoring data of how users are reacting to new features. A monitoring system needs to be able to handle large volumes of data that are being generated at a high velocity. The data needs to be queryable in an aggregated format, which might require an ETL system for getting data into columnar format. Alexander Pucher is an engineer at LinkedIn, where he works on a monitoring platform called ThirdEye. ThirdEye is built on top of Apache Pinot, a distributed columnar storage engine that ingests data and serves analytical queries at low latency. Pinot uses RocksDB, and is comparable to Apache Druid. Alexander joins the show to discuss ThirdEye, and explain why Pinot is a useful building block for monitoring infrastructure.

Mar 7, 201959 min

Ep 1080WebAssembly Execution with Syrus Akbary

WebAssembly is a runtime that lets languages beyond JavaScript to execute in frontend web applications. WebAssembly is novel because most modern frontend applications are written entirely in JavaScript. WebAssembly lets us use languages like Rust and C++ after they have been compiled down to a web assembly binary module. Language interoperability is only one part of why WebAssembly is exciting. The execution environment for WebAssembly modules has benefits for security and software distribution and consumption as well. In previous shows, we’ve given an overview of WebAssembly and explored its future applications as well as its relationship to the Rust programming language. In today’s episode, we explore the packaging and execution path of a WebAssembly module, and some other applications of the technology. Syrus Akbary is the CEO and founder of Wasmer, a company focused on creating universal binaries powered by WebAssembly. Wasmer provides a way to execute WebAssembly files universally. He joins the show to talk about the state of WebAssembly, and what his company is building.

Mar 6, 201959 min

Ep 1079Ethereum Usability with Sean Li

Cryptocurrencies enable a large number of applications. Trustless reputation systems, decentralized identity tools, micropayments, non-fungible Internet items, borderless currencies, just to name a few. But cryptocurrencies have not yet impacted daily life, for most of us. Why is that? One reason is that it is still very hard for developers to build within the cryptocurrency ecosystem. The programming languages, such as Solidity, are not widely used by software engineers. Building and deploying smart contracts is not as easy as deploying a simple Ruby on Rails webapp. The open source tooling is immature, as are the paid developer tools. Sean Li is the CEO of Fortmatic, a company that is building tools to improve the Ethereum developer experience. Fortmatic simplifies wallet creation, user identity management, security, and money transfer for Ethereum developers. Before starting Fortmatic, Sean was the founder of Kitematic, a company that made the developer experience of Docker easier. Kitematic was acquired by Docker. Sean is one of the few people with significant experience in both the enterprise container ecosystem and the cryptocurrency ecosystem. Sean joins the show to discuss his time in the Docker ecosystem, his new company Fortmatic, and his perspective on how to build tools for developers. Someday there will be hundreds of thousands of developers building applications around cryptocurrencies, just like people use cloud computing today. The road to getting there is unclear, and Sean provides useful insights and predictions for the future.

Mar 5, 201950 min

Ep 1078StarkWare: Transparent Computational Integrity with Eli Ben Sasson

Computational integrity is a property that is required for financial transactions on the Internet. Computational integrity means that the output of a certain computation is correct. If I deposit money into my bank, my bank sends me a number that represents the new balance in my account. I assume that the number they have sent me is correct. The bank could be lying to me–maybe this bank is not actually trustworthy. But I use a bank with a good reputation. If the bank stole money from its users, it would quickly go out of business. Therefore, I feel safe by trusting a bank with my money, because the bank needs to maintain its reputation. The problem with reputation-based systems is that they are opaque. It’s not easy for us to audit the bank and prove the bank actually has the money that it claims to have. Most of the time, the reputation-based systems work fine. But occasionally, we have catastrophic events–think of the 2008 financial crisis, or the Bernie Madoff financial scandal. These circumstances would have been avoided if the financial institutions could have been continuously audited for their solvency. With blockchains and cryptocurrencies, we now have tools that allow us to maintain computational integrity without the opaque systems of reputation. We no longer have to trust a central authority–we can verify computational integrity with math. Eli Ben-Sasson is a co-founder and chief scientist at StarkWare Industries, a company that is bringing zero-trust technology to market. Implementations of zero-trust technology include zk-STARKs, zk-SNARKs, and bulletproofs. StarkWare is focused on the application of zk-STARKs, which can be used to improve scalability and privacy. Eli joins the show to discuss the topic of computational integrity, and how STARKs can be used to provide scalable, secure infrastructure to blockchain applications.

Mar 4, 201952 min

Ep 1077FindCollabs

Collaboration on the Internet creates innovation. New inventions, new art, and new products–built by people working together on the Internet. FindCollabs is a product we have been working on to enable people to find and collaborate with each other. If you want to try it out, you can go to FindCollabs.com. FindCollabs is for finding people to create your projects with, and getting those projects built. Whether you are a programmer, a writer, a musician, a game designer, an actor, a videographer, or a project manager–you can find people to collaborate with. I love to work on so many different types of projects, and I love to collaborate with other people. For me, the best way that I learn new skills is by building things. I like to write music, create software, make podcasts, and build businesses. I like to create all the time. For big projects, it’s easier to build your project with a team. Finding team members can be very hard. I value other people who are creative and reliable. FindCollabs lets you find and invite people to your projects, so that you can put together a team to build your project. FindCollabs also has a reputation system. When you work on a project, your collaborators rate you. As you make contributions to projects, you show the FindCollabs community that you are reliable and productive–and other people will want to work with you because of that. The only way to build huge, ambitious projects as a team is for people to trust each other. If you are unreliable, people will not want to work with you. When you join a project on FindCollabs, you are committing to doing work that will add value to that project. If you like to build projects and be creative, you might like FindCollabs. To get started, you can go to FindCollabs.com, log in, and post a project. Or join someone else’s project. If you are ever confused about anything, you can always send me an email. We are sponsoring a series of hackathons on FindCollabs. These hackathons are for anyone with a creative project–whether you want to make a music video, a virtual reality game, an acoustic guitar song, a cryptocurrency whitepaper, a mobile app, a commercial–anything creative. Our first hackathon starts today, March 3rd 2019, and ends at 11:59 PM PST on Saturday March 16th. On March 17 2019 we will announce the winners of the first hackathon, and send them emails. We will also announce the details of the second hackathon. Prizes In the first hackathon, the prizes are not very big. But they will get bigger over time. If you like the idea of FindCollabs, it might benefit you to get involved now, so that you can build your reputation and find better people to work with in the future. 1st place: $500 divided evenly among the winning team; SE Daily hoodies for each member of the team 2nd place: SE Daily hoodies for each member of the team Most valuable feedback on the product: SE Daily Towel Most helpful community member award: SE Daily Old School Bucket Hat The FindCollabs hackathons will be judged by a panel of investors, entrepreneurs, podcasters, artists, and technologists. We will announce the judges of the first hackathon in the next few days. These judges will be voting based on which projects they like the most. Every project on the FindCollabs site before 11:59 PM PST on Saturday March 16th will be entered to win the contest. To find our detailed terms and conditions, go to findcollabs.com/terms. Thanks for taking the time to read through this post. If you get a chance, check out FindCollabs and feel free to send me feedback. I’d love to know what you think, and any suggestions you have.

Mar 3, 20198 min

Ep 1076Internet History (and Future) with Brian McCullough

The Internet has transformed humanity. The Internet is the result of a long series of innovations from military, academia, business, and the open source community. In his book, How The Internet Happened: From Netscape to the iPhone, Brian McCullough tells the story of the last 25 years of Internet development through the lens of companies like ebay, Amazon, Google, and Apple. Whereas other books have focused on the trajectory of these individual companies, Brian explains how innovations in one company often lead to success in another. Without the lessons from Napster, we might not have Spotify. Without the trust model pioneered by ebay, we would not have marketplaces like Airbnb. Brian is also the host of The Internet History Podcast and the Techmeme Ride Home podcast. In The Internet History Podcast, Brian interviews entrepreneurs and engineers who were firsthand witnesses to the developments that led to our modern Internet, including early employees at Amazon, Tesla, and TheGlobe.com. In his other podcast, the Techmeme Ride Home, Brian gives a daily overview of the day’s Internet news. Through his podcasts about the Internet’s past and present, Brian has also accumulated an intuition about the future. He joins the show to discuss his book, the art of podcasting, and the historical lessons of technology.

Mar 1, 20191h 2m