
Software Engineering Daily
2,188 episodes — Page 32 of 44
Ep 705Kubernetes Usability with Joe Beda
With the community centralizing on Kubernetes, developers are able to comfortably bet big on open source projects like Istio, Conduit, Rook, Fluentd, and Helm, each of which we will be covering in the next few weeks. The centralization on Kubernetes also makes it easier to build enterprise companies, who are no longer trying to think about which container orchestration to support. There is a wide array of Kubernetes-as-a-service providers offering a highly available runtime–and a variety of companies offering observability tools to make it easier to debug distributed systems problems. Despite all of these advances–Kubernetes is less usable than it should be. It still feels like operating a distributed system. Hopefully someday, operating a Kubernetes cluster will be as easy as operating your laptop computer. To get there, we need improvements in Kubernetes usability. Today’s guest Joe Beda was one of the original creators of the Kubernetes project. He is a founder of Heptio, a company that provides Kubernetes tools and services for enterprises. I caught up with Joe at KubeCon 2017, and he told me about where Kubernetes is today, where it is going, and what he is building at Heptio. Full disclosure–Heptio is a sponsor of Software Engineering Daily. For the next two weeks, we are covering exclusively the world of Kubernetes. Kubernetes is a project that is likely to have as much impact as Linux. Whether you are an expert in Kubernetes or you are just starting out, we have lots of episodes to fit your learning curve. To find all of our old episodes about Kubernetes, download the Software Engineering Daily app for iOS or for Android. In other podcast players, only the most 100 recent episodes are available, but in our apps you can find all 650 episodes–and there is also plenty of content that is totally unrelated to Kubernetes! Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 704Cloud R&D with Onsi Fakhouri
In the first 10 years of cloud computing, a set of technologies emerge that every software enterprise needs; continuous delivery, version control, logging, monitoring, routing, data warehousing. These tools were built into the Cloud Foundry project, a platform for application deployment and management. As we enter the second decade of cloud computing, another new set of technologies are emerging as useful tools. Serverless functions allow for rapid scalability at a low cost. Kubernetes offers a control plane for containerized infrastructure. Reactive programming models and event sourcing make an application more responsive and simplify the interactions between teams who are sharing data sources. The job of a cloud provider is to see new patterns in software development and offer tools to developers to help them implement those new patterns. Of course, building these tools is a huge investment. If you’re a cloud provider, your customers are trusting you with the health of their application. The tool that you build has to work properly and you have to help the customers figure out how to leverage the tool and resolve any breakages. Onsi Fakhouri is the senior VP of R&D for cloud at Pivotal, a company that provides a software and support for Spring, Cloud Foundry and several other tools. I sat down with Onsi to discuss his strategy for determining which products Pivotal chooses to build. There are a multitude of engineering and business elements that Onsi has to consider when allocating resources to a project. Cloud Foundry is used by giant corporations like banks, telcos and automotive manufacturers. Spring is used by most enterprises that run Java, including most of the startups that I have worked at in the past. Cloud Foundry has to be able to run on premise and in the cloud providers like AWS, Google and Microsoft. Pivotal also has its own cloud, Pivotal Web Services, and all of these stakeholders have different technologies that they would like to see built. Onsi’s job is to determine which ones have the highest net impact and make a decision on those and allocate resources towards them. I interviewed Onsi at Spring One Platform, which is a conference that is organized by Pivotal who, full disclosure, is a sponsor of Software Engineering Daily. This week’s episodes are all conversations from that conference, and if there’s a conference that you think I should attend and do coverage at, let me know. Whether you like this format or not, I would love to get your feedback. We have some big developments coming for Software Engineering Daily in 2018 and we want to have a closer dialogue with the listeners. Please send me an email, [email protected] or join our Slack channel. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 703Spring Data with John Blum
In the 1980s and the 1990s, most applications used only a relational database for their data management. In the early 2000s, software projects started to use an ever increasing number of data sources. MongoDB popularized the document database, which allows storage of objects that do not have a consistent schema. The Hadoop distributed file system enabled the redundant storage and efficient querying of high volumes of data that are spread out across multiple commodity disks. The Cassandra Database is a hybrid between key-value storage and column-oriented storage. The benefit of these different data systems is that you can choose a system that gives you the read and write performance that you need. The downside is that each of these databases has different querying semantics. If you’re a developer trying to access data from your application, you often need to know how to access that data from the specific data source and whether that data needs to be queried with SQL, or with the document style query, or with a MapReduce job. Spring Data is a project to standardize the programming model for data access within Spring. The vision for the project is to give Spring developers a consistent way to access their data from any database, or retaining the performance characteristics of those databases. Spring is a Java framework for writing web applications, but this conversation is useful even for people who are not building these Spring applications. Whatever application you’re building, you are probably pulling from multiple data sources. The question of how to abstract away the complexity of those multiple data sources is also being tackled by projects such as GraphQL and Falcor. John Blum is a staff engineer who works on the Spring Data Project at Pivotal. He joins the show to discuss how to design a data access layer. We discussed the API between a database and the Spring Data layer and also talked about reactive programming. Reactive programming allows the application layer to respond to changes in the underlying data layer. I interviewed John at SpringOne Platform, which is a conference that is organized by Pivotal, who full disclosure is a sponsor of Software Engineering Daily. This week’s episodes are all conversations from that conference. If there’s a conference that you think I should attend and do some coverage at, please let me know. Whether you like this format or not, I would love to get your feedback. We have some big developments coming for Software Engineering Daily in 2018, and we want to have a closer dialogue with the listeners. Please send me an e-mail [email protected], let me know what’s up. Or join our Slack channel. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 702Cloud Foundry with Rupa Nandi
Cloud Foundry is an open-source platform as a service for deploying and managing web applications. Cloud Foundry is widely used by enterprises who are running applications that are built using Spring, a popular web framework for Java applications, but developers also use Cloud Foundry to manage apps built in Ruby, Node and any other programming language. Cloud Foundry includes routing, message brokering, service discovery, authentication and other application level tooling for building and managing a distributed system. Some of the standard tooling in Cloud Foundry was adopted from Netflix open-source projects, such as Hystrix, which is the circuit breaker system; and Eureka, which is the service discovery server and client. When a developer deploys their application to Cloud Foundry, the details of what is going on are mostly abstracted away, which is by design. When you’re trying to ship code and iterate quickly for your organization, you don’t want to think about how your application image is being deployed to underlying infrastructure. You don’t want to think about whether you’re deploying a container or a VM, but if you use Cloud Foundry enough, you might have become curious about how Cloud Foundry schedules and runs application code. BOSH is a component of Cloud Foundry that sits between the infrastructure layer and the application layer. Cloud Foundry can be deployed to any cloud provider because of BOSH’s well-defined interface. BOSH has the abstraction of a stem cell, which is a versioned operating system image wrapped in packaging for whatever infrastructure as a service is running underneath. With BOSH, whenever a VM gets deployed no your underlying infrastructure, that VM gets a BOSH agent. The agent communicates with the centralized component of BOSH called the director. This role of director is the leader of the distributed system. Rupa Nandi is a director of engineering at Pivotal where she works on Cloud Foundry. In this episode we talked about scheduling an infrastructure, the relationship between Spring and Cloud Foundry and the impact of Kubernetes, which Cloud Foundry has integrated with so that users can run Kubernetes workloads on Cloud Foundry. I interviewed Rupa at SpringOne Platform, a conference that is organized by Pivotal who, full disclosure, is a sponsor of Software Engineering Daily, and this week’s episode are all conversations from that conference. Whether you like this format or don’t like this format, I would love to get your feedback. We have some big developments coming for Software Engineering Daily in 2018 and we want to have a closer dialogue with the listeners. Please send me an email, [email protected] or join our Slack channel. We really want to know what you’re thinking and what your feedback is, what you would like to hear more about, what you’d like to hear less about, who you are. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 701Dwarf Fortress with Tarn Adams Holiday Repeat
Originally published October 22, 2015 Dwarf Fortress is a construction and management simulation computer game set in a procedurally generated fantasy world in which the player indirectly controls a group of dwarves, and attempts to construct a successful underground fortress. Tarn Adams works on Dwarf Fortress with his brother Zach.
Ep 700Language Design with Brian Kernighan Holiday Repeat
Originally published January 6, 2016 Brian Kernighan is a professor of computer science at Princeton University and the author of several books, including The Go Programming Language and The C Programming Language, a book more commonly referred to as K&R. Professor Kernighan also worked at Bell Labs alongside Unix creators Ken Thompson and Dennis Ritchie and contributed to the development of Unix.
Ep 699Software and Entrepreneurship with Seth Godin Holiday Repeat
Originally published November 18, 2015 Seth Godin is a writer, speaker, and entrepreneur. He is the author of many books, including most recently, What To Do When It’s Your Turn.
Ep 698Knowledge-Based Programming with Stephen Wolfram Holiday Repeat
Originally published November 10, 2015 Wolfram Research makes computing software powered by the Wolfram language, a knowledge-based programming language that draws from symbolic and functional programming paradigms. Stephen Wolfram is the Founder and CEO of Wolfram Research, and also the author of A New Kind of Science.
Ep 697Machine Learning and Technical Debt with D. Sculley Holiday Repeat
Originally published November 17, 2015 Technical debt, referring to the compounding cost of changes to software architecture, can be especially challenging in machine learning systems. D. Sculley is a software engineer at Google, focusing on machine learning, data mining, and information retrieval. He recently co-authored the paper Machine Learning: The High Interest Credit Card of Technical Debt.
Ep 696Modern War with Peter Warren Singer
Military force is powered by software. The drones that are used to kill suspected terrorists can identify those terrorists using the same computer vision tools that are used to identify who is in an Instagram picture. Nuclear facilities in Iran were physically disabled by the military-sponsored Stuxnet virus. National intelligence data is collected and processed using the MapReduce algorithm. The military keeps up with technology more effectively than lawmakers. It is common to read a quote from a senator or a judge that shows a basic misunderstanding of cybersecurity. Many politicians do not even use email. There is a large and growing knowledge gap between military capability and the technological savvy of policymakers. On the whole, government is not prepared for modern warfare. Just like in social media information wars, the instigators of conflict have an advantage. And the ability to instigate such a conflict is democratized. Social media, open source software, and cloud computing give a technologist superpowers. Cryptocurrencies can anonymize the financial transactions to pay for such tools, and basic encryption can anonymize the terroristic acts that occur over a remote internet connection. Peter Warren Singer is a political scientist who formerly worked in the United States advisory committee on International Communications and Information Policy. He is also an author, whose books include Wired for War, Cybersecurity and Cyberwar: What Everyone Needs to Know, and Ghost Fleet: A Novel of the Next World War. Peter writes about the circumstances that could lead to global warfare, and how military actors might behave in a third world war. In this episode, Peter shares a dark, but realistic vision that we should all hope to avoid. If you like this episode, we have done many other shows on related topics–including drones, IoT security, and automotive cybersecurity. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 695React Components with Max Stoiber
Modern frontend development is about components. Whether we are building an application in React, Vue, or Angular, components are the abstractions that we build our user interfaces out of. Today, this seems obvious, but if you think back five years ago, frontend development was much more chaotic–partly because we had not settled around this terminology of the component. React has become the most popular frontend framework, and part of its growth is due to the ease and reusability of components across the community. It’s easy to find building blocks that you can use to piece together your frontend application. Do you need a video player component? Do you need a news feed component? A profile component? All of these things are easy to find. As you build a React application, you take some open source components off the shelf, and you build others yourself. To keep things looking nice and consistent, you need to style your components. If you are not careful with how you manage your stylesheets, you can end up with inconsistent stylings and namespace conflicts. Max Stoiber is the creator of styled-components, a project to help enforce best practices around styling components. He has also a founder of Spectrum, a system that allows people to build online communities. Spectrum has similar design and engineering challenges to Slack or Facebook, so it made for a great discussion of modern software architecture. In today’s episode, Max and I had a wide-ranging conversation about frontend frameworks, components, and the process of building a product. Max also describes the advantages of using GraphQL and the Apollo toolchain. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 694Managing Engineers with Ron Lichty
“Management is about human beings. Its task is to make people capable of joint performance. To make their strengths effective and their weaknesses irrelevant.” That quote is from Peter Drucker. It is one of the many useful quotes collected in Ron Lichty’s book “Managing the Unmanageable”—and it illustrates why we work in teams. When we collaborate with each other, we make each other’s strengths effective, and our weaknesses become irrelevant. To collaborate effectively, we need leaders. We need management. Ron Lichty spent 6 years managing engineers at Apple, and many more years in management and director roles elsewhere. In his book, Ron lays out the lessons he learned in 30 years of engineering management. Ron also describes concrete strategies for how to manage engineers productively. An engineer who becomes a manager needs to learn new skills. And the hardest skills to master have nothing to do with technology. Prioritizing the right projects, allocating engineering resources, making architectural decisions—all of those skills are important. But the art of relationships—of diplomacy and language—is harder to learn than any technical skill. How do you motivate an engineer to do something that is boring? How do you have a difficult conversation with an engineer who needs to improve? When a conflict between engineers comes up, do you confront the conflict head-on, or do you wait for those engineers to resolve it among themselves? These questions do not have easy answers. The best way to learn how to react to these situations is to live through them. The second best way to learn is to read and listen to people who have seen so much of the management dynamic that they can distill it into anecdotes and aphorisms. In today’s show, Ron shares several stories that changed how I think about management. Ron and I did not have time to discuss everything I wanted to, and I recommend checking out his podcast episode on Software Engineering Radio for more detail. And also check out his book—Managing the Unmanageable. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 693Hacker Noon with David Smooke
The New York Times makes most of its money off of subscriptions. Facebook makes its money off of native advertising. Hacker News is funded by Y-Combinator. Each of these business models creates biases in the information that gets promoted on the respective platforms. This is why I like to know the origin story and the business models behind the publications that I read. Published content is shaped by the profit motive of the publication. And yet, last month, I repeatedly found myself reading high quality content on a Medium publication that I did not know the origin of: Hacker Noon. Hacker Noon is a popular Medium publication that syndicates curated content written about software. Let me explain “syndication.” Imagine that I just spent three days on a Medium post about functional programming, and I have zero followers on social media. How can I get people to read awesome post? The answer is syndication. I can submit my Medium post to Hacker Noon. This gives me free distribution, and it gives Hacker Noon free content—a win-win relationship. But why was it worth it for Hacker Noon to spend time curating content? That syndication process takes time. You have to read through lots of submissions, sometimes you have to send it back to the author to have it edited. And this is all to build a following on Medium. I have not heard of Medium being a profitable platform to build a business. It’s worth pointing out the difference between Medium and WordPress. On WordPress, this model of curated syndication has worked to massive success—for example, the Huffington Post and TechCrunch. These businesses make millions of dollars from advertising networks, because they are built on WordPress, and WordPress is an open model. A publisher on WordPress can install plugins that serve ads from third party providers like Outbrain and Taboola. A WordPress site can also install any kind of data collection scripts, to gather data on visitors, and sell it to the highest bidder. The lack of third party plugins is the blessing and the curse of Medium. Because there is no third party ecosystem, reading content on Medium is a beautiful experience. The page loads quickly and predictably. There are no random scripts that are blocking the page as they hog your browser’s resources. When you go to close the page, there is never a popup that asks you to subscribe to a newsletter. When I read content on Medium, I am not getting slapped across the face with ads for reverse mortgages and açaí berries. I am not being tagged for retargeting. It’s a beautiful experience. But Medium seems like an ecosystem that would not allow for the content syndication business like Hacker Noon. I wanted to know who was running Hacker Noon, how the business works, and what it says about Medium as a publishing platform. Hacker Noon turns out to be part of a network of Medium publications called AMI. AMI’s network includes sites like Art + Marketing, Future Travel, and Fit Yourself Club–all of which are distinct syndication platforms. David Smooke is the CEO of AMI, and he joins this episode to explain how his business works, how he has scaled the content syndication business, and why he is betting on Medium. It was a detailed look into the state of online publishing and where it might be headed. If you don’t read Hacker Noon already, one article to start with that shows off the quality of content is Learn Blockchains By Building One. I interviewed the author of that article, Daniel Van Flymen, and it has been one of the most popular episodes of Software Engineering Daily. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 692Protocol Buffers with Kenton Varda
When engineers are writing code, they are manipulating objects. You might have a user object represented on your computer, and that user object has several different fields—a name, a gender, and an age. When you want to send that object across the network to a different computer, the object needs to be turned into a sequence of 1s and 0s that will travel efficiently across the network. This is known as “serialization.” As the user object sits on your computer, it is represented in 1s and 0s. You could just send that same representation over the wire. But we use efficient serialization to send it over the network in a more compact format. We also have to make sure that when we send that object to another service, the other service knows how to deserialize it, and turn it back into a format that we can operate on at the application level. Protocol buffers are a serialization protocol that originated at Google. Protocol buffers created a standardized interface for efficiently passing data between services. When Kenton Varda worked at Google, he was the tech lead for protocol buffers, and he joins the show to explain how protobufs work—and a newer serialization protocol that Kenton led: Cap’n Proto. You can expect to walk away from this episode with an understanding of how serialization protocols work, and the design tradeoffs you can make when creating a serialization protocol. We also touched on a startup that Kenton founded, called Sandstorm, and how he eventually found himself at Cloudflare, where he works on Cloudflare workers. With these topics, we did not go as deep as I would have liked, and I look forward to having Kenton back on in the near future. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 691High Volume Logging with Steve Newman
Google Docs is used by millions of people to collaborate on documents together. With today’s technology, you could spend a weekend coding and build a basic version of a collaborative text editor. But in 2004 it was not so easy. In 2004 Steve Newman built a product called Writely, which allowed users to collaborate on documents together. Initially, Writely was hosted on a single server that Steve managed himself. All of the reads and writes to the documents went through that single server. Writely rapidly grew in popularity, and Steve went through a crash course in distributed systems as he tried to keep up with the user base. In 2006, Writely was acquired by Google, and Steve spent his next four years turning Writely into Google Docs. Eventually he moved onto other projects within Google—“Cosmo” and “Megastore Replication.” When Steve left the company in 2010, he took with him the lessons of logging and monitoring that keep Google’s infrastructure observable. Large organizations have terabytes of log data to manage. This data streams off the servers that are running our applications. That log data gets processed in a “metrics pipeline” and turned into monitoring data. Monitoring data aggregates log data in a more presentable format. Most of the log messages that get created will never be seen with human eyes. These logs get aggregated into metrics, then compressed, and (in many cases) eventually thrown away. Different companies have different sensitivity around their logs, so some companies may not garbage collect any of their logs! When a problem occurs in our infrastructure, we need to be able to dig into our terabytes of log data and quickly find the root cause of a problem. If our log data is compressed and stored on disk, it will take longer to access it. But if we keep all of our logs in memory, it could get expensive. To review: if I want to build a logging system from scratch today I need to build: a metrics pipeline for converting log data into monitoring data; a complicated caching system, a way to store and compress logs; a query engine that knows how to ask questions to the log storage system; a user interface so I don’t have to inspect these logs via command line… The list of requirements goes on and on—which is why there is a huge industry around log management. And logging keeps evolving! One example we covered recently is distributed tracing, which is used to diagnose requests that travel through multiple endpoints. After Steve Newman left Google, he started Scalyr, a product that allows developers to consume, store, and query log messages. I was looking forward to talking to Steve about data engineering, and the query engine that Scalyr has architected, but we actually spent most of our conversation talking about the early days of Writely, and his time at Google—particularly the operational challenges of Google’s infrastructure. Full disclosure: Scalyr is a sponsor of Software Engineering Daily. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 690Scala at Duolingo with Andre Kenji Horie
Duolingo is a language learning platform with over 200 million users. On a daily basis millions of users receive customized language lessons targeted specifically to them. These lessons are generated by a system called the session generator. Andre Kenji Horie is senior engineer at Duolingo. He wrote about the process of rewriting the session generator, moving from Python to Scala and changing architecture at the same time. In this episode Adam Bell talks with him about the reasons for the rewrite, what drove them to move to Scala and the experience of moving from one technology stack to another. Rewriting Doulingo’s Engine in Scala Jobs at Duolingo Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 689Engineering Values with Lynne Tye
The values system of a company guides the actions of the engineers who work at that company. Some companies value open communication and a flat organization where anybody can talk to anyone else. Other companies encourage hierarchy and secrecy, so that employees are focused on their specific section of the company. Some companies take themselves seriously, and have a work environment that is as stoic as the military. Other companies pride themselves on having good beer and a friendly, laid back atmosphere. When company values are properly defined, the values can be used as reference points when making decisions. At Amazon, one of the core company values is “bias for action.” As an engineer, you are often in a situation where you can wait for more information, or you can start a project with an incomplete picture for how you will finish it. The “bias for action” lets you know that you should usually start the project despite having an incomplete picture. Another use of a company values system is for hiring. When a company publishes their values, prospective employees can use those stated values as a way to know if they would be a good cultural fit. For example “move fast and break things” was a value that allowed Facebook to ship new products faster than any other company before it. But the speed of movement is not for everyone. Some engineers like to have their code unit tested, and free of all bugs before shipping to production. Every company has values that define their company. And every engineer has values that define how they want to work. Lynne Tye started her company Key Values as a platform to index companies by their values systems. This allows engineers to find companies that are a good cultural fit for their values system. Lynne joins the show today to explain how engineers and companies define their values systems, and how that affects the outcomes of engineering organizations. Lynne also talks about her time at HomeJoy, one of the first companies in the “gig economy”. HomeJoy was an on-demand house cleaning service that grew extremely fast, but ultimately went under due to lawsuits. The challenges of HomeJoy were a predictor of the challenges later faced by Uber and Airbnb, and it was fascinating to hear Lynne reflect on her time spent managing operations at HomeJoy–which was about as operationally intensive a company as you can imagine! Thanks to Courtland Allen for the intro to Lynne, and if you haven’t checked out the Indie Hackers podcast, which is hosted by Courtland, you should subscribe to it. Indie Hackers breaks down the engineering and business models behind small software companies–it’s one of my favorite shows. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 688Cloud Marketplace with Zack Bloom
Ten years ago, if you wanted to build software, you probably needed to know how to write code. Today, the line between “technical” and “non-technical” people is blurring. Website designers can make a living building sites for people on WordPress or Squarespace–without knowing how to write code. Salesforce integration experts can help a sales team set up complicated software–without knowing how to write code. Shopify experts can set up an ecommerce store to your exact specifications–without knowing how to write code. WordPress, Squarespace, Salesforce, and Shopify are all fantastic services–but they are not compatible with each other. I can’t install a WordPress plugin on Salesforce. Now imagine this from the point of view of plugin creators. Plugin creators make easy ways to integrate different pieces of software together. Take PayPal as an example. PayPal wants to make it easy for software builders to integrate with their API. One plugin that PayPal has is a button that says “Pay with PayPal.” If I am a developer at PayPal, and I am building a button that people should be able to easily put on their webpage so that their users can pay with PayPal, I have to create a button that is compatible with WordPress, and Squarespace, and Wix, and Weebly, and GoDaddy, and Blogger, and all the other website builders that I might want to integrate with. In 2014, Zack Bloom started a company called Eager. Eager was a cloud app marketplace which allowed app developers to make flexible plugins that non-technical users could drag and drop into their site without technical expertise. In order for these non-technical users to add any apps from the Eager marketplace to their webpage, they had to drop in a line of JavaScript–which is, unfortunately, a significant hurdle for a nontechnical user. Eager proved to be a useful distribution mechanism for plugin developers who could write a plugin once and get distributed to multiple plugin marketplaces. But Eager was not as widely used as a way to directly drag and drop plugins onto sites. The question was: how do you build a marketplace for non-technical users to add plugins to any website without forcing the non-technical user to write code? How do you make editing any website as easy as a WYSIWYG editor? The CDN turns out to be the perfect distribution platform for these kinds of apps. Users already integrate with a CDN, so the CDN can do the work of inserting the code that allows the plugins to be added to a user’s webpage. Because of the opportunity for the integration between a plugin marketplace and a CDN, Eager was acquired by Cloudflare, and Eager became Cloudflare apps. Zack Bloom joins the show today to discuss the motivations for his company, the engineering behind building a cloud app marketplace, and the acquisition process of his company Eager. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 687Scalable Multiplayer Games with Yan Cui
Remember when the best game you could play on your phone was Snake? In 1998, Snake was preloaded on Nokia phones, and it was massively popular. That same year Half-Life won game of the year on PC. Metal Gear Solid came out for Playstation. The first version of Starcraft also came out in 1998. In 1998, few people would have anticipated that games with as much interactivity as Starcraft would be played on mobile phones twenty years later. Today, mobile phones have the graphics and processing power of a desktop gaming PC from two decades ago. But one thing still separates desktop gaming from mobile gaming: the network. With desktop gaming, users have a reliable wired connection that keeps their packets moving over the network with speeds that let them compete with other users. With mobile gaming, the network can be flaky. How do we architect real-time strategy games that can be played over an intermittent network connection? Yan Cui is an engineer at Space Ape Games, a company that makes interactive multiplayer games for mobile devices. In a previous episode, Yan described his work re-architecting a social networking startup where the costs had gotten out of control. Yan has a skill for describing software architecture and explaining the tradeoffs. When architecting a multiplayer mobile game, there are many tradeoffs to consider. What do you build and what do you buy? Do you centralize your geographical deployment to make it easier to reconcile conflicts, or do you spread your server deployment out globally? What is the interaction between the mobile clients and the server? The question of interaction between client and server for a mobile game has lessons that are important for anyone building a highly interactive mobile application. For example, think about Uber. When I make a request for a car, I can look at my phone and see the car on the map, slowly approaching me. The driver can look at his phone and see if I move across the street. This is accomplished by synchronizing the data from the driver’s phone and my phone in a centralized server, and sending the synchronized state of the world out to me and the driver. How much data does the centralized server need to get from the mobile phones? How often does it need to make those requests? The answers to these questions will vary based on bandwidth, device type, phone battery life, and other factors. There are similar problems in mobile game engineering, when users are in different players on a virtual map. They are fighting each other, trying to avoid enemies, trying to steal power ups from each other. Mobile games can be even more interactive than a ridesharing app like Uber, so the questions of data synchronization can be even harder to answer. On Software Engineering Daily, we have explored the topic of real-time synchronization in our past shows about the infrastructure of Uber and Lyft. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Yan Cui’s new video course: AWS Lambda in Motion Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 686Decentralized Objects with Martin Kleppman
The Internet was designed as a decentralized system. Theoretically, if Alice wants to send an email to Bob, she can set up an email client on her computer and send that email to Bob’s email server on his computer. In reality, very few people run their own email servers. We all send our emails to centralized services like Gmail, and connect to those centralized services using our own client—a browser on our laptop or a mobile application on our smart phone. Gmail is popular because nobody wants to run their own email server—it’s too much work. With Gmail, our emails our centralized, but centralization comes with convenience. Similar centralization happened with online payments. Decentralization is a desirable feature of computer systems. So how do we make more of our applications decentralized? Martin Kleppman is a distributed systems researcher and the author of Data Intensive Applications. Martin is concerned by the centralization of our computer networks, and he works on CRDT technology in order to make it easier for people to build peer-to-peer applications. Most of the people who know how to build systems with CRDTs are distributed systems PhDs, database experts, and people working at huge internet companies. How do you make developer-friendly CRDTs? How do you allow random hackers to build peer-to-peer applications that avoid conflicts? Start by making a CRDT out of the most widely used, generalizable data structure in modern application development: the JSON object. In today’s episode, Martin and I talk about conflict resolution, CRDTs, and decentralized applications. This is Martin’s second time on the show, and his first interview is the most popular episode to date. You can find a link to that episode in the show notes for this episode, or you can find it in the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 685Serverless Applications with Randall Hunt
Developers can build networked applications today without having to deploy their code to a server. These “serverless” applications are constructed from managed services and functions-as-a-service. Managed services are cloud offerings like database-as-a-service, queueing-as-a-service, or search-as-a-service. These managed services are easy to use. They take care of operational burdens like scalability and outages. But managed services typically solve a narrow use case. You can’t build an application entirely out of managed services. Managed services are scalable and narrow. Functions-as-a-service are scalable and flexible. With managed services, you make remote calls to a service with a well-defined API. With functions-as-a-service, you can deploy your own code. But functions-as-a-service execute against transient, unreliable compute resources. They aren’t a good fit for low latency computation, and the code you run on them should be stateless. Managed services and functions-as-a-service are the perfect complements. Managed services provide you with well-defined server abstractions that every application needs—like databases, search indexes, and queues. Functions as a service offer flexible “glue code” that you can use to create custom interactions between the managed services. The term “serverless” is used to describe the applications that are built entirely with managed services and functions as a service. Serverless applications are dramatically simpler to build and easier to operate than cloud applications of the past. The costs of managed services can get expensive, but the costs of functions as a service can cost 1/10th of what it might take to run a server that is handling your requests. Whether the size of your bill will increase or decrease as your company becomes “serverless” is less of an issue than the fact that your employees will be more productive: serverless applications have less operational burden, so developers spend more time architecting and implementing software. It has been 5 years since the Netflix infrastructure team was talking about the aspirational goal of a “no-ops” software culture. Your software should be so well-defined that you do not need regular intervention of ops staff to reboot your servers and reconfigure your load balancers. Serverless is a newer way of moving operational expense into capital expense. Today’s guest Randall Hunt is a senior technical evangelist with Amazon Web Services. He travels around the world meeting developers and speaking at conferences about AWS Lambda, the functions as a service platform from Amazon. Randall has given some excellent talks about how to architect and build serverless applications (which I will add to the show notes), and today we explore those application patterns further. Serverless Services – Randall Hunt Randall Hunt at AWS Summit Seoul Serverless, What is it Good For? Randall Hunt Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 684Data Science Mindset with Zacharias Voulgaris
A company’s approach to data can make or break the business. In the past, data was static. There was not much data, it sat in Excel, and it was interacted with on a nightly or monthly basis. Now, data is dynamic, real time and huge. To tap into available data, many industries have oriented themselves to becoming data intensive. With many new industry sectors becoming data driven, a new field called data science emerged. As a new field, data science has attracted a lot of attention from professionals with diverse backgrounds. Describing what is data science and who is a data scientist is not easy. As technologies surrounding the field continue to evolve and new verticals are added, the discourse surrounding the field has attracted different voices putting forward their definition of the field. In this episode, Zacharias Voulgaris joins guest host Sid Ramesh to discuss the developments in the field. He is the author of several data science books, and in today’s conversation Zacharias explains what he means by the data science mindset–including trends and misconceptions that people have on the field. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 683Secure Authentication with Praneet Sharma
When I log into my bank account from my laptop, I first enter my banking password. Then the bank sends a text message to my phone with a unique code, and I enter that code into my computer to finish the login. This login process is two-factor authentication. I am proving my identity by entering my banking password (the first factor) and validating that I am in control of my phone (the second factor) by receiving that text message. But in order to log in from my laptop, I need to be in control of my laptop. The laptop itself is a factor. With the laptop and my password, I have two factors. I might not actually need the phone as a factor. Praneet Sharma is the CEO of Keyless, a product that moves 2-factor authentication into the browser. Praneet joins the show to discuss how all kinds of authentication work: multi-factor authentication, single sign on, and Yubikey. We use this discussion of authentication methods to help explain why it actually could make sense for some people to be doing 2-factor authentication without requiring people to take out their phone. We also explore recent security breaches like Target, Equifax and Yahoo–and the industry of security software sold to developers. I see giant banners for security software companies every time I go into the San Francisco airport, and Praneet explained to me some of the products that these kinds of companies are selling. Praneet has joined the show in a previous episode to talk about advertising fraud. He also works with Shailin Dhar at Method Media Intelligence. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 682Serverless Scheduling with Rodric Rabbah
Functions as a service are deployable functions that run without an addressable server. Functions as a service scale without any work by the developer. When you deploy a function as a service to a cloud provider, the cloud provider will take care of running that function whenever it is called. You don’t have to worry about spinning up a new machine and monitoring that machine, and spinning the machine down once it becomes idle. You just tell the cloud provider that you want to run a function, and the cloud provider executes it and returns the result. Functions as a service can be more cost effective than running virtual machines or containerized infrastructure, because you are letting the cloud provider decide where to schedule your function, and you are giving the cloud provider flexibility on when to schedule the function. The developer experience for deploying a serverless function can feel mysterious. You send a blob of code into the cloud. Later on, you send a request to call that code in the cloud. The result of the execution of that code gets sent back down to you. What is happening in between? Rodric Rabbah is the principal researcher and technical lead in serverless computing at IBM. He helped design IBM Cloud Functions, the open source functions-as-a-service platform that IBM has deployed and operationalized as IBM Cloud Functions. Rodric joins the show to explain how to build a platform for functions as a service. When a user deploys a function to IBM Cloud Functions, that function gets stored in a database as a blob of text, waiting to be called. When the user makes a call to the function, IBM Cloud Functions takes it from the database and queues the function in Kafka, and eventually schedules the function onto a container for execution. Once the function has executed, IBM Cloud Functions stores the result in a database and sends that result to the user. When you execute a function, the time spent scheduling it and loading it onto a container is known as the “cold start problem”. The steps of executing a serverless function take time, but the resource savings are significant. Your code is just stored as a blob of text in a database, rather than sitting in memory on a server, waiting to execute. In his research for building IBM Cloud Functions, Rodric wrote about some of the tradeoffs for users who build applications with serverless functions. The tradeoffs exist along what Rodric calls “the serverless trilemma.” In today’s episode, we discuss why people are using functions-as-a-service, the architecture of IBM Cloud Functions, and the unsolved challenges of building a serverless platform. Full disclosure: IBM is a sponsor of Software Engineering Daily. OpenWhisk Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 681Animating VueJS with Sarah Drasner
Most user interfaces that we interact with are not animated. We click on a button, and a form blinks into view. We click a link and the page abruptly changes. On the other hand, when we interact with an application that has animations, we can feel the difference. The animations are often subtle. If you aren’t sure what I’m talking about, pay attention the next time you use Slack or Facebook Messenger or iMessage. Airbnb values animation so much that they built Lottie, a library for animation. In an animated application, the user interface feels alive. When a software team takes the time to build animations into small interactions, the user perceives the animations as polish and attention to detail. Sarah Drasner has been evangelizing the value of animations for years, and she is an expert at implementing complex and beautiful animations on the web. She works at Microsoft as a developer advocate and joins the show to talk about how to build animations. If you are building a web application and want to create a unique UI, you might find this show useful. JavaScript supports detailed animations, often through the manipulation of SVG files. SVG stands for “scalable vector graphics” a file format that represents an image in its own DOM. SVG is so flexible because of this DOM format, which defines the different parts of the SVG. This is in contrast to a bitmap, which is just a simple matrix of dots, without any rich metadata. You could manipulate SVG with raw JavaScript—but most people use a frontend JavaScript framework like React, Angular, or VueJS. Sarah has been implementing most of her recent web animations using Vue, and she is a member of the Vue core team. Vue has an entertaining story, because it gained popularity in a time when Google was supporting AngularJS and Facebook was supporting ReactJS. The first version of Vue was created from scratch by a single developer, Evan You. If you are a Vue developer looking for an open source project to hack on, you can check out softwaredaily.com, which is an open source platform to consume content about software. In addition to the Vue web app, we also have the Software Engineering Daily app for iOS and for Android. All of these apps are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 680React and GraphQL at New York Times
Are we a media company or a technology company? Facebook and the New York Times are both asking themselves this question. Facebook originally intended to focus only on building technology–to be a neutral arbiter of information. This has turned out to be impossible. The Facebook newsfeed is defined by algorithms that are only as neutral as the input data. Even if we could agree on a neutral data set to build a neutral newsfeed, the algorithms that generate this news feed are not public, so we have no way to vet their neutrality. Facebook is such a powerful engine for distribution, it has allowed for a rise in the number of publishers who can get their voice heard. As a result, large media companies have lost market share because Facebook has replaced their distribution. The New York Times has always been a media company–but the standards for media consumption have shot up. Millions of people produce content for free, and that content is distributed through high quality experiences like Twitter, YouTube, Medium, and Facebook. When a page takes too long to load on NewYorkTimes.com, it doesn’t matter how good the content is–the user is going to navigate away before they read anything. Today, the New York Times has built out an experienced engineering team. In a previous episode, we reported how the Times uses Kafka to make its old content more accessible. In today’s show, we talk about how the Times uses React and GraphQL to improve the performance and the developer experience of engineers who are building software at the New York Times. Scott Taylor and James Lawrie are software engineers at the New York Times. In this episode, they explain how the New York Times looks at technology. The user experience on New York Times rivals that of a platform company like Facebook, and this is assisted by technologies originally built at Facebook: React, Relay, and GraphQL. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 679How IBM Runs Its Cloud with Jason McGee
Functions as a service let developers deploy stateless application logic that is cheap and scalable. Functions as a service still have some problems to overcome in the areas state management, function composition, usability, and developer education. Kubernetes is a tool for managing containerized infrastructure. Developers put their apps into containers on Kubernetes, and Kubernetes provides a control plane for deployment, scalability, load balancing, and monitoring. So–all of the things that you would want out of a managed service become much easier when you put applications into Kubernetes. This is why Kubernetes has become so popular–and it is why Kubernetes itself is being offered as a managed service by many cloud providers–including IBM. For the last decade, IBM has been building out its cloud offerings–and for two of those years, Jason McGee has been CTO of IBM Cloud Platform. In this episode, Jason discusses what it is like to build and manage a cloud, from operations to economics to engineering. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 678Thumbtack Infrastructure with Nate Kupp
Thumbtack is a marketplace for real-world services. On Thumbtack, people get their house painted, their dog walked, and their furniture assembled. With 40,000 daily marketplace transactions, the company handles significant traffic. On yesterday’s episode, we explored how one aspect of Thumbtack’s marketplace recently changed, going from asynchronous matching to synchronous “instant” matching. In this episode, we zoom out to the larger architecture of Thumbtack, and how the company has grown through its adoption of managed services from both AWS and Google Cloud. The word “serverless” has a few definitions. In the context of today’s episode, serverless is all about managed services like Google BigQuery, Google Cloud PubSub, and Amazon ECS. The majority of infrastructure at Thumbtack is built using services that automatically scale up and down. Application deployment, data engineering, queueing, and databases are almost entirely handled by cloud providers. For the most part, Thumbtack is a “serverless” company. And it makes sense–if you are building a high-volume marketplace, you are not in the business of keeping servers running. You are in the business of improving your matching algorithms, your user experience, and your overall architecture. Paying for lots of managed services is more expensive than running virtual machines–but Thumbtack saves money from not having to hire site reliability engineers. Nate Kupp leads the technical infrastructure team, and we met at QCon in San Francisco to talk about how to architect a modern marketplace. This was my third time attending QCon and as always I was impressed by the quality of presentations and conversations I had there. They were also kind enough to set up some dedicated space for podcasters like myself. The most widely used cloud provider is AWS, but more and more companies that come on the show are starting to use some of the managed services from Google. The great news for developers is that integration between these managed services is pretty easy. At Thumbtack, the production infrastructure on AWS serves user requests. The log of transactions that occur get pushed from AWS to Google Cloud, where the data engineering occurs. On Google Cloud, the transaction records are queued in Cloud PubSub, a message queueing service. Those transactions are pulled off the queue and stored in BigQuery, a system for storage and querying of high volumes of data. BigQuery is used as the data lake to pull from when orchestrating machine learning jobs. These machine learning jobs are run in Cloud Dataproc, a managed service that runs Apache Spark. After training a model in Google Cloud, that model is deployed on the AWS side, where it serves user traffic. On the Google Cloud side, the orchestration of these different managed services is done by Apache Airflow, an open source tool that is one of the few pieces of infrastructure that Thumbtack does have to manage themselves on Google Cloud. To find out more about the Thumbtack infrastructure, check out the video of the talk Nate gave at QCon San Francisco, or check out the Thumbtack Engineering Blog. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 677Marketplace Matching with Xing Chen
The labor market is moving online. Taxi drivers are joining Uber and Lyft. Digital freelancers are selling their services through Fiverr. Experienced software contractors are leaving contract agencies to join Gigster. Online labor marketplaces create market efficiency by improving the communications between buyers and sellers. Workers make their own hours, and their performance is judged by customers and algorithms, rather than the skewed perspective of a human manager. These marketplaces for human labor are in different verticals, but they share a common problem: how do you most efficiently match supply and demand? Perfect marketplace matching is an unsolved problem. Hundreds of computer science papers have been written about the problems of stable matching, which often turn out to be NP-Complete. The stock market has been attempting to automate marketplace matching for decades, and inefficiencies are discovered every year. Today’s show is about matching buyers and sellers on Thumbtack, a marketplace for local services. For the first seven years, Thumbtack was building liquidity in its 2-sided market. During those years, the model for job requests was as follows: let’s say I was on Thumbtack looking for someone to paint my house. I would post a job that would say I am looking for house painters. The workers on Thumbtack that paint houses could see my job and place a bid on it. Then I would choose from the bids and get my house painted. This was the “asynchronous” model. The actions of the buyer and seller were not synchronized. There was a significant delay between the time when the buyer posted a job and the time when a seller places a bid, and then another delay before the buyer selects from the sellers. Thumbtack recently moved to an “instant matching” model. After gathering data about the people selling services on the platform, Thumbtack is now able to avoid the asynchronous bidding process. In the new experience, a buyer goes on the platform, requests a house painter, and is instantly matched to someone who has a history of accepting house painting tasks that fit the parameters of the buyer. From the user’s perspective, this is a simple improvement. From Thumbtack’s perspective, there was significant architectural change required. In the asynchronous model, the user requests lined up in a queue, and were matched with pros who placed bids on the items in that queue. In the instant matching model, a user request became more like a search query–the parameters of that request hit an index of pros and returns a response immediately. Xing Chen is an engineer from Thumbtack, and joins the show to describe the rearchitecture process–how Thumbtack went from an asynchronous matching system to synchronous, instant matching. We also explore some of the other architectural themes of Thumbtack, which we dive into in further detail in tomorrow’s episode about scaling Thumbtack’s infrastructure, which uses both AWS and Google Cloud. On Software Engineering Daily, we have explored the software architecture and business models of different labor marketplaces–from Uber to Fiverr. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 676Load Balancing at Scale with Vivek Panyam
Facebook serves interactive content to billions of users. Google serves query requests on the world’s biggest search engine. Uber handles a significant percentage of the transportation within the United States. These services are handling radically different types of traffic, but many of the techniques they use to balance loads are similar. Vivek Panyam is an engineer with Uber, and he previously interned at Google and Facebook. In a popular blog post about load balancing at scale, he described how a large company scales up a popular service. The methods for scaling up load balancing are simple, but effective–and they help to illustrate how load balancing works at different layers of the networking stack. Let’s say you have a simple service where a user makes a request, and your service sends them a response with a cat picture. Your service starts to get popular, and begins timing out and failing to send a response to users. When your service starts to get overwhelmed, you can scale up load by creating another service instance that is a copy of your cat picture service. Now you have two service instances, and you can use a layer 7 load balancer to route traffic evenly between those two service instances. You can keep adding service instances as the load scales and have the load distributed among those new instances. Eventually, your L7 load balancer is handling so much traffic itself that you can’t put any more service instances in front of it. So you have to set up another L7 load balancer, and put an L4 load balancer in front of those L7 load balancers. You can scale up that tier of L7 load balancers, each of which is balancing traffic across a set of your service instances. But eventually, even your L4 load balancer gets overwhelmed with requests for cat pictures. You have to set up another tier, this time with L3 load balancing… In this episode, Vivek gives a clear description for how load balancing works. We also review the 7 networking layers before discussing why there are different types of load balancers associated with the different networking layers. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 675Incident Response with Emil Storlarsky
As a system becomes more complex, the chance of failure increases. At a large enough scale, failures are inevitable. Incident response is the practice of preparing for and effectively recovering from these failures. An engineering team can use checklists and runbooks to minimize failures. They can put a plan in place for responding to failures. And they can use the process of post mortems to reflect on a failure and take full advantage of the lessons of that failure. Emil Storlarsky is a production engineer at Shopify where his role shares many similarities with that of Google’s site reliability engineers. In this episode, Emil argues that the academic study of emergency management and industries such as aerospace and transportation have a lot to teach software engineers about responding to production problems. In this interview with guest host Adam Bell, Emil argues that we need to move beyond tribal knowledge and incorporate practices such as an incident command system and rigorous use of checklists. Emil suggests that we need to move beyond a mindset of “move fast and break things” and toward a place of more deliberate preparation. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Incident Response Insights Talk The Human Side Of Post Mortems
Ep 674Run Less Software with Rich Archbold
There is a quote from Jeff Bezos: “70% of the work of building a business today is undifferentiated heavy lifting. Only 30% is creative work. Things will be more exciting when those numbers are inverted.” That quote is from 2006, before Amazon Web Services had built most of their managed services. In 2006, you had no choice but to manage your own database, data warehouse, and search cluster. If your server crashed in the middle of the night, you had to wake up and fix it. And you had to deal with these engineering problems in addition to building your business. Technology today evolves much faster than in 2006. That is partly because managed cloud services make operating a software company so much smoother. You can build faster, iterate faster, and there are fewer outages. If you are an insurance company or a t-shirt manufacturing company or an online education platform, software engineering is undifferentiated heavy lifting. Your customers are not paying you for your expertise in databases or your ability to configure load balancers. As a business, you should be focused on what the customers are paying you for, and spending the minimal amount of time on rebuilding software that is available as a commodity cloud service. Rich Archbold is the director of engineering at Intercom, a rapidly growing software company that allows for communication between customers and businesses. At Intercom, the engineering teams have adopted a philosophy called Run Less Software. Running less software means reducing choices among engineering teams, and standardizing on technologies wherever possible. When Intercom was in its early days, the systems were more heterogeneous. Different teams could choose whatever relational database they wanted–MySQL or Postgres. They could choose whatever key/value store they were most comfortable with. The downside of all this choice was that engineers who moved from one team to another team might not know how to use the tools at the new team they were moving to. After switching teams, you would have to figure out how to onboard with those new tools, and that onboarding process was time that was not spent on effort that impacted the business. By reducing the number of different choices that engineering teams have, and opting for managed services wherever possible, Intercom ships code at an extremely fast pace with very few outages. In our conversation, Rich contrasts his experience at Intercom with his experiences working at Amazon Web Services and Facebook. Amazon and Facebook were built in a time where there was not a wealth of managed services to choose from, and this discussion was a reminder of how much software engineering has changed because of cloud computing. To learn more about Intercom, you can check out the Inside Intercom podcast. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 673Training the Machines with Russell Smith
If I am building a mobile app to play podcast episodes, and I make a change to the user interface, I want to have manual quality assurance (QA) testers run through tests that I describe to them, to make sure my change did not break anything. QA tests describe high level application functionality. Can the user register and log in? Can the user press the play button and listen to a podcast episode on my app? Unit tests are not good enough, because unit tests only verify the logic and the application state from the point of view of the computer itself. Manual QA tests ensure that the quality of the user experience was not impacted. With so many different device types, operating systems, and browsers, I need my QA test to be executed in all of the different target QA environments. This requires lots of manual testers. If I want manual testing for every deployment I push, that manual testing can get expensive. RainforestQA is a platform for QA testing that turns manual testing into automated testing. The manual test procedures are recorded, processed by computer vision, and turned into automated tests. RainforestQA hires human workers from Amazon Mechanical Turk to execute the well-defined manual tests, and the recorded manual procedure is used to train the machines that can execute the same task in the future. Russell Smith is the CTO and co-founder of RainforestQA, and he joins the show to explain how RainforestQA works: the engineering infrastructure, the process of recruiting workers from mechanical turk, and the machine learning system for taking manual tasks and automating them. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 672High Volume Event Processing with John-Daniel Trask
A popular software application serves billions of user requests. These requests could be for many different things. These requests need to be routed to the correct destination, load balanced across different instances of a service, and queued for processing. Processing a request might require generating a detailed response to the user, or making a write to a database, or the creation of a new file on a file system. As a software product grows in popularity, it will need to scale these different parts of infrastructure at different rates. You many not need to grow your database cluster at the same pace that you grow the number of load balancers at the front of your infrastructure. Your users might start making 70% of their requests to one specific part of your application, and you might need to scale up the services that power that portion of the infrastructure. Today’s episode is a case study of a high-volume application: a monitoring platform called Raygun. Raygun’s software runs on client applications and delivers monitoring data and crash reports back to Raygun’s servers. If I have a podcast player application on my iPhone that runs the Raygun software, and that application crashes, Raygun takes a snapshot of the system state and reports that information along with the exception, so that the developer of that podcast player application can see the full picture of what was going on in the user’s device, along with the exception that triggered the application crash. Throughout the day, applications all around the world are crashing and sending requests to Rayguns servers. Even when crashes are not occurring, Raygun is receiving monitoring and health data from those applications. Raygun’s infrastructure routes those different types of requests to different services, queues them up, and writes the data to multiple storage layers–ElasticSearch, a relational SQL database, and a custom file server built on top of S3. John-Daniel Trask is the CEO of Raygun and he joins the show to describe the end-to-end architecture of Raygun’s request processing and storage system. We also explore specific refactoring changes that were made to save costs at the worker layer of the architecture. This is useful memory management strategy for anyone working in a garbage collected language. If you would like to see diagrams that explain the architecture and other technical decisions, the show notes have a video that explains what we talk about in this show. Full disclosure: Raygun is a sponsor of Software Engineering Daily. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 671Fiverr Engineering with Gil Scheinfeld
As the gig economy grows, that growth necessitates innovations in the online infrastructure powering these new labor markets. In our previous episodes about Uber, we explored the systems that balance server load and gather geospacial data. In our coverage of Lyft, we studied Envoy, the service proxy that standardizes communications and load balancing among services. In shows about Airbnb, we talked about the data engineering pipeline that powers economic calculations, user studies, and everything else that requires a MapReduce. In today’s episode, we explore the business and engineering behind another online labor platform: Fiverr. Fiverr is a marketplace for digital services. On Fiverr, I have purchased podcast editing, logo creation, music lyrics, videos, and sales leads. I have found people who will work for cheap, and quickly finish a job to my exact specification. I have discovered visual artists who worked with me to craft a music video for a song I wrote. Workers on Fiverr post “gigs”–jobs that they can perform. Most of the workers on Fiverr specialize in knowledge work, like proofreading or gathering sales leads. The workers are all over the world. I have worked with people from Germany, the Philippines, and Africa through Fiverr. Fiverr has become the leader in digital freelancing. The staggering growth of Fiverr’s marketplace has put the company in a position similar to an early Amazon. There is room for strategic expansion, but there is also an urgency to improve the infrastructure and secure the market lead. Gil Scheinfeld is the CTO at Fiverr, and he joins the show to explain how the teams at Fiverr are organized to fulfill the two goals of strategic, creative growth and continuous improvement to the platform. One engineering topic we discussed at length was event sourcing. Event sourcing is a pattern for modeling each change to your application as an event. Each event is placed on a pub/sub messaging queue, and made available to the different systems within your company. Event sourcing creates a centralized place to listen to all of the changes that are occurring within your company. For example, you might be working on a service that allows a customer to make a payment to a worker. The payment becomes an event. Several different systems might want to listen for that event. Fiverr needs to call out to a credit card processing system. Fiverr also needs to send an email to the worker, to let them know they have been paid. Fiverr ALSO needs to update internal accounting records. Event sourcing is useful because the creator of the event is decoupled from all of the downstream consumers. As the platform engineering team works to build out event sourcing, communications between different service owners will become more efficient. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 670Serverless Event-Driven Architecture with Danilo Poccia
In an event driven application, each component of application logic emits events, which other parts of the application respond to. We have examined this pattern in previous shows that focus on pub/sub messaging, event sourcing, and CQRS. In today’s show, we examine the intersection of event driven architecture and serverless architecture. Serverless applications can be built by combining functions-as-a-service (like AWS Lambda) together with backend as a service tools like DynamoDB and Auth0. Functions-as-a-service give you cheap, flexible, scalable compute. Backend as a service tools give you robust, fault-tolerant tools for managing state. By combining these sets of tools, we can build applications without thinking about specific servers that are managing large portions of our application logic. This is great–because managing servers and doing load balancing and scaling is painful. With this shift in architecture, we also have to change how data flows through our applications. Danilo Poccia is the author of AWS Lambda In Action, a book about building event-driven serverless applications. In today’s episode, Danilo and I discuss the connection between serverless architecture and event driven architecture. We start by reviewing the evolution of the runtime unit–from physical machines to virtual machines to containers to functions as a service. Then, we dive into what it means for an application to be “event driven.” We explore how to architect and scale a serverless architecture, and we finish by discussing the future of serverless–how IoT and edge computing and on-premise architectures will take advantage of this new technology. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 669BigQuery with Jordan Tigani
Large-scale data analysis was pioneered by Google, with the MapReduce paper. Since then, Google’s approach to analytics has evolved rapidly, marked by papers such as Dataflow and Dremel. Dremel combined a column-oriented, distributed file system with a novel way of processing queries. A single Dremel query is distributed into a tree of servers, starting with the root server, splitting into the intermediate servers, and ending with the leaf servers talking to the file system. Once the data is pulled from the file system into the leaves, the data propagates back to the root server, and is shuffled along the way so that the root server receives a sorted response. When Google started turning its internal services into customer-facing cloud products, the effort to productize Dremel began, and BigQuery was born. Jordan Tigani is an engineering lead who works on BigQuery, and he joins the show to discuss the evolution of the data warehouse. Large scale distributed queries still can take a long time–but queries get faster every year. Queries that required a nightly Hadoop job 10 years ago can be viewed in a frequently updated user-facing dashboard. Power users of BigQuery talk about the speed and the query interface as being two of its most valuable differentiating features. As the job of a large scale data analyst becomes less technically intensive, tools like BigQuery will continue to rise in popularity. We have done some great shows about Google papers like Spanner, Dremel, and Dataflow. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Shout out to today’s featured contributor Shreyans Sheth. Shreyans has worked on the Software Engineering Daily search API, and has also helped us understand open source best practices, which we are still learning. Thanks again Shreyans for your work. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 668Legal Technology with Justin Kan
Justin Kan has been building startups for a decade, and in that time he has interacted with lots of lawyers. From incorporation to fundraising to selling his company Twitch, the interactions with lawyers consistently seemed less transparent and less efficient than would be optimal. For an engineer like Justin, the natural inclination here was to build software and sell it to lawyers. But there would be so much resistance–you would have to convince the lawyers to change their pricing model to fixed-pricing, which would give them the incentive to buy software and work more efficiently. Instead, Justin teamed up with a few entrepreneurial lawyers who were willing to start a new law firm from scratch, and use software on day 1. The software company is called Atrium Legal Technology Services (or Atrium LTS for short), and the law firm that uses the software is Atrium LLP. Both of these companies are very new, and were publicly announced a few months ago. The two companies work side-by-side in undecorated office in downtown San Francisco. When I took the elevator up to see the company, the elevator doors opened and revealed two paper signs pointing to opposite ends of the office. On the Atrium LTS side of the office, engineers were writing software to extract the meaning from documents. Today, lawyers at old law firms are paid hundreds of dollars an hour to fill in document templates by editing a text document. As the Atrium LTS software gets better, document preparation will be done through web applications, with the variable names disambiguated from the parts of the document that never change from client to client. On the other side of the office sat Atrium LLP. The legal team was dressed a little more formally than their engineer counterparts, but there was nothing close to the formality of a traditional Silicon Valley law firm. Far from the decor of a Menlo Park law firm, the office space was actually more spartan than most well-funded startups, signaling to the employees that this is an unproven business strategy, and there is a ton of work to be done to validate it. This sentiment was echoed in my conversation with Justin. It’s possible (even plausible) that Atrium LLP could become the biggest law firm in the world, but the road to getting there will take patience and steady execution. I enjoyed hearing Justin explain the motivation for starting Atrium LTS, and look forward to covering the company in the future. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 667Early Investments with Semil Shah
An engineer who wants to start a business using investment capital needs to understand the expectations of investors. The market for the business needs to be huge. The team needs to have a differentiated understanding of the market, or a differentiated product. The CEO needs to have the determination to continue operating the company even when it gets very difficult. And the price needs to be right for the investor. Even if you are just working at a startup, or considering joining a startup, you must understand how the investment market works. From a raw financial standpoint, it only makes sense to spend your time at a startup that has equity with a high expected value. Your equity will only have high expected value if the company continues to exist long enough to have an exit–the company must either go public or get acquired. In order to make it down the long and winding road to an exit, a technology company often needs to raise money on multiple occasions. That money is used to pay employees like you! If the company can’t earn enough revenues or raise money, you are going to get fired. Then, you may not have the spare cash to execute your stock options, and you might lose the rights to the equity that you worked so hard for. The best way to avoid this is to learn to think like an investor–because as an engineer working for equity, you are an investor. Semil Shah is an early stage seed investor with Haystack, a fund that he started. He also works with GGV Capital, a venture firm investing out of the United States and China. Semil has been blogging about technology for many years, and eventually evolved from a commentator to an investor. In this episode, we explore the dynamics between investors and founders of early-stage technology companies. We also explore the strange market of podcasting. Semil worked at a company called Concept.io, which was acquired by Apple for $30M. We have done some great shows with other engineering investors like Chris Dixon and Adrian Colyer. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 666OpenBazaar with Brian Hoffman
Cryptocurrencies give us a decentralized financial system. OpenBazaar is a decentralized commerce system. A merchant can log onto OpenBazaar and post a listing for an item–for example, a t-shirt that I want to sell for $15. My item listing will spread throughout the OpenBazaar P2P network. A shopper can download the OpenBazaar desktop application and see my listing for a t-shirt. The shopper can pay me $15 in bitcoin, and I will send the t-shirt to their address. If I were selling that shirt on Amazon, the corporation would take a cut of that transaction. OpenBazaar has no transaction costs–so users get to save some money. However, users also miss out on the benefits of a corporate marketplace. Amazon makes sure that the seller will send the item to the buyer, and makes sure that the buyer pays the seller. On OpenBazaar, an escrow system is needed to place money in the hands of a neutral third party until the goods are delivered. Amazon ensures that the distributor sends the item to the customer. On OpenBazaar, users need to figure out how to send the goods to each other. Brian Hoffman was the first developer to start working on OpenBazaar. The project has grown significantly since his initial commit, and OpenBazaar now has buyers, sellers, and open source committers. There is a clear desire for an open system of commerce. Brian is also the CEO of OB1, a company that provides services on top of OpenBazaar. OpenBazaar is a protocol–and other companies will undoubtedly emerge to build on top of it as well. In our conversation, Brian discussed how OpenBazaar works–the peer-to-peer protocol, the escrow system, the dispute resolution, and the open source community management. It is a fascinating, unique project, and I hope you learn something about it from this episode. To find all of our old episodes about decentralized technology and blockchains, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Shout out to today’s featured open source contributor Justin Lam. He has been working on improving the iOS codebase, and I know all the SE Daily mobile users appreciate his effort. Thanks Justin! Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 665Netflix Serverless-like Platform with Vasanth Asokan
The Netflix API is accessed by developers who build for over 1000 device types: TVs, smartphontes, VR headsets, laptops. If it has a screen, it can probably run Netflix. On each of these different devices, the Netflix experience is different. Different screen sizes mean there is variable space to display the content. When you open up Netflix, you want to efficiently browse through movies. The frontend engineers who are building different experiences for different device types need to make different requests to the backend to fetch the right amount of data. This was the engineering problem that Vasanth Asokan and his team at Netflix was tasked with solving: how do you enable lots of different frontend engineers to get whatever they need from the backend? This problem led to the development of a “serverless-like platform” within Netflix, which Vasanth wrote about in a few popular articles on Medium. This platform enables frontend developers to write and deploy backend scripts to fetch data, decoupling the responsibilities of frontend engineers and backend engineers. The tight coupling of frontend and backend engineering was problematic to the development velocity of Netflix. We have done many shows about Netflix engineering, covering topics like data engineering, user interface design, and performance monitoring. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 664Serverless Authentication with Bobby Johnson
Serverless architecture is software that runs without an addressable server. Serverless is made possible by two types of technology: platform as a service providers like Auth0, and functions as a service like AWS Lambda. With both of these technologies, we can program logic that runs without being deployed to a server. Functions as a service are cheap and scalable. Write your code for a Serverless function, and the cloud provider will cheaply deploy and execute that function on some server somewhere. The difficult part is maintaining state. Since Serverless compute instances are ephemeral, you aren’t dealing with a system that will keep track of your state—it is going to disappear eventually. The ephemeral nature of Serverless code requires us to shift our thinking—but the dramatic cost and simplified scalability make it well worth the effort. Serverless functions can add complexity in exchange for lower price. Serverless “platform as a service” often lowers complexity at a slightly higher price. A Serverless database like Firebase handles database scaling and gives you a nice web interface. A Serverless machine learning platform like Google CloudML gives your models scalability and controlled deployment. A Serverless authentication service like Auth0 manages your authentication. In addition to authentication, Auth0 has built a set of tools to allow SaaS companies to extend their platforms into a sandboxed code execution environment. Bobby Johnson is an engineer at Auth0, and he joins the show to describe the toolbox that Auth0 has developed: authentication, webtasks, and extensibility–and how the world of “serverless” architecture is evolving. Full disclosure, Auth0 is a sponsor of Software Engineering Daily. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 663Parlaying Failure to Fortune with Paul Martino
In 2003, Paul Martino co-founded Tribe.net, one of the earliest social networking sites. Tribe had significant traction, with hundreds of thousands of users. In the early 2000s, hundreds of thousands of users was enough traffic to pose a company with engineering challenges. Paul had studied computer science, and was able to use his knowledge of high-performance computing to write an efficient graph database, and solve the other technical puzzles that the company faced–but the business did not ultimately work out. The failure of Tribe made the founders even hungrier for success–and it taught them lessons that they carried into subsequent businesses. Paul went on to start Aggregate Knowledge, a marketing technology company that sold for $119 million. His Tribe co-founder Mark Pincus went on to start Zynga, the multi-billion dollar gaming company. Another Tribe employee co-founded Yammer, which sold to Microsoft for a billion dollars. Since his exit from Aggregate Knowledge, Paul Martino started Bullpen Capital, which makes post-seed investments. The Bullpen Capital portfolio is appealing to me–partly because of the number of Internet gambling companies. Paul and I talked about gambling and other taboo business sectors–as well as what makes a good investment in the “post-seed” category. I enjoyed speaking to Paul because he has a straightforward, no-nonsense way of talking about things–it’s very charismatic and uncommon. We have done some great shows with other engineering investors like Chris Dixon and Adrian Colyer. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Shout out to today’s featured contributor Kurian Vithayathil. He has made significant contributions to the Software Engineering Daily Android app. Thanks again Kurian for your work. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 662Bad Men with Bob Hoffman
Bob Hoffman’s long career in advertising included CEO positions at three different agencies. He helped huge brands craft their messaging, and grab consumer attention. In Bob’s world of advertising, lots of money was spent on creativity. Were the campaigns successful? That depends who you ask. In the old world of advertising, everyone acknowledged that success was subjective. If you could imagine the opposite of what Bob Hoffman built his career doing, it might look like search advertising. Bob’s campaigns were about creating a brand’s voice, with colorful art and subtlety and ambient messaging. Advertising was about turning a brand into an entity you recognize, teaching the consumer to associate Nike with fitness, or Dove soap with clean hands, or Cheetos with cheesy, salty attitude. Search advertising, on the other hand, is just text. You enter a search query, you are looking for black socks, and the top link that comes back is a line of text that says “cheap black socks.” Search advertising catches people who have an intent to do something. They have stated their intent by typing into a box. With search advertising, a brand might not even need a sexy, flashy creative. As money poured into adtech, user tracking, and Google, brands started to care more about metrics. When Bob met with a brand, the brand wouldn’t be asking about the cool new advertising campaign featuring a young actress drinking a Coca Cola. The brand would be asking about the click-through rate of a display advertising campaign. Brands moved their focus to statistics, and away from creativity. And technology companies were happy to provide them with statistics. Whether those statistics were true or not is another story altogether. The industry was moving from creative BS to outright lying, and Bob decided to leave. In today’s episode, Bob explains how the state of advertising became so problematic, and the ways in which it harms us Internet users. We have done lots of reporting about advertising fraud for the last year, and it is a popular topic because people are often shocked to find that online advertising is inextricably linked to organized crime, surveillance, and Twitter botnets. That’s not to say that online advertising doesn’t work–it certainly does! But understanding the dark underbelly of the Internet’s cash cow is a necessary precondition to finding a solution. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 661Augmented Reality with Scott Montgomerie
Augmented reality applications are slowly making their way into the world of the consumer. Pokemon Go created the magical experience of seeing Pokemon superimposed upon the real world. IKEA’s mobile app lets you see how a couch would fit into your living room, which has a significant improvement on the furniture buying process. Augmented reality applications can have even more dramatic impact on industrial enterprises. Have you ever set up a factory? You might need to build a conveyor belt. You might need to put together the parts of a giant machine that extrudes steel. You might need to fix a silicon wafer fabrication machine. It takes an expert to set up these heavy, complicated machines. ScopeAR is a company that builds augmented reality tools. One of the ScopeAR products allows users to telepresence with each other to collaborate on the construction and maintenance of heavy machinery. Imagine I am setting up my factory, and I have a complicated piece of machinery (let’s say a conveyor belt) in front of me. I have never constructed a conveyor belt before. I put on a HoloLens, and set up a VoIP call with an expert who has experience with that piece of machinery, and they point out what I need to do by superimposing 3-D arrows, text, and other instructions on my field of vision. They can share my experience and help guide me through the process. This is such a flexible tool–you can imagine applications for augmented reality assistance being useful in medicine, construction, education and other fields. Scott Montgomerie is the CEO of ScopeAR and in today’s episode, we talk about the state of AR, how the AR tools from Apple and Google compare, and how the similarity between tools used for mapping the world in AR relate to the tools used to map the world by autonomous cars. Scott was a great guest, and I hope to have him back on in the future. We have done some great shows about how to build augmented reality and virtual reality applications. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. In other podcast players, you can only access the most recent 100 episodes. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Shout out to today’s featured contributor Edgar Pino. He is working on a real-time chat application for Software Engineering Daily, so that we can have chat rooms for people to discuss the episodes easily. Innovative work! Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 660Elastic Load Balancing with Ranga Rajagopalan
Computational load is the amount of demand that is being placed on a computer system. “Load” can take the form of memory, CPU, network bandwidth, disk space, and other finite resources. When we design systems, we need to prepare for high-load events. On a social network, people are much more active in the mornings. On an e-commerce site, Black Friday causes many more users to come online for discount shopping. Our distributed application must be able to scale in response to these spikes in traffic. Cloud computing has changed the popular software architecture patterns, and load balancing has changed along with it. With on-demand, infinite infrastructure, we don’t need to worry about ordering servers and provisioning. With infrastructure as code, it becomes simpler to manage lots of deployable units–so we can break up our monolith into microservices, and have hundreds or thousands of virtual machines or containers running. Enterprises that were started before cloud computing have large on-premise server deployments–but today, many of them also use the cloud. The cloud can be used to augment their classic on-prem deployments with cloud platform-as-a-service features. The cloud can also be used as a reliable way to scale during high load events. Today, a common architectural pattern is to have your application broken up into services. Each of those services has multiple instances. When the load on a particular service is under lots of demand, you create more instances to handle the increased load. How do you monitor the load on each service? How do you know when to spin up new instances of the service? Load analysis and load balancing across different services can be implemented by placing “agents” throughout your infrastructure. These agents gather data about services and service instances, and route that data to a centralized place. The centralized “control plane” can be used to make decisions about load-balancing and traffic routing. Ranga Rajagopalan worked on networking at Cisco for a decade before co-founding Avi Networks as CTO. Avi Networks builds modern load balancing software, and in today’s episode, Ranga describes the requirements of load balancing. We talked about the evolution of network infrastructure, the impact of the cloud, and the technical decisions that his team has made when architecting Avi Networks. Full disclosure: Avi Networks is a sponsor of Software Engineering Daily. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 659Kafka at NY Times with Boerge Svingen
The New York Times is a newspaper that evolved into a digital publication. Across its 166 year history, The Times has been known for longform journalistic quality, in addition to its ability to quickly churn out news stories. Some content on the New York Times is old but timeless “evergreen” content. Readers of the New York Times website are not only looking for the most recent news–they want to know what the headlines were the day after Pearl Harbor. They want to read editorials about Martin Luther King. Over the last 30 years, New York Times has moved itself online, bringing old material with it. Since the 90s, several different content management systems (CMS) have been used by journalists within The Times. These different sources of content store data in different formats. This is a data management problem. Users want to search over the entire history of articles published by The Times, which means that The Times needs to unify those articles in a single index. These are articles from the 1920s that were digitized using OCR, articles from 1998 that were written on a legacy CMS, and articles from 2017 that use the latest CMS. Boerge Svingen is the director of engineering at NYT, and he wrote about this problem and its solution on Medium. This story describes the flexibility of Kafka; in contrast to the applications of Kafka as a place to buffer high volumes of data, the New York Times uses Kafka as a place to unify data and allow for other specific materialized views to be built on top of it. We have covered Kafka in the past with interviews of some of its creators–including Jay Kreps and Neha Narkhede. To find these old episodes, you can download the Software Engineering Daily app for iOS and for Android. With these apps, we are building a new way to consume content about software engineering. They are open-sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to get involved with, we would love to get your help. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 658Cryptoeconomics with Vlad Zamfir
A cryptocurrency has a distributed ledger called a blockchain. The blockchain keeps track of every transaction that occurs across the cryptocurrency. This blockchain must stay up-to-date and verified–which requires someone in the network to do that validation. Bitcoin and Ethereum use the proof-of-work algorithm. Miners do computational work to validate the legitimacy of transactions across the network, and in return they are given cryptocurrency as a reward for that computational work. In the future, cryptocurrencies could move towards a proof-of-stake model. If you own a significant amount of cryptocurrency, you have incentive to keep the validity of the blockchain up to date. Proof-of-stake algorithms can be significantly less energy intensive. Vlad Zamfir is a researcher for the Ethereum Foundation, and he joins Haseeb Qureshi for a conversation about cryptoeconomics. This is an in-depth conversation between two active blockchain developers. We hope you enjoy it. You can send us feedback on the show by emailing me [email protected] or joining us on the Slack channel at softwareengineeringdaily.com/slack. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 657Analyse Asia with Bernard Leong
In America, the tech companies we focus on are commonly known as FAANG: Facebook, Amazon, Apple, Netflix, Google. We all know what these companies do because they impact our daily lives. In Asia, there are three giant tech companies that have similar scale: Baidu, Alibaba, and Tencent, otherwise known as BAT. Technology within a location is shaped by the pressures of that location. You might think we live in a global society, but tech in Asia is dramatically different than it is in America. Differences in culture lead to differences in product development. In China, a different political system contributed to more rapid adoption of online payments. Because there is more payment data, people can be given loans more efficiently. Less of the population is “unbanked.” Online payments are mostly handled by WeChat, a social networking product from Tencent, and Alibaba, an ecommerce giant. If you live in the West, imagine that Facebook and Amazon handled most of your payments for everything. You would have a different relationship with those companies. Bernard Leong is the host of Analyse Asia, a podcast about Asian developments in technology in business. After studying materials science in Singapore and theoretical physics at Cambridge, he made his way into business and journalism, and developed an interest in the Singularity–a subject that few people took seriously until recently (one topic we explored in this show is Masayoshi Son, the Japanese tycoon who wants to invest nearly a trillion dollars into technology companies; Masayoshi believes firmly that the Singularity is coming). Shenzhen: The Silicon Valley of Hardware (Full Documentary) | Future Cities | WIRED In the Plex by Steven Levy The Hidden Forces Behind Toutiao: China’s Content King Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 656IFTTT Architecture with Nicky Leach
It’s 9pm at night, and you are hungry. You order a pizza from Domino’s. You live on a street that’s dark, and so you have installed a smart lightbulb in front of your mailbox that lights up the address. When the pizza at Domino’s is ready, you want the lightbulb on your mailbox to light up so that the delivery person can read your address when they arrive in front of your house with the pizza. The Internet should make it possible to have this kind of event-driven, connected world. Anything that is connected to the Internet should be able to send signals to anything else on the Internet, so that our lives gradually become more automated. This is what IFTTT does. Users of IFTTT can easily create applets to wire different services together. You can use IFTTT to trigger an email whenever three of your friends retweet something on Twitter. You can use IFTTT to flash the lights in your house when Bitcoin hits new market highs. You can use IFTTT to order a pizza whenever Bitcoin crashes. IFTTT makes it easy to connect different services together, and a lot of work goes into the infrastructure that enables these billions of events to process correctly. Nicky Leach from IFTTT’s engineering team joins the show to describe how IFTTT allows for integrations between services that were not built to integrate–and he talks about the scheduling, data engineering, and monitoring of the company’s software stack. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.