
Software Engineering Daily
2,188 episodes — Page 33 of 44
Ep 655Dremio with Tomer Shiran
In 2015, eleven years had passed since MapReduce was first published, and companies were still having data problems. Tomer started working on Dremio, a company that was in stealth for another two years. I interviewed Tomer two years ago, when he still could not say much about what Dremio was doing. We talked about Apache Drill, an open-source project related to what Dremio eventually built. Earlier this year, two of Tomer’s colleagues Jacques Nadeau and Julien Le Dem came on to discuss columnar data storage and interoperability. What I took away from that conversation was that today, data within an average enterprise is accessible, but the different formats are a problem. Some data is in MySQL, some is in Amazon S3, some is in ElasticSearch, some is on HDFS stored in Parquet files. Different teams will set up different BI tools and charts that read from a specific silo of data. At the lowest level, the different data formats are incompatible–you have to transform MySQL data in order to merge it with S3 data. On top of that, engineers doing data science work are using Spark, Pandas, and other tools that pull lots of data into memory–if the in-memory formats are not compatible, the data teams can’t get the most out of their work. On top of THAT, at the highest level, data analysts are working with different data analysis tools, so there is even more siloing. Now I understand why Dremio took two years to bring to market. They are trying to solve data interoperability by making it easy to transform data sets between different formats. They are trying to solve data access speed by creating a sophisticated caching system. And they are trying to improve the effectiveness of the data analysts by providing the right abstractions for someone who is not a software engineer to study the different data sets across an organization. Dremio is an exciting project because it is rare to see a pure software company put so many years into up-front stealth product development. After talking to Tomer in this conversation, I’m looking forward to seeing Dremio come to market. It was fascinating to hear him talk about how data engineering has evolved to today.
Ep 654Keybase with Max Krohn
Public key encryption allows for encrypted, private messages. A message sent from Bob to Alice gets encrypted using Alice’s public key. Public key encryption also allows for signed messages–so that when Alice signs a message, Alice uses her private key and Bob can verify it if Bob has her public key. In both cases, Bob needs Alice’s public key! If Bob gets that public key from an email message, Bob is trusting that the email message is secure–and if Bob can’t ever verify that first message containing the key, he has no way to verify the messages that come after it. This is the problem of key distribution. Key distribution undermines the usability of PGP encryption. Serious encryption advocates will sometimes meet in person to exchange pieces of paper containing public keys. Keybase is a company that attempts to solve the problem of key distribution by having users connect social media accounts and devices to Keybase, in order to collectively verify who you are, and then give you the power to share your public key. Max Krohn is a founder of Keybase, and was previously a founder of SparkNotes and OKCupid. Max was on the show a few years ago to discuss the basics of Keybase, and in this episode he explores some of the abstractions that Keybase has built on top of its core identity tool–Keybase File System, Keybase Teams, and Keybase Git. We do break down the basics of Keybase, but if you want a more thorough explanation, you might like to check out that older episode, you can download the Software Engineering Daily app on iOS or Android to find all of our old episodes.
Ep 653Quantum Computing Introduction with Zlatko Minev
Computer chips have physical limitations. When transistors get too small, electrons start to behave in ways that make the hardware modules less reliable. Our reliable technological progress has been enabled by Moore’s Law: the idea that the number of components we can fit on a chip doubles roughly every 12-18 months. We can’t keep shrinking the size of these components, because physics is no longer complying. Quantum computing allows us to operate on qubits rather than bits, giving us better parallelism and continued reliable technological progress. Quantum computing is still mostly an area of research rather than production systems–but it is rapidly approaching usability, and Zlatko Minev joins the show to explain how quantum computing works, and why software engineers should care. Zlatko is a PhD candidate at the Yale Quantum Information Lab. Today he describes how qubits work, which algorithms quantum computing impacts, and which parts of modern computer architecture will work on a quantum computer. We may have to throw out the Von Neumann architecture when it comes to quantum! Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 652Smart Contract Security with Emin Gün Sirer
A smart contract is a program that allows for financial transactions. Smart contracts are usually associated with the Ethereum platform, which has a language called Solidity that makes it easy to program smart contracts. Someday, we will have smart contracts issuing insurance, processing legal claims, and executing accounting transactions. Smart contracts involve money, and they are likely to transact with cryptocurrencies. That makes them ripe targets for attackers. What are the vulnerabilities of smart contracts? What can we do to ensure the safety of a high throughput, automated financial system? In today’s episode, Haseeb Qureshi talks to Emin Gün Sirer, a professor at Cornell University where he is co-director of the Initiative for Cryptocurrencies and Contracts. They discuss how smart contracts work and how to secure them. Haseeb and Emin are both working full-time on cryptocurrencies, which makes for a detailed technical discussion. In our previous episode about the DAO hack, Emin Gün Sirer was one of the protagonists of the story. You can find that episode as well as all of our old episodes by downloading the Software Engineering Daily app for iOS and for Android. We also have several other episodes with Haseeb. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 651Interviewing.io with Aline Lerner
Interviewing engineers is not a solved problem. Quite the opposite–everyone in the software industry will tell you their own personal issues with the hiring process. One reason that technical interviews have not evolved significantly is the lack of standardized tooling. Some companies give you one phone screen, some give you two. Some companies have you solve brain teasers (“how many golf balls fit in a school bus”) and some make you fix bugs in their production codebase. During the on-site interview, some companies use whiteboards, some let you use a laptop. Software companies do so much–they should be outsourcing the things that are not their core competency. Certainly they cannot outsource the entire hiring process–but they can outsource parts of it to a company like Interviewing.io. Engineers come to Interviewing.io to practice their interview skills, where other engineers from top companies practice with them as an interviewer. When an engineer has practiced interviewing enough, they can use Interviewing.io to interview for real with real companies and find a job. Aline Lerner is the CEO of Interviewing.io, and she knows about the software interviewing and recruiting process as much as anyone. After working as an engineer, she started studying recruiting, consulting with top companies to help them improve their process. From her observations, she created Interviewing.io. In this episode, we dissect the workflow that she created for engineers to improve at interviewing and find jobs, and also explore the insights that led her to starting Interviewing.io. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 650Model Training with Yufeng Guo
Machine learning models can be built by plotting points in space and optimizing a function based off of those points. For example, I can plot every person in the United States in a 3 dimensional space: age, geographic location, and yearly salary. Then I can draw a function that minimizes the distance between my function and each of those data points. Once I define that function, you can give me your age and a geographic location, and I can predict your salary. Plotting these points in space is called embedding. By embedding a rich data set, and then experimenting with different functions, we can build a model that makes predictions based on those data sets. Yufeng Guo is a developer advocate at Google working on CloudML. In this show, we described two separate examples for preparing data, embedding the data points, and iterating on the function in order to train the model. In a future episode, Yufeng will discuss CloudML and more advanced concepts of machine learning. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 649Internet Monitoring with Matt Kraning
How would you build a system for indexing and monitoring the entire Internet? Start by breaking the Internet up into IP address ranges. Give each of those address ranges to servers distributed around the world. On each of those servers, iterate through your list of IP addresses, sending packets to them. Depending on what sorts of packets those IP addresses respond to, and what those responses are, you can build a map of the devices on the Internet: what is running on those devices, and what they respond to. Qadium is a company that indexes and monitors devices on the Internet, to help organizations understand the devices that are within corporate networks. If you are a large corporation, Qadium can probably do a better job of figuring out your Internet footprint than you can. Matt Kraning is the CTO of Qadium, and in today’s show he describes the process by which Qadium maps the Internet. Matt used to work on data infrastructure at DARPA, and has deployed Hadoop in Afghanistan–so the infrastructure of Qadium seems relatively manageable. Our data conversations in this episode spam from talking about Storm and Hadoop to Google BigQuery, BigTable, and DataFlow. Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 648Scala Native with Denys Shabalin
Scala is a functional and object oriented programming language built on the JVM. Scala Native takes this language, loved by many, and brings it to bare metal. Scala Native is an optimizing ahead-of-time compiler and lightweight managed runtime designed specifically for Scala. Denys Shabalin (dennis shuh-blin) is a Research Assistant at the EPFL and the primary creator of Scala Native. In this episode, Adam Bell interviews Denys about the motivations behind the Scala Native project, how it was implemented and future directions. He also briefly touches on how Scala Native made cold compilation times of Scala code twice as fast. If you are interested in functional programming, compiler design, or want to learn some interesting tidbits about garbage collector design and trade offs you will like this episode. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! We are building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend–and more projects are coming soon. If you have ideas for how software engineering media content should be consumed, or if you are interested in contributing code, check out github.com/softwareengineeringdaily, or join our Slack channel (there’s a link on our website)–or send me an email: [email protected] Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 647Gigster with Roger Dickey
You have heard the phrase: every company is becoming a software company. An insurance company is now supposed to turn into a software company that sells insurance. A clothing retailer needs to reinvent itself to be able to build software to manage the production and distribution of its clothing. Software applications provide so much leverage to an organization, it seems smart to develop in-house software teams to build those applications. But does it really make sense? Is there a better alternative? In the 90s outsourcing was a common solution to this problem. If you didn’t have software expertise at your company, you would hire a large consulting firm. These firms would often hire inexperienced offshore developers, and the resulting code quality was not so great. Because of the bad experiences of the first Internet boom, companies became more cautious about outsourcing their engineering work–which led to today, where the standard is to hire your own software team. The world has changed in ways that have made outsourcing a more viable solution. Programming best practices are more widely understood. There is an international community of software engineers that share information on places like Stack Overflow, Quora, and Twitter. Off-the-shelf collaboration tools make it much easier to communicate the requirements of a project to a team of developers. Gigster is a company that is working to optimize the engineering of software projects. Large enterprises come to Gigster to build new projects from scratch–whether that project is a marketplace, a mobile application, or a machine learning model. Roger Dickey is the CEO of Gigster, and he joins the show to describe how Gigster works, and why it often makes sense for companies to focus on their core competency and outsource software engineering. Some of our most popular episodes of Software Engineering Daily describe how leading software companies are being built–we have covered Giphy, Netflix, Digital Ocean, Stripe, and many others. Download the Software Engineering Daily app for iOS or Android to hear all of our old episodes. They are easily organized by category, and as you listen, the SE Daily app gets smarter, and recommends you content based on the episodes you are hearing. If you don’t like this episode, you can easily find something more interesting by using the recommendation system. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! We are building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend–and more projects are coming soon. If you have ideas for how software engineering media content should be consumed, or if you are interested in contributing code, check out github.com/softwareengineeringdaily, or join our Slack channel (there’s a link on our website)–or send me an email: [email protected] Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 646Blockchain Building with Daniel van Flymen
A blockchain is a data structure that provides decentralized, peer-to-peer data distribution. Bitcoin is the most well-known blockchain, but in the next decade we will see many more blockchains. Most listeners probably know that you could just fork the code of Bitcoin to start your own blockchain–but wouldn’t it be nice to know how to build a blockchain from scratch? Daniel van Flymen is the author of the Medium article Learn Blockchains by Building One. In his post, he walks you through how to write the code for a blockchain–just like any other web app. He starts with raw Python code, defines the data structures, and stands up his simple blockchain app on a web server to give a toy example for how nodes in a blockchain communicate. For me, this was a great article to read. I have reported on blockchains for over a year, but had not seen such a clear example with executable, simplified code. Stay tuned at the end of the episode for Jeff Meyerson’s tip about making the most of a new job: brought to you by Indeed Prime. To find all of our coverage of cryptocurrencies, download the Software Engineering Daily app for iOS or Android to hear all of our old episodes. They are easily organized by category, and as you listen, the SE Daily app gets smarter, and recommends you content based on the episodes you are hearing. If you don’t like this episode, you can easily find something more interesting by using the recommendation system. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! We are building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend–and more projects are coming soon. If you have ideas for how software engineering media content should be consumed, or if you are interested in contributing code, check out github.com/softwareengineeringdaily, or join our Slack channel (there’s a link on our website)–or send me an email: [email protected] Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 645Ethereum Platform with Preethi Kasireddy
Ethereum is a decentralized transaction-based state machine. Ethereum was designed to make smart contracts more usable for developers. Smart contracts are decentralized programs that usually allow for some a transaction between the owner of the contract and anyone who would want to purchase something from the contract owner. For example, I could set up a smart contract where a listener sends my smart contract some ether and I send the listener a podcast episode automatically. Smart contracts can also interact with each other, to network together complex transactions. In the same way that web development has been made easier by PaaS and SaaS, smart contracts will make building financial systems simple. Preethi Kasireddy is a blockchain developer who writes extensively about cryptocurrencies. She joins the show to describe how the Ethereum platform works, including the steps involved in a smart contract transaction. This episode covers some advanced topics of Ethereum, and if you are out of your comfort zone, don’t worry–you aren’t alone. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! We are building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend–and more projects are coming soon. If you have ideas for how software engineering media content should be consumed, or if you are interested in contributing code, check out github.com/softwareengineeringdaily, or join our Slack channel (there’s a link on our website)–or send me an email: [email protected] Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 644Bitcoin Segwit with Jordan Clifford
Visa processes 1,600 transactions per second. PayPal processes 193 transactions per second. Bitcoin processes only 3-4 transactions per second. In order to fulfill the dreams of financial programming–in order to get decentralized, peer-to-peer micropayments–Bitcoin needs a much higher transaction throughput. Bitcoin’s scalability issues have led to debates within the community and changes in the software. In this episode, Jordan Clifford gives an overview of some of the scaling limitations of Bitcoin, and discusses SegWit, a change to the Bitcoin protocol that improves scalability. Jordan was previously on the show to discuss the basics of Ethereum and Bitcoin. This episode covers some advanced topics of Bitcoin, and if you are out of your comfort zone, don’t worry–you aren’t alone. Stay tuned at the end of the episode for Jeff Meyerson’s tip about assessing cultural fit at a company: brought to you by Indeed Prime. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! We are building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend–and more projects are coming soon. If you have ideas for how software engineering media content should be consumed, or if you are interested in contributing code, check out github.com/softwareengineeringdaily, or join our Slack channel (there’s a link on our website)–or send me an email: [email protected] Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Ep 643Tinder Engineering Management with Bryan Li
Tinder is a rapidly growing social network for meeting people and dating. In the past few years, Tinder’s userbase has grown rapidly, and the engineering team has scaled to meet the demands of increased popularity. On Tinder, you are presented with a queue of suggested people that you might match with, and you swipe left or right to indicate that you like or dislike them. Creating that queue of suggestions is a complex engineering problem. Many factors go into the suggestions that Tinder gives you: geotargeting, food preferences, your favorite band, your photos, and the people you have swiped on in the past. Bryan Li is an engineering manager at Tinder, and he joins the show to describe the interaction between the mobile client, backend servers, and the offline analytics and machine learning. We also talk about managing different teams and how to reorganize smoothly as a company grows. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! The Software Engineering Daily open source community is building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend. If you are interested in contributing, check out github.com/softwareengineeringdaily–or send me an email: [email protected]
Ep 642Advertiser Trust with Marc Goldberg
Despite all the problems with online advertising, ads are not going away. Advertising is fundamental to the modern Internet economy. In previous episodes of Software Engineering Daily, we have mostly dissected the problems of adtech–bots, tracking, fraud, brand safety. We have talked about some solutions–for example, JavaScript tags that you can put on a page to identify a bot before you serve it an ad. But these solutions don’t get the job done completely, because it isn’t possible to reliably identify bots. Today we explore another solution for adtech: the whitelist. Marc Goldberg is the CEO of Trust Metrics, a company that provides whitelisting for advertisers. A whitelist is a list of domains that are acceptable to run your advertisements on. In order to build a whitelist, you need to review thousands of sites to judge which ones are reasonable places to publish an advertisement. Marc joins the show to describe how to build and scale a system for reviewing websites and judging whether they are safe to run ads against. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! The Software Engineering Daily open source community is building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend. If you are interested in contributing, check out github.com/softwareengineeringdaily–or send me an email: [email protected]
Ep 641Ad Fraud Science with Augustine Fou
Advertising fraud continues to plague the Internet. We do not know the scope and scale of that fraud. How many ads on the Internet are viewed by bots? Estimations range from 2% to 99%. Advertisers are slowly becoming more educated about fraud, thanks in part to Dr. Augustine Fou. Dr. Fou is a full-time advertising fraud researcher. He looks at data sets of billions of ad impressions to figure out how fraud works and help victims of ad fraud make their case. Last year, Dr. Fou came on the show to give an overview of his perspective on the world of ad fraud. Today, we dive into the importance of Twitter in ad fraud schemes. We also talk about the severity of fraud on mobile apps. If you downloaded a flashlight app, or an alarm clock app, or a keyboard, that app could be displaying hidden ads that never actually show up. Stay tuned at the end of the episode for Jeff Meyerson’s tip about getting into a new job: brought to you by Indeed Prime. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! The Software Engineering Daily open source community is building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend. If you are interested in contributing, check out github.com/softwareengineeringdaily–or send me an email: [email protected]
Ep 640User Management with Michel Feaster
A customer engages with a company across a variety of channels–email, Zendesk, Salesforce, online advertising. Unifying those data sources and getting a dashboard into the entire customer experience is the goal of Usermind, a customer engagement hub. If you can get all of that data unified in one place, it creates a tool that salespeople, customer service, and marketing can all look at to see how users are engaging with a company. Michel Feaster is the CEO of Usermind, and she joins the show to describe how Usermind works and the engineering behind the product. To connect all the different APIs from all those other companies makes this a complicated integration problem, and hearing the Usermind strategy for managing integrations will be useful to anyone listening who is building a product with lots of external API integrations. Stay tuned at the end of the episode for Jeff Meyerson’s tip about launching a side business: brought to you by Indeed Prime. If you like listening to this podcast, download the Software Engineering Daily app for iOS and Android to hear all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that. If you don’t like this episode, you can easily find something more interesting by looking at the recommendations in the app. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! The Software Engineering Daily open source community is building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend. If you are interested in contributing, check out github.com/softwareengineeringdaily–or send me an email: [email protected]
Ep 63942 Coding School with Brittany Bir
42 is tuition-free developer school for students from 18-30. It was started by Xavier Niel, a French billionaire who wanted to encourage a new model of software education. 42 has campuses in France and Silicon Valley. 42 has very high standards for the students it admits, because the students that get in are not paying tuition, but they have 24/7 access to high quality computers and a beautiful campus. Unlike coding bootcamps, 42 lasts 3-5 years. Students who graduate are equipped with both computer science theory and the practical ability to see projects through on their own. Brittany Bir is the chief operating officer of 42 Coding School in Silicon Valley. She joins the show to talk about how 42 works and the future of programming education. At 42, the focus is on peer-to-peer learning. Students complete projects of their own choosing, and are also required to take on internships and get real world work experience. Stay tuned at the end of the episode for Jeff Meyerson’s tip about building your brand brought to you by Indeed Prime. If you like listening to this podcast, download the Software Engineering Daily app for iOS and Android to hear all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that. If you don’t like this episode, you can easily find something more interesting by looking at the recommendations in the app. The mobile apps are open sourced at github.com/softwareengineeringdaily. If you are looking for an open source project to hack on, we would love to get your help! The Software Engineering Daily open source community is building a new way to consume software engineering content. We have the Android app, the iOS app, a recommendation system, and a web frontend. If you are interested in contributing, check out github.com/softwareengineeringdaily–or send me an email: [email protected]
Ep 638ReactVR with Andrew Imm
React is a programming model for user interfaces. ReactJS is for building user interfaces for web applications. React Native is for building UI on Android or iOS. ReactVR is for building user interfaces in virtual reality. React Native was originally developed to make it easier to maintain parity between the web, iOS, and Android teams at Facebook. If I build an application for the web with ReactJS, I can rewrite that application for React Native on iOS or Android and reuse some of my code from the web application. It is not a 1-click level of portability between platforms, but it helps share user interface components between different platforms. ReactVR brings React development to virtual reality. Andrew Imm is a ReactVR developer at Facebook, and he joins the show to discuss how ReactVR works. We talk about the support for VR in the browser: WebGL, WebVR, and ThreeJS. We also explore some of the key React components that you might use to build an interface in ReactVR, and we wrap up the show by exploring VR more broadly–how consumers use VR today and how they might use it in the near future. The iOS app is the first project to come out of the Software Engineering Daily Open Source Project. There are more projects on the way, and we are looking for contributors–if you want to help build a better SE Daily experience, check out github.com/softwareengineeringdaily. We are working on an Android app, the iOS app, a recommendation system, and a web frontend. Help us build a new way to consume software engineering content at github.com/softwareengineeringdaily.
Ep 637Sports Deep Learning with Yu-Han Chang and Jeff Su
A basketball game gives off endless amounts of data. Cameras from all angles capture the players making their way around the court, dribbling, passing, and shooting. With computer vision, a computer can build a well-defined understanding for what a sport looks like. With other machine learning techniques, the computer can make predictions by combining historical data with a game that is going on right now. Second Spectrum is a company that builds products for analyzing sports. At major basketball arenas, Second Spectrum cameras sit above the court, recording the game and feeding that information to the cloud. Second Spectrum’s servers crunch on the raw data, processing it through computer vision and putting it into deep learning models. The output can be utilized by teams, coaches, and fans. Yu-Han Chang and Jeff Su are co-founders of Second Spectrum. They join the show to describe the data pipeline of Second Spectrum from the cameras on the basketball court to the entertaining visualizations. After talking to them, I am convinced that machine learning will completely change how sports are played–and will probably open up a platform for new sports to be invented. The iOS app is the first project to come out of the Software Engineering Daily Open Source Project. There are more projects on the way, and we are looking for contributors–if you want to help build a better SE Daily experience, check out github.com/softwareengineeringdaily. We are working on an Android app, the iOS app, a recommendation system, and a web frontend. Help us build a new way to consume software engineering content at github.com/softwareengineeringdaily.
Ep 636Alerting and Metrics with Clement Pang
An alert is a signal of problematic application behavior. When something unusual happens to your application, an alert can bring that anomaly to your attention. In order to detect unusual events, you need to define the norm. In order to define both normal and problematic behavior, you need metrics. Metrics are measurements of the behavior in your application. Metrics get created from logs and other data. These high volumes of data get aggregated and collected into easily digestible metrics. This aggregation process that reduces data to metrics is often called a “metrics pipeline.” Clement Pang is the chief architect of Wavefront, a company that builds metrics and alerting software for enterprises such as Box and Lyft. Clement joins the show to discuss how enterprises use alerting and metrics and how to build a company around metrics and alerting–including the data engineering involved in constructing a metrics pipeline. The iOS app is the first project to come out of the Software Engineering Daily Open Source Project. There are more projects on the way, and we are looking for contributors–if you want to help build a better SE Daily experience, check out github.com/softwareengineeringdaily. We are working on an Android app, the iOS app, a recommendation system, and a web frontend. Help us build a new way to consume software engineering content at github.com/softwareengineeringdaily.
Ep 635Video Infrastructure with Matt McClure and Jon Dahl
Playing a video on the Internet seems simple. You press play, the video gets delivered, and boom–you are watching Game of Thrones, right? It’s a bit more complicated. Unless you have built an application that involves video, you probably have not dealt with the world of codecs, bitrates, and streaming. Depending on the bandwidth between the user and the server, you might want to use different compression rates. Think about all of the different use cases–different connection speeds, device types, operating systems, video players, cloud providers. As a developer, you just want videos in your application to play quickly and reliably. But it takes a lot of engineering, monitoring, and re-engineering to get it right. Matt McClure and Jon Dahl are the founders of Mux, a company that makes video infrastructure technologies. Previously they built Zencoder, a product for encoding and delivering video. This episode was a fascinating discussion of why building video products for the modern Internet is still so hard. Download the Software Engineering Daily app for iOS to hear all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that. The iOS app is the first project to come out of the Software Engineering Daily Open Source Project. There are more projects on the way, and we are looking for contributors–if you want to help build a better SE Daily experience, check out github.com/softwareengineeringdaily. We are working on an Android app, the iOS app, a recommendation system, and a web frontend. Help us build a new way to consume software engineering content at github.com/softwareengineeringdaily.
Ep 634Dashboarding and Query Latency with Tom O’Neill
A dashboard is a data visualization that aggregates metrics in a way that we can quickly understand. In a modern software company, everyone uses dashboards–from salespeople to DevOps to HR. Each dashboard represents a query that must be updated frequently, so that anyone looking at it is getting up-to-date information. The data set being queried might be getting updated quickly in the case of time series or log data. Some queries require joins between disparate data sources. How do you keep dashboards accurate? How do you keep query latency down? Tom O’Neill is the CTO of Periscope Data, a company that makes popular dashboarding tools. In this episode, Tom explains the data engineering that underlies Periscope data. We explore topics such as caching, columnar data, and Redshift. The iOS app is the first project to come out of the Software Engineering Daily Open Source Project. There are more projects on the way, and we are looking for contributors–if you want to help build a better SE Daily experience, check out github.com/softwareengineeringdaily. We are working on an Android app, the iOS app, a recommendation system, and a web frontend. Help us build a new way to consume software engineering content at github.com/softwareengineeringdaily.
Ep 633Static Analysis with Paul Anderson
Static analysis is the process of evaluating code for errors, memory leaks, and security vulnerabilities. The “static” part refers to the fact that the code is not running. This differentiates it from unit tests and integration tests, which evaluate the runtime characteristics of code. If you use an IDE or a linter, you are using a basic form of static analysis all the time. More sophisticated static analysis tools can be used to analyze code in sensitive domains like healthcare or automobiles. During static analysis, we can discover problems in the code by evaluating the structure of a program. Buffer overruns can be identified before they turn into a vulnerability like Heartbleed. Null pointer exceptions can be fixed before they cause a segmentation fault. Concurrency issues can be serialized before they result in a problematic race condition. Today’s guest Paul Anderson is the VP of engineering at GrammaTech, where he works on CodeSonar, a static analysis tool. We discussed how static analysis works, why it is useful, and how it fits into a modern software delivery pipeline. Full disclosure: GrammaTech is a sponsor of Software Engineering Daily.
Ep 632The Coding VC with Leo Polovets
The underlying cause of failure for many startups is that the founders are afraid of discomfort. An environment where everyone is comfortable is unlikely to be an environment where personal growth and value creation is occurring. When you are in a startup, calibrating the right amount of discomfort is often about calibrating risk. What are your risks? Can you quantify them? Can you enumerate them? Multiplying out the probability of surviving each of those risks, then multiplying that number times the sum of the discounted future cash flows of your business will give you the expected value of your business. Under the right circumstances, entrepreneurship has much higher expected value than a stable engineering job. The important difference is variance. Your business needs to be able to withstand the variance that bad luck can provide. And entrepreneurs themselves need to be able to withstand the variance implied by the fact that their business can completely fail and go to zero. Leo Polovets is a partner with Susa Ventures. He worked as an early engineer at Linkedin, Google, and Factual, and he blogs at Coding VC. In this episode we talked about the proper mindset for founding a company–how to think about risk, mistakes, discomfort, and finance. Coding VC blog
Ep 631Tinder Growth Engineering with Alex Ross
Tinder is a popular dating app where each user swipes through a sequence of other users in order to find a match. Swiping left means you are not interested. Swiping right means you would like to connect with the person. The simple premise of Tinder has led to massive growth, and the app is now also used to discover new friends and create casual meetings. Every social network knows–if you are not growing, then you are dying. Growth is so important to Tinder, they have a large engineering organization devoted to five facets of growth: new users, activation, retention, dropoff, and anti-spam. These five segments cover the entire Tinder user lifecycle, and there is a sub-team in charge of each of the five areas. No matter what kind of Tinder user you are, there are growth engineers focused on your experience. Alex Ross is the director of engineering for the growth team at Tinder. His job requires a mix of data science, data engineering, psychology, and setting proper KPIs (key performance indicators). Each subteam has KPIs that determine how well they are doing with growth–and if the wrong KPI is set, it can create bad incentives. For example, a growth team that is focused only on getting users to spend more time engaging with Tinder would have an incentive to create so-called “dark patterns” that trigger addiction.
Ep 630Brave Browser with Jonathan Sampson
Online advertising enables free content and services of the Internet. One of the free services that is powered by advertising is the browser. 60% of web browsing is done through Chrome, which is owned by Google, which is powered by advertising. The application that most of us use to explore the web is made by a company that relies on ads, so it is unsurprising that the default of that browser is to allow close tracking of user behavior. When you hit a website, a variety of trackers are logging your data for the purpose of serving you better ads. Some people don’t like ads, and they don’t like being tracked–but what is the alternative? How else can we get all the content we want? Since the 90’s, engineers have envisioned an Internet powered by micropayments. A micropayments system in your browser would allow users to pay for content with money instead of adtech. Brave is a web browser built with a modern view of advertising, privacy, and economics. Brave users can pay for content with their money OR by paying attention to ads. This system is formalized through the Basic Attention Token (BAT), a cryptocurrency that can be used to purchase user attention. Jonathan Sampson is a senior developer relations specialist with Brave Software. He joins the show to talk about the problems with the browsing experience and what Brave is doing to stop it.
Ep 629Deep Learning Systems with Milena Marinova
The applications that demand deep learning range from self-driving cars to healthcare, but the way that models are developed and trained is similar. A model is trained in the cloud and deployed to a device. The device engages with the real world, gathering more data. That data is sent back to the cloud, where it can improve the model. From the processor level to the software frameworks at the top of the stack, the impact of deep learning is so significant that it is driving changes everywhere. At the hardware level, new chips are being designed to perform the matrix calculations at the heart of a neural net. At the software level, programmers are empowered by new frameworks like Neon and TensorFlow. In between the programmer and the hardware, middleware can transform software models into representations that can execute with better performance. Milena Marinova is the senior director of AI solutions at the Intel AI products group, and joins the show today to talk about modern applications of machine learning and how those translate into Intel’s business strategy around hardware, software, and cloud. Full disclosure: Intel is a sponsor of Software Engineering Daily. Question of the Week: What is your favorite continuous delivery or continuous integration tool? Email [email protected] and a winner will be chosen at random to receive a Software Engineering Daily hoodie. Data Skeptic podcast: Generative Adversarial Networks
Ep 628Spotify Event Delivery with Igor Maravic
Spotify is a streaming music company with more than 50 million users. Whenever a user listens to a song, Spotify records that event and uses it as input to learn more about the user’s preferences. Listening to a song is one type of event–there are hundreds of others. Opening the Spotify app, skipping a song, sharing a playlist with a friend–all of these are events that provide valuable insights to Spotify. These are not the only types of events that Spotify cares about. There are also events that occur at the infrastructure level–for example a logging server that runs out of disk space. There are events that are relevant to all the users on Spotify–for example a new album release from Taylor Swift. An “event” is an object that needs to be registered within a system. Since there are so many events on a platform like Spotify, delivering and processing them reliably requires significant investment. Modern Internet companies are built by connecting cloud services, databases, and internal tools together. These different systems might respond to different events in different ways. Each system subscribes to the types of events that it wants to hear. Since there are so many events, and they might be received at uneven bursts, a modern architecture has a scalable queueing system to buffer events. To put an event on the queue, the event producer “publishes” that event to the queue. The event is then received by each “subscriber.” That’s why queueing is often known as pub/sub–publish/subscribe. Igor Maravic is an engineer with Spotify. In this episode, he explains why pub/sub is a key element of Spotify’s infrastructure–and he describes the migration that Spotify has made from Apache Kafka to Google Cloud Pubsub. If you like this episode, we have done many other shows about cloud infrastructure. You can check out our back catalog by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that.
Ep 627Advertising Analytics with Jonah Goodhart
Moat is one of the most successful advertising technology companies in history. After building a business from measurement of ad impressions, Moat was sold to Oracle for $850 million. Advertising powers the free content on the Internet. Measurement makes it easier for publishers to monetize their content. At Software Engineering Daily, we know this from firsthand experience. The podcast ecosystem has barely any ability to measure success–and that can make it hard to entice advertisers. In podcasting, it is very difficult to understand if an advertising campaign is a success. This illustrates why Moat is important. Improving the analytics on advertising helps publishers, brands, ad agencies, and adtech companies decide how to allocate their capital. Why is it hard to measure advertising success? Why is this a difficult engineering problem? Because there are so many players in the space with conflicting incentives. A brand wants to show ads to people who will buy a product. A publisher wants to display an ad that will maximize revenue. Adtech companies and ad agencies want to take the biggest cut possible from the transactions between brands and publishers. In the midst of all of this, fraudulent traffic providers offer cheap services that drain money from anyone who is not keeping a close eye on their deal flow. In this fog of war, Moat’s goal is to provide transparency where possible. Moat CEO Jonah Goodhart joins this episode to talk about advertising analytics, viewability, and fraud. If you like this episode, we have done many other shows about adtech and advertising fraud. You can check out our back catalog by going to softwareengineeringdaily.com or by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that.
Ep 626Visual Search with Neel Vadoothker
If I have a picture of a dog, and I want to search the Internet for pictures that look like that dog, how can I do that? I need to make an algorithm to build an index of all the pictures on the Internet. That index can define the different features of my images. I can find mathematical features in each image that describe that image. The mathematical features can be represented by a matrix of numbers. Then I can run the same algorithm on the picture of my dog, which will make another matrix of numbers. I can compare the matrix representing my dog picture to the matrices of all the pictures on the internet. This is what Google and Facebook do–and we covered this topic in our episode about similarity search a few weeks ago. Today, we evaluate a similar problem: searching images within Squarespace. Squarespace is a platform where users can easily build their own website for blogging, e-commerce, or anything else. Neel Vadoothker is a machine learning engineer at Squarespace, and he joins the show to talk about how and why he built a visual similarity search engine. If you like this episode, we have done many other shows about machine learning. You can check out our back catalog by going to softwareengineeringdaily.com or by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that.
Ep 625Doing Anything with George Anders
Software gives us new ways of communicating with each other. Engineers build scalable systems for e-commerce, helpdesk, and video sharing–and these systems do scale, to millions of people. But software alone cannot serve all of the demands of all the users and customers on these platforms. We need customer service representatives to address unexpected demands. We need design specialists to evaluate the interface that made sense to the engineers but not the users. We need sales people to connect our strange software to an impatient prospective customer. Engineers sometimes joke about firing all the non-engineers in the company. As engineers, it is easy to discount all of the work that non-engineers do–it can seem unscalable, or non-quantifiable, or mechanical. But most companies would fall over immediately without support, sales, design, operations, and the multitude of other non-engineering roles. More to the point–people in non-technical roles can drive the success of an en organization. Some of the most influential leaders in tech came from a non-technical background: Stuart Butterfield of Slack; Brian Chesky of Airbnb; Sheryl Sandberg of Google and Facebook. A liberal arts education can foster the perfect set of skills to thrive in a technology company. George Anders is an author whose most recent book is called You Can Do Anything: The Surprising Power of a “Useless” Liberal Arts Education. George is one of my favorite business writers, and some of his past writing includes pieces about Sequoia Capital, Amazon, Linkedin, and a ton of other topics on Quora. If you like this episode, we have done many other shows about business with guests like Seth Godin and Tyler Cowen–indeed many of the shows on Software Engineering Daily are not deeply technical. You can check out our back catalog by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that.
Ep 624Word2Vec with Adrian Colyer
Machines understand the world through mathematical representations. In order to train a machine learning model, we need to describe everything in terms of numbers. Images, words, and sounds are too abstract for a computer. But a series of numbers is a representation that we can all agree on, whether we are a computer or a human. In recent shows, we have explored how to train machine learning models to understand images and video. Today, we explore words. You might be thinking–”isn’t a word easy to understand? Can’t you just take the dictionary definition?” A dictionary definition does not capture the richness of a word. Dictionaries do not give you a way to measure similarity between one word and all other words in a given language. Word2vec is a system for defining words in terms of the words that appear close to that word. For example, the sentence “Howard is sitting in a Starbucks cafe drinking a cup of coffee” gives an obvious indication that the words “cafe,” “cup,” and “coffee” are all related. With enough sentences like that, we can start to understand the entire language. Adrian Colyer is a venture capitalist with Accel, and blogs about technical topics such as word2vec. We talked about word2vec specifically, and the deep learning space more generally. We also explored how the rapidly improving tools around deep learning are changing the venture investment landscape. If you like this episode, we have done many other shows about machine learning with guests like Matt Zeiler, the founder of Clarif.ai and Francois Chollet, the creator of Keras. You can check out our back catalog by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that. Question of the Week: What is your favorite continuous delivery or continuous integration tool? Email [email protected] and a winner will be chosen at random to receive a Software Engineering Daily hoodie.
Ep 623DAO Hack with Matt Leising
The Decentralized Autonomous Organization (DAO) was a digital form of venture capital. It was an ambitious idea–to provide a new decentralized business model for organizing corporations on top of the Ethereum blockchain. Few people in the crypto community were opposed to this premise–but the timeline was short, the code requirements were tremendous, and in retrospect, a vulnerability was inevitable. The DAO launched in May 2016, setting the record for the largest crowdfunding event in history. The following month, the DAO was hacked, millions of dollars of Ether were stolen, and the reverberations of the event were a referendum on how the Ethereum community governs itself. Matt Leising is a reporter for Bloomberg who has chronicled the DAO in his article The Ether Thief. He continues to follow cryptocurrencies closely, as the Internet of money fractals increasingly into the public consciousness. If you like this episode, we have done many other shows about cryptocurrencies and their implications. You can check out our back catalog by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that. Errata: Coinbase now supports Bitcoin Cash.
Ep 622Software Engineering Daily App with Keith and Craig Holliday
You have probably missed some of the best episodes of Software Engineering Daily. If you listen to just a few episodes a week, it can be difficult to identify the high quality shows. And if you are new to the podcast, you have no idea how to find episodes that might appeal to you. Software Engineering Daily has a discovery problem. We have 600 episodes, and much of the content is evergreen. The shows we did a year ago on Apache Spark, or Ethereum, or ReactJS are still relevant today, and they get plenty of listens. Keith and Craig Holliday built a recommendation system for Software Engineering Daily. Then they built a Software Engineering Daily iOS app to improve the experience of SE Daily listeners. You can use the SE Daily app to find the most popular episodes of this podcast, and to find episode recommendations based on what you have listened to. In this episode, Keith and Craig join the show to explain why they built an app for Software Engineering Daily. You can find all the code for the SE Daily app at github.com/softwareengineeringdaily in case you want to fork it for your own podcast–or if you want to contribute to it. Question of the Week: What is your favorite continuous delivery or continuous integration tool? Email [email protected] and a winner will be chosen at random to receive a Software Engineering Daily hoodie.
Ep 621Attack Attribution with John Davis
When a cyber attack occurs, how do we identify who committed it? There is no straightforward answer to that question. Even if we know Chinese hackers have infiltrated our power grid with logic bombs, we might not be able to say with certainty whether those hackers were state actors or rogue Chinese hackers looking for an offensive asset to sell to their government. Even if we know someone in Russia launched an attack on the banking system in Ukraine, we might not know whether that attack came from the government or from aggressive non-governmental forces. Accurate cyberattack attribution is key to preventing diplomatic mistakes in the modern battleground of the Internet. Today’s guest John Davis is one of the authors of the report called “Stateless Attribution: Toward International Accountability in Cyberspace”. John is a senior information scientist with RAND Corporation, a non-profit institution that helps improve policy and decisionmaking through research and analysis. This report was commissioned by Microsoft, and it provides a deep assessment of our current ability to attribute a cyberattack to the perpetrator of that attack. If you like this episode, we have done many other shows about security, with guests like Bruce Schneier and Samy Kamkar. You can check out our back catalog by downloading the Software Engineering Daily app for iOS, where you can listen to all of our old episodes, and easily discover new topics that might interest you. You can upvote the episodes you like and get recommendations based on your listening history. With 600 episodes, it is hard to find the episodes that appeal to you, and we hope the app helps with that.
Ep 620Car and IoT Security with Chris Craig
Ransomware and DDoS attacks happen all the time. Sometimes they affect large swaths of users. WannaCry ransomware froze the computer systems in hospitals. Mirai botnet DDoS attacks took down a DNS provider, making Netflix and Twitter inaccessible for a short period of time. These are innocent attacks compared to what we could face from a world where cars, heart rate monitors, and other safety critical machinery become connected to the Internet. This is not a new subject–we have covered it in previous episodes about security. But it’s a deep subject, and there is much ground to cover. Chris Craig joins the show for this episode–he is a security researcher at Oak Ridge National Lab. He studies network and cloud security, and in this episode he brings his broad expertise to subjects like IoT security, car security, and the question of standards–what do we need to standardize and certify as the internet becomes connected to physical infrastructure? Thanks to Jared Smith for the introduction. When Safety and Security Become One Standardisation and Certification of the ‘Internet of Things’
Ep 619Artificial Intelligence APIs with Simon Chan
Software companies that have been around for a decade have a ton of data. Modern machine learning techniques are able to turn that data into extremely useful models. Salesforce users have been entering petabytes of data into the company’s CRM tool since 1999. With its Einstein suite of products, Salesforce is using that data to build new product features and APIs. Simon Chan is the senior director of product management with Einstein. He oversees the efforts to give longtime Salesforce customers new value, and the efforts to build brand new APIs for image recognition and recommendation systems, which can form the backbone of entirely new businesses. Companies spend billions of dollars on sales and marketing, and I wanted to understand where the best opportunities for Salesforce were. Simon and I spent much of our time exploring higher level applications, but we got to lower level engineering eventually. There are 600 episodes of Software Engineering Daily, and it can be hard to find the shows that will interest you. If you have an iPhone and you listen to a lot of Software Engineering Daily, check out the Software Engineering Daily mobile app in the iOS App Store. Every episode can be accessed through the app, and we give you recommendations based on the ones you have already heard.
Ep 618Information Theory with Jimmy Soni and Rob Goodman
We write code in a language that looks like English. Whether it is JavaScript, Fortran, or assembly language, that code is an abstraction on top of layers of intermediate languages, binary, transistors, and physics. 100 years ago, this would have seemed like magic. Most of us know about Alan Turing, who described the vision of a multipurpose computer with the concept of the Turing machine. Less well known is the scientist Claude Shannon, who laid the groundwork of information theory. With information theory, we can compress data and communicate it efficiently. Jimmy Soni and Rob Goodman are the authors of “A Mind at Play,” a biography of Claude Shannon. Claude’s unique insights about information were made possible by his willingness to involve himself in lots of different areas–science, art, juggling, warfare. This interview gives insights for how we can think of new ideas by synthesizing disparate subjects. There are 600 episodes of Software Engineering Daily, and it can be hard to find the shows that will interest you. If you have an iPhone and you listen to a lot of Software Engineering Daily, check out the Software Engineering Daily mobile app in the iOS App Store. Every episode can be accessed through the app, and we give you recommendations based on the ones you have already heard.
Ep 617Healthcare AI with Cosima Gretton
Automation will make healthcare more efficient and less prone to error. Today, machine learning is already being used to diagnose diabetic retinopathy and improve radiology accuracy. Someday, an AI assistant will assist a doctor in working through a complicated differential diagnosis. Our hospitals look roughly the same today as they did ten years ago, because getting new technology into the hands of doctors and nurses is a slow process–just ask anyone who has tried to sell software in the healthcare space. But technological advancement in healthcare is inevitable. Cosima Gretton is a medical doctor and a product manager with KariusDX, a company that is building diagnostic tools for infectious diseases. She writes about the future of healthcare, exploring the ways that workflows will change and how human biases could impact the diagnostic process–even in the presence of sophisticated AI.
Ep 616Lending Machine Learning with Ofer Mendelevitch
Loans give people more financial security. If people know that they can receive a loan, they will be more willing to take intelligent risks. A loan can allow for a short-term investment that pays off enough to justify the interest rate on that loan. For the lender, a loan can be a fantastic return on capital–as long as the lendee does not default. When banks were the rulers of the financial infrastructure, most of them would err on the side of caution when it came to lending. They would adhere strictly to credit scores, and a wanting customer would be out of luck if they did not have a credit score, or if their credit score had gotten lower than acceptable. Newer fintech companies are taking advantage of data sources other than credit scores. They are using machine learning in conjunction with these new data sources to find viable lendees who would be overlooked by traditional institutions. Ofer Mendelevitch is the VP of data science at LendUp. He joins the show to explain why loans are important, how LendUp functions, and the machine learning systems that power an intelligent system of lending.
Ep 615Industrial IoT with Jayson Delancey
Sensors are being attached to trains, lightposts, and all kinds of factory equipment. Industrial machinery gives off high volumes of data that can be captured, stored, and processed with machine learning in order to improve workflows and ensure safety. Jayson Delancey works at GE, which is building tools and systems to manage large IoT deployments. The full stack for enterprise IoT involves tools for managing thousands of sensors; databases for storing all the data that is coming off of these devices; authentication and authorization systems for enforcing security. There is a lot to do. In this episode, Jayson surveys some of the technology GE is building with Predix, its industrial IoT platform. He also talked about some of the large scale IoT deployments he has seen.
Ep 614Sales Software with Jean-Baptiste Escoyez
Most products do not sell themselves. Salespeople bridge the gap between a product creation and a customer who purchases it. People can make a good living on the internet selling niche products–if they can find their customers. The process of taking a large group of potential customers and narrowing it down to only the subset of those customers who will buy your product is known as the sales funnel. The sales funnel consists of multiple stages–the first of which is known as “prospecting.” A salesperson doing prospecting is casting a wide net, sending emails to hundreds or thousands of people, looking for anyone who has some small probability of being interested. Without a tool for prospecting, the process can be very labor intensive. Jean-Baptise Escoyez is the CTO at Prospect.io, a tool for sales prospecting. In this episode, we explored the process of building Prospect.io, from the high level product design to the engineering details of how it is implemented. I use Prospect.io to sell two different products so it was enjoyable to find out how one of my favorite tools works.
Ep 613Cloud-Native SQL with Alex Robinson
Applications built in the cloud are often serving requests from all around the world. A user in Hong Kong could have written to a database entry at the moment just before a user in San Francisco and a user in Germany simultaneously try to read from that database. If the user in San Francisco is allowed to see a different database entry than the user in Germany, that database is not strongly consistent. Strongly consistent databases work such that two users who read the same entry at the same time will receive the same result. Weakly consistent or “eventual consistent” databases are suitable for applications where transaction ordering is not important–photo sharing apps and ecommerce shopping carts, for example. Bank accounts, on the other hand, often need to be strongly consistent. CockroachDB is a scalable, survivable, strongly consistent database. Alex Robinson is an engineer at Cockroach Labs and he joins the show to explain the data model for CockroachDB and how it maintains strong consistency.
Ep 612Internet Extremism with Lochlan Bloom
Religious extremists use technology to recruit vulnerable individuals to a violent cause. Google is developing ways to combat this extremism through its platforms, namely YouTube. When a user looks for inflammatory religious or supremacist content, YouTube’s “Redirect Method” instead sends those users toward anti-terrorist videos. Google’s fight against extremism compelled writer Lochlan Bloom to write an article called “The Coming Battle: AI, Extremism, and the New War of Ideas.” Lochlan joins the show to discuss the societal implications of giant internet providers controlling our information flow. Lochlan is a science fiction writer, most recently of The Wave, a book that mixes existentialism with quantum physics. Our reality is defined by what we observe, and this theme courses through our conversation–from religion to Twitter to artificial intelligence. The Coming Battle: AI, Extremism, and the New War of Ideas
Ep 611Advertiser Bidding with Praneet Sharma
Content websites are supported by advertising. Most of the advertisements around the internet are dynamic ad slots that change depending on the user who visits the site. Those dynamic ad slots are available to a variety of different bidders. For each ad slot, an auction occurs. The highest bidder gets to serve an ad for that slot. Praneet Sharma is the co-founder of Method Media Intelligence, which he founded with Shailin Dhar, who has been on the show several times to discuss his investigations into the world of ad fraud. I wanted to have his partner Praneet on the show to get his perspective on ad fraud and how to clean up the advertising ecosystem. One advance in dynamic advertising that we discussed is header bidding, and an open source library called PrebidJS. When an ad-supported website gets delivered to your web browser, the HTML begins to load and the JavaScript on the page begins to execute. Some of that JavaScript is calling out to advertising networks looking for the highest bidder. Until the page receives a callback for what to put in the ad slots on the page, the page will not finish loading. Sites that do not manage their ad requests appropriately suffer performance issues. Header bidding is a technique to wrap all of the requests to different advertising exchanges in a single serialized blob of code at the top of the page.
Ep 610Ad Fraud Overview with Shailin Dhar
The Internet runs on advertising. Advertising is subject to fraud–but then again, so is every system of online transactions. The amount of money lost in electronic payments fraud and ecommerce scamming is probably much greater than what is lost due to ad fraud. So why do we keep covering advertising fraud on Software Engineering Daily? More of our audience needs to know about ad fraud. Few people realize how much fraud there is in online advertising. In previous episodes of Software Engineering Daily, we have explained, how advertising fraud works, why it is absurd and disgraceful, and why nobody talks about. We also cover ad fraud because I personally find it interesting and sometimes hilarious. Those are the same reasons I invited Shailin Dhar to speak at the third Software Engineering Daily Meetup. Shailin has been on the show twice before and he will be on again in the future. He has made it a full time job to expose ad fraud, and he gives a great presentation on the topic in this episode.
Ep 609Similarity Search with Jeff Johnson
Querying a search index for objects similar to a given object is a common problem. A user who has just read a great news article might want to read articles similar to it. A user who has just taken a picture of a dog might want to search for dog photos similar to it. In both of these cases, the query object is turned into a vector and compared to the vectors representing the objects in the search index. Facebook contains a lot of news articles and a lot of dog pictures. How do you index and query all that information efficiently? Much of that data is unlabeled. How can you use deep learning to classify entities and add more richness to the vectors? Jeff Johnson is a research engineer at Facebook. He joins the show to discuss how similarity search works at scale, including how to represent that data and the tradeoffs of this kind of search engine across speed, memory usage, and accuracy. Notes: Jeff’s blog post about similarity search
Ep 608Augmented Reality with Jesse Bounds and Siyu Song
Augmented reality is coming at us fast. Every large tech company is rumored to be building an AR product. Microsoft HoloLens is already available to developers. Pokemon Go, the most popular augmented reality product today, was made by a company that was spun out of Google. But Apple seems to be ahead of everyone. Apple’s ARKit is a set of tools for developers to build augmented reality applications. The applications people are building with ARKit are remarkable, and two of those early adopters join the show today for an interview. Jesse Bounds and Siyu Song work at Mapbox, a company that makes mapping, navigation, and location search SDKs. Location is natural companion to augmented reality. If I am walking down the street with a pair of augmented reality glasses on, those glasses can augment the world with information based on my location. Because the fit between AR and mapping is so natural, Mapbox has been rapidly experimenting to build up an expertise in AR. As a result, Jesse and Siyu make for great guests to talk about what engineers can build with ARKit today and what might be possible in the future.
Ep 607Error Diagnosis with James Smith
When a user experiences an error in an application, the engineers who are building that application need to find out why that error occurred. The root cause of that error may be on the user’s device, or within a piece of server-side logic, or hidden behind a black box API. To fix a complex error, we need a stack trace of contextual information so that we can correlate events across all layers of an application. James Smith is the CEO of Bugsnag, a company that makes crash reporting and error tracking software. In this episode, he describes how to diagnose errors in modern applications. He also explains how the company functions and how Bugsnag itself is built. The product consumes and stores millions of events which makes for a good discussion of software architecture. Full disclosure: Bugsnag is a sponsor of SE Daily.
Ep 606GatsbyJS with Kyle Mathews
GatsbyJS is a framework for building web applications for JavaScript. Gatsby’s original goal was to allow users to create super fast static web sites that could be hosted and served efficiently at a low cost. Most web pages have components from a framework like React or Angular that need to render after the user requests them. This rendering can sometimes require additional requests to external data sources, causing the page to take longer to load. Gatsby uses GraphQL to pull in data at build time and pre-render as much of a site as possible using React’s server side rendering. When a page built with Gatsby is served to a user, as much of the page has been rendered as possible, so that the browser can quickly load everything on the page without additional network requests. Kyle Mathews is the creator of GatsbyJS. He joins the show to describe why he created Gatsby–the high level goals and low level engineering decisions. We also discuss how Kyle intends to take Gatsby beyond just an open source project and turn it into a business.