The InfoQ Podcast
377 episodes — Page 7 of 8

Mitchell Hashimoto on Consul since 1.2 and its Role as a Modern Service Mesh
In June of this year, Consul 1.2 was released. The release expanded Consul’s capability around service segmentation (controlling who and how services connect East and West). On this week’s podcast, Wes and Mitchell discuss Consul in detail. The two discuss Consul’s design decisions around focusing on user space networking, layer 4 routing, Go, Windows’ performance characteristics, the roadmap for eBPF on Linux, and an interesting feature that Consul implements called Network Tomography. The show wraps with Mitchell’s discussion on some of the research that Hashicorp is doing around machine learning and security with Consul. Why listen to this podcast: - Consul is first and foremost a centralized service registry that provides discovery. While it has a key-value store, it is Consul’s least important feature. With the June release (1.2), Consul entered more into the space of a service mesh with the focus on service segmentation (controlling how you connect and who can connect). - Hashicorp attempts to limit the language fragmentation in the Company and has seen a lot of success leveraging Go across their platforms. Therefore, Consul is written in Go. - Because Consul focused on layer 4 first, it is recommended to leverage the recent integration with Envoy for achieving high degrees of observability. - All of the network routing with Consul happens in user space at this point; however, kernel space routing with eBPF is planned for the near term. The focus, at this point, is safely cross-compiling to every platform and addressing the most possible use cases. The focuses isn’t on the high performance use cases (yet). - For any two servers across the globe in different data centers, instantly Consul can give you 99th percentile round-trip time between with uses a feature called Network Tomography. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2S3ZiSx You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2S3ZiSx

Camille Fournier on Platform Engineering, Engineering Ladders, and her Book “The Managers Path
On the podcast this week Charles Humble talks to Camille Fournier about running a platform team, how her current role differs from the CTO role she had a Rent the Runway, the skills developers need to acquire as they move from engineering to management positions, tends like Holacracy, and her book "The Manager's Path" Why listen to this podcast: - When looking for platform engineers Camille looks for people who understand what it takes to build and run distributed systems - network, availability, data - and customer empathy. - The team needs to be focussed on taking the time do build robust software for operational excellence. - The technical skills were different at Rent the Runway - these would tend to be more full-stack engineers who worked in a more iterative way. - Much of what we do at work is really about human relationships. One thing about relationships is that they tend to be better when you have one on one conversations with people on our regular basis. A lot of the value of one on one meeting is that you are reenforcing the social connection you have with the other person. - One of the most important things we do as engineering managers is stay abreast of how to make teams effective in the context of delivering software. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2RJdwYR You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2RJdwYR

Emmanuel Ameisen, Head of AI at Insight, on Building a Semantic Search System for Images
On this week’s podcast, Wes Reisz talks to Emmanuel Ameisen, head of AI for Insight Data Science, about building a semantic search system for images using convolution neural networks and word embeddings, how you can build on the work done by companies like Google, and then explores where the gaps are and where you need to train your own models. The podcast wraps up with a discussion around how you get something like this into production. Why listen to this podcast: - A common use case is the ability to search for similar things - I want to find another pair of sunglasses like these, or I want a cat that looks like this picture, or even a tool like Google’s Smart Reply, can all be considered broadly the domain of semantic search. - For image classification you generally want a convolutional neural network. You typically use a model pre-trained with a public data set like Imagenet pre-trained to generate embeddings, using the pre-trained model up to the penultimate layer, and storing the value of the activations. - From here the idea is to mix image embeddings with word embeddings. The embeddings, whether for words or images, are just a vector that represents a thing. There are many approaches to getting vectors for words, but the one that started it is word2vec. - For both image embeddings and word embeddings you can typically use pre-trained models, meaning that you only need to train the final step of bringing the two models together. - Before deploying to production it is important that you validate the model against biases such as sexism, typically using outside people to a carry out a through audit. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2RAEUrV You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2RAEUrV

Ben Kehoe, Cloud Robotics Research Scientist, Discusses Serverless @iRobot
On this week’s podcast, Wes Reisz talks with Ben Kehoe of iRobot. Ben is a Cloud Robotics Research Scientist where he works on using the Internet to allow robots to do more and better things. AWS and, in particular, Lambda is a core part of cloud enabled robots. The two discuss iRobot’s cloud architecture. Some of the key lessons on the podcast include: thoughts on logging, deploying, unit/integration testing, service discovery, minimizing costs of service to service calls, and Conway’s Law. Why listen to this podcast: - The AWS Platform, including services such as Kinesis, Lambda, and IoT Gateway were key components in allowing iRobot to build out everything they needed for Internet-connected robots in 2015. - Cloud-enabled Roombas talk to the cloud via the IoT Gateway (which is MQQT) and are able to perform large file uploads using mutually authenticated certificates signed via an iRobot Certificate Authority. The entire system is event-driven with lambda being used to perform actions based on the events that occur. - When you’re using serverless, you are using managed infrastructure rather than building your own. So that means, when they exist, you have to accept the limitations of the infrastructure. For example, until recently Lambda didn’t have an SQS integration. So because of that limitation, you have to have inventive ways to make things work as you want. - Serverless is all about the total cost of ownership. It’s not just about development time, but across on areas that need to support operating the environment. - iRobot takes an approach of unit testing functions locally but does integration testing on a deployed set of functions. A library called Placebo helps engineers record events sent to the cloud and then replay them for local unit tests. - For logging/tracing, iRobot packages up information that a function uses into a structured record that is sent to CloudWatch. They then pipe that into SumoLogic to be able to trace executions. Most of the difficulties that happen tend to happen closer to the edge. - iRobot uses Red/Black deployments to have a completely separate stack when deploying. In addition, they flatten (or inline) their function calls on deployment. Both of these techniques are used as cost optimization techniques to prevent lambdas calling lambdas when not needed. - Looking towards the future of serverless, there is still work to be done to offer the same feature set that more traditional applications can use with service meshes. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2NYsXNN You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2NYsXNN

Vaughn Vernon on Developing a Domain Driven Design first Actor-Based Microservices Framework
Vaughn Vernon is thought-leader in the space of reactive software and Domain Driven Design (DDD). Vaughn has recently released a new open source project called vlingo. The platform is designed to support DDD at the framework and toolkit level. On today’s podcast, Vaughn discusses what the framework is all about, why he felt it was needed, and some of the design decisions made in developing the platform, including things like the architecture, actor model decisions, clustering algorithm, and how DDD is realized with the framework. Why listen to this podcast: - Vlingo is an open source system for building distributed, concurrent, event-driven, reactive microservices that supports (at the framework level) Domain Driven Design. - The platform is in the early stages. It runs on the JVM. There is a port to C#. All code is pushed up stream. - The platform uses the actor model and all messages are sent in a type-safe way. - Vlingo supports clustering and uses a bully algorithm to achieve consensus. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2MwEWxb You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2MwEWxb

Justin Cormack on Decomposing the Modern Operating System
On today’s podcast, Justin Cormack discusses how the modern operating system is being decomposed with toolkits and libraries such as LinuxKit, eBPF, XDP, and what the kernel space service mesh Cilium is doing. Wes Reisz and Justin Cormack also discuss how Cilium differs from service meshes like an Istio, Linkerd2 (previously Conduit), or Envoy. Justin is a systems engineer at Docker. He previously was working with unikernels at Unikernel Systems in Cambridge before being acquired by Docker. (edited) *Key Takeaways:* * LinuxKit is an appliance way of thinking about your operating system and is gaining adoption. There are contributions now from Oracle, Cloudflare, Intel, etc. Docker has seen interesting use cases such as customers running LinuxKit on large cloud providers directly on bare metal (more on this coming soon). * The operating system of today is really unchanged since the Sun workstation of the 90’s. Yet everything else about software has really changed such as automation, build pipelines, and delivery. * XDP (eXpress Data Path) is a packet processing layer for Linux that lets you run fast in kernel compiled safe program in kernel called eBPF. It’s used for things like packet filtering and encapsulation/decapsulation. * Cilium is an in-kernel, high performance service mesh that leverages eBPF. Cilium is very good at layer 4 processing, but doesn’t really do the layer 7 things that some of the other services meshes can offer (such as proxying http/1 to http/2)

Mike Lee Williams on Probabilistic Programming, Bayesian Inference, and Languages like PyMC3
Probabilistic Programming has been discussed as a programming paradigm that uses statistical approaches to dealing with uncertainty in data as a first class construct. On today’s podcast, Wes talks with Mike Lee Williams of Cloudera’s Fast Forward Labs about Probabilistic Programming. The two discusses how Bayesian Inference works, how it’s used in Probabilistic Programming, production-level languages in the space, and some of the implementations/libraries that we’re seeing. Key Takeaways * Federated machine learning is an approach of developing models at an edge device and returning just the model to a centralized location. By taking the averages of the edge models, you can protect privacy and distribute processing of building models. *Probabilistic Programming is a family of programming languages that make statistical problems easier to describe and solve. *It is heavily influenced by Bayesian Inference or an approach to experimentation that turns what you know before the experiment and the results of the experiment into concrete answers on what you should do next. * The Bayesian approach to unsupervised learning comes with the ability to measure uncertainty (or the ability to quantify risk). * Most of the tooling used for Probabilistic Programming today is highly declarative. “You simply describe the world and press go.” * If you have a practical, real-world problem today for Probabilistic Programming, Stan and PyMC3 are two languages to consider. Both are relatively mature languages with great documentation. * Prophet, a time-series forecasting library built at Facebook as a wrapper around Stan, is a particularly approachable place to use Bayesian Inference for forecasting use cases general purpose.

Uncle Bob Martin on Clean Software, Craftsperson, Origins of SOLID, DDD, & Software Ethics
Wes Reisz sits down and chats with Uncle Bob about The Clean Architecture, the origins of the Software Craftsperson Movement, Livable Code, and even ethics in software. Uncle Bob discusses his thoughts on how The Clean Architecture is affected by things like functional programming, services meshes, and microservices. Why listen to this podcast: * Michael Feathers wrote to Bob and said if you rearrange the order of the design principles, it spells SOLID. * Software Craftsperson should be used when you talking about software craftsmanship in a gender-neutral way to steer clear of anything exclusionary. * Clean Architecture is a way to develop software with low coupling and is independent of implementation details. * Clean Architecture and Domain Driven Design (DDD) are compatible terms. You would find the ubiquitous language and bounded context of DDD at the innermost circles of a clean architecture. * Services do not form an architecture. They form a deployment pattern that is a way of decoupling and therefore has no impact on the idea of clean architecture. * There is room for “creature comforts” in a code base that makes for more livable, convenient code. * “We have no ethics that are defined [in software].” If we don’t find a way to police it ourselves, governments will. We have to come up with a code of ethics. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2Nebspj You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2Nebspj

Arun Gupta on Managed Container Control Planes on AWS
Arun Gupta discusses with Wes Reisz some of the container-focused services that AWS offers, including differentiating ECS and EKS. Arun goes into some detail the role that Amazon Fargate plays and goals behinds EKS. Arun wraps ups discussing some of the open source work that AWS has recently been doing in the container space. Why liste to this podcast: - ECS & EKS are both managed control planes; Amazon Fargate is a technology used to provision clusters. - ECR is the Amazon Container registry (similar to the Docker Registry). - EKS is an opinionated why of running a Kubernetes cluster on AWS. It is a highly available managed control plane available on US East 1 and US West 2 - EKS uses a split account. The control plane runs in an Amazon account and the workers run in customer’s account. - Upstream compatibility is a core tenant of EKS. You can subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2tWT8t9

Anastasiia Voitova on Cryptography and the Design of Cryptographic Libraries
In this podcast Wes Reisz is talking to Anastasiia Voitova, known as @vixentael in the security communities. She started her career as a mobile application developer, and in recent years has moved to focus mainly on designing and developing graphics software. We’re going to talk about cryptography, how to design libraries to be usable by developers, and designing cryptographic libraries. We’ll also discuss about her talk from the recent QCon New York , called “Making Security Usable”. Why listen to this podcast: - Choosing a good encryption algorithm isn’t enough - the parameters need to be chosen carefully as well - Algorithms like MD5 should not be used for hashing any more - Security is not just the encryption layer - it is the design of the whole system - Backups should be encrypted as well - Logs may contain sensitive GDPR data and need to be processed accordingly More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2yRWdQc You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2yRWdQc

Matt Klein on Lyft’s Envoy, Including Edge Proxy, Service Mesh, & Potential AI Use Cases
On today’s podcast, Wes Reisz talks to Matt Klein about Envoy. Envoy is a modern, high performance, small footprint edge and service proxy. While it was originally developed at Lyft (and still drives much of their architecture), it is a fully open source driven project. Matt addresses on this podcast what he sees as the major design goals of Envoy, answers questions about a sidecar performance impact, discusses observability, and thinks out loud on the future of Envoy. Why listen to this podcast: - Envoy’s goal is to abstract the network from application programmers. It’s really about helping application developers focus on building business logic and not on the application plumbing. - Envoy is a large community driven project, not a cohesive product that does one thing. It can be used as a foundational building blocks to extend into a variety of use cases, including as an edge proxy, as a service mesh sidecar, and as a substrate for building new products. - While there is performance cost for using sidecar proxies, the rich featureset is often a worthwhile tradeoff. With that said, there is work being done that is greatly improving Envoy’s performance. - Envoy is built to run Lyft. There were no features that were in Envoy when it was open sourced that were not used at Lyft. - Envoy emits a rich set of logs and has a plugable tracing system. The goal is observability first and one of the main project goals. - Lyft deploys Envoy master twice per week. - Envoy’s roadmap includes work on automating settings (rate limits and retries), focus on ease of operation (such as where things got routed what the internal timings), and additional protocol support such as Kafka. You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2tmUKMl

Pam Selle on Serverless Observability
On this podcast, Pam Selle (an engineer for IOPipe who builds tooling for serverless observability) talks about the case for serverless and the challenges for developing observability solutions. Some of the things discussed on the podcast include tips for creating boundaries between serverless and non-serverless resources and how to think of distributed tracing in serverless environments. Why listen to this podcast: - Coca Cola was able to see a productivity gain of 29% by adopting serverless (as measured by the amount of time spent on business productivity applications). - Tooling for serverless is often a challenge because resources are ephemeral. To address the ephemeral nature of serverless, you need to think about what information you will need to log ahead of time. - Monitoring should focus on events important to the business. - Build barriers between serverless and flat scaling non-serverless resources to prevent issues. Queues are an example of ways to protect flat scaling resources. - In-memory caches are a handy way to help serverless functions scale when fronting databases. - There are limitations with tracing and profiling on serverless. Several external products are available to help. - Serverless (and Microservices) are not for every solution. If you are choosing between two things, and one of them lets you ship and the other does not choose the thing that lets you ship. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2Jc6FXc You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. https://bit.ly/2Jc6FXc Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2Jc6FXc

Serverless and the Serverless Framework with David Wells
The Serverless Framework is quickly becoming one of the more popular frameworks used in managing serverless deployments. David Wells, an engineer working on the framework, talks with Wes Reisz about serverless adoption and the use of the open source Serverless Framework. On this week’s podcast, the two dive into what it looks like to use the tool, the development experience, why a developer might want to consider a tool like the serverless framework, and finally wraps up with what the tool offers in areas like CI/CD, canaries, and blue/green deployment. Why listen to this podcast: - Serverless allows you to focus on the core business functionality and less on the infrastructure required to run your systems. - Serverless Framework allows you to simplify the amount of configuration you need for each cloud provider (for example, you can automate much of the configuration required for CloudFormation with AWS) - Serverless Framework is an open source CLI tool that supports all major cloud providers and several on-prem solutions for managing serverless functions. - The serverless space has room to grow in offering a local development space. Much of the workflow today involves frequent deploy and scoping the deployment for different stages. - Serverless Framework is open source and invites contributions from the community. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2IWayeE You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2IWayeE

Colin Eberhardt on WebAssembly
In this podcast Wes Reisz talks to Colin Eberhardt, the Technology Director at Scott Logic, talks about what WebAssembly (WASM) is, a bit of the history of JavaScript, information about WebAssembly, and plans for WebAssembly 2.0 including the threading model and GC. Why listen to this podcast: - WebAssembly brings another kind of virtual machine to the browser that is a much more low-level language. - One of the goals of WebAssembly is to make a new assembly language that is a compilation target for a wide range of other languages such as C++, Java, C# and Rust. C++ is highly mature, Rust is maturing rapidly. Java and C# are a little further behind because of the lack of garbage collection support in WebAssembly. At some point in the future WebAssemblywill have it’s own garbage collection perhaps by using the Javascript garbage collector. - At runtime you use JavaScript to invoke functions that are exported by your WebAssembly instance. It should be noted that at the moment there is quite a lot of complexity involved in interfacing between WebAssembly and JavaScript. A lot of this complexity comes from the type system. - WebAssembly only supports four types - 2 integer types and 2 floating point types. To model strings you share the same piece of linear memory - memory that can read from and write to from both WebAssembly and JavaScript. - WebAssembly is still a very young technology. Future plans include threading support, garbage collection support, multiple value returns. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2G8QtzB You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2G8QtzB

Martin Thompson on Aeron, Binary vs Text for Message Encoding, and Raft
Martin Thompson discusses consensus in distributed systems, and how Aeron uses Raft for clustering in the upcoming release. Martin is a Java Champion with over 2 decades of experience building complex and high-performance computing systems. He is most recently known for his work on Aeron and Simple Binary Encoding (SBE). Previously at LMAX he was the co-founder and CTO when he created the Disruptor. * Aeron is a messaging system designed for modern multi-core hardware. It is highly performant with a first class design goal of making it easy to monitor and understand at runtime. The product is able to simultaneously achieve the lowest latency and highest throughput of any messaging system available today. Why listen to this podcast: * Aeron uses a binary format on the wire rather than a text based protocol. This is largely done for performance reasons. Text is commonly used in messaging to make debugging simpler but the debugging problem can be solved using tools like Wireshark and the dissectors that come with it. * In a forthcoming release of Aeron will support clustering. Raft was chosen over PAXOS for this since it is more strict. This means that there are fewer potential states the system can be in making it easier to reason about. * RAFT is an RPC-based protocol, expecting synchronous interactions. Aeron is asynchronous by its nature, but the underlying Aeron protocol was designed to support consensus, meaning that a lot of things which would typically need to be done synchronously can be done asynchronously and/or in parallel. * Static clusters will be added first to Aeron, with dynamic clustering after that, and then cryptography again with the intention of keeping the latency and throughput high. (edited) More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2Ilewk5 You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2Ilewk5

Building a Data Science Capability with Stephanie Yee, Matei Zaharia, Sid Anand and Soups Ranjan
In this podcast, recorded live at QCon.ai, Principal Technical Advisor & QCon Chair Wes Reisz and InfoQ Editor-in-chief Charles Humble chair a panel discussion with Stephanie Yee, data scientist at StitchFix, Matei Zaharia, professor of computer science at Stanford and chief scientist at Data Bricks, Sid Anand, chief data engineer at PayPal, and Soups Ranjan, director of data science at CoinBase. Why listen to this podcast: - Before you start putting a data science team together make sure you have a business goal or question that you want to answer; If you have a specific question, like increasing lift on a metric, or understanding customer usage patterns, you know where you can get the data from, and you can then figure out how to organise that data. - You need to make sure you have the right culture for the team - and find people who are excited about solving the business problems and be interested in it. Also look at the environment you are going to provide. - Your first hire shouldn’t be a data scientist (or quant). You need support to productionise the models - and if you don’t have a colleague to help productionise it then don’t hire the quant first. - Given the scarcity of talent it is worth remembering that Data Scientists come from a variety of different backgrounds - Some people have computer science backgrounds, some may be astrophysicists or neuroscientists who approach problems in different ways. - There are two common ways to structure a data science team: one is a vertical team that does everything, the other, more common in large companies, is when you have a separate data science team and an infrastructure team. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2Jym1RI You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2Jym1RI

Streaming: Danny Yuan on Real-Time, Time Series Forecasting @Uber
On this week’s podcast, Danny Yuan, Uber’s Real-time Streaming/Forecasting Lead, lays out a thorough recipe book for building a real-time streaming platform with a major focus on forecasting. In this podcast, Danny discusses everything from the scale Uber operates at to what the major steps for training/deploy models in an iterative (almost Darwinistic) fashion and wraps with his advice for software engineers who want to begin applying machine learning into their day-to-day job. Why listen to this podcast: * Uber processes 850,000 - 1.3 million messages per second in their streaming platform with about 12 TB of growth per day. The system’s queries scan 100 million to 4 billion documents per second. * Uber’s frontend is mobile. The frontend talks to an API layer. All services generate events that are shuffled into Kafka. The real-time forecasting pipeline taps into Kafka to processes events and stores the data into Elasticsearch. * There is a federated query layer in front of Elasticsearch to provide OLAP query capabilities. * Apache Flink’s advanced windowing features, programming model, and checkpointing convinced Uber to move away from the simplicity of Apache Samza. * The forecasting system allows Uber to remove the notion of delay by using recent signals plus historical data to project what is happening now and what will happen into the future. * Uber’s pipeline for deploying ML models: HDFS, feature engineering, organizing into data structures (similar to data frames), deploy mostly offline training models, train models, & store into a container-based model manager. * A model serving layer is used to pick which model to use, forecasting results are stored in an OLAP data store, a validation layer compares real results against forecast results to verify the model is working as desired, and a rollback feature enables poor performing models to be automatically replaced by previous one. * “Without output, you don’t have input.” If you want to start leveraging machine learning, developers just need to start doing. Start with intuition and practice. Over time ask questions and learn what you need, then apply a laser focus to gain that knowledge. You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2GJQbUo

Sander Mak on the Java Module System
Sander Mak and Wes Reisz discuss the Java module system and how adoption is going. Topics discussed on this podcast include Java modularity steps / migrations, green field projects, some of the concerns that caused the EC to initially vote no on Java 9, and a new tool for building custom JREs called JLink. Additionally, as Java 10 was recently released a short bit at the end was added to discuss some of the latest news with Java. Why listen to this podcast: • People quickly moved to Java 8 because of features like Streams and Lambdas. Java 9 has a different story around modularity and application architecture. Adoption is slower and more intentional. • Migrating large codebases to use modularity is hard. Many of the projects using modules are greenfield, and those large codebases that are moving now are most often using the classpath. • Jlink is a new command line tool released with Java 9. It allows developers to create their own lightweight, customized JRE for a module-based Java application. • Java version scheme has dropped the 1.* prefix. Future releases of the JDK will have the version number and follow the form *.0.1 (i.e. 9.0.1) • While the module system will likely show it’s benefit mostly for new development, many 3rd party libraries are moving to adopt modularity and removing their dependencies on JDK internal APIs. It’s improving the experience for teams adopting modularity. • There are no known open JEPS regarding the enhancement of the Java module system. • Java 10 has been released. The release features changes to the freely available Java versions, local variable type inference (var), experimental GRAAL JIT compiler, application class data sharing, improved container support/awareness, and others. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2DQ7ptx You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2DQ7ptx

Jendrik Joerdening and Anthony Navarro on Self-Racing Cars Using Deep Neural Networks
Jendrik Joerdening and Anthony Navarro describe how a team of 18 Udacity students entered a self-racing car event They had very limited experience of building autonomous control systems for vehicles and had just 6 weeks to do it with only 2 days with the physical car. They describe the architecture, how they co-ordinated a very diverse team, and how they trained the models. Why listen to this podcast: - Last year a team of 18 Udacity Self-Driving Cars students competed at the 2017 Self Racing Cars event held at Thunderhill Raceway in California. - The students had all taken the first term of a three term program on Udacity which covers computer vision and deep learning techniques. - The team was extremely diverse. They co-ordinated the work via Slack with a team in 9 timezones and 5 different countries. - The team developed a neural network using Keras and Tensorflow which steered the car based on the input from just one front-facing camera in order to navigate all turns on the racetrack. - They received a physical car two days before the start of the event. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2DykAiJ You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: http://bit.ly/2DykAiJ

Andrea Magnorsky on Paradigm Shifts and the Adoption of Programming Languages
On this podcast, we talk with Andrea Magnorsky, who is a tech lead at Goodlord on their engineering squads; she has a background in Scala, C#, and organised conferences. Today we’ll be talking about paradigm shifts. Why listen to this podcast: * A programming paradigm has a loose definition. It’s just about finding a way of doing things. * There are a number of different ways to think about problems - and different paradigms do this in different ways. * To shift paradigms, you have to un-learn some of your instincts. * When adopting a new paradigm if people don’t want to learn anything, then they won’t. * Multiple paradigms help you apply different ways of thinking about solutions to problems because solutions vary across languages. * Quick ways to start gaining knowledge and adoption for new languages are to use a new language as a test harness for your existing code. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2oPFG71 You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: http://bit.ly/2oPFG71

Anne Currie on Organizational Tech Ethics, including Scale, GDPR, Algorithmic Transparency
On this podcast, Anne Currie joins the tech ethics discussion started on the Theo Schlossnagle podcast from a few weeks ago. Wes Reisz and Anne discuss issues such as the implications (and responsibilities) of the massive amount of scale we have at our fingertips today, potential effects of GDPR (EU privacy legislation), how accessibility is a an example of how we could approach tech ethics in software, and much more. Why listen to this podcast: - Ethics in software today is particularly important because of the scale we have available with cloud native architectures. - Accessibility offers a good approach to how we can evolve the discussion on tech ethics with aspects that include both a carrot and a stick. - Bitcoin mining power consumption is an example of something we never considered to have such negatives. - The key to establishing what we all should and shouldn’t be doing with tech ethics is to start conversations and share our lessons with each other. If you want to find out what every software developer, data scientists or ops should know about GDPR, download our free guide "Perspectives on GDPR": https://bit.ly/2FRvLnP More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2FtgdIy You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: http://bit.ly/2FtgdIy

Oliver Gould on Service Mesh for Microservices, LinkerD, and the Recently Released Conduit
This week on The InfoQ Podcast Wes Reisz talks with the CTO of Bouyant Oliver Gould. Bouyant is the maker the LinkerD Service Mesh and the recently released Conduit. In the podcast, Oliver defines a service mesh, clarifies the meaning of the data and control plane, discusses what a Service Mesh can offer a Microservice application owners, and, finally, discusses some of the considerations they took into account developing Conduit. Why listen to this podcast: - Service mesh is dedicated infrastructure that handles interservice communication. - There are two components to a service mesh: the data plane handles communication and the control plane is about policy and config. - LinkerD and Conduit are two open service meshes made by Bouyant. Conduit has a small memory footprint and provides a convention over configuration approach to service mesh deployment. - Adopting Rust (language used for implementing the data plane in Conduit) requires thinking of memory differently, and the best way to adopt Rust is to read other people’s code. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2skWF61 You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: http://bit.ly/2skWF61

Theo Schlossnagle on Software Ethics and the Presence of Doing Good
This week's podcast features a chat with Theo Scholossnagle. Theo is the CEO of Circonus and co-chairs the ACM Queue. In this podcast, Theo and Wes Reisz chat about the need for ethical software, and how we as technical leaders should be reasoning about the software we create. Theo says, "it's not about the absence of evil, it's about the presence of good." He challenges us to develop rigor around ethical decisions we make in software just as we do for areas like security. With the incredible implications of machine learning and AI in our future, this week's podcast touches on topics we should all consider in the systems we create. Why listen to this podcast: - The ubiquitous society impact of computers is surfacing the need for deeper conversations on software ethics. - Ethics are a set of constructs and constraints to help us reason about right and wrong. - Algorithmic interpretability of models can be difficult to reason about; however, accountability for algorithms can be enforced in other ways. - Questions to be considered when writing software should evolve into: What am I building, why am I building it, and who will it hurt? - Ethics in software will take industry reform, deeper conversations, and developing a culture of questioning the software we’re building More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2BZAC4p You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: http://bit.ly/2BZAC4p

Chris Swan on DevOps and NoOps, plus Operations and Code Validation in a Serverless Environment
On this week’s podcast, Wes Reisz talks with Chris Swan. Chris is the CTO for the global delivery organisation at DXC Technology. Chris is well versed in DevOps, Infrastructure, Culture, and what it means to put all these together. Today’s topics include both DevOps and NoOps, and what Chris calls LessOps, what Operations means in a world of Serverless, where he sees Configuration Management, Provisioning, Monitoring and Logging heading. The podcast then wraps talking about where he sees validating code in a serverless deployment, such as canaries and blue-green deployments. Why listen to this podcast: * Serverless still requires ops - even if the ops aren’t focused on the technology * Even with minimal functions, the amount of configuration may exceed it by a factor of three * Disruptive services often move the decimal point * ML is the ability to make the inferences and AI is the ability to make decisions based on those inferences More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2Bff4jU You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: http://bit.ly/2Bff4jU

Architecting a Modern Financial Institution with Vitor Olivier, Thoughts on Immutability, CI/CD, FP
This week’s podcast features a chat with Vitor Olivier. Vitor is a partner at NuBank (a technology-centric bank in Brazil). This podcast hits on topics from several of Nubank’s recent QCon talks and includes things like: Nubank’s stack, functional programming, event sourcing, defining service boundaries, recommendations on reasoning about services, tips (or tweaks) on the second iteration of their initial architecture and more. Why listen to this podcast: - Property-based testing and Schemas (or Clojure.Spec)are complementary. - Clojure’s functional nature and Datomic’s features are a match for Nubank’s requirements. - A (micro)service needs to be able to create the full representation of the core feature it’s handling. - GraphQL is useful to abstract away the distributed system complexity from the mobile (or frontend) developers. - Nubank’s uses a combination of monitoring and sanity checks in real time at various level to keep systems consistent. - Once an invariant is broken, the system will try to fix it automatically. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2mnqyfK You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: http://bit.ly/2mnqyfK

Charles Humble and Wes Reisz Take a Look Back at 2017 and Speculate on What 2018 Might Have in Store
In this podcast Charles Humble and Wes Reisz talk about Java 9 and beyond, Kotlin, .NET Core 2, the surge in interest in organisational culture, quantum computing and more. Why listen to this podcast: - Java had a big year with Java 9 shipping, Java EE going open-source and moving to Eclipse as EE4J, and IBM open-sprucing J9. From next year the platform will also be on a bi-annual release cycle with the next two versions (expected to be Java 10 and 11) both shipping during 2018. - Kotlin joined Scala, Clojure, and Groovy as a strong alternative language for the JVM particularly for mobile where it was buoyed by Google’s official blessing of it as a language for Android development at Google IO. - On InfoQ we also saw a big surge in interest around .NET linked to .NET Core 2, and at both InfoQ and at QCon San Fransisco we also saw an upsurge in interest around organizational culture with one of the culture tracks (the Whole Engineer) moving to one of the larger rooms. - We started to see Quantum computers emerging from the labs, with IBM making a 16 Qbit quantum processor available via their cloud for developers to play with, and the corresponding library available for Python on Github, - Another major trend from the year was the availability of machine learning libraries for software developers to build and train models Check the landing page on InfoQ: http://bit.ly/2ljlBVH Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq

Kolton Andrus on Gremlin’s Newly Announced SaaS Chaos Engineering Product and Running Game Days
Gremlin is a Software as a Service that lets you plan, control and undo Chaos engineering experiments built by engineers with experience from Netflix, AWS, Dropbox and others. In this podcast Wes talks to Kolton Andrus about the Gremlin product and architecture and related topics such as running Game Days. You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq

Fast Data with Dean Wampler
In this podcast, Deam Wampler discusses fast data, streaming, microservices, and the paradox of choice when it comes to the options available today building data pipelines. Why listen to this podcast: * Apache Beam is fast becoming the de-facto standard API for stream processing * Spark is great for batch processing, but Flink is tackling the low-latency streaming processing market * Avoid running blocking REST calls from within a stream processing system - have them asynchronously launched and communicate over Kafka queues * Visibility into telemetry of streaming processing systems is still a new field and under active development * Running the fast data platform is easily launched on an existing or new Mesosphere DC/OS runtime More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2BYTMbI You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2BYTMbI

Changhoon Kim on Programmable Networking Switches with PISA and the P4 DSL
In this podcast, Werner Schuster talks to Changhoon Kim, who is a Director of System Architecture at Barefoot Networks, and is actively working for the P4 language consortium. They talk about the new PISA (protocol independence switch architecture) which promises multi-terabit switching, and P4, a domain-specific programming language designed for networking. You can subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq

Apache Beam Founder Tyler Akidau Discusses Streaming System and Their Complexities
In this podcast, we are talking to Tyler Akidau, a senior engineer at Google, who leads the technical infrastructure and data processing teams in Seattle, and a founding member of the Apache Beam PMC and a passionate voice in the streaming space. This podcast will cover data streaming and the 2015 DataFlow Model streaming paper [http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf] and much of the concepts covered, such as why dealing with out-of-order data is important, event time versus processing time, windowing approaches, and finally preview the track he is hosting at QConf SF next week. Why listen to this podcast: - Batch processing and streaming aren’t two incompatible things; they are a function of different windowing options. - Event time and processing time are two different concepts, and may be out of step with each other. - Completeness is knowing that you have processed all the events for a particular window. - Windowing choice can be answered from the what, when, where, how questions. - Unbounded versus bounded data is a better dimension than stream or batch processing. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2AyBTAb You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2AyBTAb

Guy Podjarny on OSS Security, Serverless, and the Equifax Hack
In this podcast, Wes talks to Guy Podjarny (Founder/CEO Synk). The two discuss the space between open source software and third-party dependencies, including a discussion of the Equifax hack (and what we can learn from it), the role of serverless architectures today (and what it means to application surface area), and then finally they wrap with security hygiene best practices with OSS and serverless. Why listen to this podcast: - The majority of security vulnerabilities that exist in applications today comes from vulnerable third-party libraries, rather than the application’s own code. - An application shouldn’t permit total leak of all data because of a single vulnerability - defence in depth is important. - Equifax couldn’t have failed more spectacularly in the way they handled it. - The Equifax hack serves as a wake-up call to pay attention to vulnerabilities in dependencies. - If your build system breaks the build when a dependency vulnerability is found automatically, it will be applied sooner. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2ziAIat You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2ziAIat

Julien Viet on the Newly Released Eclipse Vert.x 3.5.0 and Plans for Vert.x 4.0
In this podcast, QCon Chair Wesley Reisz talks to Julien Viet. Viet is the project lead for Vert.x and a principal engineer at RedHat having taken over as project lead for Vert.x from Tim Fox in January 2016. They talk about the newly released Vert.x 3.5.0, and the plans for Vert.x 4.0. Why listen to this podcast: * Vert.x adds RxJava2 support for streams and backpressure. * Vert.x is a polyglot set of APIs, custom aligned for the specific language. * It is unopinionated and can be used with any environments, since it doesn’t enforce a particular framework. * Verticles communicate in-VM or through peer-to-peer networking for distributed applications. * Vert.x 4.0 is on the roadmap for the future. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2z0BEQR You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2z0BEQR

Incident Response Across Non-Software Industries with Emil Stolarsky
What can software learn from industries like aerospace, transportation, or even retail during national disasters? This week’s podcast is with Emil Stolarsky and was recorded live after his talk on the subject at Strangeloop 2017. Interesting points from the podcast include several stories from Emil’s research, including the origin of the checklist, how Walmart pushed decision making down to the store level in a national disaster, and where the formalized conversation structure onboard aircraft originated. The podcast mentions several resources you can turn to if you want to learn more and wraps with some of the ways this research is affecting incident response at Shopify. Why listen to this podcast: * Existing industries like aerospace have built a working history of how to resolve issues; it can be applicable to software issues as well. * Crew Resource Management helps teams work together and take ownership of problems that they can solve, instead of a command-and-control mandated structure. * Checklists are automation for the brain. * Delegating authority to resolve system outages removes bottlenecks in processes that would otherwise need managerial sign off. * When designing an alerting system, make sure it doesn’t flood with irrelevant alerts and that there’s clear observability to what is going wrong. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2zmCsfR You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2zmCsfR

Charity Majors on Honeycomb.io, the Social Side of Debugging and Testing in Production
In this podcast, recorded live at Strange Loop 2017, Wes talks to Charity, cofounder and CEO of honeycomb.io. They discuss the social side of debugging and her Strange Loop talk “Observability for Emerging Infra: What got you Here Won't get you There”. Other topics include advice for testing in production, shadowing and splitting traffic, and sampling and aggregation. Why listen to this podcast: - Statistical sampling allows for collecting more detailed information while storing less data, and can be tuned for different event types. - Testing in production is possible with canaries, shadowing requests, and feature switches - Pulling data out of systems is just noise - it becomes valuable once someone has looked at it and indicates the meaning behind it. - Instrumenting isn’t just about problem detection - it can be used to ask business questions later - You can get 80% of the benefit from 20% of the work in instrumenting the systems. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2y6OP1b You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2y6OP1b

Nora Jones on Establishing, Growing, and Maturing a Chaos Engineering Practice
Nora Jones, a senior software engineer on Netflix’ Chaos Team, talks with Wesley Reisz about what Chaos Engineering means today. She covers what it takes to build a practice, how to establish a strategy, defines cost of impact, and covers key technical considerations when leveraging chaos engineering. Why listen to this podcast: - Chaos engineering is a discipline where you formulate hypotheses, perform experiments, and evaluate the results afterwards. - Injecting a bit of failure over time is going to make your system more resilient in the end. - Start with Tier 2 or non-critical services first, and build up success stories to grow chaos further. - As systems become more and more distributed, there becomes a higher need for chaos engineering. - If you’re running your first experiment, get your service owners in a war room and get them to monitor the results of the test as it is running. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2vJoimw You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extended shownotes? Check the landing page on InfoQ: http://bit.ly/2vJoimw

Shubha Nabar Discusses Einstein, the Machine Learning System in Salesforce
Shubha Nabar is a senior director of data science for Salesforce Einstein. Prior to working for Salesforce, she was a data scientist at LinkedIn and Microsoft. In the podcast she discusses Salesforce Einstein and the problem space that they are trying to solve, explores the differences between enterprise and consumer for machine learning, and then talks about the Optimus Prime Scala library that they use in Salesforce. Why listen to this podcast: * The volume of data, and hardware advances have made it possible to do machine learning to do them a lot faster. * AI is a science of building intelligent software, encompassing many aspects of intelligence that we tend to think of as human. * If you can’t measure something, you can’t fix it. * You have to think about what you can automate, rather than having a human to try and engineer out all those features. * Get feedback on design. Nora Jones, a senior software engineer on Netflix’ Chaos Team, talks with Wesley Reisz about what Chaos Engineering means today. She covers what it takes to build a practice, how to establish a strategy, defines cost of impact, and covers key technical considerations when leveraging chaos engineering. Why listen to this podcast: - Chaos engineering is a discipline where you formulate hypotheses, perform experiments, and evaluate the results afterwards. - Injecting a bit of failure over time is going to make your system more resilient in the end. - Start with Tier 2 or non-critical services first, and build up success stories to grow chaos further. - As systems become more and more distributed, there becomes a higher need for chaos engineering. - If you’re running your first experiment, get your service owners in a war room and get them to monitor the results of the test as it is running. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2vJoimw You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extended shownotes? Check the landing page on InfoQ: http://bit.ly/2xK7OxR

Simon Brown on the Role of the Software Architect in a Continuous Delivery Environment
This week's podcast features Simon Brown well known for his work training software architects. Topics include the differences between a tech lead and an architect, how much documentation is enough and what that looks like in a continuous delivery environment. What you'll learn on this podcast: • As an industry we seem to have lost our knowledge of how to do architecture well in the context of modern agile software teams. • Architecture is about the expensive decisions; things that are costly to change later. • Ideally architects should code in the production code base. If you are not able to do this at least be involved in quality reviews and peer reviews in the production code so you can get feedback on your designs. • It is often said the the code is the only documentation you need but the code can’t tell you everything. You do need to document the things you can’t get from the code such as the architectural drivers, they key quality attributes and so on along with some high level diagrams and how you operate the system. • As you step into the role of architect go and find a mentor or a local meet-up. The major change is that you have to influence and lead people. This podcast is sponsored by AppDynamics. Software architects play a critical role in designi¬¬¬ng, executing, and migrating large infrastructures to the cloud. Download AppDynamic’s FREE eBook “10 Tips for Enterprise Cloud Migration” and launch your migration project with a proven plan. Download the eBook now at http://infoq.link/web_sndcld_appdynamics Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2xvq7qM

Twitter's Yao Yue on Latency, Performance Monitoring, & Caching at Scale
This week's podcasts features Yao Yue of Twitter. Yao spent the majority of her career working on caching systems at Twitter. She has since created a performance team that deals with edge performance outliers often exposed by the enormous scale of Twitter. In this podcast, she discusses standing up the performance team, thoughts on instrumenting applications, and interesting performance issues (and strategies for solving them) they’ve seen at Twitter. Why listen to this podcast: * Performance problems can be caused by a few machines running slowly causing cascading failure * Aggregating stats on a minute-by-minute basis can be an effective way of monitoring thousands of servers * Being able to record second-by-second is often too expensive to centrally aggregate, but can be stored locally * Distinguishing between request timeout and connection/network timeouts is important to prevent thundering herds * With larger scale organisations, having dedicated performance teams helps centralise skills to solve performance problems More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2wnBemB You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2wnBemB

Linda Rising on the Importance of Patterns, Her Journey, & Patterns for Driving Change/Innovation
On the InfoQ Podcast this week, Wes Reisz talks with the Queen of Patterns, Linda Rising. Linda discusses her thoughts on the importance of patterns, she answers questions about what really is a pattern, and how she became involved in working with them. Throughout the podcast she discusses a variety of organizational and personal patterns and finally wraps with patterns to apply when driving change and innovation. Why listen to this podcast: - You have to realise that there’s nothing you can do about other people. The only person you can affect is yourself. - A pattern is not a band-aid that you use once. You use it in a context where you use it in conjunctions with other patterns. - Take baby steps when driving change in an organisation, and seek out a pocket of receptive people to drive it. - Slack is an important part to have in life, so that if something comes along you can absorb it without having to stop doing something else. - Listen, Listen, Listen. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2vLIsMC You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2vLIsMC

Security Considerations and the State of Microservices with Sam Newman
Wesley Reisz talks with Sam Newman about microservices. They explore the current state of the art with regards the architectural style and corresponding tooling and deployment platforms. They then discuss how microservices increase the surface area of where sensitive information can be read or manipulated, but also have the potential to create systems that are more secure. Why listen to this podcast: - Different organisations have different risk appetites for new technology, so what may be appropriate for one organisation may not be appropriate technology choices for another. - If you are deploying micro services then you need to know why you are doing it and what benefits you expect to get from deploying them. - Micro services are defined by their independently deployable units rather than their size. - Using a cryptographic token that is verifiable off line is a common pattern for passing authentication contexts around to different services. - Serverless architectures redeuce the need to monitor server patching but does not diminish the need for monitoring application runtime or library dependencies from security patching. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2v8NJg6 You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2v8NJg6

Jessica Kerr on Productivity, Slack Chatbots, Yak Shaving, & Why Diversity Matters for Innovation
Wesley Reisz talks with Jessica Kerr about her focus on developer productivity. Topics include her work at Atomist building Slack Chatbots, an approach to categorizing Yak Shaving (in an effort to prioritize and automate development dependencies), how an innovation culture drives diversity, and, finally, the role of 10x developers in the lifecycle of a company or product. Why listen to this podcast: - There are five kinds of Yak to shave - Atomist uses a Slack chatbot to automate and track commits, builds, push requests etc. - Agile retrospectives are a great way to encourage an innovation culture - Diverse teams flourish in innovation cultures - 10x developers are great for launching products, but teams are needed as products scale up More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2uO60PR You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2uO60PR

Martin Hadley on R and the modern R ecosystem
Werner Schuster talks to Martin Hadley, data scientist at University of Oxford. They discuss the state of the R language, the rich R ecosystem that covers development (RStudio), notebooks for publication (R Notebooks, RPubs), writing web apps (Shiny), and the pros/cons of the different data frames implementations. Why listen to this podcast: - R is the tool for working with rectangular data - Modern data frame implementations are Tibble and data.table (for large amounts of data) - RMarkdown and R Notebooks allow to explore data and then publish it the results and (interactive) visualization - Use Shinyapps to publish server side R applications - Tidyverse is the place to look for modern R packages More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2twOXWJ You can also subscribe to the InfoQ newsletter to receive weekly updates on the hotest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2twOXWJ

Pony Language Designer Sylvan Clebsch on Pony’s Design, Garbage Collection, and Formal Verification
In this podcast Charles Humble talks to Sylvan Clebsch, who is the designer of the actor-model language Pony programming and now works at Microsoft Research in Cambridge in the Programming Language Principles group. They talk about the inspirations behind Pony, how the garbage collector avoids stop-the-world pauses, the queuing systems, work scheduler, and formal verification. Why listen to this podcast: * Pony scales from a Raspberry Pi through a 64 core half terabyte machine to a 4096 core SGI beast * An actor has a 256-byte overhead, so creating hundreds of thousands of actors is possible * Actors have unbounded queues to prevent deadlock * Each actor garbage collects its own heap, so global stop-the-world pauses are not needed * Because the type system is data-race free, it’s impossible to have concurrency problems in Pony More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2tZXcKE You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2tZXcKE

Kotlin Lead Language Designer Andrey Breslav on Android Support, Language Features and Future Plans
Why listen to this podcast: - Kotlin is an officially supported language on Google Android platforms - Kotlin Native and Kotlin JS will allow code reuse between server, client and mobile devices - Type safety means that references can be checked for nullability Great tooling is a driver in what kind of language features are (and aren’t) adopted - Coroutines provide a way of creating maintainable asynchronous systems More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2sHyxqQ You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2sHyxqQ

Sid Anand on Building Agari’s Cloud-native Data Pipelines with AWS Kinesis and Serverless
Wesley Reisz talks to Sid Anand, a data architect at cybersecurity company Agari, about building cloud-native data pipelines. The focus of their discussion is around a solution Agari uses that is built from Amazon Kinesis Streams, serverless functions, and auto scaling groups. Sid Anand is an architect at Agari, and a former technical architect at eBay, Netflix, and LinkedIn. He has 15 years of data infrastructure experience at scale, is a PMC for Apache Airflow, and is also a program committee chair for QCon San Francisco and QCon London. Why listen to this podcast - Real-time data pipeline processing is very latency sensitive - Micro-batching allows much smaller amounts of data to be processed - Use the appropriate data store (or stores) to support the use of the dataIngesting data quickly into a clean database with minimal indexes can be fast - Communicate using a messaging system that supports schema evolution More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2rJU9nB You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2rJU9nB

Sachin Kulkarni Describes the Architecture Behind Facebook Live
Wesley Reisz talks to Sachin Kulkarni, Director of Engineering at Facebook, about the engineering challenges for Facebook live, and how it compares to the video upload platform at Facebook. Why listen to this podcast: - Facebook Infrastructure powers the board family of apps including the Facebook app, Messenger and Instagram. It is largely a C++ shop. There is some Java and Python, and the business logic is all done in PHP. The iOS apps are written in Objective C and the Android apps are in Java. - The video infra team at Facebook builds the video infrastructure across the whole company. Projects include a distributed video encoding platform which results in low latency video encoding, video upload and ingest. - Facebook Live does encoding on both the client and the server. The trade-off between encoding on the client side and the server side is mostly around the quality of the video vs. latency and reliability. - Facebook gets around 10x speed-up by encoding data in parallel compared to serial. - They also have an AI-based encoding system which resulted in 20% smaller files than raw H.264. You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2qrseG5 You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2qrseG5 Want more? Read InfoQ: http://bit.ly/2dcHmpu

Martijn Verburg on the JCP EC “No” Vote for the Java Modules
Wesley Reisz talks to Martijn Verburg, co-founder of the London Java Community and CEO of jClarity, about the JCP EC “no” vote on the Java Platform Module System (JPMS), which is due to be shipped as part of Java 9. The talk about what JPMS offers, how it works, what the no vote means and what happens next. Why listen to this podcast: - Jigsaw isn’t dead - The “no” vote was based on the submission being a bit early, and without expert group consensus that it should be submitted - Since the vote started, several amendments have been made which addressed some of the concerns listed by those who voted “no” - Daily calls with the expert group and interested parties will work to resolve the outstanding issues promptly - A resubmission is due within 30 days with a future vote expected to go through More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2q20esc You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extended shownotes? Check the landing page on InfoQ: http://bit.ly/2q20esc

Daniel Bryant on Microservices and Domain Driven Design
Wesley Reisz talks to Daniel Bryant on moving from monoliths to micro-services, covering bounded contexts, when to break up micro-services, event storming, practices like observability and tracing, and more. Why listen to this podcast: - Migrating a monolith to micro-services is best done by breaking off a valuable but not critical part first. - Designing a greenfield application as micro-services requires a strong understanding of the domain. - When a request enters the system, it needs to be tagged with a correlation id that flows down to all fan-out service requests. - Observability and metrics are essential parts to include when moving micro-services to production. - A service mesh allows you to scale services and permit binary transports without losing observability. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2pFYBiT You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2pFYBiT

Rossen Stoyanchev on Reactive Programming with Spring 5 and Spring WebFlux
Rossen Stoyanchev talks to Wesley Reisz about blocking and non-blocking architectures, upcoming changes in Spring including Spring WebFlux, the reactive web stack in Spring framework 5, due this summer. He also discusses the differences between rxJava and Reactor. Why listen to this podcast: - Spring Framework 5 is due to be released June 25 2017 - Spring Web Flux provides a web programming model designed for asynchronous APIs - Back-pressure is important in a server environment; less so within a UI environment - It’s possible to use a Spring Web Flux client within a Spring MVC applciation - Managing sets of thread pools is more complicated than having a scalable asynchronous system More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2pPgq0G You can subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2pPgq0G

Richard Feldman Discusses Elm and How It Compares to React.js for Front-end Programming
Why listen to this podcast: - Using a compiler to catch errors at compile time instead of at runtime means much easier refactoring of code. - Incrementally replacing small parts of an existing JavaScript application with Elm is a safer strategy than trying to write an entirely new application in Elm - Elm packages are semantically versioned and gated by the publishing process, so minor versions cannot remove functionality without bumping the major version. - The UI in an Elm application results in messages that transform the immutable state of the application; this allows a debugger to view the state transitions and the messages that triggered them, including record and replay of those messages. - Elm has been benchmarked as being faster than Angular and React whilst being smaller code, which is attributed to the immutable state and pure functional elements. More on this: Quick scan our curated show notes on InfoQ http://bit.ly/2qmS2CT You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Want to see extented shownotes? Check the landing page on InfoQ: http://bit.ly/2qmS2CT