
Audio is streamed directly from the publisher (media.blubrry.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Chaos Engineering introduces failures across a system. This helps us evaluate how are system will perform when a failure occurs. Tammy Bütow, Principal Site Reliability Engineer at Gremlin, explains why Chaos Engineering emerged. We talked about the different types of chaos that can be introduced to a system: DNS related attacks, black hole attacks and database attacks. Tammy highlighted the importance of a Service Level Agreement and went over its components. The discussion continued with topics around what metrics to collect for monitoring, incident management, being on-call and tracking down an issue.