
The Internet Report
135 episodes — Page 3 of 3

Ep 35Major BGP Route Leak Disrupts Internet Traffic Globally (April 13-19, 2021) | Outage Deep Dive
This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, we’re thrilled to be joined by Kemal Sanjta, ThousandEyes’ resident expert on BGP. This week, we’re going under the hood on the April 16th BGP leak at Vodafone India, which leaked more than 30,000 prefixes, causing a major disruption of Internet traffic to some services. While some news outlets reported that the incident lasted approximately 10 minutes (starting around 1:50AM UTC or 9:50AM ET), we found that it lasted quite a bit longer—more than an hour in the case of some prefixes. Watch this week’s show to see how it impacted a major CDN provider.

Ep 34Facebook Outage Analysis; Plus, Why Cross-Layer Visibility Is a Must for App Experience | Outage Deep Dive
This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. We’re back from a short sabbatical to cover an interesting outage at Facebook in what appears to be an application outage compounded by a series of routing issues. On April 8th, for roughly 40 minutes, the Facebook application became unavailable for users around the globe who were attempting to connect to the service. Despite the short-lived nature of the outage, we observed prolonged performance degradation even after the application came back online for users. Suboptimal page load and response times, both of which can impact the user experience, were observed alongside a series of routing changes. This outage reminds us all of the importance of having visibility across network and application layers when troubleshooting and prioritizing issues that are impacting user experience. Catch this week’s episode to hear about the outage from ThousandEyes perspective.

Ep 33What Happened With Verizon’s Recent Outage (Week of Jan. 25-Feb. 1, 2021) | Outage Deep Dive
On today’s episode, we discuss the recent outage on Verizon’s network that had widespread impacts on users in the US. ThousandEyes Broadband Agents detected an outage starting around 11:30am EST that manifested as packet loss across multiple locations concentrated along Verizon backbone in the US east coast and midwest. While the outage was resolved approximately an hour later, users connecting from the Verizon network across the US experienced varying degrees of impact, depending on the services they were connecting to. This serves as yet another reminder that the context around an outage directly affects the scope of the disruption. Watch this week’s episode to see what this outage looked like from ThousandEyes vantage points.

Ep 32What Happened with Slack’s Outage; Plus, Talking Cloud Resiliency with Forrest Brazeal of Cloud Guru (Week of 12/28/20-01/04/21) | Outage Deep Dive
This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Despite a quiet last couple of weeks on the Internet, we started off our new year with quite the bang. As droves of mildly-caffeinated workers returned to their home offices on Monday after the holiday break, many were surprised to find that Slack was not available. On today’s episode, we go under the hood of Slack’s Monday outage to see what went wrong and how it was resolved. We’re also excited to be joined by Forrest Brazeal, a cloud architect, writer, speaker and cartoonist, to talk about everyone’s favorite subject: cloud resiliency. Watch this week’s episode to see the interview and hear our outage analysis. Show links: https://forrestbrazeal.com https://acloudguru.com https://cloudirregular.substack.com https://cloudirregular.substack.com/p/the-cold-reality-of-the-kinesis-incident

Ep 31About Monday’s Google Outage; Plus, Talking Holiday Internet Traffic Trends with Fastly (Week of Dec. 7-14) | Outage Deep Dive
In this week's episode of #TheInternetReport... 00:00 Welcome 00:16 Headlines: About Monday’s Google Outage; Plus, Talking Holiday Internet Traffic Trends with Fastly 00:43 Under the Hood: This week, we go under the hood on a recent outage that took down the availability of several Google applications, including YouTube, Gmail and Google Calendar. Yesterday morning at approximately 6:50 AM EST, users around the world were unable to access several Google services for a span of around 40 minutes. While short-lived, the outage was notable in that it occurred during business hours in Europe and toward the beginning of the school day on the US east coast—so, people noticed, to put it bluntly. Catch this week’s episode to hear about the official RCA and what we saw from a network perspective. 10:18 Expert Spotlight: We’re thrilled to be joined by David Belson Senior Director of Data Insights, at Fastly talk about Internet traffic trends related to holiday online shopping and charitable giving. Cyber Five: what we saw during ecommerce's big week- https://www.fastly.com/blog/cyber-five-what-we-saw-during-ecommerces-big-week Decoding the digital divide- https://www.fastly.com/blog/digital-divide 19:14 Outro: We're taking a break for the rest of 2020 but join us on Jan. 05 2021 when we kick off the New Year with Forrest Brazeal: https://forrestbrazeal.com https://cloudirregular.substack.com

Ep 30Major AWS Outage Highlights Dependencies within Cloud Providers (Week of Nov. 23-30) | Outage Deep Dive
If you’re an AWS customer or rely on services that use AWS, you might have noticed the major, hours-long outage last week. On November 25th, at approximately 5:15 am PST, users of Kinesis, a real-time processor of streaming data, began to experience service interruptions. The issue was not network-related, and AWS later issued a detailed incident post-mortem analysis identifying an existing operating system configuration issue that was triggered by a maintenance event that involved adding server capacity. Over the course of the day, Amazon attempted several mitigation measures, but the outage was not completely resolved until approximately 10:23 pm PST. What was notable about this outage was its blast radius, which extended far beyond AWS’s direct customers. Several AWS services that use Kinesis, including Cognito and CloudWatch, were affected, as were any user of applications consuming those services (e.g., Ring, iRobot, Adobe). This is a good reminder of the risk of hidden service dependencies, as well as the need for visibility to understand and communicate with customers when something’s gone wrong.

Ep 292020 Election—The Internet Held Strong With a Few App Performance Glitches (Week of Nov. 2-8)
This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. This week, we’re pleasantly surprised to say that the network did not break, and there were no major election-night outages to report. However, that’s not to say we didn’t catch performance glitches in the days and weeks around the big night. Watch this week’s episode, as we cover performance issues at a Secretary of State website as well as why CNN’s election map website was so slow to load for many.

Ep 282020 Election Special: Going Under the Hood on State Election Websites (Week of Oct. 19-25)
We’ve got an election coming up here in the US, and over the last several weeks, we have been analyzing a dozen or so state election websites to take a closer look at how they’re hosted (e.g., do they use a CDN or are they self-hosted?) and to monitor them for outages. In this episode, we discuss the pros and cons of each hosting method and dive into some examples we’ve seen where election websites have had unexpected performance degradation. Catch this week’s episode to go under the hood on the websites powering the upcoming presidential election—and don’t forget to get out there and vote!

Ep 27No, Twitter Wasn’t Hacked and Zayo Goes Bump in the Night (Week of Oct. 12-18) | Outage Deep Dive
. In this week’s episode, we discuss two notable outages that happened last week. The first, at Twitter, took place on October 15 around 5:30 pm PST and impacted users’ ability to tweet or re-tweet. According to Twitter’s official statement, an internal system error was the culprit—putting to bed any theories of another hack. The second outage took place at the transit provider, Zayo, in the early morning hours of October 13. Although the outage seemed to mostly involve interfaces on the US west coast, Denver and the southwest (as well as a handful of other global locations), the impact of the outage was not very severe due to the time of the outage, which was outside of US business hours. Watch this week’s episode to hear more about these two outages.

Ep 26The Case of an Overloaded Database and What Happens When a Bug Bites (Week of Oct. 4-11) | Outage Deep Dive
This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. In this week’s episode, we dive into a recent outage at Slack that caused intermittent issues for its enterprise users (including ourselves) for nearly a full day. The cause, as noted by Slack, was on the backend and related to an overloaded database. Next, we dig into another outage at Microsoft. According to their statement, a bug in an internal update seems to have revoked the routes to a number of devices that were believed to be unhealthy—thereby creating congestion in the rest of their network. This explanation jives with the increased packet loss we observed during this time period. Don’t miss this week’s episode, where we walk through these outages in depth

Ep 25Microsoft's Monday Outage Is a Lesson in App Complexity; Plus, Digging into Telstra’s BGP Hijack (Week of Sept. 28-Oct. 4) | Outage Deep Dive
This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, we dive into a recent Azure AD disruption that significantly impacted access to Microsoft cloud services and apps (as well as third-party apps) for nearly three hours. We then went under the hood on a recent BGP hijacking in which Telstra began announcing routes to services that didn’t belong to it, such as Quad9. Catch this episode to hear our take on these incidents, and see below for show links, some additional commentary on these outages, and a sneak preview of next week’s episode.

Ep 24The TikTok Shutdown Showdown Continues, and WeChat Gets Muzzled (Week of Sept. 14-20) | Outage Deep Dive
On today’s episode, Angelique and I cover off on a couple outages that occurred over the past week. First, we discuss an application outage at Instagram that occurred on September 17th and lasted around 30 minutes. We also discuss a network outage on September 14th on the AWS backbone near Columbus, Ohio. This outage was a little more widespread, affecting nearly 100 interfaces and lasting around 30 minutes. Next, we dive into the upcoming bans on WeChat and TikTok, which have now been temporarily extended by a Federal judge, and then we walk through some of the network architecture differences between these two applications and how a potential shutdown could be enforced.

Ep 23You’ve Got Questions, We’ve Got Answers: Upstream Providers and the Reality of SLAs (Week of Sept. 7-13) | Outage Deep Dive
It was another quiet week on the Internet, so we wanted to spend some time answering your questions around some recent outages. Catch this episode as we discuss how you can understand the upstream relationships of the services you rely on to assess your risk profile. We also cover why SLAs fall short in protecting your business in the event of an outage, and why you need to proactively collaborate with your providers to solve issues faster.

Ep 22Even the Internet Enjoys a Long Weekend; Plus, Digging Into a Recent CDN Outage (Week of Aug. 31-Sept. 6) | Outage Deep Dive
The Internet held up reasonably well over the past week, all things considered. There were no major outages to report, which is a welcome repose for those impacted by the major outages the week prior. While it’s not an outage that occurred this past week, we did want to spend some time covering the recent Verizon Edgecast outage that occurred on August 21st. Watch this episode as we dive into this application-level outage to understand exactly what happened and who might have been impacted.

Ep 21Under the Hood on the CenturyLink / Level 3 Outage (Week of Aug. 24-30) | Outage Deep Dive
This is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. It was a rough week on the Internet last week, with outages and incidents across multiple services and providers including Slack, Zoom, AWS, and Verizon. However, in today’s episode we’re going to focus exclusively on Sunday’s CenturyLink / Level 3 outage that according to Cloudflare, caused a significant 3.5% drop in global Internet traffic, making it one of the most significant internet outages ever recorded.

Ep 20An IXP and a Streaming Music Provider Walk Into an Outage Bar (Week of Aug. 17-23) | Outage Deep Dive
his is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this week’s episode, Archana and I cover some recent outages that made headlines. This includes the Spotify outage, caused by an expired TLS certificate, that prevented users from accessing its platform. We also cover off on a widespread outage at Cogent during (what seems to be) a maintenance window. Then, we go “under the hood” on the prolonged outage at an IXP on August 18th to understand exactly what infrastructure was impacted and which downstream providers were subsequently impacted. We’re also joined by our guest, Prabhnit Singh, who currently leads ThousandEyes’ Internet & WAN product line, to discuss why we’re seeing an increased number of outages caused by expired TLS certificates and to cover some examples of past high-profile outages.

Ep 19Fortnite’s Epic Battle Against the “Apple Tax”; And, The Evolution of Cloud Connectivity (Week of Aug. 10-16)
On this week’s episode, Archana and I cover recent headlines concerning social media platform, TikTok, and the gaming provider, Epic Games. TikTok appears to have gained some additional time (now 90 days) before the US government will enforce its ban on the service. Gaming provider, Epic Games, recently made news when its game Fortnite was removed from Apple’s App Store and Google’s Play Store for violating their Terms of Service. Epic was quick to file a lawsuit claiming the tech giants were in violation of anti-competition laws. The outcome of this case will be one to watch, and can have far-reaching impacts for developers. Next up, we speak with William Collins, Lead Cloud Architect at a Fortune 100 company, about cloud connectivity, on-ramp services and the difference between the “Big 3” on-ramp services.

Ep 18Time’s Running Out on TikTok; Plus, This Ain't Your Dad’s SatComms—But Does It Live Up to the Hype? (Week of Aug. 3-9)
This is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this week’s episode, Mike sat down with our guest, Ray Hunter, the senior network consultant at Globis in the Netherlands, to talk about SatComms and the role they play in connecting users, and what effect the mass deployment of Low Earth Orbital (LEO) satellites will have on networks and service delivery. We also discuss a recent move by the US to ban financial transactions between TikTok’s parent company, ByteDance, and US citizens, effectively removing financial incentives to serve US citizens. While not an outright ban, it does raise questions about how an outright ban even be enforced, and what that means for the broader conversation around Internet sovereignty.

Ep 17Cogent's Midsummer's Night Outage and Telstra's Weekend DNS Mishap Prove Not All Outages Are Equal (Week of July 27-Aug. 2) | Outage Deep Dive
This is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this week’s episode, Archana and I discuss a small number of outages that hit certain regions of the globe over the past week. This includes an outage that caused a midday disruption for people trying to connect to Reddit, a weekend DNS issue at Telstra, and a Cogent outage in EMEA and NA that had the signatures of a maintenance window. We also revisit Cloudflare’s root cause analysis concerning their recent DNS outage and answer some of the open-ended questions we had.

Ep 16Ransomware Attack Leaves Garmin Users Stuck Without a Paddle (Week of July 20-26) | Outage Deep Dive
On this week’s episode, I am joined by Deepak Ravi from our Dublin technical sales engineering team to discuss a recent outage at Garmin. Garmin confirmed that it was a victim of a ransomware attack, which took down several of its services including its website functions, customer support, customer facing applications, and company communications. In this episode, we walk through what we observed in the ThousandEyes platform during the time of the attack, and what the impacts were on users attempting to access Garmin services.We’re also joined by ThousandEyes’ CISO, Alexander Anoufriev, to talk about what ransomware attacks are, how they manifest and how organizations can protect themselves against future attacks.

Ep 15Do Outages Come in 3’s? Diving Into Last Week’s Outages at GitHub, WhatsApp, and Cloudflare (Week of July 13-19) | Outage Deep Dive
On this week’s episode, we cover a couple of significant application-layer outages at Github and WhatsApp that occurred over the past week. Then, Archana and I do a deep-dive into a network-related outage at Cloudflare that affected the availability of its popular DNS service for approximately 30 minutes. We’ll share what we saw through our vantage points in the ThousandEyes platform, and you can read Cloudflare’s full explanation of the incident on their blog/

Ep 14India Swipes Left on TikTok, GCP Outage Hits Multiple AZs, & Cloud Networking 101 for Enterprises (Week of June 29-July 5) | Outage Deep Dive
On this week’s episode, we cover a recent move by the government of India to ban many Chinese-owned applications, including TikTok, which reportedly has more than 600,000,000 downloads in India. We also talk through a two-hour-long outage at Google Cloud Platform that affected multiple of its availability zones within a single region—highlighting that availability zones may be architected differently between providers—and briefly cover outages at Slack and Comcast, too. After our review of this week’s highlights, I sat down with Atif Khan, CTO of Alkira and former co-founder of Viptela, to talk enterprise cloud strategy.

Ep 13Broadband Goes Bust, Again; Plus, Satellite Meets SD-WAN (Week of June 22-28) | Outage Deep Dive
This week’s episode is brought to you by the letter “O” for outages — in particular, there were a number of broadband providers, globally, that suffered localized outages this past week. After we run down our top headlines, including a satellite provider rolling out managed SD-WAN, we take a look at outages in Comcast and AT&T’s networks. Make sure you join us next week to hear from Atif Khan, CTO at Alkira, as we talk about multi-cloud networking.

Ep 12Major T-Mobile Outage Caused By Fiber Cut, and Talking Cloud Architecture at Scale with Uber (Week of June 15-21) | Outage Deep Dive
This is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On this week’s episode, we cover a widespread T-Mobile outage that took down its cellular network for several hours and elicited a rare condemnation from the FCC. The culprit, according to the carrier, was a fiber cut—highlighting the need for redundancy and resiliency in the nation’s cellular networks. We also cover an issue with What’s App’s privacy settings that sent users scrambling to Twitter, as well as a recent move by Russia to “un-ban” the messenger app, Telegram. Then, stay tuned as we go one-on-one with Jason Black, the Head of Global Network Infrastructure at Uber Technologies, to discuss how Uber approaches its cloud architecture.

Ep 11Excuse Me, Your BGP Is Leaking (Week of June 8-15) | Outage Deep Dive
On this week’s episode, we discuss a recent BGP-related outage at a major public cloud provider, as well as a recent announcement that Cogent Networks has rolled out RPKI in an effort to strengthen its BGP route security. We’re also joined by Kemal Sanjta, principal engineer on our customer success team and our resident expert on Internet routing and security, to chat about these events. Catch this week’s episode here to dive into BGP with us.

Ep 10It’s ALWAYS DNS! (Week of May 25-31) | Outage Deep Dive
On this week’s episode of the Internet Report, I’m joined by my colleague, Michael Batchelder (aka Binky), to discuss a DNS-related service disruption that affected users trying to access Amazon.com. We also talk about a recently discovered DNS vulnerability that could leave DNS providers susceptible to DNS amplification DDoS attacks. If you’re curious about what went wrong with Amazon’s service last week and want to know more about the role of DNS and why it’s so important, don’t miss this episode.

Ep 9Outages Become a Night Owl’s Nuisance, and How COVID-19 Impacted the Submarine Cable Industry (Week of May 18-24) | Outage Deep Dive
Welcome back to the Internet Report! On this week’s episode, we cover our usual check-up of ISPs, cloud and collaboration app outages, and discuss several major middle-of-the-night outages that affected services from providers such as Google and Virgin Media. We’re also joined by TeleGeography’s Alan Mauldin to discuss submarine cables, terrestrial networks, international Internet infrastructure and more.

Ep 8YouTube, Google, Slack, and How Do You Say ‘Performance Issues’ in French? (Week of May 11-17) | Outage Deep Dive
Never a dull minute on the Internet! In today’s episode, Archana and I dove into a YouTube service disruption and an (unrelated!) Google network issue in India. We also discussed Slack’s explanation of their service disruption last week, and even talked through a case out of France where an education site experienced performance issues in lock step with time-of-day usage

Ep 7Facebook SDK Snafu Sidelines Spotify & Others, Plus, AWS Global Accelerator… Accelerates (Week of May 4-10) | Outage Deep Dive
On this week’s episode of The Internet Report, Archana and I cover some newsworthy updates that we’ve seen over the past week. We discuss a notable Facebook SDK outage that had ripple effects on other popular services that leverage its log-in functionality, including Spotify and Tik Tok. We also discuss a blog from AWS sharing their thoughts on the JEDI contract. We’re also joined by Arash Molavi, the lead Internet researcher here at ThousandEyes. Arash shares his insight into outages we’re seeing, discusses what constitutes an outage, and why loss, latency and jitter can impact end-user experience in various ways depending on the context. Last, we cover our usual availability check of ISP, public cloud, and collaboration app provider networks.

Ep 6Cloudflare Calls on ISPs to Take BGP Security Seriously; Plus Virgin Media Has Outage Deja Vu (Week of April 27-May 3) | Outage Deep Dive
On this week’s episode of The Internet Report, Archana and I are thrilled to be joined by Martin Levy, who is a distinguished engineer at Cloudflare focused on BGP route security and expanding Cloudflare's global network footprint. Check out this week’s episode to hear his thoughts on BGP security, best practices such as using RPKI and some recent routing incidents. After speaking with Levy and hearing his perspective, we jump over to a discussion around some notable outages this past week, particularly one at Virgin Media that affected connectivity for users in the UK for several hours. After going through these events, we cover off on our usual health check of ISP, cloud, and collaboration service providers.

Ep 5CenturyLink/Level 3 Suffers Fiber Cut, FCC Cracks Down on China Telecom, and Public vs. Private Internet Exchanges—Which Is Best? (Week of April 20-26) | Outage Deep Dive
In this week’s episode, Archana and I are joined by a special guest, Christian Koch, who is the head of product, cloud and ecosystem at PacketFabric. Listen in as we cover the current state of global Internet health and dive into network outage numbers across ISPs, public cloud, and collaboration platforms. This week, we saw that the overall number of outages wasn’t particularly concerning, reflecting our “new normal,” but there were a few notable outages that had far-reaching impacts. In particular, an outage at Tata Communications and a fiber cut at Level 3/CenturyLink had significant end-user impacts, as did another outage that took down access to GitHub.

Ep 4ISPs Back in the Spotlight, US Banks Stimulus Check Stumble, Is Netflix Breaking the Internet? (Week of April 13-19, 2020) | Outage Deep Dive
In this week’s episode, Archana and I discuss some pretty significant events that unfolded over the past week. First, we saw a notable increase in the number of ISP outages occurring across the global Internet—a trend that had seemed to plateau in previous weeks. After going through our usual health check of provider networks, we examined an interesting online banking outage that occured as millions of Americans checked on the status of their stimulus checks. This caused understandable angst as consumers were unable to log in or access basic online banking information. Finally, we took a look at how popular streaming services like Netflix actually work, and our experts weigh in on whether or not Netflix is “breaking the Internet.”

Ep 3Outages Hit Record Lows, Unemployment Sites Hit Network Snags, Plus, Meet The Internet Society (Week of April 6-12, 2020) | Outage Deep Dive
In this week’s episode, Archana and I welcomed David Belson (@dbelson) of the Internet Society. We got to discuss some rather good news -- overall outage events are down more than 40% globally, and more than 44% in the U.S. after a several-week-long spike in events. We very well may be looking at our ‘new normal’.

Ep 2Collaboration App Networks Have Particularly Bad Week, Rostelecom BGP Bungle (Again) (Week of March 30-April 5, 2020) | Outage Deep Dive
It was yet another eventful week on the Internet, folks. In this week’s episode of The Internet Report, Archana and I discuss the latest figures around global Internet performance, noting that, despite an elevation in outages last month, the Internet is holding up well. ISP outages declined slightly in the U.S. and globally last week, but that wasn’t the case for UCaaS providers, who had a particularly rough time last week, especially in the United States. There was also a fairly large BGP route hijack on April 1 courtesy of Russian ISP, Rostelecom — the same ISP responsible for a route hijacking incident back in 2017. The prefixes involved belonged to Amazon, Cloudflare, and other services, and impacted the reachability of sites like Yelp.com.

Ep 1ISP Outages On The Rise, Router Failure Takes Down Cloud Provider Services During COVID-19 (Week of March 23-29, 2020) | Outage Deep Dive
Over the past month, ThousandEyes has been flooded with questions about how the Internet is holding up given the extra strain it’s been under with the sudden influx of remote workers, remote schoolers, and overall increased use due to COVID-19 related self-isolating and shelter in place orders. We’ve put out blogs and have conducted executive, media and analyst briefings. Network World and the IDG family of publications have even started publishing our data on a weekly basis to keep its readers up to date, as things are changing so frequently.Because of the continued interest in how the Internet is handling the current and, potentially, increasing traffic loads, we decided that now is the right time to kick off a show to answer this question each week. How is the Internet faring? What were some of the most interesting events we observed during the week? We're pleased to share the inaugural episode of The Internet Report.Listen along and don’t forget to subscribe to our blog and our YouTube Channel to be the first to get these episodes moving forward. And feel free to leave a comment here, on YouTube, or on Twitter, tagging @ThousandEyes and using the hashtag #TheInternetReport. We hope you find this info useful, and we look forward to your feedback.Show Links:Review the interactive share link of the March 27 Google Outage here and here.Vodafone reports a 50% rise in Internet use as more people work from homeVerizon sees almost 20% increase in web traffic in one week due to COVID-19A large-scale Cogent Communications outage impacted the Northwest United States — see an interactive view of the outage here.Another Cogent outage impacted the reachability of Verily’s projectbaseline.com for users in Northern California — see an interactive view of the outage here.