You probably noticed or heard that CenturyLink / Level 3 had a big network outage this past Sunday morning. Several popular sites and online services were down or unusable, including Amazon, Hulu, PlayStation Network, Xbox Live, Steam, and Twitter.
So what happened? Bigleaf’s founder and CEO took a few minutes to share some insights from our perspective running a network that peers with CenturyLink and many others.
Hi, I’m Joel Mulkey, founder and CEO of Bigleaf Networks. You may be wondering what happened with the CenturyLink / Level 3 outage that happened yesterday, on Sunday. We definitely saw that going on. We own and operate our own backbone network and had a lot of visibility and were able to respond to that. I’ll talk you through a little bit of that now.
So we had a couple responses to that. One that was automated. So our SD-WAN software that is managing traffic constantly – 10 times a second – detected that and responded by rerouting customer traffic wherever possible, keeping folks up and running, which was great. We also had a manual response. We have a skilled network engineering and operations team who was alerted and went and dug in and found some optimizations they could make in how traffic was flowing.
And through that, we saw some different customer experiences. So the nature of this issue that CenturyLink / Level 3 had – it was actually the Level 3 network which is now owned by CenturyLink but not fully incorporated – they had this BGP issue. BGP is the routing protocol that runs the internet, and you can think of it as a sort of black hole sort of an issue where, like an onramp on a freeway. If Waze is sending all the traffic to that onramp but there’s actually an accident, this was a similar scenario where CenturyLink was saying “hey, get to all these networks through me” yet their network wasn’t functioning right.
And so, it was a very difficult time for network operators – kind of unprecedented with CenturyLink even telling other big carriers “hey, shut off your connections to us” which was a pretty substantial move, disconnecting one of the world’s biggest networks from the internet. But they had about a 4-hour outage, from 4 am Pacific Time to about 8 am Pacific Time.
And we were able to respond to that. So our customers, they saw and experienced – if they had multiple WAN circuits, maybe they had AT&T and something else, generally they stayed up and running. Although because of that black hole-ing, no exclusively. Some customers had outages because their traffic was flowing through CenturyLink and CenturyLink was just dropping it. And then on the content side, if you were trying to reach content that was hosted by CenturyLink, even as one of the paths to that content, you may have been able to not reach it. So it was quite the dramatic moment. Thankfully, most of our customers stayed up and running, and were happy. If you only had one WAN circuit with CenturyLink, obviously you were down. In that case, I certainly do recommend – take a look at diverse WAN connections, take look at Bigleaf as an intelligent SD-WAN platform that can automatically mitigate these kinds of issues. And let us know if you have any questions. We’re happy to share more about this, what we saw, and how our platform can help. Thanks.