CenturyLink Outage Causing Internet Wide Problems

Published: 2020-08-30
Last Updated: 2020-08-30 15:05:04 UTC
by Johannes Ullrich (Version: 1)
1 comment(s)

Update From Centrylink at approx 15:00 UTC / 11:00 EDST:
The IP NOC with the assistance of the Operations Engineering team confirmed a routing issue to be preventing BGP sessions from establishing correctly. A configuration adjustment was deployed at a high level, and sessions began to re-establish with stability. As the change propagates through the affected devices, service affecting alarms continue to clear.

Due to the nature of this outage, it may be necessary to reset your services locally at your equipment, or manually reset your BGP session. If after that action has been performed a service issue prevails, please contact the CenturyLink Repair Center for troubleshooting assistance.

Early this morning (US East Coast time), CenturyLink started having problems with routes passing AS3356. This network is central in routing a large part of internet traffic, and the outage is still causing problems for many services like for example OpenDNS, Duo Security, Cloudflare, Imperva (a service SANS, and isc.sans.edu uses). 

At this point, there is no indication that this is an attack. This looks so far like a misconfiguration or maybe a hardware failure.

If a network like AS3356 has problems handling traffic, a typical response is to route traffic via a different network. As a customer of CenturyLink, you would disconnect from CenturyLink, and instead, advertise your IP address space via a different backup ISP. It looks like this failed for two reasons in many cases:

  1. AS3356 itself did not withdraw these routes once the customer disconnected. So the rest of the internet still continued to believe CenturyLink, and is sending traffic to them vs sending it to the backup ISP
  2. Which backup ISP? AS3356 used to belong to Level 3. Level 3 was purchased by CenturyLink. CenturyLink also merged with other ISPs/NSPs like for example Qwest. This is another example of how the Internet has long been much less diverse than it should be.

What can you do about this as an end-user? Not much. Wait for CenturyLink to find a network engineer who is fluent enough in BGP to fix this. Some customers of CenturyLink report estimated times to resolution quoted at 1 pm ET. But there is no public acknowledgment of this time. I have seen some traffic come back to ISC/Imperva. For ISC, we also have dshield.org which does not appear to be affected (different ISP setup). 

You may want to disable affected services like OpenDNS as they may make things worse. Google DNS appears to be working. You could also decide to not require 2FA if you rely on a service like Duo. But I will live that risk decision up to you. And attackers could take advantage of widespread disabling of Duo.

Also: the companies I named here are just some notable once I ran across as affected. There are likely more.

---
Johannes B. Ullrich, Ph.D. , Dean of Research, SANS Technology Institute
Twitter|

1 comment(s)

Comments

Comcast appears to have been seriously impacted by this as well. We saw issues from 6:05 AM until 10:18 AM Eastern Daylight Savings time today.

Diary Archives