A Facebook, Instagram, WhatsApp, and Oculus outage knocked every corner of Mark Zuckerberg’s empire offline on Monday. It’s a social media blackout that can most charitably be described as “thorough” and seems likely to prove particularly tough to fix.
Facebook itself has not confirmed the root cause of its woes, but clues abound on the internet. The company’s family of apps effectively fell off the face of the internet at 11:40 am ET, according to when its Domain Name System records became unreachable. DNS is often referred to as the internet’s phone book; it’s what translates the host names you type into a URL tab—like facebook.com—into IP addresses, which is where those sites live.
DNS mishaps are common enough, and when in doubt, they’re the reason why a given site has gone down. They can happen for all kinds of wonky technical reasons, often related to configuration issues, and can be relatively straightforward to resolve. In this case, though, something more serious appears to be afoot.
“Facebook’s outage appears to be caused by DNS; however that’s a just symptom of the problem,” says Troy Mursch, chief research officer of cyberthreat intelligence company Bad Packets. The fundamental issue, Mursch says—and other experts agree—is that Facebook has withdrawn the so-called Border Gateway Protocol route that contains the IP addresses of its DNS nameservers. If DNS is the internet’s phone book, BGP is its navigation system; it decides what route data takes as it travels the information superhighway.
“You can think of it like a game of telephone,” but instead of people playing, it’s smaller networks letting each other know how to reach them, says Angelique Medina, director of product marketing at the network monitoring firm Cisco ThousandEyes. “They announce this route to their neighbor and their neighbor will propagate it out to their peers.”
It’s a lot of jargon, but easy to put plain: Facebook has fallen off the internet’s map. If you try to ping those IP addresses right now? “The packets end up in a black hole,” Mursch says.
The obvious and still unresolved question is why those BGP routes disappeared in the first place. It’s not a common ailment, especially at this scale or for this duration. During the outage, Facebook didn’t say beyond a tweet that it’s “working to get things back to normal as quickly as possible.” After service came trickling back late Monday afternoon, it sent a statement that still lacked any technical detail. “To everyone who was affected by the outages on our platforms today: we’re sorry,” the company said. “We know billions of people and businesses around the world depend on our products and services to stay connected. We appreciate your patience as we come back online.”