A ‘cascading’ network issue took out all of the company’s services for billions of people across the world in its worst outage in years.
Facebook apps and services including Messenger, WhatsApp and Instagram were down for more than six hours yesterday (4 October).
The outage began at approximately 4:40pm IST and affected all of the company’s 3.5bn users across the world, with most services only beginning to reappear after 11pm. Full restoration of service took much longer for some users.
In a blog post, Facebook vice-president for infrastructure Santosh Janardhan said that the root cause of the issue was “configuration changes on the backbone routers that coordinate network traffic between our data centres”. He said this had a “cascading effect” on the company’s systems, “bringing our services to a halt”.
Network infrastructure company Cloudflare explained that the issue was with an update to Facebook’s Border Gateway Protocol, which tells devices on the internet how to navigate to the company’s network. In effect, the routing information that allows people’s computers to find Facebook and Facebook’s data centres to talk to each other suddenly disappeared.
Because the outage affected the fundamental structure of the company’s network, Facebook’s internal email system, communications platform and tools were also not working. The New York Times reported that company staff turned to LinkedIn and Discord to communicate with each other, and were unable to enter buildings with their ID badges.
These internal issues delayed diagnosis of the issue and complicated the response to it. The company dispatched a team to one of its key data centres in California to attempt to reset its servers.
This is the worst outage the social networking giant has experienced since 2019, when its services disappeared for 24 hours. Since then, the company has only become more fundamental to how the internet functions, adding more than a billion users across its platforms.
Janardhan’s blog post said that Facebook “understand[s] the impact outages like these have on people’s lives, and our responsibility to keep people informed about disruptions to our services”. He said the company was seeking to understand the issue and make its infrastructure “more resilient”.
As well as denying billions of people their ordinary means of communication, the outage is likely to have an effect on Facebook’s business outlook. Mike Proulx, research director and vice-president at Forrester, noted: “It’s a sure tell that something must be pretty bad at Facebook when the company is forced to turn to its competitor, Twitter, to communicate with its users.
“This outage has widespread implications to the advertising ecosystem given the fact that ads weren’t being served for over six hours across Facebook and Instagram, which command the lion’s share of social media ad revenue. This not only affects Facebook’s revenue (and stock price) but also brands’ bottom lines.”
Proulx also pointed out that the severe damage caused by this outage was a consequence of Facebook’s consolidation of all its apps onto one set of infrastructure, making it vulnerable to a single point of failure.
The issue exponentially worsened what was already a bad day for the company, after whistleblower Frances Haugen gave an interview to CBS 60 Minutes in which she alleged that the company is aware of severe damage to society caused by its services but has consistently chosen “profit over safety”.
Haugen is due to testify before a US Senate committee today.
Facebook’s share price dropped by 4.9pc yesterday following these events, though it has since begun to recover.