Sci-Tech

Meta and OpenAI face massive outages: What went wrong?

As social media platforms become more essential for communication, the impact of outages on users worldwide is significant

Meta and OpenAI face massive outages: What went wrong?
Meta logo is seen in this illustration taken, August 22, 2022.
Reuters

Meta acknowledged the disruption and assured users that the issue was nearly resolved

OpenAI's ChatGPT and Sora also went offline shortly after Meta's outage

Growing concern over private companies controlling these essential communication platforms

If you found yourself frustrated last night because your Instagram story wouldn’t upload, you might have assumed it was another government crackdown on social media in Pakistan.

Surprisingly, this time, the government wasn’t to blame. Instead, Meta’s platforms, including Instagram, Facebook, and WhatsApp, experienced a massive global outage in the early hours of Thursday, December 12 (Pakistan Standard Time - PKT). Adding to the chaos, OpenAI’s ChatGPT and Sora also went offline shortly after Meta's outage.

Downdetector, a platform that tracks service disruptions, reported sudden and significant spikes for Facebook, WhatsApp, Messenger, and Instagram. Instagram alone saw over 70,000 reported issues at its peak, while Facebook had more than 100,000 outage reports.

Meta acknowledged the disruption on its social media account on X (formerly Twitter) shortly before midnight PKT. A few hours later, the company updated users, assuring them that the issue was nearly resolved.

“We’re 99% of the way there — just doing some last checks. We apologize to those who’ve been affected by the outage,” Meta tweeted at 3:26 AM PKT.

Meanwhile, OpenAI reported a major outage for ChatGPT, Sora, and its API services starting around 4 AM PKT, according to its status page. By 10 AM PKT, most services were back online.

Edwin Arbus, OpenAI’s developer community lead, explained the issue in a post on X, attributing the outage to a “configuration change that caused many servers to become unavailable.”

As of now, Meta has not issued an official statement regarding the cause of its global outage.

A history of outages

Recurring outages are not new to the tech world. Meta has faced several significant outages over the years, each affecting millions of users worldwide. Here’s a brief timeline:

  • March 2024: A global outage impacted over 500,000 users of Facebook, Instagram, and Threads. The issues started around 3 PM GMT, with no official cause disclosed.
  • October 2021: A six-hour outage took down all Meta platforms, including Messenger, Instagram, and WhatsApp. The issue was traced to a misconfiguration that disconnected Meta's backbone routers, with initial speculation blaming BGP and DNS errors.
  • March 2019: A partial outage lasting over 14 hours disrupted Facebook, Instagram, Messenger, and WhatsApp. It was caused by a bug triggered during routine maintenance, making it difficult to send or receive media files.
  • January 2015: A 50-minute outage affected Facebook, Instagram, and other services reliant on Facebook logins, such as Tinder and Hipchat. The issue stemmed from a configuration change within Facebook's systems.

What's behind the outage?

“There isn’t usually a consistent reason why a website or app goes down,” said Aadil Ayub, a web developer and system administrator at Autonomic Co-operative, a worker-owned co-op providing digital infrastructure to NGOs, art collectives, activist groups, and individuals.

Ayub explained that outages can arise from various issues, with no single consistent cause. He stated, "Hardware faults, such as a hosting provider updating equipment or replacing a hard disk, can lead to disruptions. Similarly, failed software updates or poorly optimized websites overwhelmed by heavy traffic are other common triggers."

He added that configuration errors and faulty code updates further contribute to the unpredictable nature of these disruptions.

“The more complex the system becomes, the harder it is to figure out the cause of the outage,” Ayub said, referencing recent incidents involving Meta and OpenAI.

OpenAI’s outage occurred just days after the debut of its video generator, Sora. The tool is capable of producing up to 1080p video footage and features a 'Storyboard' option, allowing users to create cohesive video sequences by combining multiple prompts.

“AI applications are extremely performance-intensive. ChatGPT is used by billions of people and performance costs then scale exponentially,” Ayub said.

He noted the substantial computing demands of rendering videos, emphasizing that combining AI with video generation can dramatically amplify these costs.

“It’s quite likely that the high performance and compute costs for AI video generation were so significant that OpenAI couldn’t handle them. It’s likely this is not a coincidence,” Ayub added.

Outages: The new power cuts?

"In layman terms, an outage is just when an application or service stops working," stated Ayub.

"The use of this word is interesting because it’s used in the context of internet-connected services like WhatsApp, Facebook, Instagram, search engines, and websites. But it’s also used in the context of public infrastructure like an electricity outage or a gas outage," he added.

Ayub emphasized the growing dependence of the general public on social media platforms, likening it to reliance on essential public utilities.

"The parallel is interesting because it shows we have started treating these third-party corporate services almost the same way as we treat electricity. We need it to be always available," he said.

According to DataReportal, there were 5.22 billion social media users worldwide as of October 2024, accounting for 63.8% of the global population. Meanwhile, research from GWI revealed that the average social media user visits 6.8 platforms monthly and spends about two hours and 19 minutes daily on social media.

Zuha Siddiqui, a freelance journalist covering technology and climate, noted the impact of social media outages on an increasing number of people.

"One of the reasons why this outage has affected so many people all over the world is because platforms like Instagram and WhatsApp have become ubiquitous for communication," she said.

"WhatsApp, particularly in countries like Pakistan, has replaced traditional text messaging. When the app doesn’t work, it automatically leads to a [communication] crisis."

As reliance on social media grows, the broader implications of private companies controlling these platforms have become harder to ignore.

"It’s very problematic that such vital infrastructure is controlled by third-party corporations in the U.S. We have no transparency or control over them," Ayub said. "Ideally, these should be public services collectively controlled by citizens."

Comments

See what people are discussing

More from Science

Warming threatens to expand area of world too hot for humans

Warming threatens to expand area of world too hot for humans

At 2°C warming, deadly heat could cover an area the size of the U.S., putting billions at risk, a study warns

More from World

EU seeks early US talks to avert Trump tariffs

EU seeks early US talks to avert Trump tariffs

Trump has said EU is next in line after Canada, Mexico, China; Von der Leyen sees tough negotiations, will protect EU interests