AWS outage: Company working to restore service as users report a resurgence in issues

Amazon Web Services outage hits major websites

Amazon^[1] Web Services, a leader in the cloud infrastructure market, reported a major outage on Monday, taking down numerous major websites.

Many sites came back online within a few hours, although Downdetector ^[2] showed another spike in user reports around noon ET of outages at Amazon, AWS and Alexa.

Amazon said in a blog post ^[3] Monday afternoon that AWS customers were still experiencing “increased error rates” when trying to launch new instances in EC2, its popular cloud service that provides virtual server capacity.

Around 1:30 p.m. ET, AWS said ^[4] it was starting to see “early signs” of EC2 recovery in some regions and that it was applying fixes to remaining areas “at which point we expect launch errors and network connectivity issues to subside.”

The company’s latest update at 4 p.m. ET said it was still working to recover services.

Amazon also confirmed that the outage impacted Amazon.com, some of its subsidiaries and AWS customer support operations.

“We are working to fully restore service as quickly as possible,” the company wrote.

The outage was first reported at 3:11 a.m. ET in AWS’ main US-East-1 region hosted in northern Virginia. A notice on AWS’ status page said it was experiencing DNS problems with DynamoDB, its database service that underpins many other AWS applications.

DNS, or Domain Name System, translates website names to IP addresses so browsers and other applications can load.

AWS cited an “operational issue” affecting “multiple services” and said it was “working on multiple parallel paths to accelerate recovery,” in an update at 5:01 a.m. ET. More than 70 of its own services were affected.

Shortly afterward, AWS said it was seeing “significant signs of recovery.”

AWS said in an update at 6:35 a.m. ET that the DNS issue had been “fully mitigated” and that AWS service operations were “succeeding normally.”

AWS is the leading provider of cloud infrastructure technology, accounting for around a third of the market, ahead of Microsoft and Google, according to Synergy Research Group ^[5]. Millions of companies and organizations rely on AWS for cloud computing services, such as servers and storage.

Major companies hit

Downdetector showed user reports indicating problems at sites including Disney+^[6], Lyft^[7], the McDonald’s^[8] app, The New York Times^[9], Reddit^[10], Ring doorbells, Robinhood,^[11] Snapchat^[12], United Airlines^[13], T-Mobile and Venmo.

British government websites Gov.uk and HM Revenue and Customs were also experiencing issues, per Downdetector.

A government spokesperson told CNBC: “We are aware of an incident affecting Amazon Web Services, and several online services which rely on their infrastructure. Through our established incident response arrangements, we are in contact with the company, who are working to restore services as quickly as possible.”

Lloyds Banking Group confirmed that some of its services were affected and asked customers “to bear with us” while it worked to restore them. Some 20 minutes later, it added that services were coming back online.

The outage also brought down critical tools inside Amazon. Warehouse and delivery employees, along with drivers for Amazon’s Flex service, reported ^[14] on Reddit ^[15] that internal systems were offline at many sites. Some warehouse workers were instructed to stand by in break rooms and loading areas during their shift, while they couldn’t load Amazon’s Anytime Pay app, which lets employees access a portion of their paycheck immediately.

Seller Central, the hub used by Amazon’s third-party sellers to manage their businesses, was also knocked offline by the outage.

Reddit, too, is “working on scaling Reddit back to 100 percent as we speak,” a spokesperson told CNBC.

Some United and Delta Air Lines^[16] customers reported on social media that they couldn’t find their reservations online, check in or drop bags ^[17].

A T-Mobile spokesperson said its customers had issues when trying to use other sites or services due to the AWS disruption, but that there “was no outage or service disruption” at the carrier.

Canvas, an online teaching platform used to host course information and submit assignments, said ^[18] it was also hit by the “ongoing AWS incident.”

Other social media users cited disruption across cloud-based games, including Roblox and Fortnite, while crypto exchange Coinbase said many users were unable to access the service due to the outage.

Graphic design tool Canva said it was “experiencing significantly increased error rates which are impacting functionality on Canva. There is a major issue with our underlying cloud provider.”

Generative artificial intelligence search tool Perplexity was also affected. “The root cause is an AWS issue. We’re working on resolving it,” CEO Aravind Srinivas said in a post on X.

Centralized software

It’s not the first time in recent history that major companies have been affected by a technical issue. In July 2024, a faulty software upgrade ^[19] by cybersecurity firm Crowdstrike revealed the fragility of global technology infrastructure when it caused Microsoft Windows systems to go dark, creating millions of dollars worth of chaos and grounding thousands of flights in the process. It also affected hospitals and banks.

AWS has also experienced other outages in recent years. A disruption in 2023 ^[20] knocked many websites offline for several hours, while a more severe outage in 2021 ^[21] affected websites and services across the globe, including some of Amazon’s own delivery operations, which were briefly brought to a standstill.

Amazon, Microsoft and Google have long jockeyed to claim enterprise customers. After an outage ^[22] of Microsoft’s suite of productivity software earlier this month, Google sought to capitalize ^[23] on the service lapse by pitching its own tools and a business continuity plan that runs its Workspace service in parallel with Microsoft 365.

In a blog post last week, Google wrote, “Just because Microsoft 365 goes down — and it’s a question of when and for how long, not if — doesn’t mean that your teams need to go back to using pen and paper.”

Google’s cloud services went down ^[24] for an extended period in June, disrupting several major service providers like OpenAI and Shopify. The company said ^[25] the outage was caused by multiple layers of flawed recent updates.

Monday’s AWS outage doesn’t appear to have been caused by a cyberattack, but is more likely a “technical fault affecting one of Amazon’s main data centres,” Rob Jardin, chief digital officer at cybersecurity company NymVPN, said in a statement.

“These issues can happen when systems become overloaded or a key part of the network goes down, and because so many websites and apps rely on AWS, the impact spreads quickly,” he added.

An Amazon spokesperson pointed to AWS’ service health dashboard when reached for comment.

Indeed, “DynamoDB isn’t a term that most consumers know,” Mike Chapple, IT professor at the University of Notre Dame’s Mendoza College of Business and former computer scientist with the National Security Agency, said in a statement. However, it “is one of the record-keepers of the modern Internet.”

“We’ll learn more in the hours and days ahead but early reports indicate that this wasn’t actually a problem with the database itself. The data appears to be safe. Instead, something went wrong with the records that tell other systems where to find their data,” he added.

“This episode serves as a reminder of how dependent the world is on a handful of major cloud service providers: Amazon, Microsoft, and Google. When a major cloud provider sneezes, the Internet catches a cold.”

— CNBC’s Leslie Josephs and Jennifer Elias contributed to this report.

Clarification: This article has been updated to clarify that there was no service disruption at T-Mobile.

References

^{^} Amazon (www.cnbc.com)
^{^} Downdetector (downdetector.com)
^{^} said in a blog post (www.aboutamazon.com)
^{^} said (health.aws.amazon.com)
^{^} Synergy Research Group (www.srgresearch.com)
^{^} Disney+ (www.cnbc.com)
^{^} Lyft (www.cnbc.com)
^{^} McDonald’s (www.cnbc.com)
^{^} The New York Times (www.cnbc.com)
^{^} Reddit (www.cnbc.com)
^{^} Robinhood, (www.cnbc.com)
^{^} Snapchat (www.cnbc.com)
^{^} United Airlines (www.cnbc.com)
^{^} reported (www.reddit.com)
^{^} on Reddit (www.reddit.com)
^{^} Delta Air Lines (www.cnbc.com)
^{^} check in or drop bags (www.cnbc.com)
^{^} said (status.instructure.com)
^{^} faulty software upgrade (www.cnbc.com)
^{^} in 2023 (www.reuters.com)
^{^} in 2021 (www.cnbc.com)
^{^} an outage (x.com)
^{^} sought to capitalize (workspace.google.com)
^{^} went down (www.cnbc.com)
^{^} said (www.cnbc.com)

Byadmin

Read more CNBC tech news

Major companies hit

Centralized software

References

Related

By admin

Related Post

US, Australia sign rare earth, mineral agreement as China tightens supply

Madagascar’s new prime minister named after military coup

BNP Paribas shares fall after US jury’s Sudan verdict

You missed

Jamie Carragher tears into ‘SHOCKING’ West Ham and their owners – as he labels Nuno’s strugglers the ‘slowest’ Premier League team he’s ever seen

West Ham 0-2 Brentford: Nuno Espirito Santo learns the size of the task at hand the hard way as dire Hammers lose again

Maccabi Tel Aviv fans will NOT be able to buy tickets to Aston Villa match, Israeli club says

Boost for Super Eagles as Maduka Okoye Returns to Action