████ # This file was generated bot-o-matically! Edit at your own risk. ████
for jan
AWS admits more services broke as it recovered from outage [theregister.com]:
Amazon Web Services has revealed that its efforts to recover from the massive mess at its US-EAST-1 region caused other services to fail.
The most recent update to the cloud giant’s service health [amazon.com] page opens by recounting how a DNS mess [theregister.com] meant services could not reach a DynamoDB API, which led to widespread outages [theregister.com].
AWS got that sorted at 02:24 AM PDT on October 20th.
But then things went pear-shaped in other ways.
“After resolving the DynamoDB DNS issue, services began recovering but we had a subsequent impairment in the internal subsystem of EC2 that is responsible for launching EC2 instances due to its dependency on DynamoDB,” the status page explains. Not being able to launch EC2 instances meant Amazon’s foundational rent-a-server offering was degraded, a significant issue because many users rely on the ability to automatically create servers as and when needed.
While Amazonian engineers tried to get EC2 working properly again, “Network Load Balancer health checks also became impaired, resulting in network connectivity issues in multiple services such as Lambda, DynamoDB, and CloudWatch.”
- Vodafone keels over, cutting off millions of mobile and broadband customers [theregister.com]
- Microsoft 364 trips over its own network settings in North America [theregister.com]
- Kubernetes kicks down Azure Front Door [theregister.com]
- EU’s cyber agency blames ransomware as Euro airport check-in chaos continues [theregister.com]
AWS recovered Network Load Balancer health checks at 9:38 AM, but “temporarily throttled some operations such as EC2 instance launches, processing of SQS queues via Lambda Event Source Mappings, and asynchronous Lambda invocations.”
The cloud colossus said it throttled those services to help with its recovery efforts which, The Register expects, means it decided not to allow every request for resources because a flood of jobs would have overwhelmed its systems.
“Over time we reduced throttling of operations and worked in parallel to resolve network connectivity issues until the services fully recovered,” the post states.
By 3:01 PM, all AWS services returned to normal operations, meaning problems persisted for over a dozen hours after resolution of the DynamoDB debacle.
AWS also warned that the incident is not completely over, as “Some services such as AWS Config, Redshift, and Connect continue to have a backlog of messages that they will finish processing over the next few hours.”
The post ends with a promise to “share a detailed AWS post-event summary.”
Grab some popcorn. Unless you have an internet-connected popcorn machine, which recent history tells us may be one of a horrifyingly large number of devices that stops working when major clouds go down. ®
Get ourTech Resources [theregister.com]ShareMore about
- AWS
- Cloud Computing
- Outage
More like these×More about
- AWS
- Cloud Computing
- Outage
Narrower topics
- Amazon Bedrock
- AWS Graviton
- Azure
- Cloud Migration
- Cloud native
- Content delivery network
- Digital Ocean
- EC2
- Edge Computing
- FinOps
- Google Cloud Platform
- G Suite
- Hybrid Cloud
- IaaS
- iCloud
- Kubernetes
- Multicloud
- OpenStack
- Paas
- Private Cloud
- Public Cloud
- S3
- Serverless
- Systems Approach
- Virtualization
- vSphere
Broader topics
- Amazon
- Jeff Bezos
More about ShareMore about
- AWS
- Cloud Computing
- Outage
More like these×More about
- AWS
- Cloud Computing
- Outage
Narrower topics
- Amazon Bedrock
- AWS Graviton
- Azure
- Cloud Migration
- Cloud native
- Content delivery network
- Digital Ocean
- EC2
- Edge Computing
- FinOps
- Google Cloud Platform
- G Suite
- Hybrid Cloud
- IaaS
- iCloud
- Kubernetes
- Multicloud
- OpenStack
- Paas
- Private Cloud
- Public Cloud
- S3
- Serverless
- Systems Approach
- Virtualization
- vSphere
Broader topics
- Amazon
- Jeff Bezos
TIP US OFF
Send us news [theregister.com]