Stories
Slash Boxes
Comments

SoylentNews is people

Submission Preview

Link to Story

Microsoft Outage Caused by Overloaded Azure DNS Servers

Accepted submission by Fnord666 at 2021-04-04 06:26:57
Software

Microsoft outage caused by overloaded Azure DNS servers [bleepingcomputer.com]

Microsoft has revealed that Thursday's worldwide outage was caused by a code defect that allowed the Azure DNS service to become overwhelmed and not respond to DNS queries.

At approximately 5:21 PM EST on Thursday, Microsoft experienced a global outage that prevented users from accessing or signing into numerous services, including Xbox Live, Microsoft Office, SharePoint Online, Microsoft Intune, Dynamics 365, Microsoft Teams, Skype, Exchange Online, OneDrive, Yammer, Power BI, Power Apps, OneNote, Microsoft Managed Desktop, and Microsoft Streams.

The service was so wide-spread within Microsoft's infrastructure that even their Azure status page, which is used to provide outage info, was inaccessible.

Microsoft's eventually resolved the outage at approximately 6:30 PM EST, with some services taking a bit longer to function again properly. At the time, Microsoft stated that the outage was caused by a DNS issue but did not provide further information.

Last night, Microsoft published a root cause analysis (RCA) for this week's outage and explained that it was caused by their Azure DNS service becoming overloaded.

========== Extended Copy ==================

[...] Microsoft states that their DNS service could typically handle a large number of requests through DNS caches and traffic shaping. However, a code defect prevented their DNS Edge caches from working correctly. "Azure DNS servers experienced an anomalous surge in DNS queries from across the globe targeting a set of domains hosted on Azure. Normally, Azure's layers of caches and traffic shaping would mitigate this surge. In this incident, one specific sequence of events exposed a code defect in our DNS service that reduced the efficiency of our DNS Edge caches."

[...] As almost all Microsoft domains are resolved through Azure DNS, it was no longer possible to resolve hostnames on these domains and access associated services when the DNS service became overloaded.

[...] To prevent this type of outage in the future, Microsoft states that they are repairing the code defect in Azure DNS so that the DNS cache can adequately handle large amounts of requests. They also plan on improving the monitoring and mitigations of anomalous traffic.


Original Submission