i got an warning in my azure subscription with heading "Network Infrastructure in Multiple Regions and Impacted Dependent Services" and detailed message is
Azure SQL Database services were able to handle large number
of requests from our customers quickly enough to seamlessly process and Azure
SQL Database customers would have seen recovery except Central US region.
Unfortunately, Azure SQL Databases in Central US region were overwhelmed by
requests that came in higher rate than expected and resulted in availability
impact to Azure SQL Database service in Central US region. Azure SQL Database
service team was engaged promptly and identified a higher rate of sending
requests that prevented Azure SQL Database services from recovery. The team
controlled the amount of requests to Azure SQL Database service to be able to
handle seamlessly, confirmed all requests were processed normally by 17:15
UTC. Affected HDInsight and Media Services in Central US region were fully
recovered shortly after. CUSTOMER / SLA IMPACT: Customers may have
experienced degraded service availability for multiple Azure services listed
in “Impacted Services” above when connecting to resources or services that
have a dependency on the recursive DNS services. We estimated that the
availability of Azure SQL Database and DW, and HDInsight and Media Services
that are dependent on these was reduced by approximately 60% due to the
impact of the recursive DNS issue. After the recursive DNS issue was
mitigated, a subset of our customers using Azure SQL Database and DW
resources in Central US region, services that have a dependency on Azure SQL
Database and DW in Central US region may have continued experiencing the
impact. WORKAROUND: No workaround was available during the initial impact
period from 11:18 UTC to 13:00 UTC. For customers who were impacted by the
subsequent outage on Azure SQL Database and DW in Central US region, if
customers configured active geo-replication, the downtime would have been
minimized by performing a failover to a geo-secondary which would be loss of
less than 5 seconds of transactions. Please visit https://azure.microsoft.com/en-us/documentation/articles/sql-database-business-continuity/
for more information on these capabilities. AFFECTED SUB REGIONS: All Regions
ROOT CAUSE: The root cause of the initial impact was a software bug in a
class of network device used in multiple regions which incorrectly handled a
spike in network traffic. This resulted in incorrect identification of
legitimate DNS requests as malformed, including requests from Azure services
to resolve the DNS names of any internal endpoint or external endpoint to
Azure from within Azure. The root cause of the subsequent Azure SQL Database
issue in Central US region was triggered by a large amount of requests before
Azure SQL Database service was fully recovered to process those requests,
which resulted in availability impact to Azure SQL Database service in
Central US region. Azure SQL Database and DW and its customers make extensive
use of DNS. This is because the connection path to Azure SQL Database and DW
requires 2 DNS lookups. All Azure SQL database and DW connection requests are
initially handled by an Azure hosted service called the control ring. This is
the IP address referenced by the DNS record
|
No comments:
Post a Comment