Microsoft faced another test of its cloud infrastructure resilience this week after reporting two Azure service disruptions within two days, with each incident highlighting the tightly integrated nature of modern cloud systems. These disruptions were related to core management and identity services, which had a ripple effect on a large number of downstream products.
The latest outage hit Managed Identity for Azure resources. Customers in the East US and West US regions felt it most. Microsoft said the issue began at around 0015 UTC and lasted until approximately 0605 UTC, stretching close to six hours. During that window, users experienced failures when attempting to create, update, delete, or request tokens through the service. Managed Identity plays a critical role in Azure environments by allowing applications to authenticate without manually handling secrets or certificates, which makes any interruption particularly disruptive.
As a result, several Azure services that rely on managed identities also experienced problems. These included Azure Synapse Analytics, Azure Databricks, Azure Stream Analytics, Azure Kubernetes Service, Microsoft Copilot Studio, Azure Container Apps, and Azure AI Video Indexer, among others. Microsoft confirmed that it mitigated the issue, although it did not immediately provide a detailed breakdown of the underlying cause.
The identity outage followed closely after a separate incident involving virtual machine management operations. In that case, customers reported errors when performing routine actions such as creating, starting, stopping, scaling, or deleting virtual machines. Microsoft acknowledged the issue at 1946 UTC and later traced it to a configuration change that restricted public access to certain Microsoft managed storage accounts used for hosting virtual machine extension packages.
Because many services depend on virtual machine management operations, the impact extended beyond compute alone. Azure Arc Enabled Servers, Azure DevOps, Azure Kubernetes Service, Azure Backup, Azure Firewall, Azure Virtual Machine Scale Sets, and even GitHub Actions experienced degraded performance during the incident. GitHub reported issues shortly after 1900 UTC and did not fully resolve them until early the next day.
Taken together, these two outages highlight the ripple effects that exist when cloud infrastructure is highly interconnected. A misconfiguration or problem in one region can easily manifest in unrelated products, making it difficult to resolve. For customers, these disruptions serve as a reminder that even the largest cloud providers cannot fully shield their platforms from internal changes and errors, particularly when core services depend on shared components.
