Microsoft revealed how it’s scaling Azure during a pandemic

17 Jun 2020

Image: © Ricochet64/Stock.adobe.com

Microsoft has described its efforts to manage the increased demand for cloud platform Azure during the Covid-19 pandemic.

On Tuesday (16 June), Microsoft Stories writer Jennifer Langston detailed the company’s response to increased demand for the Azure cloud business during the Covid-19 pandemic.

“Cloud providers like Microsoft Azure are by their nature designed to expand and scale quickly and meet elastic demand,” she wrote. “With more than 60 data centre regions around the globe – including three new regions announced this past May in Italy, New Zealand and Poland – Microsoft can shift traffic if a natural disaster or power outage affects capacity in one part of the world.”

In April, Azure published a blogpost about business continuity, saying it planned to scale up and meet demand for its cloud services as schools moved online, workers left offices and individuals turned to video calls and video games to socialise with friends.

‘The scope and scale of the response to Covid-19 was completely unprecedented, in terms of how much of the world went digital inside a month’
– MARK SIMMS

It also expanded a cloud-based Hospital Emergency Response Solution run on Azure to help US hospitals cope with the health crisis.

While the company may not have predicted the situation would continue all around the world for this long, Langston suggested that the company was ready to deal with demand spikes, as it has hardware stockpiled in warehouses “ready to be deployed” when it is needed.

Scaling Azure

In the latest blogpost, Langston mentioned some of the Microsoft staff that helped the company scale its technology in recent months.

For instance, Aarthi Natarajan, partner director of engineering for Microsoft Teams, helped spread Teams’ capacity across additional data centre regions to share the load from the unprecedented demand.

In March, Microsoft Teams saw record daily meeting minutes, with 2.7bn minutes logged in a single day, up from 900m minutes two weeks earlier. In April, that figure climbed to 4.1bn meeting minutes in a single day.

Elsewhere, a team run by software engineer Sneha Shankar had to quickly scale Windows Virtual Desktop, which saw usage triple.

And Mahesh Nayak, principal programme manager at Azure’s Wide Area Network (WAN), worked with his team to ensure that the company’s fibre optic network that carries data around the globe was up to the task of meeting demand. The WAN team added 110 terabits of capacity within two months.

Through the company’s Azure active directory team, Microsoft also helped companies offer employees working from home secure remote access to on-premises corporate apps through the Azure AD Application Proxy.

Langston outlined how engineer John Sheehan led efforts to find efficiencies in Microsoft services running on Azure to free up capacity for existing customers and organisations working on the frontline of the Covid-19 response.

Working around the clock

Microsoft Teams used early data from China and Italy to plan for expected growth as the pandemic spread into other regions. Every Sunday night, Microsoft employees gathered remotely to look for bottlenecks and troubleshoot as the situation evolved in Europe and the US.

Mark Simms, a partner software architect managing the Covid-19 response across Azure, commented: “The scope and scale of the response to Covid-19 was completely unprecedented, in terms of how much of the world went digital inside a month.

“So the work that we had to do to get through the initial surge in demand and free up capacity for our customers to run critical health and safety workloads was also unprecedented.”

Simm said that “pretty profound” changes were made in a “very short time frame” at Azure. Data centre employees began working around-the-clock shifts to install new servers while staying at least six feet apart.

Microsoft product teams worked to find any further efficiencies to free up Azure resources for other customers. The company doubled capacity on one of its own undersea cables carrying data across the Atlantic and said it also negotiated with owners of another subsea cable to open up additional capacity.

Langston wrote: “Network engineers installed new hardware and tripled the deployed capacity on the America Europe Connect cable in just two weeks.”

The company described how Xbox, which runs on Azure, is accustomed to ramping up for large surges in usage that typically occur around holidays or new video game releases. With distancing restrictions put in place around the world, there was a 50pc increase in multiplayer gameplay and a 30pc increase in peak concurrent usage.

The company worked to move gaming workloads out of high-demand data centres in the UK and Asia, to free up capacity for other Azure customers without negatively impacting the experiences of its own users.

Casey Jacobs, who manages reliability for Xbox operations, said: “There’s no question in those regions the people who were on the frontlines of Covid-19 efforts really needed that capacity more than us. And our telemetry gave us confidence that we could make these trade-offs while protecting our customer experience.”

Kelly Earley was a journalist with Silicon Republic

editorial@siliconrepublic.com