Data center outages: More money, more problems

If companies have invested more in redundancy efforts, why are data center outages happening more often, lasting longer and costing more to resolve? Uptime Institute’s latest report points to the rise of complicated, interconnected IT ecosystems, as well as a host of non-IT issues.

Share this article:

data center outages

The number of data center outages within the past three years has reached an all-time high, with the amount of time and money spent on fixing outages also seeing an uptick.

Figures from the 2021 edition of the Uptime Institute’s annual outage study were presented during a recent webinar led by Rhonda Ascierto, vice-president of research at the institute, and Chris Brown, the institute’s chief technical officer.

While last year was a tough one for data centers in many respects, 2022 could be another challenging period due to non-IT factors such as war, inflation and global supply chain hiccups.

Key numbers

“I think it’s fair to say, looking across all our data, that despite strong investment and attention, outage (numbers) per site and per company are certainly not falling,” Ascierto said.

Although 48 per cent of data centers say they invested more money last year in power redundancy and 39 per cent put more money into cooling systems, the number of outages actually went up. In 2021, 80 per cent of data centers reported suffering an outage within the last three years, a record high for Uptime’s annual survey.

In addition, outages lasted for a longer duration in 2021. The number of outages lasting 48 hours or longer hit 16 per cent in 2021, four times higher than the four per cent reported in 2017. Outages also got more expensive in 2021: 47 per cent of outages cost between $100,000 and $1 million, up from 40 per cent in 2020.

More money, more problems

If companies invested more in redundancy efforts last year, why did outages happen more often, last longer and cost more to resolve? Ascierto believes the rise of complicated, interconnected IT ecosystems plays a role.

“It’s fair to say that the scale of IT systems, particularly for commercial service providers, is phenomenally high. Complexity is increasing. And we are seeing interdependencies amongst IT systems, which means there are more secondary and cascading failures. All of this contributes to these lengthy recovery times, so of course, longer outages are also likely to cost more,” she explained.

Brown put some of the blame on the aging process.

“We have data centers that are just aging. A lot of these older data centers are getting to stages where very expensive equipment would start to fail. Once it fails, it’s going to take a lot of money to replace things such as an engine generator,” he said. “The other reason that we’ve been talking about for years is that in the data center industry, our staffing and our skill sets are aging. People are getting too close to retirement. Some are retiring and we’re having trouble replacing them.”

Why outages happen

data center outages

These were the top causes of “significant” outages in 2021 (second, third and fourth place were a three-way tie):

  • power (43%)
  • software or IT systems error (14%)
  • network issue (14%)
  • cooling issue (14%)
  • third-party service provider (5%)

Brown noted that it “just hasn’t proven to be feasible yet” to fully power data center operations via renewable energy sources like wind or solar.

“Some data centers are using solar arrays to trim their demand during the high-cost times of electrical power. They can’t power the whole data center with it, but they can reduce the demand at the most expensive times, at the peak times that it’s really needed.”

Where outages happen

Here are the three vertical sectors that suffered the most data center outages in 2021:

  • cloud provider/Internet giant (26% of all outages)
  • digital services (25%)
  • financial services (15%)

The high number of outages in the top two sectors illustrate the risk that comes with the “interdependencies amongst IT systems” described earlier by Ascierto. When cloud providers, Internet giants or digital service providers suffer outages, the domino effect can be felt far and wide among all the smaller players who rely on them.

“These are core parts of the Internet that are increasingly owned by fewer organizations,” Ascierto said. “There’s been a real growth in our dependency on these digital providers for basic infrastructure that we need for the Internet to be resilient and to enable our business continuity. So for me, the concentration risk is something that I think is a growing trend and one that’s pretty troubling.”

Brown’s advice for dealing with this risk?

“People have to be masters of their own domain. It doesn’t matter if you’ve outsourced or contracted to another company for a service or decided to use a cloud provider. You have to make the effort and remember that when there’s a problem, the only one who’s going to be looking out for you is you,” he said.

Read more:

5 data centre trends that should be on your radar
What’s really causing IT outages in your data centre
How COVID-19 could reshape the data centre

Future forecast

Although the pandemic eased up somewhat at the start of 2022, non-IT factors may cast a shadow over the outage forecast during the rest of this year.

“We have inflation, we have supply chain issues that seem to be getting better in some areas but worse in others. We’ve had the whole world sort of upset by war,” Brown pointed out.

He added that supply chain problems are forcing many data centers to move away from using standardized IT and equipment, which could escalate the risk of outages.

“Any time you move away from standardization or what your organization is accustomed to, you run risks because you don’t know all the issues that could arise,” he warned. “I don’t know if it’s going to result in more outages but I think it’s going to result in more problems.”

Images: PeopleImages/iStock; Svitlana Hulko/iStock

Share this article:
Comments are closed.