How Nokia is Modernising Data Centre Networks to Cut Outages

Share this article
Share this article
Prioritise Us on Google
A data centre migration switch inside a rack (Credit: Nokia)
Nokia is automating its data centre network fabric and integrating SR Linux, reducing outages of up to nine hours and stabilising services

Nokia is restructuring the networks inside its own data centres after persistent outages proved the costly consequences of outdated infrastructure.

At one point, every hour of network outage inside Nokia’s factories cost about US$100,000, creating pressure on the company to rethink how its internal networks operate.

Those outages occurred in data centre environments where failures could last up to nine hours, which placed losses close to the million-dollar mark during a single event.

A joint report from Futurum Group and Nokia explained the technical weakness of the earlier setup. Previously, if the network skipped a beat for two seconds the database collapsed. Recovery would then take more than two hours, placing strain on employees and business operations.

This was an unsustainable situation for a company that designs telecoms networks for operators around the world, so Nokia responded by consolidating its legacy infrastructure into a modern automated data centre fabric designed around its own networking technology.

Nokia's data centre fabric is a network of switches that work together to provide infrastructure for connecting traditional and AI-based applications installed on servers (Credit: Nokia)

Replacing legacy infrastructure with automated fabric

Nokia rebuilt its internal network around Nokia data centre switches running the SR Linux network operating system, which forms the foundation of a new data centre fabric, linking servers and applications within a facility.

Management and automation sit inside the Nokia Event-Driven Automation platform (EDA) through a Software-as-a-Service (SaaS) model. This opposes traditional, routine operations as the system reacts automatically to network conditions or policy triggers rather than relying on manual intervention from engineers.

Moving to SaaS removes an engineer's inconvenience of manually upgrading servers or managing infrastructure. Instead of managing the platform, the teams can concentrate on network design and automation workflows that shape how services behave across the infrastructure.

Mitch Ashley, VP and Practice Lead of Software Lifecycle Engineering at Futurum, explains the goal behind the redesign on Nokia’s blog:

“Nokia wanted to create an environment where network changes could be made without hesitation, validated before deployment and carried out by teams who trusted both the system and the process.”

Mitch Ashley, VP and Practice Lead of Software Lifecycle Engineering at Futurum Group (Credit: Futurum)

He continues: “That meant shifting to an architecture built around reliable automation, intent-based configuration and integrated change validation.

“And it meant designing for scale from day one – so that lessons learned in a single site could be carried forward globally.”

Phased deployment across global data centres

Rather than replacing every system at once, Nokia IT rolled out the new architecture through phased migration so engineers could test the fabric's behaviour in live environments.

The first places in Europe which saw the changes were the dual data centre sites operating on legacy vendor infrastructure. This initial deployment led to an 80% reduction in network-related incident reports, meaning less time spent by engineers and IT staff on troubleshooting and huge cost savings.

Nokia then expanded the new approach to its US data centres, with trained staff overseeing the process.

According to the Futurum and Nokia report, the migration took place without any unplanned outages caused by the new network, one reason being the automation platform’s digital twin capability. Validation inside the digital twin helps prevent configuration errors that often trigger outages in complex networks.

Youtube Placeholder

Operational gains and reliability improvements

After deploying modern solutions into data centre networks in Europe and the US, recurring outages linked to older manufacturing applications were eliminated out of the networking process.

One application, which had run for about 20 years, failed almost every month on the legacy network. After migration to the SR Linux and EDA-based fabric those disruptions disappeared according to Nokia and Futurum's report.

Routine tasks now run through high-level intent definitions inside the automation platform, allowing engineers to focus on specialised network work rather than repetitive manual configuration.

Nokia’s IT organisation expects the model to scale globally. Once deployment completes across all facilities, the company anticipates that a relatively small team will manage data centre networking operations worldwide through the automated platform, which would not have been feasible before the changes.

As Nokia is a telecoms provider whose technologies underpin operator networks, stabilising the infrastructure behind its own operations is a necessary step in maintaining efficient internal systems and reducing costly downtime.

Company portals

Executives

  • Mitch Ashley

    Vice President and Practice Lead for Software Lifecycle Engineering