Eliminate single points of failure in your critical power system



All products deteriorate over time – it’s one of those unavoidable facts of life. What’s particularly annoying is that any system is only ever as reliable as the least reliable of its components. One famous example is the story of how the failure of a 46-cent computer chip raised a false alarm of an incoming missile attack on the United States . Thankfully, this false alarm was quickly identified for what it was, but the incident teaches us a lesson: Unless your system is designed to cope with internal component malfunctions, you risk system failure.

This is precisely why critical power systems exist: If grid power fails, they provide backup power for your mission-critical applications. But the 46-cent chip lesson applies to your critical power system, too. What happens if a component fails when you need it? If you cannot afford to risk the consequences of critical power system failure, you need to design the system in such a way that the failure of any single component cannot bring it down.

Single points of failure and redundancy

Any component that would cause the entire system to stop operating if it failed is called a single point of failure (SPOF). The 46-cent chip was an SPOF, and in a typical critical power system, the list of components that could become SPOFs is long and includes breakers, controllers, transformers, and communication lines.
Eliminating SPOFs is the only effective way of ensuring resilience in power systems, and the only effective way of eliminating them is to provide redundancy for these components in the power system design; in other words to add backup or duplicate components that can take over in case of component failure in order to eliminate SPOFs.

Computer chip
 

In 1980, a faulty 46-cent computer chip raised a false alarm of a missile attack on the United States. Unless your system is designed to cope with internal component malfunctions, you risk system failure.

A design approach

Achieving redundancy is a design approach where you methodically review every critical part of the system, constantly asking, ‘What would happen if this component failed?’ If the results of component failure are not acceptable, providing redundancy for that component is the answer.

According to results presented at the 2018 Data Center World conference, 48% of all critical data centre failures were caused by equipment failure or inadequate system design. Providing redundancy for critical components allows you to eliminate both of these risks: even if component failure does occur, the system design provides resilience and security of supply through redundancy.

More details in our FREE whitepaper

Which components to focus on depends on how your critical power system was designed, and which level of resilience you require; as a consequence, it is not possible to define a universal guideline for achieving redundancy in critical power systems.

Having said that, we have noticed that particular types of error tend to reoccur. Based on this, we have prepared a whitepaper listing common causes of critical power system error and explaining how you can achieve redundancy for components such as genset starter systems, master controllers, and breakers. To achieve redundancy for genset starter systems, for example, you can equip your gensets with additional starter motors. If your genset controllers offer a double starter feature, you can also use this to ensure that the gensets start when you need them to.

Genset starter motors

Genset starter motors. Ensuring redundancy in genset starter systems is an example of how to eliminate single points of failure in the backup power system.


The whitepaper is FREE and provides a good overview of how to achieve redundancy in critical power systems by eliminating SPOFs. We hope that it will serve as a source of inspiration for you to identify and resolve any design issues in your own installations – whether they were caused by 46-cent computer chips or otherwise.

Download our FREE whitepaper on designing for redundancy

Read our blog post on the benefits of multimaster PMSes in ensuring uninterrupted critical power 

See how a leading Danish colocation provider built an N+1 critical power system with DEIF controllers 

 

About author

Rene Kristensen

Global Business Development Manager

Send email