Disaster recovery for OpenShift on Google Cloud

Disaster recovery (DR) is essential for maintaining the continuity of yourapplications that are deployed on the OpenShift Container Platform onGoogle Cloud. This document provides an overview of the architecturaloptions for DR with OpenShift on Google Cloud, helping your organizationto achieve minimal downtime and rapid recovery in the event of a disaster.

This document is intended for system administrators, cloud architects, andapplication developers who are responsible for maintaining the availability andresilience of applications on the OpenShift Container Platform deployed onGoogle Cloud.

This document is part of a series that focuses on the application-levelstrategies that ensure your workloads remain highly available and quicklyrecoverable in the face of failures. The documents in this series are asfollows:

DR planning

Planning for DR is a critical component of running production workloads in thecloud. Although OpenShift and Google Cloud offer robustinfrastructure-level redundancy, you must also design and configure yourapplications to quickly recover from catastrophic failures.

Effective DR planning involves a layered approach. You begin by defining clearrecovery time objectives (RTO) and recovery point objectives (RPO) for yourapplication and system for rapid redeployment.

Finally, your secrets and credentials must also be recoverable and securely managed.By considering all of these factors, you can achieve a DR posture that lets youquickly create a new OpenShift cluster in a different region or fail over to ainactive secondary cluster. This secondary cluster remains offline until a failureoccurs, at which point it is started and brought online to take over operationswith minimal downtime.

Architectures for DR

There are different options for deployment architectures that you can usefor DR with OpenShift on Google Cloud. Each of these options has differentimplications for cost, complexity, and availability. The following table providesan overview of these architectures:

ArchitectureDescriptionUse caseAdvantagesDisadvantages
Active-passiveOne cluster is active, handling all traffic, and the other ispassive and ready to take over. Data is replicated to the passivecluster.Suitable for applications with moderate RTO and RPOrequirements.Simpler to implement, lower cost for standby cluster.Higher RTO due to failover time, potential data sync delays.
Active-inactiveSimilar to active-passive, but the inactive cluster is not used untila DR event. Data is regularly backed up.Ideal for cost-sensitive environments that allow for higher RTO and RPO.Lower operational cost when inactive, suitable for DR where a secondary system is not actively running (cold DR) .Higher RTO due to activation and sync time, although there is the potential for data to go out of date.
Active-activeBoth clusters are active, handling traffic with load balancing anddata replication between regions.Critical applications requiring minimal downtime and highavailability.Lowest RTO and RPO, continuous availability.Highest complexity and cost, requires robust network and datasyncs.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.