High availability and replicas

This page explains how Memorystore for Valkey's architecture supports and provideshigh availability (HA). This page also explains recommended configurations that contributeto improved instance performance and stability.

Note: For more information about region-specific considerations, seeGeography and regions.

High availability

Memorystore for Valkey is built on a highly available architecture where your clients access managed Memorystore for Valkey nodes directly. Your clients do this by connecting to individual endpoints, as described inConnect to a Memorystore for Valkey instance.

Connecting to shard(s) directly provides the following benefits:

  • Direct connection avoids intermediate hops, which minimizes the round-trip time (client latency) between your client and the Valkey node.

  • In Cluster Mode Enabled, direct connection avoids any single point of failure because each shard is designed to fail independently. For example, if traffic from multiple clients overloads a slot (keyspace chunk), shard failure limits the impact to the shard responsible for serving the slot.

Recommended configurations

We recommend creating highly available multi-zone instances as opposed tosingle-zone instances because of the better reliability they provide. However, if you choose to provision an instance without replicas, we recommend choosing a single-zone instance. For more information, seeWhen to use a single-zone instance.

To enable high availability for your instance, you must provision at least 1 replica node for every shard. You can do this whenCreating the instance, or you canScale the replica count to at least 1 replica per shard. Replicas provideAutomatic failover during planned maintenance and unexpected shard failure.

You should configure your client according to the guidance inClient best practices. Using recommended best practices allows your client to handle the following items for your instance automatically and without any downtime:

  • The role (automatic failovers)

  • The endpoint (node replacement)

  • Cluster Mode Enabled-related slot assignment changes (consumer scale out and in)

Replicas

A highly available Memorystore for Valkey instance is a regional resource. Memorystore for Valkey distributes the primary and replica VMs of shards acrossmultiple zones to safeguard against a zonal outage. Memorystore for Valkeysupports instances with 0-5 replicas per node.

You can use replicas to increase read throughput at the cost of potential datastaleness.

  • Cluster Mode Enabled: Use theREADONLY command to establish a connectionthat allows your client to read from replicas.
  • Cluster Mode Disabled: Connect to thereader endpoint to connect to any of the availablereplicas.

Cluster Mode Enabled Instance shapes

The following diagrams illustrate shapes for Cluster Mode Enabled instances:

Instance shape with three shards and zero replicas per node

A Memorystore for Valkey Cluster Mode Enabled instance with no replicas that has nodes divided evenly across three zones.

Instance shape with three shards and one replica per node

A Memorystore for Valkey Cluster Mode Enabled instance with one replica per node, and nodes divided evenly across three zones.

Instance shape with three shards and multiple replicas per node

A Memorystore for Valkey Cluster Mode Enabled instance with multiple replicas per node, and nodes divided evenly across three zones.

Cluster Mode Disabled Instance shapes

The following diagram illustrates a shape for Cluster Mode Disabled instances:

Instance shape with multiple replicas

A Memorystore for Valkey Cluster Mode Disabled instance with multiple replicas and nodes divided evenly across three zones.

Automatic failover

Automatic failovers within a shard can occur due tomaintenance or an unexpected failure of the primary node. During a failover a replica is promoted to be the primary. You can configure replicas explicitly. The service can also temporarily provision extra replicas during internal maintenance to avoid any downtime.

Automatic failovers prevent data-loss during maintenance updates. For details about automatic failover behavior during maintenance, seeAutomatic failover behavior during maintenance.

Failover and node repair duration

Automatic failovers can take time on the order of tens of seconds for unplanned events such as a primary node process crash, or a hardware failure. During this time the system detects the failure, and elects a replica to be the new primary.

Node repair can take time on the order of minutes for the service toreplace the failed node. This is true for all primary and replica nodes. For instances that aren't highly available (no replicas provisioned), repairing a failed primary node also takes time on the order of minutes.

Client behavior during an unplanned failover

Client connections are likely to be reset depending on the nature of the failure. After automatic recovery, connections should be retried withexponential backoff to avoid overloading primary and replica nodes.

Clients using replicas for read throughput should be prepared for a temporary degradation in capacity until the failed node is automatically replaced.

Lost writes

During a failover resulting from an unexpected failure, acknowledged writes maybe lost due to the asynchronous nature of Valkey's replication protocol.

Client applications can leverage the Valkey WAIT command to improve real world data safety.

Keyspace impact of a single zone outage

This section describes the impact of a single zone outage on a Memorystore for Valkeyinstance.

Multi-zone instances

  • HA instances: If a zone has an outage, the entire keyspace is available for reads and writes, but since some read replicas are unavailable, the read capacity is reduced. We strongly recommend over-provisioning cluster capacity so that the instance has enough read capacity, in the rare event of a single zone outage. Once the outage is over, replicas in the affected zone are restored and the read capacity of the cluster returns to its configured value. For more information, seePatterns for scalable and reliable apps.

  • Non-HA instances (no replicas): If a zone has a outage, the portion of the keyspace that is provisioned in the affected zone undergoes a data flush, and is unavailable for writes or reads for the duration of the outage. Once the outage is over, primaries in the affected zone are restored and the capacity of the cluster returns to its configured value.

Single-zone instances

  • Both HA and Non-HA instances: If the zone that the instance is provisioned in has an outage, the cluster is unavailable and data is flushed. If a different zone has an outage, the cluster continues to serve read and write requests.

Best practices

This section describes best practices for high availability and replicas.

Add a replica

Adding a replica requires an RDB snapshot. RDB snapshots use a process fork and'copy-on-write' mechanism to take asnapshot of node data. Depending on the pattern of writes to nodes, the usedmemory of the nodes grows as pages touched by the writes are copied. The memoryfootprint can be up to double the size of the data in the node.

To ensure that nodes have sufficient memory to complete the snapshot, keep orsetmaxmemory at 80% ofthe node capacity so that 20% is reserved for overhead. This memory overhead, inaddition to monitoring snapshots, helps you manage your workload to havesuccessful snapshots. Also, when you add replicas, lower write traffic as muchas possible. For more information, seeMonitor memory usage for an instance.

Note: If you add a replica to an instance thatuses more than 80% of the instance's maximum memory, then the operation failsand you receive an error message.

To resolve this issue, reduce your instance's memory usage in one of thefollowing ways:

After your instance's memory usage is below the 80% threshold, add thereplica again.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.