Best practices for Memorystore for Valkey

This page provides guidance on using Memorystore for Valkey optimally. This pagealso points out potential issues to avoid.

Memory management best practices

This section describes strategies for managing instance memory so thatMemorystore for Valkey works efficiently for your application.

Memory management concepts

  • Memory usage: the amount of memory that your instance uses. Youhave a fixed memory capacity. You can usemetrics to monitor how much memory you're using.

  • Eviction policy: Memorystore for Valkey uses thevolatile-lru evictionpolicy. You can use Valkey commands like theEXPIRE command to set evictionsfor keys.

Monitor memory usage for an instance

To monitor the memory usage for a Memorystore for Valkey instance, we recommendthat you view the/instance/memory/maximum_utilization metric. If the memoryusage of the instance approaches 80% and you expect data usage to grow, thenscale up the size of the instance to make room for new data.

If the instance has high memory usage, then do the following to improveperformance:

If you run into issues, then contactGoogle Cloud Customer Care.

Scale shards in Cluster Mode Enabled

When youscale the number of shardsin an instance, we recommend that you scale during periods of low writes.Scaling during periods of high usage can put memory pressure on your instancebecause of memory overhead that's caused by replication or slot migration.

If your Valkey use case uses key evictions, then scaling to a smaller instancesize can reduce your cache hit ratio. In this circumstance, however, you don'tneed to worry about losing data, since key eviction is expected.

For Valkey use cases where you don't want to lose keys, you should only scaledown to a smaller instance that still has enough room for your data. Your newtarget shard count should allow for at least 1.5 times the memory used by data.In other words, you should provision enough shards for 1.5 times the amount ofdata in your instance. You can use the/instance/memory/total_used_memory metric to see how much data is stored in your instance.

CPU usage best practices

If an unexpected zonal outage occurs, this leads to reduced CPU resources foryour instance due to lost capacity from nodes in the unavailable zone. Werecommend usinghighly availableinstances. Using multiple replicas per shard (as opposed to one replica pershard) provides additional CPU resources during an outage. You can have up tofive replicas per shard.

Additionally, we recommend managing node CPU usage so that nodes have enough CPUoverhead to handle additional traffic from lost capacity if an unexpected zonaloutage happens. You should monitor CPU usage for primaries and replicas usingtheMain Thread CPU Seconds/instance/cpu/maximum_utilization metric.

Depending on the number of replicas you provision per node, we recommend thefollowing/instance/cpu/maximum_utilization CPU usage targets:

  • For instances with one replica per node, target a/instance/cpu/maximum_utilizationvalue of 0.5 seconds for the primary and 0.5 seconds for the replica.
  • For instances with two replicas per node or greater, target a/instance/cpu/maximum_utilizationvalue of 0.9 seconds for the primary and 0.5 seconds for each replica.

If values for the metric exceed these recommendations, then we recommend scalingup the number of shards in your instance. If you have fewer than five replicasfor your instance, then you can also scale up the number of replicas, up to amaximum of five replicas.

Resource-intensive Valkey commands

We strongly recommend that you avoid using Valkey commands that areresource-intensive. Using these commands might result in the followingperformance issues:

  • High latency and client timeouts
  • Memory pressure caused by commands that increase memory usage
  • Data loss during node replication and synchronization because the Valkeymain thread is blocked
  • Starved health checks, observability, and replication

The following table lists examples of Valkey commands that areresource-intensive and provides you with alternatives that areresource-efficient.

Tip: In addition to being resource-intensive, asyour total data size increases, so does the cost of using these commands.

To find which long-running commands you use, useSLOWLOG. This tool provides you with a list ofcommands that take the longest to run. As a result, you know the commands thatcause latency issues. To determine resource-efficient alternatives for thesecommands, refer to the following table.

CategoryResource-intensive commandResource-efficient alternative
Run for the entire keyspaceKEYSSCAN
Run for a variable-length keysetLRANGELimit the size of the range that you use for a query.
ZRANGELimit the size of the range that you use for a query.
HGETALLHSCAN
SMEMBERSSSCAN
Block the running of a scriptEVALEnsure that your script doesn't run indefinitely.
EVALSHAEnsure that your script doesn't run indefinitely.
Remove files and linksDELETEUNLINK
Publish and subscribePUBLISHSPUBLISH
SUBSCRIBESSUBSCRIBE

Valkey client best practices

Avoid connection overload on Valkey

To mitigate the impact caused by a sudden influx of connection, we recommend thefollowing:

  • Determine the client connection pool size that's best for you. A good startingsize for each client is one connection per Valkey node. You can then benchmarkto see if more connections helps without saturating the maximum allowedconnection count.

  • When the client disconnects from the server because the server times out,retry with exponential backoff with jitter. This helps to avoid multiple clientsoverloading the server simultaneously.

For Cluster Mode Enabled instances

Your application must use a cluster-aware Valkey client when connecting to aMemorystore for Valkey Cluster Mode Enabled instance. For examples ofcluster-aware clients and sample configurations, seeClient library code samples. Your client must maintain a map of hash slots tothe corresponding nodes in the instance to send requests to the correct nodes.This prevents performance overhead that's caused by redirections.

Client mapping

Clients must obtain a complete list of slots and the mapped nodes in thefollowing situations:

  • When the client is initialized, it must populate the initial slot to nodesmapping.

  • When aMOVED redirection is received from the server, such as in thesituation of a failover when all slots served by the former primary node aretaken over by the replica, or re-sharding when slots are being moved from thesource primary to the target primary node.

  • When aCLUSTERDOWN error is received from the server or connections to aparticular server run into timeouts persistently.

  • When aREADONLY error is received from the server. This can happen when aprimary is demoted to a replica.

  • Additionally, clients should periodically refresh the topology to keep theclients warmed up for any changes and learn about changes that may not result inredirections or errors from the server, such as when new replica nodes areadded. Note that any stale connections should also be closed as part of thetopology refresh to reduce the need to handle failed connections during commandruntime.

Client discovery

Client discovery is usually done by issuing aSLOTS,NODES, orCLUSTER SHARDS command to the Valkey server. We recommend using theCLUSTER SHARDS command.CLUSTER SHARDS replaces theSLOTS command(deprecated), by providing a more efficient and extensible representation of theinstance.

The size of the response for the client discovery commands can varybased on the instance size and topology. Larger instances with more nodesproduce a larger response. As a result, it's important to ensure that the numberof clients doing the node topology discovery doesn't grow unbounded.

These node topology refreshes are expensive on the Valkey server but are alsoimportant for application availability. Therefore it is important to ensure thateach client makes a single discovery request at any given time (and cachesresult in-memory), and the number of clients making the requests be kept boundedto avoid overloading the server.

For example, when the client application starts up or loses connection from theserver and must perform node discovery, one common mistake is that the clientapplication makes several reconnection and discovery requests without addingexponential backoff upon retry.This can render the Valkey server unresponsive for a prolonged period of time,causing very high CPU utilization.

Use a discovery endpoint for node discovery

Use the Memorystore for Valkey discovery endpoint to perform nodediscovery. The discovery endpoint is highly available and is load balancedacross all the nodes in the instance. Moreover, the discovery endpoint attemptsto route the node discovery requests to nodes with the most up-to-datetopology view.

For Cluster Mode Disabled instances

When connecting to a Cluster Mode Disabled instance, your application mustconnect to the primary endpoint to write to the instance and to retrieve themost recent writes. Your application can also connect to the reader endpoint toread from replicas and to isolate traffic from the primary node.

If you use thecreate-before-destroy strategy when youperform maintenance on your instance, then you might receive the following errormessage:

READONLY You can't write against a read only replica.

To resolve this issue, stop the connection to your instance. Then, recreate theconnection.

Persistence best practices

This section explains best practices forpersistence.

RDB persistence and adding replicas

For best results of backing up your instance with RDB snapshots or addingreplicas to your instance, use the following best practices:

Memory management

RDB snapshots use a process fork and'copy-on-write' mechanismto take a snapshot of node data. Depending on the pattern of writes to nodes,the used memory of the nodes grows as pages touched by the writes are copied.The memory footprint can be up to double the size of the data in the node.

To ensure that nodes have sufficient memory to complete the snapshot, keep orsetmaxmemory at 80%of the node capacity so that 20% is reserved for overhead. This memory overhead,in addition to monitoring snapshots, helps you manage your workload to havesuccessful snapshots. Also, when you add replicas, lower write traffic as muchas possible. For more information, seeMonitor memory usage for an instance.

Note: If you add a replica to an instance thatuses more than 80% of the instance's maximum memory, then the operation failsand you receive an error message.

To resolve this issue, reduce your instance's memory usage in one of thefollowing ways:

After your instance's memory usage is below the 80% threshold, add thereplica again.

Stale snapshots

Recovering nodes from a stale snapshot can cause performance issues for yourapplication as it tries to reconcile a significant amount of stale keys or otherchanges to your database such as a schema change. If you are concerned aboutrecovering from a stale snapshot, you can disable the RDB persistence feature.Once you re-enable persistence, a snapshot is taken at the next scheduledsnapshot interval.

Performance impact of RDB snapshots

Depending on your workload pattern RDB snapshots can impact the performance ofthe instance and increase latency for your applications. You can minimize theperformance impact of RDB snapshots by scheduling them to run during periods oflow instance traffic if you are comfortable with less frequent snapshots.

For example, if your instance has low traffic from 1 AM to 4 AM, you can set thestart time to 3 AM and set the interval to 24 hours.

If your system has a constant load and requires frequent snapshots, then werecommend that you carefully evaluate the performance impact and weigh thebenefits of using RDB snapshots for the workload.

Add a replica

Adding a replica requires an RDB snapshot. For more information about RDBsnapshots, seeMemory management.

When to use a single-zone instance

If you configure an instance so that it doesn't use replicas, then we recommendthat you use asingle-zone instance.Here's why:

Cost and performance

If minimizing your cost and having peak performance for your clients that arelocated in the same region are your primary drivers, then we recommend that youchoose a single-zone instance.

Minimize your outage impact

When you choose a single-zone instance, zonal outages are less likely to impactyour instance. By placing all nodes within a single zone, the chance of a zonaloutage affecting your server drops from 100% to 33%. There's a 33% chance thatthe zone where your instance is located goes down, as opposed to a 100% chancethat nodes, which are located in the unavailable zone, are impacted.

Rapid recovery

If a zonal outage occurs for a single-zone instance, then Memorystore for Valkeystreamlines the recovery of your data. You can provision a new instance in afunctioning zone quickly and redirect your application for minimally interruptedoperations.

Enable Transport Layer Security (TLS)

This section explains the security benefits and performance implications ofusing Transport Layer Security (TLS), along with recommendations for itsenablement.

Security benefits

By using TLS, you get the following security benefits:

  • Identity and Access Management (IAM) authentication:TLS uses this type of authentication to protect against server spoofing attacks,such as person-in-the-middle attacks.
  • In-transit encryption:Google Cloud's built-in encryption protects traffic within Google's network atan infrastructure level. However, this involves trusting both Google's host andnetwork stacks. Although this encryption is transparent and enabled by default,it's not end-to-end. On the other hand, TLS uses in-transit encryption at theapplication layer. This end-to-end encryption gives you more control over yourencryption keys and processes.
  • Authentication token protection:If you use IAM authentication, then enabling TLS minimizes therisk of exposing and leaking your authentication tokens.

Performance implications

TLS impacts performance in the following ways:

  • Establish connections: Aclient and server that have established a TLS session can resume the sessionwithout repeating the resource-intensive process of establishing the connectionbetween the client and the server. By enabling TLS resumption, you reduce theoverhead of establishing a connection between the client and the server.

    If you don't establish TLS resumption, then establishing connections isresource-intensive. For both new and existing connections, many connectionsbetween the client and the server might lead to connection timeouts. This cancause a snowball effect because Memorystore for Valkey attempts to re-establishtimed-out connections, which increases the resources it uses to establishconnections.

  • Encrypt and decrypt data:Data encryption and decryption involve CPU-intensive operations that impact boththe client and the server. This can reduce your instance's capacity and increasethe instance's latency.

Recommendations

When considering whether to enable TLS, we recommend that you evaluate yoursecurity policies while considering the benefits and drawbacks of TLS. If youchoose to enable TLS, then keep the following considerations in mind:

  • Enabling TLS resumption mitigates overhead for establishing connections. Aconnection between the client and the server is required only for the initialconnection. However, a sudden expansion of the client's instance size mightresult in a brief disruption that's caused by each new client host's initialfull handshake.
  • Although someclient libraries might notoffer built-in controls to enable TLS, you can use custom code to integratethis functionality into your instances.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.