Python 2.7 has reached end of supportand will bedeprecatedon January 31, 2026. After deprecation, you won't be able to deploy Python 2.7applications, even if your organization previously used an organization policy tore-enable deployments of legacy runtimes. Your existing Python2.7 applications will continue to run and receive traffic after theirdeprecation date. We recommend thatyoumigrate to the latest supported version of Python.

Memcache API for legacy bundled services

This page provides an overview of the App Engine memcache service. Highperformance scalable web applications often use a distributed in-memory datacache in front of or in place of robust persistent storage for some tasks. AppEngine includes a memory cache service for this purpose. To learn how toconfigure, monitor, and use the memcache service, readUsing Memcache.

Note: The cache is global and is shared across the application's frontend,backend, and all of its services and versions. This API is supported for first-generation runtimes and can be used whenupgrading to corresponding second-generation runtimes. If you are updating to the App Engine Python 3 runtime, refer to themigration guide to learn about your migration options for legacy bundled services.

When to use a memory cache

One use of a memory cache is to speed up common datastore queries. If manyrequests make the same query with the same parameters, and changes to theresults do not need to appear on the web site right away, the application cancache the results in the memcache. Subsequent requests can check the memcache,and only perform the datastore query if the results are absent or expired.Session data, user preferences, and other data returned by queries for web pagesare good candidates for caching.

Memcache can be useful for other temporary values. However, when consideringwhether to store a value solely in the memcache and not backed by otherpersistent storage, be sure that your application behaves acceptably when thevalue is suddenly not available. Values can expire from the memcache at anytime, and can be expired prior to the expiration deadline set for the value. Forexample, if the sudden absence of a user's session data would cause the sessionto malfunction, that data should probably be stored in the datastore in additionto the memcache.

Service levels

App Engine supports two levels of the memcache service:

  • Shared memcache is the free default for App Engine applications. Itprovides cache capacity on a best-effort basis and is subject to the overalldemand of all the App Engine applications using the shared memcache service.

  • Dedicated memcache provides a fixed cache capacity assigned exclusivelyto your application. It's billed by the GB-hour of cache size and requiresbilling to be enabled. Having control over cache size means your app canperform more predictably and with fewer reads from more costly durablestorage.

Both memcache service levels use the same API. To configure the memcache servicefor your application, seeUsing Memcache.

Note: Whether shared or dedicated, memcache is not durable storage. Keys can beevicted when the cache fills up, according to the cache's LRU policy. Changesin the cache configuration or datacenter maintenance events can also flush someor all of the cache.

The following table summarizes the differences between the two classesof memcache service:

FeatureDedicated MemcacheShared Memcache
Price$0.06 per GB per hourFree
Capacity
us-central
1 to 100GB
asia-northeast1, europe-west, europe-west3, and us-east1:
1 to 20GB
other regions:
1 to 2GB
No guaranteed capacity
PerformanceUp to 10k reads or 5k writes (exclusive) per second per GB (items <1KB). For more details, seeCache statistics.Not guaranteed
Durable storeNoNo
SLANoneNone

Dedicated memcache billing is charged in 15 minute increments. If you pay in a currency other than USD, the prices listed in your currency onCloud Platform SKUs apply.

If your app needs more memcache capacity,contact our Sales team.

Limits

The following limits apply to the use of the memcache service:

  • The maximum size of a cached data value is 1 MB (10^6bytes).
  • A keycannot be larger than 250 bytes. In the Python runtime, keys that are stringslonger than 250 bytes will be hashed. (Other runtimes behavedifferently.)
  • The "multi" batch operations can have any number of elements. The total sizeof the call and the total size of the data fetched must not exceed 32megabytes.
  • A memcache key cannot contain a null byte.

Recommendations and best practices

When using Memcache, we recommend that you design your applications to:

  • Handle the case where a cached value is not always available.

    • Memcache is not durable storage. According to theeviction policy, keys are evicted when thecache fills up. Changes in the cache configuration or datacentermaintenance events can also flush some or all of the cache.
    • Memcache may experience temporary unavailability. Memcache operationscan fail for various reasons including changes in cache configuration ordatacenter maintenance events. Applications should be designed to catchfailed operations without exposing these errors to end users. Thisguidance applies especially to Set operations.
  • Use the batching capability of the API when possible.

    • Doing so increases the performance and efficiency of your app,especially for small items.
  • Distribute load across your memcache keyspace.

    • Having a single or small set of memcache items represent adisproportionate amount of traffic will hinder your app from scaling.This guidance applies to both operations/sec and bandwidth. You canoften alleviate this problem by explicitly sharding your data.

      For example, you can split a frequently updated counter among severalkeys, reading them back and summing only when you need a total.Likewise, you can split a 500K piece of data that must be read on everyHTTP request across multiple keys and read them back using a singlebatch API call. (Even better would be to cache the value in instancememory.) For dedicated memcache, the peak access rate on a single keyshould be 1-2 orders of magnitude less than the per-GB rating.

  • Retain your own keys in order to retrieve values from the cache.

    • Memcache does not provide a method to list keys. Due to the nature ofthe cache, it is not possible to list keys without disrupting the cache.Additionally, some languages, like Python, hash long keys, and theoriginal keys are only known to the application.

How cached data expires

Memcache contains key/value pairs. The pairs in memory at any time change asitems are written and retrieved from the cache.

By default, values stored in memcache are retained as long as possible. Valuescan be evicted from the cache when a new value is added to the cache and thecache is low on memory. When values are evicted due to memory pressure, theleast recently used values are evicted first.

The app can provide an expiration time when a value is stored, as either anumber of seconds relative to when the value is added, or as an absolute Unixepoch time in the future (a number of seconds from midnight January 1,1970). The value is evicted no later than this time, though it can beevicted earlier for other reasons. Incrementing the value stored for an existingkey does not update its expiration time.

Under rare circumstances, values can also disappear from the cache prior toexpiration for reasons other than memory pressure. While memcache is resilientto server failures, memcache values are not saved to disk, so a service failurecan cause values to become unavailable.

In general, an application should not expect a cached value to always beavailable.

You can erase an application's entire cache via the API or in the memcachesection ofGoogle Cloud console.

Note: The actual removal of expired cache data is handled lazily. An expireditem is removed when someone unsuccessfully tries to retrieve it. Alternatively,the expired cache data falls out of the cache according to LRU cache behavior,which applies to all items, both live and expired. This means when the cachesize is reported in statistics, the number can include live and expired items.

Cache statistics

Operations per second by item size

Note: This information applies to dedicated memcache only.

Dedicated memcache is rated in operations per second per GB, where anoperation is defined as an individual cache item access, such as aget,set, ordelete. The operation rate varies by item sizeapproximately according to the following table. Exceeding theseratings might result in increased API latency or errors.

The following tables provide the maximum number of sustained, exclusiveget-hit orset operations per GB of cache. Note that aget-hit operationis aget call that finds that there is a value stored with the specified key,and returns that value.

Item Size (KB)Maximumget-hit ops/sMaximumset ops/s
≤110,0005,000
1002,0001,000
512500250

An app configured for multiple GB of cache can in theory achieve an aggregateoperation rate computed as the number of GB multiplied by the per-GB rate.For example, an app configured for 5GB of cache could reach 50,000 memcacheoperations/sec on 1KB items. Achieving this level requires a good distributionof load across the memcache keyspace.

For each IO pattern, the limits listed above are for readsorwrites. For simultaneous readsand writes, the limits are on a slidingscale. The more reads being performed, the fewer writes can beperformed, and vice versa. Each of the following are example IOPslimits for simultaneous reads and writes of 1KB values per 1GB of cache:

Read IOPsWrite IOPs
100000
80001000
50002500
10004500
05000

Memcache compute units (MCU)

Note: This information applies to dedicated memcache only.

Memcache throughput can vary depending on the size of the item you are accessingand the operation you want to perform on the item. You can roughly associate acost with operations and estimate the traffic capacity that you can expect fromdedicated memcache using a unit called Memcache Compute Unit (MCU). MCU isdefined such that you can expect 10,000 MCU per second per GB of dedicatedmemcache. The Google Cloud console shows how much MCU your app is currently using.

Note that MCU is a rough statistical estimation and also it's not a linearunit. Each cache operation that reads or writes a value has a corresponding MCUcost that depends on the size of the value. The MCU for aset depends on thevalue size: it is 2 times the cost of a successfulget-hit operation.

Note: The way that Memcache Compute Units(MCU) are computed is subject to change.
Value item size (KB)MCU cost forget-hitMCU cost forset
≤11.02.0
21.32.6
101.73.4
1005.010.0
51220.040.0
102450.0100.0

Operations that do not read or write a value have a fixed MCU cost:

OperationMCU
get-miss1.0
delete2.0
increment2.0
flush100.0
stats100.0

Note that aget-miss operation is aget that finds that there is no valuestored with the specified key.

Compare and set

Compare and set is a feature that allows multiple requests that are beinghandled concurrently to update the value of the same memcache key atomically,avoiding race conditions.

Note: For a complete discussion of the compare and set feature for Python, seeGuido van Rossum's blog postCompare-And-Set in Memcache.

Key logical components of compare and set

If you're updating the value of a memcache key that might receive otherconcurrent write requests, you must use the memcacheClient object, whichstores certain state information that's used by the methodsthat support compare and set. You cannot use the memcache functionsget() orset(), because they are stateless. TheClient class itself is notthread-safe, so you should not use the sameClient object in more than onethread.

When you retrieve keys, you must use the memcacheClient methods that supportcompare and set:gets() orget_multi() with thefor_cas parameter set toTrue.

When you update a key, you must use the memcacheClient methods that supportcompare and set:cas() orcas_multi().

The other key logical component is the App Engine memcache service and itsbehavior with regard to compare and set. The App Engine memcache service itselfbehaves atomically. That is, when two concurrent requests (for the same app id)use memcache, they will go to the same memcache service instance, and thememcache service has enough internal locking so that concurrent requests for thesame key are properly serialized. In particular this means that twocas()requests for the same key do not actually run in parallel -- the service handlesthe first request that came in until completion (that is, updating the value andtimestamp) before it starts handling the second request.

To learn how to use compare and set in Python, readHandling concurrent writes.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.