Use metrics to diagnose latency

This page describes the latency metrics that Spanner provides. If yourapplication experiences high latency, use these metrics to help you diagnose andresolve the issue.

You can view these metricsin the Google Cloud console andinthe Cloud Monitoring console.

Overview of latency metrics

The latency metrics for Spanner measure how long it takes for theSpanner service to process a request. The metric captures theactual amount of time that elapses, not the amount of CPU time thatSpanner uses.

These latency metrics do not include latency that occurs outside ofSpanner, such as network latency or latency within yourapplication layer. To measure other types of latency, you can useCloud Monitoring toinstrument your application with custommetrics.

You can view charts of latency metricsin the Google Cloud console andin theCloud Monitoring console. You can view combinedlatency metrics that include both reads and writes, or you can view separatemetrics for reads and writes.

Based on the latency of each request, Spanner groups the requestsinto percentiles. You can view latency metrics for 50th percentile and 99thpercentile latency:

  • 50th percentile latency: The maximum latency, in seconds, for thefastest50% of all requests. For example, if the 50th percentile latency is 0.5 seconds,then Spanner processed 50% of requests in less than 0.5seconds.

    This metric is sometimes called themedian latency.

  • 99th percentile latency: The maximum latency, in seconds, for thefastest99% of requests. For example, if the 99th percentile latency is 2 seconds, thenSpanner processed 99% of requests in less than 2 seconds.

Latency and operations per second

When an instance processes a small number of requests during a period of time,the 50th and 99th percentile latencies during that time are not meaningfulindicators of the instance's overall performance. Under these conditions, a verysmall number of outliers can drastically change the latency metrics.

For example, suppose that an instance processes 100 requests during an hour. Inthis case, the 99th percentile latency for the instance during that hour is theamount of time it took to process the slowest request. A latency measurementbased on a single request is not meaningful.

How to diagnose latency issues

The following sections describe how to diagnose several common issues that couldcause your application to experience high end-to-end latency.

For a quick look at an instance's latency metrics,use theGoogle Cloud console. To examine the metrics more closelyandfind correlations between latency and othermetrics,use the Cloud Monitoring console.

High total latency, low Spanner latency

If your application experiences latency that is higher than expected, but thelatency metrics for Spanner are significantly lower than thetotal end-to-end latency, there might be an issue in your application code. Ifyour application has a performance issue that causes some code paths to be slow,the total end-to-end latency for each request might increase.

To check for this issue, benchmark your application to identify code paths thatare slower than expected.

You can also comment out the code that communicates with Spanner,then measure the total latency again. If the total latency doesn't change verymuch, then Spanner is unlikely to be the cause of the highlatency.

High total latency, high Spanner latency

If your application experiences latency that is higher than expected, and theSpanner latency metrics are also high, there are a few likelycauses:

  • Your instance needs more compute capacity. If your instance does nothave enough CPU resources, and its CPU utilization exceeds therecommendedmaximum, then Spanner might not be able toprocess your requests quickly and efficiently.

  • Some of your queries cause high CPU utilization. If your queries do nottake advantage of Spanner features that improve efficiency, suchasquery parameters andsecondary indexes, or if they include a large number ofjoins or other CPU-intensive operations, the queries can use alarge portion of the CPU resources for your instance.

To check for these issues, use the Cloud Monitoring console tolookfor a correlation between high CPU utilizationand high latency. Also, check thequery statistics for yourinstance to identify any CPU-intensive queries during the same time period.

If you find that CPU utilization and latency are both high at the same time,take action to address the issue:

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.