Use metrics to diagnose latency Stay organized with collections Save and categorize content based on your preferences.
This page describes the latency metrics that Spanner provides. If yourapplication experiences high latency, use these metrics to help you diagnose andresolve the issue.
You can view these metricsin the Google Cloud console andinthe Cloud Monitoring console.
Overview of latency metrics
The latency metrics for Spanner measure how long it takes for theSpanner service to process a request. The metric captures theactual amount of time that elapses, not the amount of CPU time thatSpanner uses.
These latency metrics do not include latency that occurs outside ofSpanner, such as network latency or latency within yourapplication layer. To measure other types of latency, you can useCloud Monitoring toinstrument your application with custommetrics.
You can view charts of latency metricsin the Google Cloud console andin theCloud Monitoring console. You can view combinedlatency metrics that include both reads and writes, or you can view separatemetrics for reads and writes.
Based on the latency of each request, Spanner groups the requestsinto percentiles. You can view latency metrics for 50th percentile and 99thpercentile latency:
50th percentile latency: The maximum latency, in seconds, for thefastest50% of all requests. For example, if the 50th percentile latency is 0.5 seconds,then Spanner processed 50% of requests in less than 0.5seconds.
This metric is sometimes called themedian latency.
99th percentile latency: The maximum latency, in seconds, for thefastest99% of requests. For example, if the 99th percentile latency is 2 seconds, thenSpanner processed 99% of requests in less than 2 seconds.
Latency and operations per second
When an instance processes a small number of requests during a period of time,the 50th and 99th percentile latencies during that time are not meaningfulindicators of the instance's overall performance. Under these conditions, a verysmall number of outliers can drastically change the latency metrics.
For example, suppose that an instance processes 100 requests during an hour. Inthis case, the 99th percentile latency for the instance during that hour is theamount of time it took to process the slowest request. A latency measurementbased on a single request is not meaningful.
How to diagnose latency issues
The following sections describe how to diagnose several common issues that couldcause your application to experience high end-to-end latency.
For a quick look at an instance's latency metrics,use theGoogle Cloud console. To examine the metrics more closelyandfind correlations between latency and othermetrics,use the Cloud Monitoring console.
High total latency, low Spanner latency
If your application experiences latency that is higher than expected, but thelatency metrics for Spanner are significantly lower than thetotal end-to-end latency, there might be an issue in your application code. Ifyour application has a performance issue that causes some code paths to be slow,the total end-to-end latency for each request might increase.
To check for this issue, benchmark your application to identify code paths thatare slower than expected.
You can also comment out the code that communicates with Spanner,then measure the total latency again. If the total latency doesn't change verymuch, then Spanner is unlikely to be the cause of the highlatency.
High total latency, high Spanner latency
If your application experiences latency that is higher than expected, and theSpanner latency metrics are also high, there are a few likelycauses:
Your instance needs more compute capacity. If your instance does nothave enough CPU resources, and its CPU utilization exceeds therecommendedmaximum, then Spanner might not be able toprocess your requests quickly and efficiently.
Some of your queries cause high CPU utilization. If your queries do nottake advantage of Spanner features that improve efficiency, suchasquery parameters andsecondary indexes, or if they include a large number ofjoins or other CPU-intensive operations, the queries can use alarge portion of the CPU resources for your instance.
To check for these issues, use the Cloud Monitoring console tolookfor a correlation between high CPU utilizationand high latency. Also, check thequery statistics for yourinstance to identify any CPU-intensive queries during the same time period.
If you find that CPU utilization and latency are both high at the same time,take action to address the issue:
If you did not find many CPU-intensive queries,add compute capacity to theinstance.
Addingcompute capacity provides more CPU resources and enablesSpanner to handle a larger workload.
If you found CPU-intensive queries, review thequery execution plans to learn why the queries areslow, then update your queries to follow theSQL best practices forSpanner.
You might also need to review theschema design for thedatabase and update the schema to allow for more efficient queries.
What's next
- Monitor your instance with theGoogle Cloud consoleor theCloud Monitoring console.
Learn how to
Understand how to reduce read latency by followingSQL best practices and usingtimestamp bounds.
Find out aboutlatency metrics in query statistics tables, which youcan retrieve using SQL statements.
Understandhow instance configuration affects latency.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.