Read statistics Stay organized with collections Save and categorize content based on your preferences.
Spanner provides built-in tables that store statistics about reads. You canretrieve statistics from theseSPANNER_SYS.READ_STATS* tables using SQLstatements.
When to use read statistics
Read statistics provide insight into how an application is using the database, andare useful when investigating performance issues. For example, you can checkwhat read shapes are running against a database, how frequently theyrun and explain the performance characteristics of these read shapes. You canuse the read statistics for your database to identify read shapes that result inhigh CPU usage. At a high level, read statistics will help you to understand thebehavior of the traffic going into a database in terms of resource usage.
Limitations
This tool is best suited for analyzing streams of similar reads that account formost of the CPU usage. It is not good for searching for reads that were run onlyone time.
The CPU usage tracked in these statistics represents Spannerserver side CPU usage, excluding prefetch CPU usage and some other overheads.
Statistics are collected on a best-effort basis. As a result, it ispossible for statistics to be missed if there are issues with underlyingsystems. For example, if there is internal networking issues, it is possiblefor some statistics to be missed.
Access read statistics
Note: Spanner Studio (formerly labeledQuery in the Google Cloud console)supports SQL, DML, and DDL operations in a single editor. For more information,seeManage your data using the Google Cloud console.Spanner provides the read statistics in theSPANNER_SYS schema. You can use the following ways to accessSPANNER_SYS data:
A database's Spanner Studio page in the Google Cloud console.
The
gcloud spanner databases execute-sqlcommand.The
executeSqlor theexecuteStreamingSqlmethod.
The following single read methods that Spanner providesdon't supportSPANNER_SYS:
- Performing a strong read from a single row or multiple rows in a table.
- Performing a stale read from a single row or multiple rows in a table.
- Reading from a single row or multiple rows in a secondary index.
CPU usage grouped by read shape
The following tables track the read shapes with the highest CPU usage during aspecific time period:
SPANNER_SYS.READ_STATS_TOP_MINUTE: Read shape statistics aggregated across 1minute intervals.SPANNER_SYS.READ_STATS_TOP_10MINUTE: Read shape statistics aggregated across10 minute intervals.SPANNER_SYS.READ_STATS_TOP_HOUR: Read shape statistics aggregated across 1hour intervals.
These tables have the following properties:
Each table contains data for non-overlapping time intervals of the length thetable name specifies.
Intervals are based on clock times. 1 minute intervals end on the minute, 10minute intervals end every 10 minutes starting on the hour, and 1 hour intervalsend on the hour. After each interval, Spanner collects data from allservers and then makes the data available in the SPANNER_SYS tables shortlythereafter.
For example, at 11:59:30 AM, the most recent intervals available to SQLqueries are:
- 1 minute: 11:58:00–11:58:59 AM
- 10 minute: 11:40:00–11:49:59 AM
- 1 hour: 10:00:00–10:59:59 AM
Spanner groups the statistics by read shape. If a tag is present,FPRINT is the hash of the tag. Otherwise, it is the hash of the
READ_COLUMNSvalue.Each row contains statistics for all executions of a particular read shapethat Spanner captures statistics for during the specified interval.
If Spanner is unable to store information about every distinct readshape run during the interval, the system prioritizes read shapes with thehighest CPU usage during the specified interval.
Table schema
| Column name | Type | Description |
|---|---|---|
INTERVAL_END | TIMESTAMP | End of the time interval that the included read executions occurred in. |
REQUEST_TAG | STRING | The optional request tag for this read operation. For more information about using tags, see Troubleshooting with request tags. Statistics for multiple reads that have the same tag string are grouped in a single row with the `REQUEST_TAG` matching that tag string. |
READ_TYPE | STRING | Indicates if a read is aPARTITIONED_READ orREAD. A read with a partitionToken obtained from the PartitionRead API is represented by thePARTITIONED_READ read type and the other read APIs byREAD. |
READ_COLUMNS | ARRAY<STRING> | The set of columns that were read. These are in alphabetical order. |
FPRINT | INT64 | The hash of theREQUEST_TAG value if present; Otherwise, the hash of theREAD_COLUMNS value. |
EXECUTION_COUNT | INT64 | Number of times Spanner executed the read shape during the interval. |
AVG_ROWS | FLOAT64 | Average number of rows that the read returned. |
AVG_BYTES | FLOAT64 | Average number of data bytes that the read returned, excluding transmission encoding overhead. |
AVG_CPU_SECONDS | FLOAT64 | Average number of Spanner server side CPU seconds executing the read, excluding prefetch CPU and other overhead. |
AVG_LOCKING_DELAY_SECONDS | FLOAT64 | Average number of seconds spent waiting due to locking. |
AVG_CLIENT_WAIT_SECONDS | FLOAT64 | Average number of seconds spent waiting due to the client not consuming data as fast as Spanner could generate it. |
AVG_LEADER_REFRESH_DELAY_SECONDS | FLOAT64 | Average number of seconds spent waiting to confirm with the Paxos leader that all writes have been observed. |
RUN_IN_RW_TRANSACTION_EXECUTION_COUNT | INT64 | The number of times the read was run as part of a read-write transaction. This column helps you determine if you can avoid lock contentions by moving the read to a read-only transaction. |
AVG_DISK_IO_COST | FLOAT64 | The average cost of this query in terms of Spanner HDDdisk load. Use this value to make relative HDD I/O cost comparisons between reads that you run in the database. Querying data on HDD storage incurs a charge against the HDD disk load capacity of the instance. A higher value indicates that you are using more HDD disk load and your query might be slower than if it was running on SSD. Furthermore, if your HDD disk load is at capacity, the performance of your queries might be further impacted. You can monitor the totalHDD disk load capacity of the instance as a percentage. To add more HDD disk load capacity, you can add more processing units or nodes to your instance. For more information, seeChange the compute capacity. To improve query performance, also consider moving some data to SSD. For workloads that consume a lot of disk I/O, we recommend that you store frequently accessed data on SSD storage. Data accessed from SSD doesn't consume HDD disk load capacity. You can store selective tables, columns, or secondary indexes on SSD storage as needed, while keeping infrequently accessed data on HDD storage. For more information, seeTiered storage overview. |
Example queries
This section includes several example SQL statements that retrieve readstatistics. You can run these SQL statements using theclient libraries, thegcloud spanner, or theGoogle Cloud console.
List the basic statistics for each read shape in a given time period
The following query returns the raw data for the top read shapes in the mostrecent 1-minute time intervals.
SELECTfprint,read_columns,execution_count,avg_cpu_seconds,avg_rows,avg_bytes,avg_locking_delay_seconds,avg_client_wait_secondsFROMspanner_sys.read_stats_top_minuteORDERBYinterval_endDESCLIMIT3;Query output
| fprint | read_columns | execution_count | avg_cpu_seconds | avg_rows | avg_bytes | avg_locking_delay_seconds | avg_client_wait_seconds |
|---|---|---|---|---|---|---|---|
125062082139 | ["Singers.id", "Singers.name"] | 8514387 | 0.000661355290396507 | 310.79 | 205 | 8.3232564943763752e-06 | 0 |
151238888745 | ["Singers.singerinfo"] | 3341542 | 6.5992827184280315e-05 | 12784 | 54 | 4.6859741349028595e-07 | 0 |
14105484 | ["Albums.id", "Albums.title"] | 9306619 | 0.00017855774721667873 | 1165.4 | 2964.71875 | 1.4328191393074178e-06 | 0 |
List the read shapes, ordered by highest total CPU usage
The following query returns the read shapes with the highest CPU usage in themost recent hour:
SELECTread_columns,execution_count,avg_cpu_seconds,execution_count*avg_cpu_secondsAStotal_cpuFROMspanner_sys.read_stats_top_hourWHEREinterval_end=(SELECTMAX(interval_end)FROMspanner_sys.read_stats_top_hour)ORDERBYtotal_cpuDESCLIMIT3;Query output
| read_columns | execution_count | avg_cpu_seconds | total_cpu |
|---|---|---|---|
["Singers.id", "Singers.name"] | 1647 | 0.00023380297430622681 | 0.2579 |
["Albums.id", "Albums.title"] | 720 | 0.00016738889440282034 | 0.221314999999999 |
["Singers.singerinfo""] | 3223 | 0.00037764625882302246 | 0.188053 |
Aggregate statistics
SPANNER_SYS also contains tables to store aggregate read statistics capturedby Spanner in a specific time period:
SPANNER_SYS.READ_STATS_TOTAL_MINUTE: Aggregate statistics for all readshapes during 1 minute intervals.SPANNER_SYS.READ_STATS_TOTAL_10MINUTE: Aggregate statistics for all readshapes during 10 minute intervals.SPANNER_SYS.READ_STATS_TOTAL_HOUR: Aggregate statistics for all read shapesduring 1 hour intervals.
Aggregate statistics tables have the following properties:
Each table contains data for non-overlapping time intervals of the length thetable name specifies.
Intervals are based on clock times. 1 minute intervals end on the minute, 10minute intervals end every 10 minutes starting on the hour, and 1 hourintervals end on the hour.
For example, at 11:59:30 AM, the most recent intervals available to SQLqueries on aggregate read statistics are:
- 1 minute: 11:58:00–11:58:59 AM
- 10 minute: 11:40:00–11:49:59 AM
- 1 hour: 10:00:00–10:59:59 AM
Each row contains statistics forall read shapes executed over thedatabase during the specified interval, aggregated together. There is only onerow per time interval.
The statistics captured in the
SPANNER_SYS.READ_STATS_TOTAL_*tables mightinclude read shapes that Spanner did not capture in theSPANNER_SYS.READ_STATS_TOP_*tables.Some columns in these tables are exposed as metrics in Cloud Monitoring.The exposed metrics are:
- Rows returned count
- Read execution count
- Read CPU time
- Locking delays
- Client wait time
- Leader refresh delay
- Bytes returned count
For more information, seeSpanner metrics.
Table schema
| Column name | Type | Description |
|---|---|---|
INTERVAL_END | TIMESTAMP | End of the time interval that the included read shape executions occurred in. |
EXECUTION_COUNT | INT64 | Number of times Spanner executed the read shape during the interval. |
AVG_ROWS | FLOAT64 | Average number of rows that the reads returned. |
AVG_BYTES | FLOAT64 | Average number of data bytes that the reads returned, excluding transmission encoding overhead. |
AVG_CPU_SECONDS | FLOAT64 | Average number of Spanner server side CPU seconds executing the read, excluding prefetch CPU and other overhead. |
AVG_LOCKING_DELAY_SECONDS | FLOAT64 | Average number of seconds spent waiting due to locking. |
AVG_CLIENT_WAIT_SECONDS | FLOAT64 | Average number of seconds spent waiting due to throttling. |
AVG_LEADER_REFRESH_DELAY_SECONDS | FLOAT64 | Average number of seconds spent coordinating the reads across instances inmulti-region configurations. |
RUN_IN_RW_TRANSACTION_EXECUTION_COUNT | INT64 | The number of times that reads were run as part of read-write transactions. This column helps you determine if you can avoid lock contentions by moving some reads to read-only transactions. |
Example queries
This section includes several example SQL statements that retrieve aggregate readstatistics. You can run these SQL statements using theclient libraries, thegcloud spanner, or theGoogle Cloud console.
Find the total CPU usage across all read shapes
The following query returns the number of CPU hours consumed by read shapes inthe most recent hour:
SELECT(avg_cpu_seconds*execution_count/60/60)AStotal_cpu_hoursFROMspanner_sys.read_stats_total_hourWHEREinterval_end=(SELECTMAX(interval_end)FROMspanner_sys.read_stats_total_hour);Query output
| total_cpu_hours |
|---|
0.00026186111111111115 |
Find the total execution count in a given time period
The following query returns the total number of read shapes executed in the mostrecent complete 1-minute interval:
SELECTinterval_end,execution_countFROMspanner_sys.read_stats_total_minuteWHEREinterval_end=(SELECTMAX(interval_end)FROMspanner_sys.read_stats_total_minute);Query output
| interval_end | execution_count |
|---|---|
2020-05-28 11:02:00-07:00 | 12861966 |
Data retention
At a minimum, Spanner keeps data for each table for the following timeperiods:
SPANNER_SYS.READ_STATS_TOP_MINUTEandSPANNER_SYS.READ_STATS_TOTAL_MINUTE: Intervals covering the previous 6hours.SPANNER_SYS.READ_STATS_TOP_10MINUTEandSPANNER_SYS.READ_STATS_TOTAL_10MINUTE: Intervals covering the previous 4days.SPANNER_SYS.READ_STATS_TOP_HOURandSPANNER_SYS.READ_STATS_TOTAL_HOUR: Intervals covering the previous 30 days.
Troubleshoot high CPU usage with read statistics
Spanner read statistics come in handy in cases where you need toinvestigate high CPU usage on your Spanner database or when you arejust trying to understand the CPU-heavy read shapes on your database. Inspectionof read shapes that use significant amounts of database resources gives Spanner users a potential way to reduce operational costs and possiblyimprove general system latencies. Using the following steps, we'll show you howto use read statistics to investigate high CPU usage in your database.
Select a time period to investigate
Start your investigation by looking for a time when your application began toexperience high CPU usage. For example, in the following scenario, the issuestarted occurring around 5:20pm on May 28th 2020.
Gather read statistics for the selected time period
Having selected a time period to start our investigation, we'll look atstatistics gathered in theREAD_STATS_TOTAL_10MINUTE table around that time.The results of this query might give us clues about how CPU and other readstatistics changed over that period of time. The following query returns theaggregated read statistics from4:30 pm to7:30 pm (inclusive).
SELECTinterval_end,ROUND(avg_cpu_seconds,4)asavg_cpu_seconds,execution_count,avg_locking_delay_secondsFROMSPANNER_SYS.READ_STATS_TOTAL_10MINUTEWHEREinterval_end>="2020-05-28T16:30:00"ANDinterval_end<="2020-05-28T19:30:00"ORDERBYinterval_end;The following data is an example of the result we get back from our query.
| interval_end | avg_cpu_seconds | execution_count | avg_locking_delay_seconds |
|---|---|---|---|
| 2020-05-28 16:40:00-07:00 | 0.0004 | 11111421 | 8.3232564943763752e-06 |
| 2020-05-28 16:50:00-07:00 | 0.0002 | 8815637 | 8.98734051776406e-05 |
| 2020-05-28 17:00:00-07:00 | 0.0001 | 8260215 | 6.039129247846453e-06 |
| 2020-05-28 17:10:00-07:00 | 0.0001 | 8514387 | 9.0535466616680686e-07 |
| 2020-05-28 17:20:00-07:00 | 0.0006 | 13715466 | 2.6801485272173765e-06 |
| 2020-05-28 17:30:00-07:00 | 0.0007 | 12861966 | 4.6859741349028595e-07 |
| 2020-05-28 17:40:00-07:00 | 0.0007 | 3755954 | 2.7131391918005383e-06 |
| 2020-05-28 17:50:00-07:00 | 0.0006 | 4248137 | 1.4328191393074178e-06 |
| 2020-05-28 18:00:00-07:00 | 0.0006 | 3986198 | 2.6973481999639748e-06 |
| 2020-05-28 18:10:00-07:00 | 0.0006 | 3510249 | 3.7577083563017905e-06 |
| 2020-05-28 18:20:00-07:00 | 0.0004 | 3341542 | 4.0940589703795433e-07 |
| 2020-05-28 18:30:00-07:00 | 0.0002 | 8695147 | 1.9914494947583975e-05 |
| 2020-05-28 18:40:00-07:00 | 0.0003 | 11679702 | 1.8331461539001595e-05 |
| 2020-05-28 18:50:00-07:00 | 0.0003 | 9306619 | 1.2527332321222135e-05 |
| 2020-05-28 19:00:00-07:00 | 0.0002 | 8520508 | 6.2268448078447915e-06 |
| 2020-05-28 19:10:00-07:00 | 0.0006 | 13715466 | 2.6801485272173765e-06 |
| 2020-05-28 19:20:00-07:00 | 0.0005 | 11947323 | 3.3029114639321295e-05 |
| 2020-05-28 19:30:00-07:00 | 0.0002 | 8514387 | 9.0535466616680686e-07 |
Here we see that average CPU time,avg_cpu_seconds, is higher in thehighlighted intervals. Theinterval_end with the value2020-05-28 19:20:00 has a higher CPU time, so we'll choose that interval toinvestigate further in the next step.
Find which read shapes are causing high CPU usage
Digging a little deeper, we now query theREAD_STATS_TOP_10MINUTE table forthe interval which was picked in the preceding step. The results of this querycan help indicate which read shapes cause high CPU usage.
SELECTread_columns,ROUND(avg_cpu_seconds,4)asavg_cpu_seconds,execution_count,avg_rowsFROMSPANNER_SYS.READ_STATS_TOP_10MINUTEWHEREinterval_end="2020-05-28T19:20:00"ORDERBYavg_cpu_secondsDESCLIMIT3;The following data as an example of the result we get back from our query,returning information about the top three read shapes ranked byavg_cpu_seconds. Note the use ofROUND in our query to restrictthe output ofavg_cpu_seconds to 4 decimal places.
| read_columns | avg_cpu_seconds | execution_count | avg_rows |
|---|---|---|---|
[TestHigherLatency._exists,TestHigherLatency.lang_status,TestHigherLatency.score,globalTagAffinity.shares]1 | 0.4192 | 1182 | 11650.42216582 |
[TestHigherLatency._exists,TestHigherLatency.lang_status,TestHigherLatency.likes,globalTagAffinity.score] | 0.0852 | 4 | 12784 |
[TestHigherLatency._exists,TestHigherLatency.lang_status,TestHigherLatency.score,globalTagAffinity.ugcCount] | 0.0697 | 1140 | 310.7921052631 |
1_exists is an internal field that is used to check whether acertain row exists or not.
One reason for high CPU usage might be that you start to execute a few readshapes more frequently (execution_count). Perhaps the average number of rowsthat the read returned has increased (avg_rows). If none of those propertiesof the read shape reveal anything interesting, you can examine other propertiessuch asavg_locking_delay_seconds,avg_client_wait_seconds oravg_bytes.
Apply best practices to reduce high CPU usage
When you have gone through the preceding steps, consider whether supplying anyof these best practices will help your situation.
The number of times Spanner executed read shapes during the intervalis a good example of a metric that needs a baseline to tell you if a measurementis reasonable or a sign of a problem. Having established a baseline for themetric, you'll be able to detect and investigate the cause of any unexpecteddeviations from normal behavior.
If CPU Usage is relatively constant most of the time, but suddenly shows aspike that can be correlated with a similar sudden spike in user requests orapplication behavior, it might be an indication that everything is working asexpected.
Try the following query to find the top read shapes ranked by the number oftimes Spanner executed for each read shape:
SELECTinterval_end,read_columns,execution_countFROMSPANNER_SYS.READ_STATS_TOP_MINUTEORDERBYexecution_countDESCLIMIT10;If you are looking for the lowest possible read latencies, especially whenusing multi-region instance configurations, usestale reads instead of strongreads to reduce or remove the
Note: Stale reads don't provide any latency benefits in regionalconfigurations, so you should almost always use strong reads when yourinstance does not have a multi-region configuration.AVG_LEADER_REFRESH_DELAY_SECONDScomponent of read latency.If you are only doing reads, and you can express your read using asingleread method, you should use that single read method. Singlereads don't lock, unlike read-write transactions, therefore you should useread-only transactions over more expensive read-write transactions when you arenot writing data.
What's next
- Learn about otherIntrospection tools.
- Learn about other information Spanner stores for each database in thedatabase'sinformation schema tables.
- Learn more aboutSQL best practices forSpanner.
- Learn more aboutInvestigating high CPUutilization.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.