Monitoring
You can monitor Bigtable visually, using charts that are available intheGoogle Cloud console, or you can programmatically call the Cloud Monitoring API.
Note: If you use a Java client library to access Bigtable, youcan enableclient-side metrics.In the Google Cloud console, monitoring data is available in the followingplaces:
- Bigtable system insights
- Bigtable instance overview
- Bigtable cluster overview
- Bigtable table overview
- Cloud Monitoring
- Key Visualizer
The system insights and overview pages provide a high-level view of yourBigtable usage. You can useKey Visualizer to drill down into your access patterns by row key andtroubleshoot specific performance issues.
Understand CPU and disk usage
No matter what tools you use to monitor your instance, it's essential to monitorthe CPU and disk usage for each cluster in the instance. If a cluster's CPU ordisk usage exceeds certain thresholds, the cluster won't perform well, and itmight return errors when you try to read or write data.
CPU usage
The nodes in your clusters use CPU resources to handle reads, writes, andadministrative tasks. We recommend that youenableautoscaling, which lets Bigtableautomatically add and remove nodes to a cluster based on workload. To learn moreabout how the number of nodes affects a cluster's performance, seePerformancefor typical workloads.
Bigtable reports the following metrics for CPU usage:
| Metric | Description |
|---|---|
| Average CPU utilization | The average CPU utilization across all nodes in the cluster. Includes change stream activity if achange stream is enabled for a table in the instance. In app profile charts, <system> indicates system background activities such as replication andcompaction. System background activities are not client-driven. The recommended maximum values provide headroom for brief spikes in usage. |
| CPU utilization of hottest node | CPU utilization for the busiest node in the cluster. This metric continues to beprovided for continuity, but in most cases you should use the more accurate metricHigh-granularity CPU utilization of hottest node. |
| High-granularity CPU utilization of hottest node | A fine-grained measurement of CPU utilization for the busiest node in the cluster. The hottest node is not necessarily the same node over time and can change rapidly, especiallyduring large batch jobs or table scans. If the hottest node is frequently above the recommended value, even when your average CPU utilization is reasonable, you might be accessing a small part of your data much more frequently than the rest of your data.
|
| Change stream CPU utilization | The average CPU utilization caused by change stream activity across all nodes in the cluster. |
| CPU utilization by app profile, method, and table | CPU utilization by app profile, method, and table. If you observe higher than expected CPU usage for a cluster, use this metric to determine if the CPU usage of a particular app profile, API method, or table is driving the CPU load. |
Disk usage
For each cluster in your instance, Bigtable stores a separate copyof all of the tables in that instance.
Bigtable tracks disk usage in binary units, such as binarygigabytes (GB), where 1 GB is 230 bytes. Thisunit of measurement is also known as agibibyte (GiB).
Bigtable reports the following metrics for disk usage:
| Metric | Description |
|---|---|
| Storage utilization (bytes) | The amount of data stored in the cluster. Change stream usage is not included for this metric. This value affects yourcosts. Also, as described below, you might need to add nodes to each cluster as the amount of data increases. |
| Storage utilization (% max) | The percentage of the cluster's storage capacity that is being used. The capacity is based on thenumber of nodes in your cluster. Change stream usage is not included for this metric. In general, do not use more than 70% of the hard limit on total storage, so you have room to add more data. If you do not plan to add significant amounts of data to your instance, you can use up to 100% of the hard limit. Important: If any cluster in an instance exceeds the hard limit on theamount of storage per node, writes to all clusters in that instance will fail until youadd nodes to each cluster that is over thelimit. Also, if you try to remove nodes from a cluster, and the change would cause the clusterto exceed the hard limit on storage, Bigtable will deny the request.If you are using more than the recommended percentage of the storage limit, add nodes to the cluster. You can also delete existing data, butdeleted data takes upmore space, not less, until a compaction occurs. For details about how this value is calculated, seeStorage utilization per node. |
| Change stream storage utilization (bytes) | The amount of storage consumed by change stream records for tables in the instance. This storagedoes not count toward the total storage utilization. You are charged for change stream storage,but it is not included in the calculation of storage utilization (% max). |
| Disk load | The percentage your cluster is using of the maximum possible bandwidth for HDD reads.Available only for HDD clusters. If this value is frequently at 100%, you might experience increased latency. Add nodes to the cluster to reduce the disk load percentage. |
Compaction and replicated instances
Storage metrics reflect the data size on disk as of the last compaction. Becausecompaction happens on a rolling basis over the course of a week, storage usagemetrics for a cluster might sometimes temporarily be different from metrics forother clusters in the instance. Observable impacts of this include thefollowing:
A new cluster that has recently been added to an instance might temporarilyshow 0 bytes of storage even though all data has successfully been replicatedto the new cluster.
A table might be a different size in each cluster, even when replication isworking properly.
Storage usage metrics might be different in each cluster, even afterreplication has finished and no writes have been sent for a few days. The internalstorage implementation, including how data is divided and stored in a distributed manner,can be different for each cluster, causing the actual usage of storage todiffer.
Instance overview
The instance overview page shows the current values of several key metrics foreach cluster:
| Metric | Description |
|---|---|
| CPU utilization average | The average CPU utilization across all nodes in the cluster. Includes change stream activity if achange stream is enabled for a table in the instance. In app profile charts, <system> indicates system background activities such as replication andcompaction. System background activities are not client-driven. |
| CPU utilization of hottest node | CPU utilization for the busiest node in the cluster. This metric continues to beprovided for continuity, but in most cases you should use the more accurate metricHigh-granularity CPU utilization of hottest node. |
| High-granularity CPU utilization of hottest node | A fine-grained measurement of CPU utilization for the busiest node in the cluster. The hottest node is not necessarily the same node over time and can change rapidly, especiallyduring large batch jobs or table scans. Exceeding the recommended maximum for the busiest node can cause latency and other issues for thecluster. |
| Rows read | The number of rows read per second. |
| Rows written | The number of rows written per second. |
| Read throughput | The number of bytes per second of response data sent. This metric refers to the fullamount of data that is returned after filters are applied. |
| Write throughput | The number of bytes per second that were received when data was written. |
| System error rate | The percentage of all requests that failed on the Bigtable server side. |
| Replication latency for input | The highest amount of time at the 99th percentile, in seconds, for a write to another clusterto be replicated to this cluster. |
| Replication latency for output | The highest amount of time at the 99th percentile, in seconds, for a write to this cluster to bereplicated to another cluster. |
To see an overview of these key metrics:
Open the list of Bigtable instances in the Google Cloud console.
Click the instance whose metrics you want to view. The Google Cloud console displays thecurrent metrics for your instance's clusters.
Cluster overview
Use the cluster overview page to understand the current and past status of anindividual cluster.
The cluster overview page displays charts showing the following metrics for eachcluster:
| Metric | Description |
|---|---|
| Number of nodes | The number of nodes in use for the cluster at a given time. |
| Maximum node count target | The maximum number of nodes that Bigtable will scale the cluster up to when autoscaling is enabled. This metric is visible only when autoscaling is enabled for the cluster. You are able to change this value on theEdit cluster page. |
| Minimum node count target | The minimum number of nodes that Bigtable will scale the cluster down to when autoscaling is enabled. This metric is visible only when autoscaling is enabled for the cluster. You are able to change this value on theEdit cluster page. |
| Recommended number of nodes for CPU target | The number of nodes that Bigtable recommends for the cluster based on the CPU utilization target that you set. This metric is visible only when autoscaling is enabled for the cluster. If this number is higher than the maximum node count target, consider raising your CPU utilization target or increasing the maximum number of nodes for the cluster. If this number is lower than the minimum number of nodes, the cluster might be overprovisioned for your usage, and you should consider lowering the minimum. |
| Recommended number of nodes for storage target | The number of nodes that Bigtable recommends for the cluster based on the built-in storage utilization target. This metric is visible only when autoscaling is enabled for the cluster. If this number is higher than the maximum node count target, consider increasing the maximum number of nodes for the cluster. |
| CPU utilization | The average CPU utilization across all nodes in the cluster. Includes change stream activity if achange stream is enabled for a table in the instance. In app profile charts, <system> indicates system background activities such as replication andcompaction. System background activities are not client-driven. |
| Storage utilization | The amount of data stored in the cluster. Change stream usage is not included for this metric. This metric reflects the fact that Bigtable compresses your data when it is stored. |
To view a cluster's overview page, do the following:
Open the list of Bigtable instances in the Google Cloud console.
Click the instance whose metrics you want to view.
Go to the section that follows the section that shows the current status ofsome of the cluster's metrics.
Click the cluster ID to open the cluster'sCluster overview page.
Logs
TheLogs chart displays system event log entries for the cluster. Systemevent logs are generated only for clusters that use autoscaling. To learnadditional ways to view Bigtable audit logs, seeAudit logging.
Table overview
Use the table overview page to understand the current and past status of anindividual table.
The table overview page displays charts showing the following metrics for thetable. Each chart shows a separate line for each cluster that the table is in.
| Metric | Description |
|---|---|
| Storage utilization (bytes) | The percentage of the cluster's storage capacity that is being used by the table. The capacity is based on the number of nodes in the cluster. For details about how this value is calculated, seeStorage utilization per node. |
| CPU utilization | The average CPU utilization across all nodes in the cluster. Includes change stream activity if achange stream is enabled for a table in the instance. In app profile charts, <system> indicates system background activities such as replication andcompaction. System background activities are not client-driven. |
| Read latency | The time for a read request to return a response. Measurement of read latency begins when Bigtable receives the request and ends when the last byte of data is sent to the client. For requests for large amounts of data, read latency can be affected by the client's ability to consume the response. |
| Write latency | The time for a write request to return a response. |
| Rows read | The number of rows read per second. This metric provides a more useful view of Bigtable's overall throughput than thenumber of read requests, because a single request can read a large number of rows. |
| Rows written | The number of rows written per second. This metric provides a more useful view of Bigtable's overall throughput than thenumber of write requests, because a single request can write a large number of rows. |
| Read requests | The number of random reads and scan requests per second. |
| Write requests | The number of write requests per second. |
| Read throughput | The number of bytes per second of response data sent. This metric refers to the fullamount of data that is returned after filters are applied. |
| Write throughput | The number of bytes per second that were received when data was written. |
| Automatic failovers | The number of requests that were automatically rerouted from one cluster to another due to a failover scenario, such as a brief outage or delay. Automatic rerouting can occur if an app profile uses multi-cluster routing. This chart does not include manually rerouted requests. |
The table overview page also shows the table's replication state in each clusterin the instance. For each cluster, the page displays the following:
- Status
- Cluster ID
- Zone
- The amount of cluster storage used by the table
- Encryption key and key status
- Date of the latest backup of the selected table
- A link to theEdit cluster page.
To view a table's overview page, do the following:
Open the list of Bigtable instances in the Google Cloud console.
Click the instance whose metrics you want to view.
In the left pane, clickTables. The Google Cloud console displays alist of all the tables in the instance.
Click a table ID to open the table'sTable overview page.
Monitor performance over time
Use your Bigtable instance's system insights page to understand thepast performance of your instance. You can analyze the performance of eachcluster, and you can break down the metrics for different types ofBigtable resources. Charts can display a period ranging from thepast 1 hour to the past 6 weeks.
System insights charts for Bigtable resources
The Bigtable system insights page provides charts for the followingtypes of Bigtable resources:
- Instances
- Tables
- Application profiles
- Replication
Charts on the system insights page show the following metrics:
| Metric | Available for | Description |
|---|---|---|
| CPU utilization | Instances Tables App profiles | The average CPU utilization across all nodes in the cluster. Includes change stream activity if achange stream is enabled for a table in the instance. In app profile charts, <system> indicates system background activities such as replication andcompaction. System background activities are not client-driven. |
| High-granularity CPU utilization (hottest node) | Instances | A fine-grained measurement of CPU utilization for the busiest node in the cluster. The hottest node is not necessarily the same node over time and can change rapidly, especiallyduring large batch jobs or table scans. Exceeding the recommended maximum for the busiest node can cause latency and other issues for thecluster. |
| Data Boost serverless processing units (SPUs) | Instances | Billable Data Boost compute usage measured in SPU-seconds. |
| Read latency | Instances Tables App profiles | The time for a read request to return a response. Measurement of read latency begins when Bigtable receives the request and ends when the last byte of data is sent to the client. For requests for large amounts of data, read latency can be affected by the client's ability to consume the response. |
| SQL read latency | Instances App profiles | The time for a SQL read request to return a response. Measurement of SQL read latency begins when Bigtable receives the request and ends when the last byte of data is sent to the client. For requests for large amounts of data, SQL read latency can be affected by the client's ability to consume the response. |
| Write latency | Instances Tables App profiles | The time for a write request to return a response. |
| Client-side read latency | Instances Tables App profiles | The total end-to-end latency across all RPC attempts associated with a Bigtable operation. Measures the operation's round trip from the client to Bigtable and back to the client and includes all retries. |
| Client-side SQL read latency | Instances Tables App profiles | The total end-to-end latency across all RPC attempts associated with a Bigtable operation. Measures the operation's round trip from the client to Bigtable and back to the client and includes all retries. For |
| Client-side write latency | Instances Tables App profiles | The total end-to-end latency across all RPC attempts associated with a Bigtable operation. Measures the operation's round trip from the client to Bigtable and back to the client and includes all retries. |
| Client-side read attempt latency | Instances Tables App profiles | The latencies of a client read RPC attempt. Under normal circumstances, this value is identical to |
| Client-side SQL read attempt latency | Instances Tables App profiles | The latencies of a client SQL read RPC attempt. Under normal circumstances, this value is identical to |
| Client-side write attempt latency | Instances Tables App profiles | The latencies of a client write RPC attempt. Under normal circumstances, this value is identical to |
| User error rate | Instances | The rate of errors caused by the content of a request, as opposed to errors on the Bigtable server side. The user error rate includes the followingstatus codes:
User errors are typically caused by a configuration issue, such as a request that specifies the wrong cluster, table, or app profile. Note: To view this chart, you must group the system insights data by instance. In theView metrics for drop-down list, selectInstance. Then, underGroup by, clickInstance. |
| System error rate | Instances | The percentage of all requests that failed on the Bigtable server side. The system error rate includes the followingstatus codes:
|
| Automatic failovers | Instances Tables App profiles | The number of requests that were automatically rerouted from one cluster to another due to a failover scenario, such as a brief outage or delay. Automatic rerouting can occur if an app profile uses multi-cluster routing. This chart does not include manually rerouted requests. |
| SQL automatic failovers | Instances Tables App profiles | The number of SQL requests that were automatically rerouted from one cluster to another due to a failover scenario, such as a brief outage or delay. Automatic rerouting can occur if an app profile uses multi-cluster routing. This chart does not include manually rerouted requests. |
| Storage utilization (bytes) | Instances Tables | The amount of data stored in the cluster. Change stream usage is not included for this metric. This metric reflects the fact that Bigtable compresses your data when it is stored. |
| Storage utilization (% max) | Instances | The percentage of the cluster's storage capacity that is being used. The capacity is based on thenumber of nodes in your cluster. Change stream usage is not included for this metric. For details about how this value is calculated, seeStorage utilization per node. |
| Disk load | Instances | The percentage your cluster is using of the maximum possible bandwidth for HDD reads.Available only for HDD clusters. |
| Rows read | Instances Tables App profiles | The number of rows read per second. This metric provides a more useful view of Bigtable's overall throughput than thenumber of read requests, because a single request can read a large number of rows. |
| Rows written | Instances Tables App profiles | The number of rows written per second. This metric provides a more useful view of Bigtable's overall throughput than thenumber of write requests, because a single request can write a large number of rows. |
| Read requests | Instances Tables App profiles | The number of random reads and scan requests per second. |
| Write requests | Instances Tables App profiles | The number of write requests per second. |
| Read throughput | Instances Tables App profiles | The number of bytes per second of response data sent. This metric refers to the fullamount of data that is returned after filters are applied. |
| Write throughput | Instances Tables App profiles | The number of bytes per second that were received when data was written. |
| Write throughput | Instances Tables App profiles | The number of bytes per second that were received when data was written. |
| Node count | Instances | The number of nodes in the cluster. |
| Data Boost traffic eligibility | App profiles | Current Bigtable requests that are eligible and ineligible for Data Boost. |
| Data Boost traffic ineligible reasons | App profiles | Reasons that current traffic is ineligible for Data Boost. |
To view metrics for these resources:
Open the list of Bigtable instances in the Google Cloud console.
Click the instance whose metrics you want to view.
In the left pane, clickSystem insights. The Google Cloud console displays aseries of charts for the instance, as well as a tabular view of the instance'smetrics. By default, the Google Cloud console shows metrics for the past hour,and it shows separate metrics for each cluster in the instance.
To view all of the charts, scroll through the pane where the charts aredisplayed.
To view metrics at the table level, clickTables.
To view metrics for individual app profiles, clickApplication Profiles.
To view combined metrics for the instance as a whole, find theGroup bysection above the charts, then clickInstance.
To view metrics for a longer period of time, click the arrow next to1Hour. Choose a pre-set time range or enter a custom time range, then clickApply.
Charts for replication
The system insights page provides a chart that shows replication latency over time.You can view the average latency for replicating writes at the 50th, 99th, and100th percentiles.
To view the replication latency over time:
Open the list of Bigtable instances in the Google Cloud console.
Click the instance whose metrics you want to view.
In the left pane, clickSystem insights. The page opens withtheInstance tab selected.
Click theReplication tab. TheGoogle Cloud console displays replication latency over time. By default,the Google Cloud console shows replication latency for the past hour.
To toggle between latency charts grouped by table or by cluster, use theGroup by menu.
To change which percentile to view, use thePercentile menu.
To view metrics for a longer period of time, click the arrow next to1Hour. Choose a pre-set time range or enter a custom time range, then clickApply.
Monitor with Cloud Monitoring
Bigtable exportsusage metrics toCloud Monitoring. You can use these metrics in a variety ofways:
- Monitor programmatically using the Cloud Monitoring API.
- Monitor visually in the Metrics Explorer.
- Set upalerting policies.
- Add Bigtable usage metricsto a custom dashboard.
- Use a graphing library, such asMatplotlib for Python, to plotand analyze the usage metrics for Bigtable.
To view usage metrics in the Metrics Explorer:
Open the Monitoring page in the Google Cloud console.
If you are prompted to choose an account, choose the account that you use toaccess Google Cloud.
ClickResources, then clickMetrics Explorer.
UnderFind resource type and metric, type
bigtable. A list ofBigtable resources and metrics appears.Click a metric to view a chart for that metric.
For additional information about using Cloud Monitoring, see theCloud Monitoring documentation.
For a complete list of Bigtable metrics, seeMetrics.
Create a storage utilization alert
You can set up an alert to notify you when your Bigtable clusterexceeds a specified threshold. For more information about determining yourtarget storage utilization, seeDisk usage.
To create an alerting policy that triggers when the storage utilization for yourBigtable cluster is above a recommended threshold, such as 70%, use the following settings.
Steps to create an alerting policy.
To create an alerting policy, do the following:
In the Google Cloud console, go to thenotifications Alerting page:
If you use the search bar to find this page, then select the result whose subheading isMonitoring.
- If you haven't created your notification channels and if you want to be notified, then clickEdit Notification Channels and add your notification channels. Return to theAlerting page after you add your channels.
- From theAlerting page, selectCreate policy.
- To select the resource, metric, and filters, expand theSelect a metric menu and then use the values in theNew condition table:
- Optional: To limit the menu to relevant entries, enter the resource or metric name in the filter bar.
- Select aResource type. For example, selectVM instance.
- Select aMetric category. For example, selectinstance.
- Select aMetric. For example, selectCPU Utilization.
- SelectApply.
- ClickNext and then configure the alerting policy trigger. To complete these fields, use the values in theConfigure alert trigger table.
- ClickNext.
Optional: To add notifications to your alerting policy, clickNotification channels. In the dialog, select one or more notification channels from the menu, and then clickOK.
To be notified when incidents are openend and closed, checkNotify on incident closure. By default, notifications are sent only when incidents are openend.
- Optional: Update theIncident autoclose duration. This field determines when Monitoring closes incidents in the absence of metric data.
- Optional: ClickDocumentation, and then add any information that you want included in a notification message.
- ClickAlert name and enter a name for the alerting policy.
- ClickCreate Policy.
| New condition Field | Value |
|---|---|
| Resource and Metric | In theResources menu, selectCloud Bigtable Cluster. In theMetric categories menu, selectCluster. In theMetrics menu, selectStorage utilization. (The metric.type is bigtable.googleapis.com/cluster/storage_utilization). |
| Filter | cluster =YOUR_CLUSTER_ID |
| Configure alert trigger Field | Value |
|---|---|
| Condition type | Threshold |
| Condition triggers if | Any time series violates |
| Threshold position | Above threshold |
| Threshold value | 70 |
| Retest window | 10 minutes |
What's next
- Find out how totroubleshoot issues with KeyVisualizer.
- Read aboutclient-side metrics.
- Try theCloud Monitoring quickstart.
- Learn aboutcreating alerts based on Bigtablemetrics.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.