Monitoring API usage Stay organized with collections Save and categorize content based on your preferences.
This page describes how to useAPI metrics to track and understand your usageof Google APIs and Google Cloud APIs.
Google APIs produce detailed usage metrics that can help you:
- Track and understand your usage of Google APIs.
- Monitor performance of your applications and Google APIs.
- Discover problems between your applications and Google APIs.
It can dramatically speed up resolution times when you troubleshoot problemsor need technical support from Google.
The metrics that Google APIs produce are the standard signals that Google'sown Site Reliability Engineers use to assess the health of a service.These metrics covers request counts, error rates, total latencies, backendlatencies, request sizes, and response sizes. For the API metric definitions,seeCloud Monitoring documentation.
You can view API metrics in two places:API Dashboard andCloud Monitoring. The metrics you see are specific toyour project, and they don't reflect the overall service status.
Using the API Dashboard
The simplest way to view your API metrics is to use the Google Cloudconsole'sAPI Dashboard. You can see anoverview of all your API usage, or you can drill down to your usage of aspecific API.
To see an overview of your API usage:
Visit Cloud console'sAPIs and Services section.The main API Dashboard is displayed by default. In this page you cansee all the APIs you currently have enabled for your project,as well as overview charts for the following metrics:
- Traffic: the number of requests per second made by or about yourproject to enabled APIs
- Errors: the percentage of requests to enabled APIs thatresulted in errors
- Median latency: the median latency for requests to enabled APIs,if available".
To view usage details for a specific API:
- Select the API you want to view in the main API Dashboard list of APIs.The API's Overview page shows a more detailed traffic chart with abreakdown by response code.
For even more detailed usage information, selectView metrics.By default, the following pre-built charts are displayed,though more are available:
- Traffic by response code
- Errors by API method
- Overall latency at the 50th, 95th, and 99th percentile
- Latency by API method (median)
If you want to add more charts, you can select additionalpre-built charts from theSelect Graphs drop-down menu.
Using Cloud Monitoring
If you use Cloud Monitoring, you can dive deeper into available metricsdata using the Metrics Explorer to give you greater insight into your API usage.Cloud Monitoring supports a wide variety of metrics, which you can combinewith filters and aggregations for new and insightful views into your applicationperformance. For example, you can combine a request count metric with a filteron the HTTP Response Code class to build a dashboard that shows error rates overtime, or you can look at the 95th percentile latency of requests to the CloudPub/Sub API.
Available metrics
The following table lists the availableserviceruntime metrics.The API-usage metrics are those that includeconsumed_api as a monitored resource.
The "metric type" strings in this table must be prefixed withserviceruntime.googleapis.com/. That prefix has been omitted from the entries in the table. When querying a label, use themetric.labels. prefix; for example,metric.labels.LABEL="VALUE".
| Metric type Launch stage (Resource hierarchy levels) Display name | |
|---|---|
| Kind, Type, Unit Monitored resources | Description Labels |
api/request_countGA (project)Request count | |
DELTA, INT64, 1api consumed_api produced_api | The count of completed requests. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds.protocol: The protocol of the request, e.g. "http", "grpc".response_code: The HTTP response code for HTTP requests, or HTTP equivalent code for gRPC requests. See code mapping inhttps://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto.response_code_class: The response code class for HTTP requests, or HTTP equivalent class for gRPC requests, e.g. "2xx", "4xx".grpc_status_code: The numeric gRPC response code for gRPC requests, or gRPC equivalent code for HTTP requests. See code mapping inhttps://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto. |
api/request_latenciesGA (project)Request latencies | |
DELTA, DISTRIBUTION, sapi consumed_api produced_api | Distribution of latencies in seconds for non-streaming requests. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds. |
api/request_latencies_backendGA (project)Request backend latencies | |
DELTA, DISTRIBUTION, sapi produced_api | Distribution of backend latencies in seconds for non-streaming requests. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds. |
api/request_latencies_overheadGA (project)Request overhead latencies | |
DELTA, DISTRIBUTION, sapi produced_api | Distribution of request latencies in seconds for non-streaming requests excluding the backend. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds. |
api/request_sizesGA (project)Request sizes | |
DELTA, DISTRIBUTION, Byapi consumed_api produced_api | Distribution of request sizes in bytes recorded at request completion. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds. |
api/response_sizesGA (project)Response sizes | |
DELTA, DISTRIBUTION, Byapi consumed_api produced_api | Distribution of response sizes in bytes recorded at request completion. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds. |
mcp/request_countBETA (project)MCP Request Count | |
DELTA, INT64, 1consumed_mcp_api | The count of MCP requests.response_code: The HTTP response code for HTTP requests, or HTTP equivalent code for MCP requests.response_code_class: The response code class for HTTP requests, or HTTP equivalent class for gRPC requests, e.g. '2xx', '4xx'. |
mcp/request_durationsBETA (project)MCP Request Duration | |
DELTA, DISTRIBUTION, sconsumed_mcp_api | The duration of the MCP request from the time it was sent until the response or ack is received. |
quota/allocation/usageGA (project, folder, organization)Allocation quota usage | |
GAUGE, INT64, 1consumer_quota producer_quota | The total consumed allocation quota. Values reported more than 1/min are dropped. If no changes are received in quota usage, the last value is repeated at least every 24 hours. Sampled every 60 seconds.quota_metric: The name of quota metric or quota group. |
quota/concurrent/exceededALPHA (project, folder, organization)Concurrent Quota Exceeded | |
DELTA, INT64, 1consumer_quota | The number of times exceeding the concurrent quota was attempted. Sampled every 86400 seconds. After sampling, data is not visible for up to 180 seconds.limit_name: The quota limit name, such as "Requests per day" or "In-use IP addresses".quota_metric: The name of quota metric or quota group.time_window: The window size for concurrent operation limits. |
quota/concurrent/limitALPHA (project, folder, organization)Concurrent Quota limit | |
GAUGE, INT64, 1consumer_quota producer_quota | The concurrent limit for the quota. Sampled every 86400 seconds. After sampling, data is not visible for up to 180 seconds.limit_name: The quota limit name, such as "Requests per day" or "In-use IP addresses".quota_metric: The name of quota metric or quota group.time_window: The window size for concurrent operation limits. |
quota/concurrent/usageALPHA (project, folder, organization)Concurrent Quota usage | |
GAUGE, INT64, 1consumer_quota producer_quota | The concurrent usage of the quota. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds.limit_name: The quota limit name, such as "Requests per day" or "In-use IP addresses".quota_metric: The name of quota metric or quota group.time_window: The window size for concurrent operation limits. |
quota/exceededGA (project, folder, organization)Quota exceeded error | |
GAUGE, BOOL, 1consumer_quota | The error happened when the quota limit was exceeded. Sampled every 60 seconds.limit_name: The quota limit name, such as "Requests per day" or "In-use IP addresses".quota_metric: The name of quota metric or quota group. |
quota/limitGA (project, folder, organization)Quota limit | |
GAUGE, INT64, 1consumer_quota producer_quota | The limit for the quota. Sampled every 86400 seconds.limit_name: The quota limit name, such as "Requests per day" or "In-use IP addresses".quota_metric: The name of quota metric or quota group. |
quota/rate/net_usageGA (project, folder, organization)Rate quota usage | |
DELTA, INT64, 1consumer_quota producer_quota | The total consumed rate quota. Sampled every 60 seconds. After sampling, data is not visible for up to 240 seconds.method: The API method name, such as "disks.list".quota_metric: The name of quota metric or quota group. |
reserved/metric1EARLY_ACCESS (project)Deprecated | |
DELTA, INT64, 1deprecated_resource | Deprecated. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds.quota_name: Deprecated.credential_id: Deprecated.quota_location: Deprecated. |
Table generated at 2025-12-11 14:22:04 UTC.
To see API metrics in Metrics Explorer, selectConsumed API as the resourcetype, then select one of theserviceruntime metrics. Then use the filter andaggregation options to refine your data.After you've found the API usage information you want, you can useCloud Monitoring to create custom dashboards and alerts that will help youcontinue to monitor and maintain a robust application. You can find out how todo this in the following pages:
For more information, seeMetrics Explorer.
Troubleshooting with API metrics
API metrics can be particularly useful if you need to contact Google whensomething goes wrong, and may even show you that you don't need to contactsupport at all. For example:
- If all of your calls to a service are failing for a single credential ID, butnot any other, chances are there is something wrong with that account that youcan easily fix yourself without opening a ticket.
- You’re troubleshooting a problem with your app, and notice a correlationbetween your application’s degraded performance and a sustained increase inthe 50th percentile latency of a critical GCP service. Definitely call us andpoint us to this data so we can start working on the problem as quickly aspossible.
- The latencies for a GCP service report look good and unchanged from before,but your in-app metrics report that the latency on calls to the service isabnormally high. That tells you that there is some trouble in the network.Call your network provider (in some cases, Google) to get the debuggingprocess started.
Best practices
While API metrics are an extremely useful tool, there are issues you need toconsider to make sure they provide useful information, particularly when settingup alerts based on metric values. The following best practices will help you getthe most from API metrics data.
Is latency causing a problem?
While some services are quite latency-sensitive, for others scale andreliability matter more. Some APIs,Cloud Storage orBigQuery for example, can have a couple of seconds of highlatency without customers noticing. With data from API metrics, you can learnwhat your users need from a given service.
Look for changes from the norm
Before you decide to alert on a particular metric value, consider what actuallycounts as unusual behavior. Looking at your API metrics can show you thatlatency results for most services fall within a normal distribution: a big humpin the middle, and outliers on either side. The metrics will help you understandthe normal distribution so that you can engineer your app to work well withinthe distribution curve. Metrics can also help you correlate distribution changeswith times where your app is not working as intended, to help you find the rootcause of an issue. We expect the 99th percentile to look very different than themedian — what we don’t expect are dramatic changes in those percentilesover time.
Also you may see that some kinds of requests take longer than others. If themedian size of a photo uploaded to Google Photos is 4 MB, but you normallyupload 20 MB RAW files, your average time to upload 20 photos is likely to besubstantially worse than that of most users, but is stillyour normalbehavior.
All this means that it's not particularly useful to alert the first time asecond-long RPC or 5xx HTTP call is detected. Instead, when investigating aGoogle service as a possible cause for an issue your application isexperiencing, compare the return codes and latency rates over time andlook for sustained changes from the norm that are correlated with observed issues in your application.
Traffic rate
API metrics are most useful where you have a high volume of traffic going to theAPI. If you call a service only intermittently, your API metrics won’t bestatistically valid and won’t give you meaningful triage information.
For example, if you want to track the 99.5th percentile latency for a service,and you only do 100 calls an hour, watching the measurement over a two hourperiod would only give you one data point to represent the 99.5th percentile,which won't tell you much about the normal behavior of the API or yourapplication. Make sure the traffic rate, the percentile you are tracking,and the time window you are considering generate many data points of interestor the monitoring data will not be helpful to you.
Supported APIs
All Google APIs and Google Cloud APIs, as well as APIs built on top of CloudEndpoints and API Gateway, support API metrics. If you are an API consumer,you can view the Consumed API metrics in theAPI Dashboard. If youare an API producer, you can view the Produced API metrics in theEndpoints Dashboard.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.