Billing settings for services Stay organized with collections Save and categorize content based on your preferences.
This page describes billing settings assuming the use of the defaultCloud Runautoscaling behavior.SeeBilling behavior using manual scalingfor additional considerations if you use manual scaling.
There are two billing settings in Cloud Run services:
Request-based billing (default): Cloud Run instances are onlycharged when they process requests, when they start, and when they shut down.Seeinstance lifecycle formore details. This setting was previously calledCPU only allocated during request processing.
Instance-based billing: Cloud Run instances are charged forthe entire lifecycle of instances, even when there are no incoming requests.Instance-based billing can be useful for running short-lived background tasksand other asynchronous processing tasks. This setting was previously calledCPU always allocated.
If you choose request-based billing, you are charged per request and only whenthe instance processes a request. If you choose instance-based billing, you arecharged for the entire lifecycle of the instance. See theCloud Run pricing tables for details.
Recommender automatically looksat traffic received by your Cloud Run service over the past month, andwill recommend switching fromrequest-based billing toinstance-based billing, if this is cheaper.
Note: Unlike Cloud Run services, all Cloud Run jobs haveinstance-based billing.CPU allocation impact
Selecting a billing setting impacts how CPU is allocated.
- Withrequest-based billing, CPU is only allocated during request processing.
- Withinstance-based billing, CPU is allocated for the entire containerinstance lifecycle.
How to choose the appropriate billing setting
Choosing the appropriate billing setting for your use case depends on severalfactors, such as traffic patterns, background execution, and cost, each of whichis described in the following sections.
Note: There is no container-based way to tell whether an instance changes fromidle to actively serving. If detecting this kind of change is an issue for you,you should chooseinstance-based billing.Traffic patterns considerations
- Request-based billing is recommended when incoming traffic is sporadic,bursty or spiky.
- Instance-based billing is recommended when incoming traffic is steady,slowly varying.
Background execution considerations
Selectinginstance-based billing allocates CPU even outside of requestprocessing, letting you execute short-lived background tasks and otherasynchronous processing work after returning responses. For example:
- Leveraging monitoring agents like OpenTelemetry that may assume to be able torun in the background.
- Using Go's Goroutines, Node.js async, Java threads, and Kotlin coroutines.
- Using application frameworks that rely on built-in scheduling/backgroundfunctionalities.
Idle instances, including those kept warm usingminimum instances,can be shut down at any time. If you need to finish outstanding tasks beforethe container is terminated, you can trap SIGTERM to give a instance 10 secondsgrace time before it is stopped.
Consider using Cloud Tasks forexecuting asynchronous tasks.Cloud Tasks automatically retries failed tasks and supportsrunning times up to 30 minutes.
Cost considerations
If you are usingrequest-based billing,instance-based billing can be moreeconomical if:
- Your Cloud Run service is processing high number of current requestsat a rather steady rate.
- You don't see a lot of "idle" instances when looking at theinstance count metric.
You can use thepricing calculator to estimate costdifferences.
Autoscaling considerations
Cloud Run by defaultautoscalesthe number of container instances.
For a service set torequest-based billing, Cloud Run autoscalesthe number of instances based on CPU utilization only during request processing.
For a service set toinstance-based billing, Cloud Run autoscalesthe number of instances based on CPU utilization for the entire lifecycle of thecontainer instance, except whenscaling to and from zero,where it only uses requests.
Seemanual scaling for additionalconsiderations if you use manual scaling instead of the Cloud Runautoscaling feature.
Instance-based billing considerations
Even if the billing setting is set toinstance-based billing,Cloud Runautoscaling is stillin effect, and may terminate instances if they aren't needed to handle incomingtraffic or current CPU utilization outside of requests. An instance will neverstay idle for more than 15 minutes afterprocessing a request unless it is kept active usingminimum instances.
Combining instance-based billing with a number ofminimum instances results in a number ofinstances up and running with full access to CPU resources, enabling backgroundprocessing use cases. When using this pattern, Cloud Run appliesinstance autoscaling even if a serviceis using CPU outside of any requests.
If you use healthcheck probes, you must use instance-based billing for everyprobe. Seecontainer healthcheck probesfor billing details.
Required roles
To get the permissions that you need to configure and deploy Cloud Run services, ask your administrator to grant you the following IAM roles:
- Cloud Run Developer (
roles/run.developer) on the Cloud Run service - Service Account User (
roles/iam.serviceAccountUser) on the service identity
If you are deploying aserviceorfunction from source code, youmust also have additional roles granted to you on your project andCloud Build service account.
For a list of IAM roles and permissions that are associated withCloud Run, seeCloud Run IAM rolesandCloud Run IAM permissions.If your Cloud Run service interfaces withGoogle Cloud APIs, such as Cloud Client Libraries, see theservice identity configuration guide.For more information about granting roles, seedeployment permissionsandmanage access.
Set and update billing
Any configuration change leads to thecreation of a new revision. Subsequent revisions will also automatically getthis configuration setting unless you make explicit updates to change it.
If you select instance-based billing, you must specify at least512MiB ofmemory.
You can change the billing setting using the Google Cloud console, thegcloud CLI, or a YAML file when youcreate a new service ordeploy a new revision:
Console
In the Google Cloud console, go to the Cloud RunServices page:
ClickDeploy container to configure a new service. If you areconfiguring an existing service, click the service, then clickEdit and deploy new revision.
If you are configuring a new service, fill out the initial servicesettings page.
Select a billing setting underBilling. Selectrequest-based billing for your instances to be charged only duringrequest processing. Selectinstance-based billing for your instancesto be charged for the entire lifetime of instances.
ClickCreate orDeploy.
gcloud
You canupdate the billing setting.To set instance-based billing for a given service:
gcloudrunservicesupdateSERVICE--no-cpu-throttling
ReplaceSERVICE with the name of your service.
To set request-based billing:
gcloudrunservicesupdateSERVICE--cpu-throttling
You can also set your billing setting duringdeployment.To set your billing setting to instance-based billing:
gcloudrundeploy--imageIMAGE_URL--no-cpu-throttling
To set your billing setting to request-based billing:
gcloudrundeploy--imageIMAGE_URL--cpu-throttling
ReplaceIMAGE_URL with a reference to the container image, forexample,us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry,therepositoryREPO_NAME mustalready be created. The URL follows the format ofLOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG.
YAML
If you are creating a new service, skip this step.If you are updating an existing service, download itsYAML configuration:
gcloudrunservicesdescribeSERVICE--formatexport>service.yamlUpdate the
cpuattribute:apiVersion:serving.knative.dev/v1kind:Servicemetadata:name:SERVICEspec:template:metadata:annotations:run.googleapis.com/cpu-throttling:'BOOLEAN'name:REVISION
Replace the following:
- SERVICE: the name of your Cloud Run service
- BOOLEAN:
trueto set request-billing, orfalsetoset instance-based billing. - REVISION with a new revision name or delete it (if present). If you supply a new revision name, itmust meet the following criteria:
- Starts with
SERVICE- - Contains only lowercase letters, numbers and
- - Does not end with a
- - Does not exceed 63 characters
- Starts with
Create or update the service using the following command:
gcloudrunservicesreplaceservice.yaml
Terraform
To learn how to apply or remove a Terraform configuration, seeBasic Terraform commands.
Add the following to agoogle_cloud_run_v2_service resource in your Terraform configuration:resource"google_cloud_run_v2_service""default"{name="cloudrun-service-cpu-allocation"location="us-central1"deletion_protection=false # set to "true" in productiontemplate{containers{image="us-docker.pkg.dev/cloudrun/container/hello"resources{ # If true, garbage-collect CPU when once a request finishescpu_idle=false}}}}View billing settings
To view the current billing settings for your Cloud Run service:
Console
In the Google Cloud console, go to the Cloud RunServices page:
Click the service to open theService details page.
Click theRevisions tab.
In the details panel, theBilling settingis listed under theGeneral tab.
gcloud
Run the following command to view the billing configuration:
gcloudrunservicesdescribeSERVICE--format=yamlIn the YAML output, find the
run.googleapis.com/cpu-throttlingsetting. A value offalseindicates instance-based billing, and ifthis setting is missing, it indicates request-based billing.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-18 UTC.