Error code 429

If the number of your requests exceeds the capacity allocated to processrequests, then error code429 is returned. The following table displays theerror message generated by each type of quota framework:

Quota framework	Message
Pay-as-you-go	`Resource exhausted, please try again later.`
Provisioned Throughput	`Too many requests. Exceeded the Provisioned Throughput.`

With a Provisioned Throughput subscription, you can reserve anamount of throughput for specific generative AI models. If you don't have aProvisioned Throughput subscription and resources aren't availableto your application, then an error code429 is returned. Although you don'thave reserved capacity, you can try your request again. However, the requestisn't counted against your error rate as described in yourservice levelagreement (SLA).

For projects that have purchased Provisioned Throughput,Vertex AI measures a project's throughput and reserves the purchasedamount of throughput for the project's actual usage.

For standard Provisioned Throughput, when you use less than yourpurchased amount, errors that might otherwise be429 are returned as5XX andcount toward the SLA error rate. For Single Zone Provisioned Throughput,when you use less than your purchased amount, capacity-related429 errors aretreated as5XX but don't count toward the SLA error rate. When you exceed yourpurchased amount, the additional requests are processed on-demand as pay-as-you-go.

Pay-as-you-go

On the pay-as-you-go quota framework, you have the following options toresolving429 errors:

Use theglobal endpointinstead of a regional endpoint whenever possible.
Implement a retry strategy by usingtruncated exponential backoff.
If your model uses quotas, you can submit a Quota Increase Request (QIR). Ifyour model usesStandard pay-as-you-go, smoothing trafficand reducing large spikes can help.
Subscribe to Provisioned Throughput for a more consistent level of service.For more information, seeProvisioned Throughput.

Provisioned Throughput

To correct the 429 error generated by Provisioned Throughput, do thefollowing:

Use theDefault behaviorexample, which doesn't set aheader in prediction requests. Any overages are processed on-demand and billedas pay-as-you-go.
Increase the number of GSUs in your Provisioned Throughputsubscription.

What's next

To learn more about Standard pay-as-you-go, seeStandard pay-as-you-go.
To learn more about Provisioned Throughput, seeProvisioned Throughput.
To learn about quotas and limits for Vertex AI, seeVertex AI quotas and limits.
To learn more about Google Cloud quotas and system limits, see theCloud Quotas documentation.
To learn more about API errors, seeAPI errors.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Error code 429 Stay organized with collections Save and categorize content based on your preferences.

Pay-as-you-go

Provisioned Throughput

What's next

Error code 429