fastapi/fastapiPublic

NotificationsYou must be signed in to change notification settings
Fork8.3k
Star92.5k

Help with 503 upstream timeouts on Kubernetes + FastAPI (HPA not scaling)#14417

Unanswered

gcarrascoro asked this question inQuestions

gcarrascoro

Nov 28, 2025

· 0 comments

Return to top

Discussion options

gcarrascoro
Nov 28, 2025

First Check

I added a very descriptive title here.
I used the GitHub search to find a similar question and didn't find it.
I searched the FastAPI documentation, with the integrated search.
I already searched in Google "How to X in FastAPI" and didn't find any information.
I already read and followed all the tutorial in the docs and didn't find an answer.
I already checked if it is not related to FastAPI but toPydantic.
I already checked if it is not related to FastAPI but toSwagger UI.
I already checked if it is not related to FastAPI but toReDoc.

Commit to Help

I commit to help with one of those options 👆

Example Code

Description

Hey all!
I’m running a FastAPI application on Kubernetes and encountering 503 upstream timeout errors under load. I suspect my issue is related to a mismatch between sync endpoints, worker capacity, and autoscaling not triggering (HPA).

My setup:

FastAPI app
Endpoints are currently sync (def)

Running with:

autoscaling/v2
Ingress: projectcontour.io/v1 (Contour Ingress)

The application seems to not handle bursts of requests well and also not hit enough resource usage to trigger HPA scaling which leads to the 503s.

What’s the recommended approach for handling sync endpoints that may block workers but don’t consume enough CPU/ memory to trigger HPA? Any recommended Gunicorn/Uvicorn worker configurations? I understand from the documentation that when working with pods, the application should run with the fewest workers and be scaled up or down based on workload.

Are there known best practices for HPA when applications are I/O bound but not CPU bound (so autoscaling doesn’t trigger)?
Moreover, what's the best approach mark the overloaded pod so no more requests are sent and fail with 503?

Any help or tips is more than welcome. Thanks!

Operating System

Windows

Operating System Details

No response

FastAPI Version

0.79.0

Pydantic Version

1.10.24

Python Version

3.11.12

Additional Context

No response

You must be logged in to vote

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Help with 503 upstream timeouts on Kubernetes + FastAPI (HPA not scaling)#14417

Uh oh!

{{title}}

Uh oh!

gcarrascoro
Nov 28, 2025

First Check

Commit to Help

Example Code

Description

Operating System

Operating System Details

FastAPI Version

Pydantic Version

Python Version

Additional Context

Replies: 0 comments

Select a reply

Uh oh!

Movatterモバイル変換

Uh oh!

Help with 503 upstream timeouts on Kubernetes + FastAPI (HPA not scaling)#14417

Uh oh!

gcarrascoroNov 28, 2025

First Check

Commit to Help

Example Code

Description

Operating System

Operating System Details

FastAPI Version

Pydantic Version

Python Version

Additional Context

Replies: 0 comments

Uh oh!

gcarrascoro
Nov 28, 2025