Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Crawlee increasing concurrency until it dies #1535

Open
Assignees
Pijukatel
Labels
t-toolingIssues with this label are in the ownership of the tooling team.
@ericvg97

Description

@ericvg97

I am deploying crawlee in a kubernetes pod. It gets recurrently OOMKilled because crawlee increases the desired concurrency continuously. I don't want to decrease the max_concurrency because I am crawling domains that are super lightweight while crawling others that aren't, and I'd like crawlee to maximize the throughput. I also could increase the RAM for the pod, but I think there is an underlying issue that would come up later (or I would just underuse my resources)

I am seeing this log which makes me suspicious crawlee doesn't actually know the memory and cpu it is using:
current_concurrency = 21; desired_concurrency = 21; cpu = 0.0; mem = 0.0; event_loop = 0.148; client_info = 0.0

Cpu is not a big problem because the kernel throttles cpu for this pod, but mem is a hardlimit and kubernetes kills the pod.

For more context I am using the Playwright adaptative crawler with beautiful soup and headless firefox and my concurrency settings are:
concurrency_settings = ConcurrencySettings(max_concurrency=45, desired_concurrency=10, min_concurrency=10)

and I am giving the pod

resources:    limits:      cpu: "12"      memory: "12Gi"    requests:      cpu: "8"      memory: "8Gi"

Metadata

Metadata

Assignees

Labels

t-toolingIssues with this label are in the ownership of the tooling team.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions


    [8]ページ先頭

    ©2009-2025 Movatter.jp