Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improve the throughput of SocketsHttpHandler's HTTP/1.1 connection pool#99364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
MihaZupan merged 5 commits intodotnet:mainfromMihaZupan:http-h1-contention
Mar 22, 2024

Conversation

MihaZupan
Copy link
Member

@MihaZupanMihaZupan commentedMar 6, 2024
edited
Loading

Closes#70098

The connection pool currently manages the list of available connections and the requests queue under a single lock.
As the number of cores and RPS rise, the speed at which the pool can manage connections becomes a bottleneck.

This PR brings the fast path (there are enough connections available to process all requests) down to aConcurrentStack.Push +ConcurrentStack.TryPop.


Numbers for ConcurrentQueue

Numbers from#70098 (comment):

crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/httpclient.benchmarks.yml --scenario httpclient-kestrel-get --profile aspnet-citrine-lin --variable useHttpMessageInvoker=true --variable concurrencyPerHttpClient=256 --variable numberOfHttpClients=1 --server.framework net9.0 --client.framework net9.0 --json 1x256.jsoncrank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/httpclient.benchmarks.yml --scenario httpclient-kestrel-get --profile aspnet-citrine-lin --variable useHttpMessageInvoker=true --variable concurrencyPerHttpClient=32 --variable numberOfHttpClients=8 --server.framework net9.0 --client.framework net9.0 --json 8x32.jsoncrank compare .\1x256.json .\8x32.json
client1x2568x32
RPS693,873875,814+26.22%
Patched RPS873,571876,394+0.32%

This shows that before this PR, manually splitting load between multiple HttpClient instances can have a significant impact.
After the change, there's no more benefit to doing that as a single pool can efficiently handle the higher load.

YARP's http-http 100 byte scenario:

loadyarp-baseyarp-patched
Latency 50th (ms)0.730.68-6.97%
Latency 75th (ms)0.820.74-9.82%
Latency 90th (ms)1.030.89-13.39%
Latency 95th (ms)1.411.18-16.41%
Latency 99th (ms)2.872.68-6.63%
Mean latency (ms)0.830.76-8.74%
Requests/sec306,699335,921+9.53%

In-memory loopback benchmark that stresses the connection pool contention:https://gist.github.com/MihaZupan/27f01d78c71da7b9024b321e743e3d88

Rough RPS numbers with 1-6 threads:

RPS (1000s)123456
main206019001760167015701500
patched215026003400370041004260

Breaking change consideration - This is no longer relevant after switching toConcurrentStack

While I was careful to keep the observable behavior of the pool as close as possible to what we have today, there is one important change I made intentionally:

  • The order in which we dequeue idle connections is changed from LIFO to FIFO (from a stack to a queue). This is because the backing store for available connections is now aConcurrentQueue.
  • Where this distinction may be important is if a load drops for a longer period such that we no longer need as many connections. We would previously keep the overhead connections completely idle and eventually remove them via the idle timeout. With this change, we would keep cycling through all connections, potentially keeping more of them alive.
  • A slight benefit of that behavior may be that it makes it less likely to run into the idle close race condition (server closing an idle connection after we've started using it again).

See#99364 (comment) forConcurrentStack results (current PR).

PaulusParssinen, martincostello, neon-sunset, and suraciii reacted with rocket emoji
@MihaZupanMihaZupan added this to the9.0.0 milestoneMar 6, 2024
@MihaZupanMihaZupan requested a review froma teamMarch 6, 2024 16:45
@MihaZupanMihaZupan self-assigned thisMar 6, 2024
@ghost
Copy link

Tagging subscribers to this area: @dotnet/ncl
See info inarea-owners.md if you want to be subscribed.

Issue Details

Closes#70098

The connection pool currently manages the list of available connections and the requests queue under a single lock.
As the number of cores and RPS rise, the speed at which the pool can manage connections becomes a bottleneck.

This PR brings the fast path (there are enough connections available to process all requests) down to aConcurrentQueue.Enqueue +ConcurrentQueue.TryDequeue


Numbers from#70098 (comment):

crank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/httpclient.benchmarks.yml --scenario httpclient-kestrel-get --profile aspnet-citrine-lin --variable useHttpMessageInvoker=true --variable concurrencyPerHttpClient=256 --variable numberOfHttpClients=1 --server.framework net9.0 --client.framework net9.0 --json 1x256.jsoncrank --config https://raw.githubusercontent.com/aspnet/Benchmarks/main/scenarios/httpclient.benchmarks.yml --scenario httpclient-kestrel-get --profile aspnet-citrine-lin --variable useHttpMessageInvoker=true --variable concurrencyPerHttpClient=32 --variable numberOfHttpClients=8 --server.framework net9.0 --client.framework net9.0 --json 8x32.jsoncrank compare .\1x256.json .\8x32.json
client1x2568x32
RPS693,873875,814+26.22%
Patched RPS873,571876,394+0.32%

This shows that before this PR, manually splitting load between multiple HttpClient instances can have a significant impact.
After the change, there's no more benefit to doing that as a single pool can efficiently handle the higher load.

YARP's http-http 100 byte scenario:

loadyarp-baseyarp-patched
Latency 50th (ms)0.730.68-6.97%
Latency 75th (ms)0.820.74-9.82%
Latency 90th (ms)1.030.89-13.39%
Latency 95th (ms)1.411.18-16.41%
Latency 99th (ms)2.872.68-6.63%
Mean latency (ms)0.830.76-8.74%
Requests/sec306,699335,921+9.53%

In-memory loopback benchmark that stresses the connection pool contention:https://gist.github.com/MihaZupan/27f01d78c71da7b9024b321e743e3d88

Rough RPS numbers with 1-6 threads:

RPS (1000s)123456
main206019001760167015701500
patched215026003400370041004260

Breaking change consideration
While I was careful to keep the observable behavior of the pool as close as possible to what we have today, there is one important change I made intentionally:

  • The order in which we dequeue idle connections is changed from LIFO to FIFO (from a stack to a queue). This is because the backing store for available connections is now aConcurrentQueue.
  • Where this distinction may be important is if a load drops for a longer period such that we no longer need as many connections. We would previously keep the overhead connections completely idle and eventually remove them via the idle timeout. With this change, we would keep cycling through all connections, potentially keeping more of them alive.
  • A slight benefit of that behavior may be that it makes it less likely to run into the idle close race condition (server closing an idle connection after we've started using it again).
Author:MihaZupan
Assignees:MihaZupan
Labels:

area-System.Net.Http

Milestone:9.0.0

@MihaZupan
Copy link
MemberAuthor

/azp run runtime-libraries-coreclr outerloop

@azure-pipelinesAzure Pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan
Copy link
MemberAuthor

/azp run runtime-libraries stress-http

@azure-pipelinesAzure Pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan
Copy link
MemberAuthor

Using a stack here is close enough (in benchmarks the collection is going to be close to empty all the time, so contention between the stack and queue is similar). I'll switch the PR to use that to avoid thebehavioral change.
We can revisit it in the future with more idle eviction heuristics to get the last few % with a queue if needed.

It does mean an extra 32 byte allocation for each enqueue op sadly (+1 for#31911).

loadyarp-mainyarp-stackyarp-queue
Latency 50th (ms)0.730.69-4.95%0.68-5.91%
Latency 75th (ms)0.810.75-7.49%0.74-8.97%
Latency 90th (ms)0.990.88-10.98%0.85-14.40%
Latency 95th (ms)1.311.13-13.99%1.05-20.00%
Latency 99th (ms)2.832.60-7.95%2.45-13.29%
Mean latency (ms)0.820.76-6.78%0.75-8.59%
Requests/sec312,857335,444+7.22%342,141+9.36%
clientclient-mainclient-stackclient-queue
Requests80,028,791107,128,778+33.86%107,868,124+34.79%
Mean RPS666,886892,749+33.87%898,902+34.79%
MethodToolchainMeanErrorRatioAllocatedAlloc Ratio
SendAsyncmain517.0 ns4.27 ns1.00552 B1.00
SendAsyncstack482.0 ns2.87 ns0.93584 B1.06
SendAsyncqueue471.1 ns1.37 ns0.91552 B1.00
rzikm, CarnaViire, neon-sunset, and PaulusParssinen reacted with rocket emoji

@MihaZupan
Copy link
MemberAuthor

/azp run runtime-libraries-coreclr outerloop

@azure-pipelinesAzure Pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MihaZupan
Copy link
MemberAuthor

/azp run runtime-libraries stress-http

@azure-pipelinesAzure Pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@stephentoubstephentoubstephentoub approved these changes

Assignees

@MihaZupanMihaZupan

Projects
None yet
Milestone
9.0.0
Development

Successfully merging this pull request may close these issues.

HttpConnectionPool contention for HTTP/1.1
2 participants
@MihaZupan@stephentoub

[8]ページ先頭

©2009-2025 Movatter.jp