Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Proposal] use Timer in MemoryCache#45842

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed

Conversation

@adamsitnik
Copy link
Member

@adamsitnikadamsitnik commentedDec 9, 2020
edited
Loading

This is my last proposal for improvingMemoryCache, at least for a while (I've run out of ideas).

After#45281,#45280,#45410,#45563,aspnet/Benchmarks#1603, andaspnet/Benchmarks#1607

we got to the place where the RPS for TE Caching benchmark has increased by 15% (from ~230k to ~270k).

Now, most of the time related to caching (most of the time, in general, is spent in JSON serialization) and not related to the usage of Concurrent Dictionaryis spent in getting current time (to tell if the given entry has expired or not):

obraz

In the case of the TE Caching benchmark, all cache entries have no expiration date. So this is kind of redundant and could be detected before getting the current time.

The problem is thatMemoryCache uses every public method to check if it has to scan existing cache items for expired ones. To be able to do that, it needs the current time.

We could store the information about the fact that all items in the given cache can't expire, but it would rather be a hack..

Instead of this, we canintroduce aTimer toMemoryCache and ask it to perform the scan in the background at configured time interval. And this is what this proposal is all about.

I assume that most of our users have a single instance of the cache and they use the default expiration scan frequency (1 minute).

This change gives us a 10% boost for the TE caching benchmark (from 274 to 307k RPS). After it, the Concurrent Dictionary becomes a bottleneck. But even if we replace the cache with an array, the RPS is around 330k so it puts us very close to the maximum limit.

FWIW the microbenchmark results:

MethodToolchainMeanRatio
GetHit\after\CoreRun.exe22.46 ns0.21
GetHit\before\CoreRun.exe108.11 ns1.00
TryGetValueHit\after\CoreRun.exe21.71 ns0.20
TryGetValueHit\before\CoreRun.exe111.43 ns1.00
GetMiss\after\CoreRun.exe14.41 ns0.15
GetMiss\before\CoreRun.exe92.94 ns1.00
TryGetValueMiss\after\CoreRun.exe14.35 ns0.15
TryGetValueMiss\before\CoreRun.exe93.17 ns1.00
SetOverride\after\CoreRun.exe296.49 ns0.99
SetOverride\before\CoreRun.exe297.80 ns1.00
AddThenRemove_NoExpiration\after\CoreRun.exe46,419.16 ns0.93
AddThenRemove_NoExpiration\before\CoreRun.exe50,281.66 ns1.00
AddThenRemove_AbsoluteExpiration\after\CoreRun.exe51,470.07 ns0.97
AddThenRemove_AbsoluteExpiration\before\CoreRun.exe52,867.99 ns1.00
AddThenRemove_RelativeExpiration\after\CoreRun.exe51,951.99 ns0.95
AddThenRemove_RelativeExpiration\before\CoreRun.exe54,729.02 ns1.00
AddThenRemove_SlidingExpiration\after\CoreRun.exe44,456.62 ns0.86
AddThenRemove_SlidingExpiration\before\CoreRun.exe51,965.55 ns1.00

@eerhardt@davidfowl@maryamariyan@Tratcher

@adamsitnikadamsitnik added NO-MERGEThe PR is not ready for merge yet (see discussion for detailed reasons) tenet-performancePerformance related issue discussion area-Extensions-Caching labelsDec 9, 2020
@ghost
Copy link

Tagging subscribers to this area:@eerhardt,@maryamariyan
See info in area-owners.md if you want to be subscribed.

Issue Details

This is my last proposal for improvingMemoryCache, at least for a while (I've run out of ideas).

After#45281,#45280,#45410,#45563,aspnet/Benchmarks#1603 andaspnet/Benchmarks#1607

we got to the place where the RPS for TE Caching benchmark has increased by 15% (from ~230k to ~270k).

Now, most of the time related to caching (most of the time in general is spent in JSON serialization) and not related to usage of Concurrent Dictionaryis spent in getting current time (to tell if given entry has expired or not):

obraz

In case of TE Caching benchmark all cache entries have no expiration date. So this is kind of redundant and could be detected before getting current time.

The problem is thatMemoryCache uses every public method to check if it has to scan existing cache items for expired ones. To be able to do that, it needs the current time.

We could store the information about the fact that all items in given cache can't expire, but it would rather be a hack..

Instead of this, we canintroduce aTimer toMemoryCache and ask it to perfom the scan in background at configured time interval. And this is what this proposal is all about.

I assume that most of our users have a single instance of the cache and they use the default expiration scan frequency (1 minute).

This change gives us a 10% boost for the TE caching benchmark (from 274 to 307k RPS). After it, the Concurrent Dictionary becomes a bottleneck. But even if we replace the cache with an array, the RPS is around 330k so it put's us very close to the maximum limit.

FWIW the microbenchmark results:

MethodToolchainMeanRatio
GetHit\after\CoreRun.exe22.46 ns0.21
GetHit\before\CoreRun.exe108.11 ns1.00
TryGetValueHit\after\CoreRun.exe21.71 ns0.20
TryGetValueHit\before\CoreRun.exe111.43 ns1.00
GetMiss\after\CoreRun.exe14.41 ns0.15
GetMiss\before\CoreRun.exe92.94 ns1.00
TryGetValueMiss\after\CoreRun.exe14.35 ns0.15
TryGetValueMiss\before\CoreRun.exe93.17 ns1.00
SetOverride\after\CoreRun.exe296.49 ns0.99
SetOverride\before\CoreRun.exe297.80 ns1.00
AddThenRemove_NoExpiration\after\CoreRun.exe46,419.16 ns0.93
AddThenRemove_NoExpiration\before\CoreRun.exe50,281.66 ns1.00
AddThenRemove_AbsoluteExpiration\after\CoreRun.exe51,470.07 ns0.97
AddThenRemove_AbsoluteExpiration\before\CoreRun.exe52,867.99 ns1.00
AddThenRemove_RelativeExpiration\after\CoreRun.exe51,951.99 ns0.95
AddThenRemove_RelativeExpiration\before\CoreRun.exe54,729.02 ns1.00
AddThenRemove_SlidingExpiration\after\CoreRun.exe44,456.62 ns0.86
AddThenRemove_SlidingExpiration\before\CoreRun.exe51,965.55 ns1.00

@eerhardt@davidfowl@maryamariyan@Tratcher

Author:adamsitnik
Assignees:-
Labels:

* NO MERGE *,Discussion,area-Extensions-Caching,tenet-performance

Milestone:-

Comment on lines +64 to +66
_timer=newSystem.Timers.Timer(Math.Max(1,_options.ExpirationScanFrequency.TotalMilliseconds));
_timer.Elapsed+=OnTimerElapsed;
_timer.Start();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is bad for testability. Any timing mechanism needs to be abstracted for the unit tests.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I agree, but it's still doable

@Tratcher
Copy link
Member

Now, most of the time related to caching (most of the time, in general, is spent in JSON serialization) and not related to the usage of Concurrent Dictionaryis spent in getting current time (to tell if the given entry has expired or not):

Have you tried making the current time check more efficient?

@adamsitnik
Copy link
MemberAuthor

Have you tried making the current time check more efficient?

I did:#45281 Now it's just a syscall and I can't see a way to optimize it further

@Clockwork-Muse
Copy link
Contributor

...because these should always be absolute stamps, in theory you could do something like get the current process/system time on cache initialization, via something likeEnvironment.TickCount64 or similar (obviously you'd need an abstraction for testing purposes, andTickCount64 is in milliseconds, which might not be sufficient), then use that for tracking expiration. This should remove the verification when getting the current time, although I don't know how much that would help.

Copy link
Member

@TratcherTratcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Embedding a timer in the implementation makes the component impossible to test reliably. Tests must be in full control of any timing. This is a blocking design point to resolve.

@drieseng
Copy link
Contributor

@adamsitnik From having a quick look atMemoryCache there a few (small) quick wins left:

Don't hesitate to correct me!


privatereadonlyMemoryCacheOptions_options;
privatereadonlyConcurrentDictionary<object,CacheEntry>_entries;
privatereadonlySystem.Timers.Timer_timer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Besides the abstraction thing, why not useSystem.Threading.Timer? It should have less overhead, asSystem.Timers.Timer also relies onThreading.Timer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That's right, nothing should be using System.Timers.Timer.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've used this particular timer because it was the first one VS has suggested ;)

I wanted to have a proof of concept to get some numbers and get feedback before I invest more in the implementation.

@adamsitnik
Copy link
MemberAuthor

@drieseng you are right, but the places that you have pointed are not on the hot path that I am optimizing. So even if I optimize them it won't have any effects

@adamsitnik
Copy link
MemberAuthor

Embedding a timer in the implementation makes the component impossible to test reliably. Tests must be in full control of any timing. This is a blocking design point to resolve.

It's harder, but not impossible to mock.

The question is whether the testing ofMemoryCache should be exposed to the users? This would require me to introduce an official public abstraction forTimer similar to what you did withISystemClock.

@adamsitnik
Copy link
MemberAuthor

This should remove the verification when getting the current time

I've already removed the redundant verification in#45281 WhateverEnvironment.TickCount64 uses, it's most probably a syscall and would not solve our problem

@drieseng
Copy link
Contributor

@drieseng you are right, but the places that you have pointed are not on the hot path that I am optimizing. So even if I optimize them it won't have any effects

It should affect theAddThenRemove andSetOverride benchmark results you've mentioned in the description... BUT... I'm saying this without having actually seen the benchmark code and without having tested the proposed changes :p

If you can guarantee that I can have the dotnet runtime built on my Windows box in 30 minutes, I definitely want to try to prove you wrong (or admit my defeat).

@edevoogd
Copy link

I may have missed some crucial implementation details, but could this work instead to achieve the objective?

  • Change the signature ofCacheEntry.CheckExpired() to take aLazy<DateTimeOffset> instead of aDateTimeOffset. The implementation ofCacheEntry.CheckForExpiredTime() would only invoke the factory method (now would be changed tonow.Value) ifAbsoluteExpiration.HasValue istrue or_slidingExpiration.HasValue istrue.
  • SetCacheEntry.LastAccessed only if_slidingExpiration.HasValue istrue.

@filipnavara
Copy link
Member

filipnavara commentedDec 10, 2020
edited
Loading

Whatever Environment.TickCount64 uses, it's most probably a syscall

Not necessarily, Windows and some versions of Linux have a special memory page read-only mapped into all processes and this value is directly readable from it without a syscall.

It is also monolithically increasing unlike the real clock (which may be an attack vector for cache performance degradation).

@Clockwork-Muse
Copy link
Contributor

I've already removed the redundant verification in#45281

... and, at least on windows, there's a whole bunch of stuff that happens when callingDateTime.UtcNow, like leap second handling.

@adamsitnik
Copy link
MemberAuthor

at least on windows, there's a whole bunch of stuff that happens when calling DateTime.UtcNow, like leap second handling

I've optimized everything except leap seconds in the past there:dotnet/coreclr#26046

And now@GrabYourPitchforks is working on leap second handling perf:#44771

@gfoidlgfoidl mentioned this pull requestDec 14, 2020
@edevoogd
Copy link

@adamsitnik
In a previouscomment, I did not take into accountCacheEntry.LastAccessed being updated on a lot of the public cache operations. You may want to have a look at a suggested direction for further optimization that I outlinedhere.

@adamsitnik
Copy link
MemberAuthor

@edevoogd big thanks for your proposal! I really like the idea described inhttps://github.com/edevoogd/ClockQuantization!

@adamsitnik
Copy link
MemberAuthor

For reasons described by@DamianEdwards onTwitter, I’ve decided to not pursue this idea. If we ever decide that we want to be in the Top 5 for this benchmark, we should reconsider it.

@davidfowl
Copy link
Member

Yes we do want to do this.

@filipnavara
Copy link
Member

I think you should still re-evaluate usingEnvironment.TickCount64 instead ofDateTime.UtcNow. The performance characteristics ought to be slightly different.

@edevoogd
Copy link

edevoogd commentedJan 27, 2021
edited
Loading

FWIW, local experimentation with changes inedevoogd/ClockQuantization#1 shows the additional benefit of usingEnvironment.TickCount64.

@filipnavara@adamsitnik

@ghostghost locked asresolvedand limited conversation to collaboratorsFeb 26, 2021
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@davidfowldavidfowldavidfowl left review comments

@TratcherTratcherTratcher requested changes

+1 more reviewer

@gfoidlgfoidlgfoidl left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

area-Extensions-CachingdiscussionNO-MERGEThe PR is not ready for merge yet (see discussion for detailed reasons)tenet-performancePerformance related issue

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

8 participants

@adamsitnik@Tratcher@Clockwork-Muse@drieseng@edevoogd@filipnavara@davidfowl@gfoidl

[8]ページ先頭

©2009-2025 Movatter.jp