Moved JIT stage fromEngineFactory to a properEngineJitStage.
- JIT stage now attempts to push the benchmarked method through all JIT tiers.
Moved heuristic fromEngineFactory to a new pilot stage (JIT stage, according to its name, now only focuses on jitting).
- Fixed the heuristic to never include the first invocation.
Cleanup aroundIEngine (breaking changes).
Improved check forLegacyJit.

timcassell added the Area:Engine label

Jul 9, 2025

timcassell commented

Jul 9, 2025

View reviewed changes

src/BenchmarkDotNet/Engines/EngineJitStage.cs

		yield return GetOverheadNoUnrollIterationData();
		yield return GetDummyIterationData(dummy2Action);
		yield return GetWorkloadNoUnrollIterationData();
		yield return GetDummyIterationData(dummy3Action);

Copy link

CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@AndreyAkinshin You added dummy actions in 2017. I don't know what they are for. Do we still need them?

timcassell added the breaking change label

Jul 9, 2025

timcassell force-pushed thejit-stage branch 4 times, most recently fromd5e6cd4 to8efb670Compare

July 10, 2025 18:45

timcassell mentioned this pull request

Jul 10, 2025

Improve memory diagnoser accuracy#2562

Merged

Copy link

CollaboratorAuthor

timcassell commentedJul 11, 2025

cc@AndyAyersMS @EgorBo

timcassell force-pushed thejit-stage branch froma5c1dc0 tofee6992Compare

July 13, 2025 19:05

Refactored engine JIT stage.

cf37148

timcassell force-pushed thejit-stage branch fromfee6992 tocf37148Compare

July 13, 2025 19:07

Copy link

Member

EgorBo commentedJul 13, 2025

JIT stage now attempts to push the benchmarked method through all JIT tiers.
Set environment variable for the runtime to enable aggressive tiering by default.

Honestly, I think you shouldn't useTC_AggressiveTiering, just 1 iteration to promote to Tier1 is mostly just for internal testing. I thinkCallCountingDelayMs=0 should be enough.

Copy link

CollaboratorAuthor

timcassell commentedJul 13, 2025

Honestly, I think you shouldn't useTC_AggressiveTiering, just 1 iteration to promote to Tier1 is mostly just for internal testing. I thinkCallCountingDelayMs=0 should be enough.

Can you elaborate on that? Why would we need more than 1 invocation per tier for throughput benchmarks? 30 invocations is too much for the stage to complete in a timely manner for long-running benchmarks.

Also, I triedCallCountingDelayMs=0, but it breaks the disassembler (dotnet/runtime#117339).

Copy link

Member

EgorBo commentedJul 13, 2025•
edited
Loading

Can you elaborate on that?

I think the profile will not be representable (a benchmark may invoke the same method from different places and we don't have context-sensitive profiling yet) + we have optimizations like we intentionally make call counting for some methods smaller so their callers are guaranteed to be promoted later (it's for some internal calls so we can bake final addresses of their Tier1 code versions directly instead of having indirect calls), although, I am mostly concerned about PGO quality.

Copy link

CollaboratorAuthor

timcassell commentedJul 13, 2025

Thanks, that makes sense. I guess I can remove that env var and just run the jit stage with a timeout, and if it doesn't fully reach tier1, we can allow the pilot/warmup stages to handle it later (#1210).

Can you also verify the logic inJitInfo.cs?

Copy link

Member

EgorBo commentedJul 13, 2025

Thanks, that makes sense. I guess I can remove that env var and just run the jit stage with a timeout, and if it doesn't fully reach tier1

How do you check that? I don't think there is a way to check whether a benchmark and all of its callees are fully warmed up

Copy link

CollaboratorAuthor

timcassell commentedJul 13, 2025

Thanks, that makes sense. I guess I can remove that env var and just run the jit stage with a timeout, and if it doesn't fully reach tier1
How do you check that? I don't think there is a way to check whether a benchmark and all of its callees are fully warmed up

We don't. We just run a number of invocations based on the configured values retrieved fromJitInfo and hope for the best. The pilot/warmup stages will have to work with some sort of heuristic to try to determine if tiering caused the measured time to significantly drop.

timcassell force-pushed thejit-stage branch 2 times, most recently from8a143d6 tof02de9bCompare

July 13, 2025 21:42

PR feedback.

52cabdd

timcassell force-pushed thejit-stage branch fromf02de9b to52cabddCompare

July 13, 2025 21:43

IsRyuJit field instead of property.

ca3d7b9

timcassell force-pushed thejit-stage branch from9ade172 toca3d7b9Compare

July 13, 2025 22:45

timcassell requested a review fromAndreyAkinshin

July 13, 2025 22:54

Fix call counts.

778a1a7

timcassell mentioned this pull request

Jul 17, 2025

Random.Next Tier1 slower than Tier0dotnet/runtime#117787

Open

Added an extra invocation to the end of jit stage.

1251e63

Copy link

CollaboratorAuthor

timcassell commentedJul 17, 2025

dotnet/runtime#117787 (comment)

The "third tier" you see may be OSR, since your method loops a lot and isn't called often.

@AndyAyersMS (to not derail that issue), how can we account for OSR in the jit stage here?

Copy link

Member

EgorBo commentedJul 17, 2025•
edited
Loading

dotnet/runtime#117787 (comment)
The "third tier" you see may be OSR, since your method loops a lot and isn't called often.
@AndyAyersMS (to not derail that issue), how can we account for OSR in the jit stage here?

I think for BDN specifically OSR is just some intermediate tier it doesn't have to care about, it shouldn't impact the Tier0->Tier1 promotion velocity. Since the method is too slow, I guess BDN decided not too call it too many times?

Copy link

CollaboratorAuthor

timcassell commentedJul 17, 2025

Since the method is too slow, I guess BDN decided not too call it too many times?

This is purely for the jit stage, where the number of invocations are fixed (in an attempt to push it through all tiers). I'm not sure what the jit thinks is not called enough times. Perhaps because of how the stages work, it only invokes once per iteration, and the jit can't see that the iterations are being ran multiple times? If we called it through theWorkloadUnroll method (withunrollFactor = 16), the jit would skip the OSR?

Copy link

CollaboratorAuthor

timcassell commentedJul 17, 2025•
edited
Loading

I think for BDN specifically OSR is just some intermediate tier it doesn't have to care about, it shouldn't impact the Tier0->Tier1 promotion velocity.

That's what I thought, but the evidence shows otherwise. It took 60 invocations to fully reach tier1, instead of 30 (DPGO disabled).

Copy link

Member

AndyAyersMS commentedJul 17, 2025

Did you try profiling the example fromdotnet/runtime#117787? If not, I can do it soonish.

Copy link

CollaboratorAuthor

timcassell commentedJul 17, 2025

Did you try profiling the example fromdotnet/runtime#117787? If not, I can do it soonish.

Nope, I don't have much experience to know what to look for. If you're going to do it from this branch, add +2 toremainingTiers in the jit stage to see results of all tiers. Appreciate it.

Labels

Area:Engine breaking change

3 participants

Movatterモバイル変換

Uh oh!

Refactor engine JIT stage#2806

Are you sure you want to change the base?

Refactor engine JIT stage#2806

Uh oh!

Conversation

timcassell commentedJul 9, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

timcassellJul 9, 2025

Choose a reason for hiding this comment

Uh oh!

timcassell commentedJul 11, 2025

Uh oh!

EgorBo commentedJul 13, 2025

Uh oh!

timcassell commentedJul 13, 2025

Uh oh!

EgorBo commentedJul 13, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

timcassell commentedJul 13, 2025

Uh oh!

EgorBo commentedJul 13, 2025

Uh oh!

timcassell commentedJul 13, 2025

Uh oh!

timcassell commentedJul 17, 2025

Uh oh!

EgorBo commentedJul 17, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

timcassell commentedJul 17, 2025

Uh oh!

timcassell commentedJul 17, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

AndyAyersMS commentedJul 17, 2025

Uh oh!

timcassell commentedJul 17, 2025

Uh oh!

Uh oh!

timcassell commentedJul 9, 2025•
edited
Loading

EgorBo commentedJul 13, 2025•
edited
Loading

EgorBo commentedJul 17, 2025•
edited
Loading

timcassell commentedJul 17, 2025•
edited
Loading