NotificationsYou must be signed in to change notification settings
Fork5.2k
Star17.2k

Increase max loops optimized by RyuJIT from 16 to 64.#55614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

BruceForstall merged 1 commit intodotnet:mainfromBruceForstall:IncreaseMaxLoops

Jul 15, 2021

Merged

Increase max loops optimized by RyuJIT from 16 to 64.#55614

BruceForstall merged 1 commit intodotnet:mainfromBruceForstall:IncreaseMaxLoops

Jul 15, 2021

Conversation

Copy link

Contributor

BruceForstall commentedJul 14, 2021

16 seems remarkably small. 64 was not scientifically chosen, but does cover more
cases.

Note that this number is used to allocate the statically-sized loop table, as well as for memory allocation for value
numbering, so there is some overhead to increasing it.

A few microbenchmarks that have diffs show benefit, including 9% for MulMatrix

Method	Job	Toolchain	Mean	Error	StdDev	Median	Min	Max	Ratio
LLoops	Job-JXEMSM	\runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe	559.9 ms	8.20 ms	7.67 ms	556.4 ms	550.6 ms	576.0 ms	1.00
LLoops	Job-MUOLTV	\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe	552.3 ms	5.84 ms	5.46 ms	552.0 ms	542.4 ms	561.1 ms	0.99
MulMatrix	Job-JXEMSM	\runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe	369.7 ms	4.01 ms	3.56 ms	369.9 ms	364.6 ms	376.9 ms	1.00
MulMatrix	Job-MUOLTV	\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe	338.1 ms	2.69 ms	2.51 ms	337.7 ms	332.2 ms	341.9 ms	0.91
Puzzle	Job-JXEMSM	\runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe	403.4 ms	6.93 ms	6.48 ms	402.3 ms	394.8 ms	412.5 ms	1.00
Puzzle	Job-MUOLTV	\runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe	394.9 ms	4.16 ms	3.68 ms	395.5 ms	388.2 ms	401.8 ms	0.98

spmi diffs:

Summary of Code Size diffs:(Lower is better)Total bytes of base: 6264Total bytes of diff: 6285Total bytes of delta: 21 (0.34% of base)Total relative delta: 0.00    diff is a regression.    relative diff is a regression.

Detail diffs

Top file regressions (bytes):          21 : 42451.dasm (0.34% of base)1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.Top method regressions (bytes):          21 ( 0.34% of base) : 42451.dasm - RelationalModel:Create(IModel,IRelationalAnnotationProvider):IRelationalModelTop method regressions (percentages):          21 ( 0.34% of base) : 42451.dasm - RelationalModel:Create(IModel,IRelationalAnnotationProvider):IRelationalModel1 total methods with Code Size differences (0 improved, 1 regressed), 0 unchanged.

Summary of Code Size diffs:(Lower is better)Total bytes of base: 39243Total bytes of diff: 39940Total bytes of delta: 697 (1.78% of base)Total relative delta: 0.16    diff is a regression.    relative diff is a regression.

Detail diffs

Top file regressions (bytes):         536 : 25723.dasm (18.29% of base)         185 : 26877.dasm (2.21% of base)          66 : 16212.dasm (1.39% of base)          16 : 16196.dasm (0.36% of base)          13 : 13322.dasm (0.27% of base)           9 : 15596.dasm (0.16% of base)Top file improvements (bytes):        -125 : 27270.dasm (-6.93% of base)          -3 : 13993.dasm (-0.05% of base)8 total files with Code Size differences (2 improved, 6 regressed), 0 unchanged.Top method regressions (bytes):         536 (18.29% of base) : 25723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         185 ( 2.21% of base) : 26877.dasm - Benchstone.BenchF.LLoops:Main1(int):this          66 ( 1.39% of base) : 16212.dasm - Jil.Deserialize.Methods:SkipWithLeadChar(System.IO.TextReader,int)          16 ( 0.36% of base) : 16196.dasm - DynamicClass:_DynamicMethod9(System.IO.TextReader,int):MicroBenchmarks.Serializers.MyEventsListerViewModel          13 ( 0.27% of base) : 13322.dasm - DynamicClass:Regex1_Go(System.Text.RegularExpressions.RegexRunner)           9 ( 0.16% of base) : 15596.dasm - DynamicClass:_DynamicMethod9(byref,int):MicroBenchmarks.Serializers.MyEventsListerViewModelTop method improvements (bytes):        -125 (-6.93% of base) : 27270.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this          -3 (-0.05% of base) : 13993.dasm - Jil.Deserialize.Methods:SkipWithLeadCharThunkReader(byref,int)Top method regressions (percentages):         536 (18.29% of base) : 25723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         185 ( 2.21% of base) : 26877.dasm - Benchstone.BenchF.LLoops:Main1(int):this          66 ( 1.39% of base) : 16212.dasm - Jil.Deserialize.Methods:SkipWithLeadChar(System.IO.TextReader,int)          16 ( 0.36% of base) : 16196.dasm - DynamicClass:_DynamicMethod9(System.IO.TextReader,int):MicroBenchmarks.Serializers.MyEventsListerViewModel          13 ( 0.27% of base) : 13322.dasm - DynamicClass:Regex1_Go(System.Text.RegularExpressions.RegexRunner)           9 ( 0.16% of base) : 15596.dasm - DynamicClass:_DynamicMethod9(byref,int):MicroBenchmarks.Serializers.MyEventsListerViewModelTop method improvements (percentages):        -125 (-6.93% of base) : 27270.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this          -3 (-0.05% of base) : 13993.dasm - Jil.Deserialize.Methods:SkipWithLeadCharThunkReader(byref,int)8 total methods with Code Size differences (2 improved, 6 regressed), 0 unchanged.

Summary of Code Size diffs:(Lower is better)Total bytes of base: 142908Total bytes of diff: 143387Total bytes of delta: 479 (0.34% of base)Total relative delta: 0.42    diff is a regression.    relative diff is a regression.

Detail diffs

Top file regressions (bytes):         536 : 248723.dasm (18.29% of base)         390 : 220441.dasm (14.45% of base)         390 : 220391.dasm (14.61% of base)         150 : 239253.dasm (1.78% of base)          71 : 234803.dasm (1.72% of base)          16 : 225588.dasm (1.00% of base)          16 : 225590.dasm (1.00% of base)           5 : 225285.dasm (0.26% of base)Top file improvements (bytes):        -359 : 215690.dasm (-0.99% of base)        -320 : 215701.dasm (-1.16% of base)        -128 : 215723.dasm (-0.73% of base)        -128 : 215666.dasm (-0.62% of base)        -125 : 239280.dasm (-6.93% of base)         -29 : 216754.dasm (-0.33% of base)          -3 : 225316.dasm (-0.15% of base)          -3 : 225313.dasm (-0.15% of base)16 total files with Code Size differences (8 improved, 8 regressed), 0 unchanged.Top method regressions (bytes):         536 (18.29% of base) : 248723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         390 (14.45% of base) : 220441.dasm - VectorTest:Main():int         390 (14.61% of base) : 220391.dasm - VectorTest:Main():int         150 ( 1.78% of base) : 239253.dasm - Benchstone.BenchF.LLoops:Main1(int):this          71 ( 1.72% of base) : 234803.dasm - SmallLoop1:Main():int          16 ( 1.00% of base) : 225588.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          16 ( 1.00% of base) : 225590.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int           5 ( 0.26% of base) : 225285.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):intTop method improvements (bytes):        -359 (-0.99% of base) : 215690.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -320 (-1.16% of base) : 215701.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.73% of base) : 215723.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.62% of base) : 215666.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -125 (-6.93% of base) : 239280.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this         -29 (-0.33% of base) : 216754.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225316.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225313.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):intTop method regressions (percentages):         536 (18.29% of base) : 248723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         390 (14.61% of base) : 220391.dasm - VectorTest:Main():int         390 (14.45% of base) : 220441.dasm - VectorTest:Main():int         150 ( 1.78% of base) : 239253.dasm - Benchstone.BenchF.LLoops:Main1(int):this          71 ( 1.72% of base) : 234803.dasm - SmallLoop1:Main():int          16 ( 1.00% of base) : 225588.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          16 ( 1.00% of base) : 225590.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int           5 ( 0.26% of base) : 225285.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):intTop method improvements (percentages):        -125 (-6.93% of base) : 239280.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this        -320 (-1.16% of base) : 215701.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -359 (-0.99% of base) : 215690.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.73% of base) : 215723.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.62% of base) : 215666.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int         -29 (-0.33% of base) : 216754.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225316.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225313.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int16 total methods with Code Size differences (8 improved, 8 regressed), 0 unchanged.

Summary of Code Size diffs:(Lower is better)Total bytes of base: 15351Total bytes of diff: 15448Total bytes of delta: 97 (0.63% of base)Total relative delta: 0.08    diff is a regression.    relative diff is a regression.

Detail diffs

Top file regressions (bytes):          71 : 63973.dasm (7.63% of base)          26 : 106039.dasm (0.19% of base)2 total files with Code Size differences (0 improved, 2 regressed), 1 unchanged.Top method regressions (bytes):          71 ( 7.63% of base) : 63973.dasm - Microsoft.Diagnostics.Tracing.Parsers.Symbol.FileVersionTraceData:ToXml(System.Text.StringBuilder):System.Text.StringBuilder:this          26 ( 0.19% of base) : 106039.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:thisTop method regressions (percentages):          71 ( 7.63% of base) : 63973.dasm - Microsoft.Diagnostics.Tracing.Parsers.Symbol.FileVersionTraceData:ToXml(System.Text.StringBuilder):System.Text.StringBuilder:this          26 ( 0.19% of base) : 106039.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:this2 total methods with Code Size differences (0 improved, 2 regressed), 1 unchanged.

Summary of Code Size diffs:(Lower is better)Total bytes of base: 22540Total bytes of diff: 22576Total bytes of delta: 36 (0.16% of base)Total relative delta: 0.01    diff is a regression.    relative diff is a regression.

Detail diffs

Top file regressions (bytes):          23 : 104600.dasm (0.15% of base)          14 : 43143.dasm (0.47% of base)Top file improvements (bytes):          -1 : 35267.dasm (-0.03% of base)3 total files with Code Size differences (1 improved, 2 regressed), 0 unchanged.Top method regressions (bytes):          23 ( 0.15% of base) : 104600.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:this          14 ( 0.47% of base) : 43143.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:ForceComplete(Microsoft.CodeAnalysis.SourceLocation,System.Threading.CancellationToken):thisTop method improvements (bytes):          -1 (-0.03% of base) : 35267.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseNamespaceBody(byref,byref,byref,ushort):thisTop method regressions (percentages):          14 ( 0.47% of base) : 43143.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:ForceComplete(Microsoft.CodeAnalysis.SourceLocation,System.Threading.CancellationToken):this          23 ( 0.15% of base) : 104600.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:thisTop method improvements (percentages):          -1 (-0.03% of base) : 35267.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseNamespaceBody(byref,byref,byref,ushort):this3 total methods with Code Size differences (1 improved, 2 regressed), 0 unchanged.

Summary of Code Size diffs:(Lower is better)Total bytes of base: 53782Total bytes of diff: 53837Total bytes of delta: 55 (0.10% of base)Total relative delta: 0.00    diff is a regression.    relative diff is a regression.

Detail diffs

Top file regressions (bytes):          28 : 28889.dasm (0.16% of base)          27 : 28887.dasm (0.15% of base)2 total files with Code Size differences (0 improved, 2 regressed), 2 unchanged.Top method regressions (bytes):          28 ( 0.16% of base) : 28889.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach003.freach003.Test:MainMethod():int          27 ( 0.15% of base) : 28887.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach004.freach004.Test:MainMethod():intTop method regressions (percentages):          28 ( 0.16% of base) : 28889.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach003.freach003.Test:MainMethod():int          27 ( 0.15% of base) : 28887.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach004.freach004.Test:MainMethod():int2 total methods with Code Size differences (0 improved, 2 regressed), 2 unchanged.

Increase max loops optimized by RyuJIT from 16 to 64.

a39c1db

16 seems remarkably small. Note that this number is used to allocate thestatically-sized loop table, as well as for memory allocation for valuenumbering, so there is some overhead to increasing it.A few microbenchmarks that have diffs show benefit, including 9% for MulMatrix| Method |        Job |                                                                         Toolchain |     Mean |   Error |  StdDev |   Median |      Min |      Max | Ratio ||------- |----------- |---------------------------------------------------------------------------------- |---------:|--------:|--------:|---------:|---------:|---------:|------:|| LLoops | Job-JXEMSM | \runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe | 559.9 ms | 8.20 ms | 7.67 ms | 556.4 ms | 550.6 ms | 576.0 ms |  1.00 || LLoops | Job-MUOLTV |  \runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe | 552.3 ms | 5.84 ms | 5.46 ms | 552.0 ms | 542.4 ms | 561.1 ms |  0.99 || MulMatrix | Job-JXEMSM | \runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe | 369.7 ms | 4.01 ms | 3.56 ms | 369.9 ms | 364.6 ms | 376.9 ms |  1.00 || MulMatrix | Job-MUOLTV |  \runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe | 338.1 ms | 2.69 ms | 2.51 ms | 337.7 ms | 332.2 ms | 341.9 ms |  0.91 || Puzzle | Job-JXEMSM | \runtime2\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe | 403.4 ms | 6.93 ms | 6.48 ms | 402.3 ms | 394.8 ms | 412.5 ms |  1.00 || Puzzle | Job-MUOLTV |  \runtime\artifacts\tests\coreclr\windows.x64.Checked\Tests\Core_Root\CoreRun.exe | 394.9 ms | 4.16 ms | 3.68 ms | 395.5 ms | 388.2 ms | 401.8 ms |  0.98 |spmi diffs:```Summary of Code Size diffs:(Lower is better)Total bytes of base: 6264Total bytes of diff: 6285Total bytes of delta: 21 (0.34% of base)Total relative delta: 0.00    diff is a regression.    relative diff is a regression.```<details><summary>Detail diffs</summary>```Top file regressions (bytes):          21 : 42451.dasm (0.34% of base)1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.Top method regressions (bytes):          21 ( 0.34% of base) : 42451.dasm - RelationalModel:Create(IModel,IRelationalAnnotationProvider):IRelationalModelTop method regressions (percentages):          21 ( 0.34% of base) : 42451.dasm - RelationalModel:Create(IModel,IRelationalAnnotationProvider):IRelationalModel1 total methods with Code Size differences (0 improved, 1 regressed), 0 unchanged.```</details>--------------------------------------------------------------------------------```Summary of Code Size diffs:(Lower is better)Total bytes of base: 39243Total bytes of diff: 39940Total bytes of delta: 697 (1.78% of base)Total relative delta: 0.16    diff is a regression.    relative diff is a regression.```<details><summary>Detail diffs</summary>```Top file regressions (bytes):         536 : 25723.dasm (18.29% of base)         185 : 26877.dasm (2.21% of base)          66 : 16212.dasm (1.39% of base)          16 : 16196.dasm (0.36% of base)          13 : 13322.dasm (0.27% of base)           9 : 15596.dasm (0.16% of base)Top file improvements (bytes):        -125 : 27270.dasm (-6.93% of base)          -3 : 13993.dasm (-0.05% of base)8 total files with Code Size differences (2 improved, 6 regressed), 0 unchanged.Top method regressions (bytes):         536 (18.29% of base) : 25723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         185 ( 2.21% of base) : 26877.dasm - Benchstone.BenchF.LLoops:Main1(int):this          66 ( 1.39% of base) : 16212.dasm - Jil.Deserialize.Methods:SkipWithLeadChar(System.IO.TextReader,int)          16 ( 0.36% of base) : 16196.dasm - DynamicClass:_DynamicMethod9(System.IO.TextReader,int):MicroBenchmarks.Serializers.MyEventsListerViewModel          13 ( 0.27% of base) : 13322.dasm - DynamicClass:Regex1_Go(System.Text.RegularExpressions.RegexRunner)           9 ( 0.16% of base) : 15596.dasm - DynamicClass:_DynamicMethod9(byref,int):MicroBenchmarks.Serializers.MyEventsListerViewModelTop method improvements (bytes):        -125 (-6.93% of base) : 27270.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this          -3 (-0.05% of base) : 13993.dasm - Jil.Deserialize.Methods:SkipWithLeadCharThunkReader(byref,int)Top method regressions (percentages):         536 (18.29% of base) : 25723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         185 ( 2.21% of base) : 26877.dasm - Benchstone.BenchF.LLoops:Main1(int):this          66 ( 1.39% of base) : 16212.dasm - Jil.Deserialize.Methods:SkipWithLeadChar(System.IO.TextReader,int)          16 ( 0.36% of base) : 16196.dasm - DynamicClass:_DynamicMethod9(System.IO.TextReader,int):MicroBenchmarks.Serializers.MyEventsListerViewModel          13 ( 0.27% of base) : 13322.dasm - DynamicClass:Regex1_Go(System.Text.RegularExpressions.RegexRunner)           9 ( 0.16% of base) : 15596.dasm - DynamicClass:_DynamicMethod9(byref,int):MicroBenchmarks.Serializers.MyEventsListerViewModelTop method improvements (percentages):        -125 (-6.93% of base) : 27270.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this          -3 (-0.05% of base) : 13993.dasm - Jil.Deserialize.Methods:SkipWithLeadCharThunkReader(byref,int)8 total methods with Code Size differences (2 improved, 6 regressed), 0 unchanged.```</details>--------------------------------------------------------------------------------```Summary of Code Size diffs:(Lower is better)Total bytes of base: 142908Total bytes of diff: 143387Total bytes of delta: 479 (0.34% of base)Total relative delta: 0.42    diff is a regression.    relative diff is a regression.```<details><summary>Detail diffs</summary>```Top file regressions (bytes):         536 : 248723.dasm (18.29% of base)         390 : 220441.dasm (14.45% of base)         390 : 220391.dasm (14.61% of base)         150 : 239253.dasm (1.78% of base)          71 : 234803.dasm (1.72% of base)          16 : 225588.dasm (1.00% of base)          16 : 225590.dasm (1.00% of base)           5 : 225285.dasm (0.26% of base)Top file improvements (bytes):        -359 : 215690.dasm (-0.99% of base)        -320 : 215701.dasm (-1.16% of base)        -128 : 215723.dasm (-0.73% of base)        -128 : 215666.dasm (-0.62% of base)        -125 : 239280.dasm (-6.93% of base)         -29 : 216754.dasm (-0.33% of base)          -3 : 225316.dasm (-0.15% of base)          -3 : 225313.dasm (-0.15% of base)16 total files with Code Size differences (8 improved, 8 regressed), 0 unchanged.Top method regressions (bytes):         536 (18.29% of base) : 248723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         390 (14.45% of base) : 220441.dasm - VectorTest:Main():int         390 (14.61% of base) : 220391.dasm - VectorTest:Main():int         150 ( 1.78% of base) : 239253.dasm - Benchstone.BenchF.LLoops:Main1(int):this          71 ( 1.72% of base) : 234803.dasm - SmallLoop1:Main():int          16 ( 1.00% of base) : 225588.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          16 ( 1.00% of base) : 225590.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int           5 ( 0.26% of base) : 225285.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):intTop method improvements (bytes):        -359 (-0.99% of base) : 215690.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -320 (-1.16% of base) : 215701.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.73% of base) : 215723.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.62% of base) : 215666.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -125 (-6.93% of base) : 239280.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this         -29 (-0.33% of base) : 216754.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225316.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225313.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):intTop method regressions (percentages):         536 (18.29% of base) : 248723.dasm - Benchstone.BenchI.MulMatrix:Inner(System.Int32[][],System.Int32[][],System.Int32[][])         390 (14.61% of base) : 220391.dasm - VectorTest:Main():int         390 (14.45% of base) : 220441.dasm - VectorTest:Main():int         150 ( 1.78% of base) : 239253.dasm - Benchstone.BenchF.LLoops:Main1(int):this          71 ( 1.72% of base) : 234803.dasm - SmallLoop1:Main():int          16 ( 1.00% of base) : 225588.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          16 ( 1.00% of base) : 225590.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int           5 ( 0.26% of base) : 225285.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):intTop method improvements (percentages):        -125 (-6.93% of base) : 239280.dasm - Benchstone.BenchI.Puzzle:DoIt():bool:this        -320 (-1.16% of base) : 215701.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -359 (-0.99% of base) : 215690.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.73% of base) : 215723.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int        -128 (-0.62% of base) : 215666.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int         -29 (-0.33% of base) : 216754.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225316.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int          -3 (-0.15% of base) : 225313.dasm - IntelHardwareIntrinsicTest.Program:Main(System.String[]):int16 total methods with Code Size differences (8 improved, 8 regressed), 0 unchanged.```</details>--------------------------------------------------------------------------------```Summary of Code Size diffs:(Lower is better)Total bytes of base: 15351Total bytes of diff: 15448Total bytes of delta: 97 (0.63% of base)Total relative delta: 0.08    diff is a regression.    relative diff is a regression.```<details><summary>Detail diffs</summary>```Top file regressions (bytes):          71 : 63973.dasm (7.63% of base)          26 : 106039.dasm (0.19% of base)2 total files with Code Size differences (0 improved, 2 regressed), 1 unchanged.Top method regressions (bytes):          71 ( 7.63% of base) : 63973.dasm - Microsoft.Diagnostics.Tracing.Parsers.Symbol.FileVersionTraceData:ToXml(System.Text.StringBuilder):System.Text.StringBuilder:this          26 ( 0.19% of base) : 106039.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:thisTop method regressions (percentages):          71 ( 7.63% of base) : 63973.dasm - Microsoft.Diagnostics.Tracing.Parsers.Symbol.FileVersionTraceData:ToXml(System.Text.StringBuilder):System.Text.StringBuilder:this          26 ( 0.19% of base) : 106039.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:this2 total methods with Code Size differences (0 improved, 2 regressed), 1 unchanged.```</details>--------------------------------------------------------------------------------```Summary of Code Size diffs:(Lower is better)Total bytes of base: 22540Total bytes of diff: 22576Total bytes of delta: 36 (0.16% of base)Total relative delta: 0.01    diff is a regression.    relative diff is a regression.```<details><summary>Detail diffs</summary>```Top file regressions (bytes):          23 : 104600.dasm (0.15% of base)          14 : 43143.dasm (0.47% of base)Top file improvements (bytes):          -1 : 35267.dasm (-0.03% of base)3 total files with Code Size differences (1 improved, 2 regressed), 0 unchanged.Top method regressions (bytes):          23 ( 0.15% of base) : 104600.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:this          14 ( 0.47% of base) : 43143.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:ForceComplete(Microsoft.CodeAnalysis.SourceLocation,System.Threading.CancellationToken):thisTop method improvements (bytes):          -1 (-0.03% of base) : 35267.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseNamespaceBody(byref,byref,byref,ushort):thisTop method regressions (percentages):          14 ( 0.47% of base) : 43143.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceMemberContainerTypeSymbol:ForceComplete(Microsoft.CodeAnalysis.SourceLocation,System.Threading.CancellationToken):this          23 ( 0.15% of base) : 104600.dasm - Microsoft.VisualBasic.CompilerServices.VBBinder:BindToMethod(int,System.Reflection.MethodBase[],byref,System.Reflection.ParameterModifier[],System.Globalization.CultureInfo,System.String[],byref):System.Reflection.MethodBase:thisTop method improvements (percentages):          -1 (-0.03% of base) : 35267.dasm - Microsoft.CodeAnalysis.CSharp.Syntax.InternalSyntax.LanguageParser:ParseNamespaceBody(byref,byref,byref,ushort):this3 total methods with Code Size differences (1 improved, 2 regressed), 0 unchanged.```</details>--------------------------------------------------------------------------------```Summary of Code Size diffs:(Lower is better)Total bytes of base: 53782Total bytes of diff: 53837Total bytes of delta: 55 (0.10% of base)Total relative delta: 0.00    diff is a regression.    relative diff is a regression.```<details><summary>Detail diffs</summary>```Top file regressions (bytes):          28 : 28889.dasm (0.16% of base)          27 : 28887.dasm (0.15% of base)2 total files with Code Size differences (0 improved, 2 regressed), 2 unchanged.Top method regressions (bytes):          28 ( 0.16% of base) : 28889.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach003.freach003.Test:MainMethod():int          27 ( 0.15% of base) : 28887.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach004.freach004.Test:MainMethod():intTop method regressions (percentages):          28 ( 0.16% of base) : 28889.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach003.freach003.Test:MainMethod():int          27 ( 0.15% of base) : 28887.dasm - ManagedTests.DynamicCSharp.Conformance.dynamic.statements.freach.freach004.freach004.Test:MainMethod():int2 total methods with Code Size differences (0 improved, 2 regressed), 2 unchanged.```</details>--------------------------------------------------------------------------------

ghost added the area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label

Jul 14, 2021

Copy link

ContributorAuthor

BruceForstall commentedJul 14, 2021

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress, runtime-coreclr outerloop

BruceForstall requested a review fromAndyAyersMS

July 14, 2021 01:09

Copy link

azure-pipelinesbot commentedJul 14, 2021

Azure Pipelines successfully started running 3 pipeline(s).

Copy link

ContributorAuthor

BruceForstall commentedJul 14, 2021

@AndyAyersMS @dotnet/jit-contrib PTAL

Copy link

ContributorAuthor

BruceForstall commentedJul 14, 2021•
edited
Loading

If anyone has an opinion on how the "best" number should be chosen, I'd be happy to hear it. Making it dynamic perhaps would be an option as well: not have a maximum. Not sure if there are algorithms that would need to be reconsidered to avoid bad behavior for large numbers.

Copy link

Contributor

kunalspathak commentedJul 14, 2021

Curious - does choosing 32 not showing enough wins? Can we expose aCOMPlus_MaxLoops or something that will help us experiment more? Is there a TP impact?

Copy link

Member

AndyAyersMS commentedJul 14, 2021

Does 64 cover all the SPMI cases...?

This might be a place where a more extensive SPMI collection would prove valuable.

Copy link

Contributor

kunalspathak commentedJul 14, 2021

This might be a place where a more extensive SPMI collection would prove valuable.

Could you elaborate on that?

Copy link

Member

AndyAyersMS commentedJul 14, 2021

This might be a place where a more extensive SPMI collection would prove valuable.
Could you elaborate on that?

I'm referring to some of the internal collections we had at one point, over much larger amounts of code.

Copy link

ContributorAuthor

BruceForstall commentedJul 14, 2021

I enabledCOUNT_LOOPS and ran it across a merged spmi mega-collection (and bumped the number of histogram buckets), and get:

---------------------------------------------------Loop stats---------------------------------------------------Total number of methods with loops is 64366Total number of              loops is 92730Maximum number of loops per method is   192# of methods overflowing nat loop table is    49Total number of 'unnatural' loops is 135600# of methods overflowing unnat loop limit is     0Total number of loops with an         iterator is 32376Total number of loops with a simple   iterator is 32376Total number of loops with a constant iterator is  5515--------------------------------------------------Loop count frequency table:--------------------------------------------------     <=          0 ===>   11043 count ( 14% of total)      1 ..       1 ===>   48707 count ( 79% of total)      2 ..       2 ===>   10489 count ( 93% of total)      3 ..       3 ===>    2954 count ( 97% of total)      4 ..       4 ===>     932 count ( 98% of total)      5 ..       5 ===>     496 count ( 98% of total)      6 ..       6 ===>     268 count ( 99% of total)      7 ..       7 ===>     118 count ( 99% of total)      8 ..       8 ===>      95 count ( 99% of total)      9 ..       9 ===>      63 count ( 99% of total)     10 ..      10 ===>      50 count ( 99% of total)     11 ..      11 ===>      44 count ( 99% of total)     12 ..      12 ===>      19 count ( 99% of total)     13 ..      13 ===>      23 count ( 99% of total)     14 ..      14 ===>      34 count ( 99% of total)     15 ..      15 ===>      12 count ( 99% of total)     16 ..      16 ===>      13 count ( 99% of total)     17 ..      17 ===>       2 count ( 99% of total)     18 ..      18 ===>       9 count ( 99% of total)     19 ..      19 ===>       1 count ( 99% of total)     20 ..      20 ===>       6 count ( 99% of total)     21 ..      21 ===>       1 count ( 99% of total)     22 ..      22 ===>       1 count ( 99% of total)     23 ..      23 ===>       0 count ( 99% of total)     24 ..      24 ===>       1 count ( 99% of total)     25 ..      25 ===>       4 count ( 99% of total)     26 ..      26 ===>       1 count ( 99% of total)     27 ..      27 ===>       0 count ( 99% of total)     28 ..      27 ===>       0 count ( 99% of total)     28 ..      29 ===>       6 count ( 99% of total)     30 ..      30 ===>       2 count ( 99% of total)     31 ..      31 ===>       0 count ( 99% of total)     32 ..      32 ===>       0 count ( 99% of total)     33 ..      33 ===>       0 count ( 99% of total)     34 ..      34 ===>       0 count ( 99% of total)     35 ..      35 ===>       1 count ( 99% of total)     36 ..      36 ===>       2 count ( 99% of total)     37 ..      37 ===>       0 count ( 99% of total)     38 ..      38 ===>       0 count ( 99% of total)     39 ..      39 ===>       0 count ( 99% of total)     40 ..      40 ===>       0 count ( 99% of total)     41 ..      41 ===>       0 count ( 99% of total)     42 ..      42 ===>       0 count ( 99% of total)     43 ..      43 ===>       0 count ( 99% of total)     44 ..      44 ===>       0 count ( 99% of total)     45 ..      45 ===>       2 count ( 99% of total)     46 ..      46 ===>       2 count ( 99% of total)     47 ..      47 ===>       0 count ( 99% of total)     48 ..      48 ===>       1 count ( 99% of total)     49 ..      49 ===>       0 count ( 99% of total)     50 ..      50 ===>       0 count ( 99% of total)     51 ..      51 ===>       0 count ( 99% of total)     52 ..      52 ===>       0 count ( 99% of total)     53 ..      53 ===>       0 count ( 99% of total)     54 ..      54 ===>       0 count ( 99% of total)     55 ..      55 ===>       0 count ( 99% of total)     56 ..      56 ===>       0 count ( 99% of total)     57 ..      57 ===>       0 count ( 99% of total)     58 ..      58 ===>       0 count ( 99% of total)     59 ..      59 ===>       1 count (100% of total)     60 ..      60 ===>       0 count (100% of total)      >         60 ===>       6 count (100% of total)--------------------------------------------------Loop exit count frequency table:--------------------------------------------------     <=          0 ===>     126 count (  0% of total)      1 ..       1 ===>   57237 count ( 64% of total)      2 ..       2 ===>   21195 count ( 87% of total)      3 ..       3 ===>    5175 count ( 93% of total)      4 ..       4 ===>    2758 count ( 96% of total)      5 ..       5 ===>    2164 count ( 99% of total)      6 ..       6 ===>     892 count (100% of total)      >          6 ===>    3183 count (103% of total)--------------------------------------------------

So, sticking with powers of 2, 32 only misses ~9 functions with > 32 loops. Even 16 max loops per function hits 99% of all loops (no surprise). Of course there's some crazy outlier: 192 loops in (at least) one function.

The stats for just the benchmarks is:

---------------------------------------------------Loop stats---------------------------------------------------Total number of methods with loops is  2679Total number of              loops is  4326Maximum number of loops per method is    61# of methods overflowing nat loop table is    10Total number of 'unnatural' loops is  4822# of methods overflowing unnat loop limit is     0Total number of loops with an         iterator is  1534Total number of loops with a simple   iterator is  1534Total number of loops with a constant iterator is   398--------------------------------------------------Loop count frequency table:--------------------------------------------------     <=          0 ===>     213 count (  7% of total)      1 ..       1 ===>    1984 count ( 75% of total)      2 ..       2 ===>     415 count ( 90% of total)      3 ..       3 ===>     144 count ( 95% of total)      4 ..       4 ===>      59 count ( 97% of total)      5 ..       5 ===>      29 count ( 98% of total)      6 ..       6 ===>      13 count ( 98% of total)      7 ..       7 ===>       5 count ( 98% of total)      8 ..       8 ===>       1 count ( 99% of total)      9 ..       9 ===>       2 count ( 99% of total)     10 ..      10 ===>       1 count ( 99% of total)     11 ..      11 ===>       1 count ( 99% of total)     12 ..      12 ===>       2 count ( 99% of total)     13 ..      13 ===>       5 count ( 99% of total)     14 ..      14 ===>       5 count ( 99% of total)     15 ..      15 ===>       3 count ( 99% of total)     16 ..      16 ===>       0 count ( 99% of total)     17 ..      17 ===>       0 count ( 99% of total)     18 ..      18 ===>       2 count ( 99% of total)     19 ..      19 ===>       1 count ( 99% of total)     20 ..      20 ===>       1 count ( 99% of total)     21 ..      21 ===>       0 count ( 99% of total)     22 ..      22 ===>       0 count ( 99% of total)     23 ..      23 ===>       0 count ( 99% of total)     24 ..      24 ===>       0 count ( 99% of total)     25 ..      25 ===>       0 count ( 99% of total)     26 ..      26 ===>       0 count ( 99% of total)     27 ..      27 ===>       0 count ( 99% of total)     28 ..      27 ===>       0 count ( 99% of total)     28 ..      29 ===>       1 count ( 99% of total)     30 ..      30 ===>       1 count ( 99% of total)     31 ..      31 ===>       0 count ( 99% of total)     32 ..      32 ===>       0 count ( 99% of total)     33 ..      33 ===>       0 count ( 99% of total)     34 ..      34 ===>       0 count ( 99% of total)     35 ..      35 ===>       0 count ( 99% of total)     36 ..      36 ===>       1 count ( 99% of total)     37 ..      37 ===>       0 count ( 99% of total)     38 ..      38 ===>       0 count ( 99% of total)     39 ..      39 ===>       0 count ( 99% of total)     40 ..      40 ===>       0 count ( 99% of total)     41 ..      41 ===>       0 count ( 99% of total)     42 ..      42 ===>       0 count ( 99% of total)     43 ..      43 ===>       0 count ( 99% of total)     44 ..      44 ===>       0 count ( 99% of total)     45 ..      45 ===>       1 count ( 99% of total)     46 ..      46 ===>       0 count ( 99% of total)     47 ..      47 ===>       0 count ( 99% of total)     48 ..      48 ===>       0 count ( 99% of total)     49 ..      49 ===>       0 count ( 99% of total)     50 ..      50 ===>       0 count ( 99% of total)     51 ..      51 ===>       0 count ( 99% of total)     52 ..      52 ===>       0 count ( 99% of total)     53 ..      53 ===>       0 count ( 99% of total)     54 ..      54 ===>       0 count ( 99% of total)     55 ..      55 ===>       0 count ( 99% of total)     56 ..      56 ===>       0 count ( 99% of total)     57 ..      57 ===>       0 count ( 99% of total)     58 ..      58 ===>       0 count ( 99% of total)     59 ..      59 ===>       1 count (100% of total)     60 ..      60 ===>       0 count (100% of total)      >         60 ===>       1 count (100% of total)--------------------------------------------------Loop exit count frequency table:--------------------------------------------------     <=          0 ===>       0 count (  0% of total)      1 ..       1 ===>    2426 count ( 58% of total)      2 ..       2 ===>     995 count ( 82% of total)      3 ..       3 ===>     330 count ( 90% of total)      4 ..       4 ===>     237 count ( 96% of total)      5 ..       5 ===>      96 count ( 98% of total)      6 ..       6 ===>      68 count (100% of total)      >          6 ===>     174 count (104% of total)--------------------------------------------------

Unfortunately, the spmi collections, especially the benchmarks collection, have a lot of "MISSING" data currently, so it's not clear how that's skewing the data.