Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Enable CSE for floating-point constants#44419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
briansull merged 2 commits intodotnet:masterfrombriansull:cse-float-constants
Feb 27, 2021

Conversation

@briansull
Copy link
Contributor

No description provided.

@Dotnet-GitSync-BotDotnet-GitSync-Bot added the area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labelNov 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unrelated to your change, but this isn't accurate anymore.fldz andfld1 are from the x87 stack and we almost exclusively use SSE/SSE2 now (modulo some ABI things for 32-bit).

Now its that we usexorps for0 and load values for everything else.

@tannergooding
Copy link
Member

This is just CSE of constants right? What is the behavior for floating-point values that compare equal but whose bitwise values are different (-0.0 vs+0.0) or values who are bitwise equivalent but compare inequal (double.NaN or any of theNaN representations in general)?

@briansull
Copy link
ContributorAuthor

briansull commentedNov 10, 2020
edited
Loading

The ValueNumbers of the two constants have to be the same, We don't use floating point compare to determine if two constant are equal. Value numbering already has to properly handle floating point constants properly.

Tracing through the value number implementation, It eventually uses this function in jithashtable.h

template <typename T>struct JitLargePrimitiveKeyFuncs : public JitKeyFuncsDefEquals<T>{    static unsigned GetHashCode(const T val)    {        // A static cast when T is a float or a double converts the value (i.e. 0.25 converts to 0)        //        // Instead we want to use all of the bits of a float to create the hash value        // So we cast the address of val to a pointer to an equivalent sized unsigned int        // This allows us to read the actual bit representation of a float type        //
tannergooding reacted with thumbs up emoji

@ViktorHofer
Copy link
Member

// Auto-generated message

69e114c which was merged 12/7 removed the intermediate src/coreclr/src/ folder. This PR needs to be updated as it touches files in that directory which causes conflicts.

To update your commits you can use this bash script:https://gist.github.com/ViktorHofer/6d24f62abdcddb518b4966ead5ef3783. Feel free to use the comment section of the gist to improve the script for others.

@AndyAyersMS
Copy link
Member

@briansull anything holding this up other than a review?

@JulieLeeMSFTJulieLeeMSFT added this to the6.0.0 milestoneFeb 8, 2021
@JulieLeeMSFTJulieLeeMSFT linked an issueFeb 8, 2021 that may beclosed by this pull request
@briansull
Copy link
ContributorAuthor

This is ready for checkin again:
@dotnet/jit-contrib PTAL
@tannergooding

@briansull
Copy link
ContributorAuthor

ARM32:

Top file regressions (bytes):         294 : System.Data.Common.dasm (0.03% of base)Top file improvements (bytes):       -3200 : System.Private.CoreLib.dasm (-0.10% of base)        -496 : Microsoft.VisualBasic.Core.dasm (-0.12% of base)        -344 : System.Runtime.Numerics.dasm (-0.49% of base)        -170 : System.Drawing.Common.dasm (-0.05% of base)         -98 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.00% of base)         -28 : FSharp.Core.dasm (-0.00% of base)         -26 : System.Net.Http.dasm (-0.00% of base)         -20 : System.Private.Xml.dasm (-0.00% of base)         -18 : System.Drawing.Primitives.dasm (-0.05% of base)         -14 : Microsoft.CSharp.dasm (-0.00% of base)         -12 : System.Linq.Parallel.dasm (-0.00% of base)         -10 : System.Speech.dasm (-0.00% of base)         -10 : System.Net.Http.WinHttpHandler.dasm (-0.02% of base)          -6 : Newtonsoft.Json.dasm (-0.00% of base)

ARM64:

Top file regressions (bytes):          16 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (0.00% of base)          12 : System.Runtime.Numerics.dasm (0.01% of base)           8 : Microsoft.VisualBasic.Core.dasm (0.00% of base)           8 : System.Speech.dasm (0.00% of base)Top file improvements (bytes):        -412 : System.Private.CoreLib.dasm (-0.01% of base)         -20 : System.Data.Common.dasm (-0.00% of base)         -12 : System.Drawing.Common.dasm (-0.00% of base)          -4 : Microsoft.CSharp.dasm (-0.00% of base)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should there be an issue tracking fixing this so thatLGN2DBL properly setsTYP_DOUBLE?

Copy link
ContributorAuthor

@briansullbriansullFeb 26, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No, I will change the comment to be more clear here.

For this case the tree's type is TYP_FLOAT and the call tofgMorphCastIntoHelper just leaves the return type as is. It assumes that the caller knows what he is doing. Since this is the code that is deciding to use this helper call and with my change we are now just fixing the return type to be correct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think we use the FPU stack anywhere except for returns on 32-bit...

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Fixed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Do we need diffs for x86/x64 as well?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

X64 Asm Diffs:

Top file regressions (bytes):         239 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (0.01% of base)         123 : Microsoft.VisualBasic.Core.dasm (0.03% of base)          99 : System.Speech.dasm (0.03% of base)          75 : System.Runtime.Numerics.dasm (0.10% of base)          46 : System.Drawing.Common.dasm (0.01% of base)           7 : System.Net.Http.WinHttpHandler.dasm (0.01% of base)           4 : System.Private.Xml.dasm (0.00% of base)Top file improvements (bytes):         -61 : System.Private.CoreLib.dasm (-0.00% of base)          -4 : System.Data.Common.dasm (-0.00% of base)          -4 : Microsoft.CSharp.dasm (-0.00% of base)          -1 : System.Net.Http.dasm (-0.00% of base)

X86 Asm Diffs:

Top file regressions (bytes):          31 : System.Drawing.Primitives.dasm (0.10% of base)          24 : Microsoft.VisualBasic.Core.dasm (0.01% of base)          18 : System.ComponentModel.TypeConverter.dasm (0.01% of base)          11 : System.Drawing.Common.dasm (0.00% of base)          11 : System.Linq.Expressions.dasm (0.00% of base)          10 : Microsoft.CodeAnalysis.CSharp.dasm (0.00% of base)           4 : System.Private.DataContractSerialization.dasm (0.00% of base)Top file improvements (bytes):         -41 : System.Speech.dasm (-0.01% of base)         -27 : System.Private.CoreLib.dasm (-0.00% of base)         -13 : System.Data.Common.dasm (-0.00% of base)          -9 : FSharp.Core.dasm (-0.00% of base)          -3 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.00% of base)          -2 : System.Linq.Parallel.dasm (-0.00% of base)

Copy link
Member

@tannergoodingtannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

Update the gtCostEx and gtCostSz values of  float and double constants for Arm and Arm64Fix the gtCosts for 16-bit ARM32 immediate values
@briansull
Copy link
ContributorAuthor

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@briansull
Copy link
ContributorAuthor

All JitStress failures are due to
#48727

@briansull
Copy link
ContributorAuthor

This is ready for checkin
@dotnet/jit-contrib PTAL

@briansull
Copy link
ContributorAuthor

Sample Diff with a CSE

image

Copy link
Member

@AndyAyersMSAndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you run SPMI diffs (on benchmarks, in particular)?

else if (((tree->gtFlags & GTF_UNSIGNED) == 0) && (srcType == TYP_LONG) && varTypeIsFloating(dstType))
{
return fgMorphCastIntoHelper(tree, CORINFO_HELP_LNG2DBL, oper);
oper = fgMorphCastIntoHelper(tree, CORINFO_HELP_LNG2DBL, oper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is this part of this change or some other fix?

Copy link
ContributorAuthor

@briansullbriansullFeb 26, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It is required. CSE will assert because without this we can make a CSE with mismatched types (TYP_FLOAT and TYP_DOUBLE)


// We don't shared small offset constants when we require a reloc
// We don't share small offset constants when we require a reloc
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Maybe explain why not? Is it a legality thing or a profitability thing?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It would be incorrect to CSE,,
You can't eliminate a reloc using a CSE and adding an offset to a different constant.

@briansull
Copy link
ContributorAuthor

briansull commentedFeb 26, 2021
edited
Loading

Here is a diff from the SPMI Benchmarks where we hoist a Load of a double constant out of a loop using a CSE

image

@AndyAyersMS
Copy link
Member

How many diffs are there in benchmarks?

Can you share out the details so we can cross-correlate with perf improvements we might see in the lab?

tannergooding reacted with thumbs up emoji

Copy link
Member

@AndyAyersMSAndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Changes look good.

Would still like to see a benchmarks diff summary. Presumably you have the data lying around already?

@briansull
Copy link
ContributorAuthor

PerfScore Benchmark Improvements:

Top method improvements (percentages):       -4.30 (-16.20% of base) : SIMD\RayTracer\RayTracer\RayTracer.dasm - <>c:<.cctor>b__3_4(Vector):Color:this       -4.30 (-16.20% of base) : SIMD\RayTracer\RayTracer\RayTracer.dasm - <>c:<.cctor>b__3_7(Vector):Color:this      -76.85 (-9.26% of base) : Bytemark\Bytemark\Bytemark.dasm -Fourier:DoFPUTransIteration(System.Double[],System.Double[],int):long     -142.40 (-7.43% of base) : Bytemark\Bytemark\Bytemark.dasm - NeuralJagged:randomize_wts()       -7.40 (-7.30% of base) : Benchstones\BenchF\NewtR\NewtR\NewtR.dasm - Benchstone.BenchF.NewtR:Bench():bool     -142.00 (-6.66% of base) : Bytemark\Bytemark\Bytemark.dasm - Neural:randomize_wts()       -7.40 (-6.01% of base) : Benchstones\BenchF\Secant\Secant\Secant.dasm - Benchstone.BenchF.Secant:Bench():bool       -4.30 (-5.95% of base) : SIMD\RayTracer\RayTracer\RayTracer.dasm - Color:Brightness():float:this       -5.80 (-5.84% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:SqrtDoubleTest()      -77.80 (-5.07% of base) : Benchstones\BenchF\Lorenz\Lorenz\Lorenz.dasm - Benchstone.BenchF.Lorenz:Bench():bool     -243.40 (-5.04% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm -Algorithms.ScalarFloatRenderer:RenderSingleThreadedNoADT(float,float,float,float,float):this       -4.90 (-5.03% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:AbsSingleTest()       -4.90 (-5.03% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:AbsDoubleTest()     -243.00 (-5.02% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.ScalarFloatRenderer:RenderSingleThreadedWithADT(float,float,float,float,float):this     -243.40 (-5.02% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.ScalarDoubleRenderer:RenderSingleThreadedNoADT(float,float,float,float,float):this     -243.00 (-5.00% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.ScalarDoubleRenderer:RenderSingleThreadedWithADT(float,float,float,float,float):this      -13.47 (-4.95% of base) : Benchstones\BenchF\Bisect\Bisect\Bisect.dasm - Benchstone.BenchF.Bisect:Inner(byref,byref,byref,byref)      -33.80 (-4.80% of base) : SIMD\RayTracer\RayTracer\RayTracer.dasm - Color:ChangeHue(float):this      -27.30 (-4.65% of base) : Benchstones\BenchF\NewtE\NewtE\NewtE.dasm - Benchstone.BenchF.NewtE:Bench():bool       -5.80 (-4.42% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:SqrtSingleTest()     -107.80 (-4.24% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - <>c__DisplayClass4_0:<RenderMultiThreadedWithADT>b__0(int):this (2 methods)     -107.40 (-4.23% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - <>c__DisplayClass5_0:<RenderMultiThreadedNoADT>b__0(int):this (2 methods)       -4.90 (-4.18% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:RoundSingleTest()       -4.90 (-4.18% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:CeilingSingleTest()       -4.90 (-4.18% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:FloorSingleTest()       -4.90 (-4.18% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:CeilingDoubleTest()       -4.90 (-4.18% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:FloorDoubleTest()       -4.90 (-4.18% of base) : Math\Functions\Functions\Functions.dasm - Functions.MathTests:RoundDoubleTest()       -6.30 (-3.53% of base) : Benchstones\BenchF\Bisect\Bisect\Bisect.dasm - Benchstone.BenchF.Bisect:Bench():bool     -213.57 (-3.40% of base) : Benchstones\BenchF\DMath\DMath\DMath.dasm - Benchstone.BenchF.DMath:Bench(int):bool       -1.90 (-3.34% of base) : SciMark\SciMark\SciMark.dasm - SciMark2.FFT:num_flops(int):double      -18.25 (-2.67% of base) : Burgers\Burgers\Burgers.dasm - Burgers:Main():int      -31.65 (-2.13% of base) : Burgers\Burgers\Burgers.dasm - Burgers:GetCalculated2(int,int,double,double,double,System.Double[]):System.Double[]      -31.25 (-2.13% of base) : Burgers\Burgers\Burgers.dasm - Burgers:GetCalculated0(int,int,double,double,double,System.Double[]):System.Double[]      -31.25 (-2.10% of base) : Burgers\Burgers\Burgers.dasm - Burgers:GetCalculated1(int,int,double,double,double,System.Double[]):System.Double[]       -1.90 (-1.71% of base) : FractalPerf\FractalPerf\FractalPerf.dasm - FractalPerf.Fractal:.ctor(double,double,double,double):this      -79.60 (-1.50% of base) : Benchstones\BenchF\LLoops\LLoops\LLoops.dasm - Benchstone.BenchF.LLoops:Init():this       -3.10 (-1.38% of base) : SciMark\SciMark\SciMark.dasm - SciMark2.kernel:measureSOR(int,double,SciMark2.Random):double      -49.30 (-1.19% of base) : Bytemark\Bytemark\Bytemark.dasm - NeuralJagged:read_data_file():this       -5.10 (-1.16% of base) : Bytemark\Bytemark\Bytemark.dasm - Fourier:Run():double:this      -49.30 (-0.92% of base) : Bytemark\Bytemark\Bytemark.dasm - Neural:read_data_file():this      -43.20 (-0.85% of base) : BenchmarksGame\mandelbrot\mandelbrot-2\mandelbrot-2.dasm - BenchmarksGame.Mandelbrot_2:DoBench(int,System.IO.MemoryStream,bool)      -59.70 (-0.85% of base) : Bytemark\Bytemark\Bytemark.dasm - Huffman:DoHuffIteration(System.Byte[],System.Byte[],System.Byte[],int,int,huff_node[]):long       -2.55 (-0.84% of base) : SIMD\RayTracer\RayTracer\RayTracer.dasm - RayTracer:Shade(ISect,Scene,int):Color:this       -1.40 (-0.69% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.VectorDoubleStrictRenderer:RenderMultiThreadedWithADT(float,float,float,float,float):this       -1.40 (-0.69% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.VectorDoubleStrictRenderer:RenderMultiThreadedNoADT(float,float,float,float,float):this       -1.40 (-0.69% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.VectorDoubleRenderer:RenderMultiThreadedNoADT(float,float,float,float,float):this       -1.40 (-0.69% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.VectorDoubleRenderer:RenderMultiThreadedWithADT(float,float,float,float,float):this      -28.13 (-0.56% of base) : Burgers\Burgers\Burgers.dasm - Burgers:GetCalculated3(int,int,double,double,double,System.Double[]):System.Double[]       -8.60 (-0.54% of base) : Benchstones\BenchF\Whetsto\Whetsto\Whetsto.dasm - Benchstone.BenchF.Whetsto:Bench():bool       -2.30 (-0.48% of base) : Benchstones\BenchF\Regula\Regula\Regula.dasm - Benchstone.BenchF.Regula:Inner(byref,byref,double,double,int,byref)       -5.50 (-0.41% of base) : BenchmarksGame\n-body\n-body-3\n-body-3.dasm - BenchmarksGame.NBodySystem:Energy():double:this       -4.50 (-0.36% of base) : Bytemark\Bytemark\Bytemark.dasm - ByteMark:bench_with_confidence(int,byref,byref,byref):int      -53.70 (-0.34% of base) : Benchstones\BenchF\FFT\FFT\FFT.dasm - Benchstone.BenchF.FFT:FastFourierT(System.Double[],System.Double[],int)       -1.40 (-0.02% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.VectorDoubleRenderer:RenderSingleThreadedNoADT(float,float,float,float,float):this       -1.40 (-0.02% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - Algorithms.VectorDoubleStrictRenderer:RenderSingleThreadedNoADT(float,float,float,float,float):this
tannergooding reacted with heart emojitannergooding reacted with rocket emoji

Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@tannergoodingtannergoodingtannergooding approved these changes

@AndyAyersMSAndyAyersMSAndyAyersMS approved these changes

Assignees

@briansullbriansull

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

Archived in project

Milestone

6.0.0

Development

Successfully merging this pull request may close these issues.

Allow GT_CNS_DBL to be a CSE candidate

6 participants

@briansull@tannergooding@ViktorHofer@AndyAyersMS@Dotnet-GitSync-Bot@JulieLeeMSFT

[8]ページ先頭

©2009-2025 Movatter.jp