- Notifications
You must be signed in to change notification settings - Fork2.6k
Remove JIT LEGACY_BACKEND code#18064
Remove JIT LEGACY_BACKEND code#18064
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Remove a few references to legacy backend.
Remove compCanUseSSE2Remove some unused liveness function arguments
@dotnet-bot test Windows_NT arm Cross Checked normal Build and Test |
mikedn commentedMay 20, 2018
Here and there may be additional clean up worth doing after removal of LEGACY_BACKEND code. For example, some RangeCheck code still uses |
@mikedn Yes, there certainly be some additional cleanup fallout. It would be good to open issues for those things when noticed; some are not obvious. |
@dotnet/jit-contrib @dotnet/arm32-contrib @dotnet/arm64-contrib PTAL This is a large change; I don't expect anyone to review it all. Perhaps some spot checking will be sufficient and useful. This might lead to ideas for further cleanup. I do want everyone to be aware this is happening, and have the opportunity to provide any feedback you might have. |
Would be nice to verify that this is a no-diff change. |
@@ -113,12 +105,12 @@ GTSTRUCT_1(Qmark , GT_QMARK) | |||
GTSTRUCT_1(PhiArg , GT_PHI_ARG) | |||
GTSTRUCT_1(StoreInd , GT_STOREIND) | |||
GTSTRUCT_N(Indir , GT_STOREIND, GT_IND, GT_NULLCHECK, GT_BLK, GT_STORE_BLK, GT_OBJ, GT_STORE_OBJ, GT_DYN_BLK, GT_STORE_DYN_BLK) | |||
#if defined(LEGACY_BACKEND) || !defined(_TARGET_ARM_) | |||
GTSTRUCT_1(PutArgStk , GT_PUTARG_STK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This declaration was also for all non ARM targets. Is it correct to remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Oh, it's declared on 112 line.
@dotnet-bot test Tizen armel Cross Release Build |
@dotnet-bot test Tizen armel Cross Checked Build |
@gbalykov Looks like the Tizen builds are failing because we now require clang5.0 for ARM, but Docker image hqueue/dotnet-buildtools-prereqs:ubuntu-14.04-cross-0cd4667-20172211042239-tizen-rootfs-20170925 doesn't have this, like our Ubuntu Docker image now does. |
@BruceForstall, Tizen docker image was updated in#17814, it was just merged to master. Could you, please, rebase on current master? |
alpencolt left a comment• edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Great work!
I've looked almost only to ARM parts and it looks good.
We will run jit stress tests on armel and share results.
@@ -1292,10 +1290,6 @@ int LinearScan::ComputeAvailableSrcCount(GenTree* node) | |||
// | |||
void LinearScan::buildRefPositionsForNode(GenTree* tree, BasicBlock* block, LsraLocation currentLoc) | |||
{ | |||
#ifdef _TARGET_ARM_ | |||
assert(!isRegPairType(tree->TypeGet())); | |||
#endif // _TARGET_ARM_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we keep this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
RegPair is gone, so what would the check be?
#ifndef LEGACY_BACKEND | ||
#ifdef _TARGET_ARM_ | ||
if (type == TYP_LONG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Will it work correct with paired register after this change?
* Returns the register that holds the low 32 bits of the long value given | ||
* by the register pair 'regPair'. | ||
*/ | ||
inline regNumber genRegPairLo(regPairNo regPair) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
As I see logic for register pairs were changed. There were a lot issues related to it in ARM/ARM64.
Could you point where is new logic for long types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There are no reg pairs anymore; long decomposition for arm/x86 forces LONG to INT types in RyuJIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
As I remember pairs were needed to store long values in neighboring registers which is declared in ARM ABI. Does the new implementation work well with this rule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes. Note that we have been running with arm RyuJIT for a long time, with no failures. I do remember seeing code that uses regNum and REG_NEXT(regNum) instead of the register pairs, which might be what you are referring to. Note that the RegPair concept allowed any pair of registers, and was mostly used by the old register allocator. The new allocator, LSRA, requires long decomposition. LclVars still have lvOtherReg (for 32-bit targets) and lvOtherArgReg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Got it! Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@alpencolt - there are also a small number of nodes that support multiple registers, where there are actual target instructions that produce multiple registers (whereas decomposition expands instruction sequences when no such single instruction exists). Those instructions are:GT_CALL
,GT_COPY
, andGT_RELOAD
(most targets), plusGT_MUL_LONG
,GT_PUTARG_REG
,GT_PUTARG_SPLIT
andGT_BITCAST
on ARM. Instead of using thegtRegPair
these use additional register fields that are defined only for these nodes.
Note that this support has evolved a bit over time, and could use some cleanup (I think there may be an issue associated with that), though I believe that it is complete, just a but messier that it could be.
No need to rebase; just need to wait for the generator job to complete (https://ci.dot.net/job/dotnet_coreclr/job/master/job/generator/3446/) and re-trigger the jobs. @dotnet-bot test Tizen armel Cross Release Build |
ghost commentedMay 21, 2018
Is there a job that runs on CI on-demand for this, that dotnet-bot could read the results from and post it here? that'd be convenient |
CarolEidt left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM. I've only looked at the .h files, plus lowering, lsra, regalloc and simd files. I'll probably have a quick look at the emit stuff as well, but I'm fine with this as-is. I noted a few clean up items for future consideration.
src/jit/codegeninterface.h Outdated
#endif // FEATURE_STACK_FP_X87 | ||
#ifndef LEGACY_BACKEND | ||
regNumber genGetAssignedReg(GenTree* tree); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It appears that this is unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Removed. There will probably be a bit of fallout like this.
@@ -5,8 +5,6 @@ | |||
#ifndef REGALLOC_H_ | |||
#define REGALLOC_H_ | |||
// Some things that are used by both LSRA and regpredict allocators. | |||
enum FrameType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
For clarity, it would probably be good to have a comment that explains why this is separate from the lsra files, and eventually we should consider restructuring - though perhaps we should keep the regalloc files as "common" functionality in the event that we add an alternate register allocator in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm not sure there is a good reason it is separate, except for history, and maybe potential a multi-reg allocator case as you state. But it doesn't seem worth it unless we have two allocators. What about regalloc.cpp? Should that be incorporated directly into lsra.cpp?
Similarly, do we still need codegenlinear? Should it be put in codegencommon? (and codegenlinear.h in codegen.h?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Even if we don't have two register allocators, it doesn't seem a bad thing to keep the things that are interfaces to the rest of the compiler separate, but it's not like we have a clean separation now. I'd probably leave the regalloc stuff alone and perhaps even migrate some other things there over time.
However, for codegenlinear, I agree that it would be better moved to codegencommon.cpp and codegen.h. I don't feel it's urgent, though.
src/jit/regalloc.cpp Outdated
#ifdef _TARGET_ARM_ | ||
// Unaligned loads/stores for floating point values must first be loaded into integer register(s) | ||
// | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Now that this isn't really "shared" other than for the various non-x64/ux targets, this should also be cleaned up in future. There's really no reason that it needs to be special for x64/ux, and there's certainly no need for 3 different methods.
@kasper3 that has been on our "to do" list for a long time, but doesn't exist currently. |
sandreenko left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Morph changes looks good.
@@ -1814,7 +1736,7 @@ void fgArgInfo::ArgsComplete() | |||
// morphing the register arguments of any GT_IND with a GTF_IND_RNGCHK flag | |||
// Thus we can not reorder the argument after any stack based argument | |||
// (Note that GT_LCLHEAP sets the GTF_EXCEPT flag so we don't need to | |||
// check for it explicitly | |||
// check for it explicitly.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
typo?
Hmmm... I don't see one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I was confused by the bracket, now I see that it is fine.
#endif // LEGACY_BACKEND | ||
if (!opts.compCanUseSSE2) | ||
{ | ||
return fgMorphCastIntoHelper(tree, CORINFO_HELP_DBL2INT, oper); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This wasn't under LEGACY_BACKEND. Why is it not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
compCanUseSSE2 is always true
I ran Windows Checked asm diffs on frameworks + tests for x86, x64, arm32 altjit, and arm64 altjit. There were no diffs in any of these configurations (about 460K methods each). |
Note: there appears to be an undesired case of ARM-specific code inref counts under this `#ifdef` that should be fixed.
alpencolt left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
jitstress
andjitstressregs
are passed on armel Tizen
#else | ||
#define HAS_TINY_DESC 0 | ||
#endif | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nice to get rid of HAS_TINY_DESC
IF_DEF(AWR_TRD, IS_FP_STK|IS_AM_WR, AMD ) // write [adr], read ST(n) | ||
//////(ARW_TRD, IS_FP_STK|IS_AM_RW, AMD ) // r/w [adr], read ST(n) | ||
#endif // FEATURE_STACK_FP_X87 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Does the calling convention still use the FPU stack for FP return values ?
I thought that it did.
tannergoodingMay 22, 2018 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
For x86, all calling conventions except for__vectorcall
return floating-point values on the FPU stack:
https://godbolt.org/g/QSqZVD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes, and we still have the two instructions we need:
// For RyuJIT/x86, we follow the x86 calling convention that requires// us to return floating point value on the x87 FP stack, so we need// these instructions regardless of whether we're using full stack fp.#ifdef _TARGET_X86_INST1(fld , "fld" , 1, IUM_WR, 0, 0, 0x0000D9)INST1(fstp , "fstp" , 1, IUM_WR, 0, 0, 0x0018D9)#endif // _TARGET_X86
Uh oh!
There was an error while loading.Please reload this page.
All code related to the
LEGACY_BACKEND
JIT is removed. This includes all code related to x87 floating-point code generation. Almost 50,000 lines of code have been removed.