Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Jan 23, 2023. It is now read-only.
/coreclrPublic archive

Remove JIT LEGACY_BACKEND code#18064

Merged

Conversation

BruceForstall
Copy link

@BruceForstallBruceForstall commentedMay 19, 2018
edited
Loading

All code related to theLEGACY_BACKEND JIT is removed. This includes all code related to x87 floating-point code generation. Almost 50,000 lines of code have been removed.

alpencolt, xoofx, dannyrb, factormystic, nietras, dv00d00, RyoukoKonpaku, KrzysztofBranicki, Fireboyd78, wtfblub, and 10 more reacted with thumbs up emojimikedn, tannergooding, sandreenko, wateret, ArtBlnd, ciplogic, erozenfeld, vbfox, pauldotknopf, khellang, and 38 more reacted with hooray emojimirbeta and danmoseley reacted with heart emojixiangzhai reacted with rocket emoji
@BruceForstall
Copy link
Author

@dotnet-bot test Windows_NT arm Cross Checked normal Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked normal Build and Test
@dotnet-bot test Ubuntu arm Cross Checked normal Build and Test
@dotnet-bot test Windows_NT x64 Build and Test
@dotnet-bot test Windows_NT x86 Checked Build and Test

@BruceForstall
Copy link
Author

@mikedn
Copy link

Here and there may be additional clean up worth doing after removal of LEGACY_BACKEND code. For example, some RangeCheck code still usesswitch even though now there's only one possible case. I guess that's best done in a separate PR, right?

@BruceForstall
Copy link
Author

@mikedn Yes, there certainly be some additional cleanup fallout. It would be good to open issues for those things when noticed; some are not obvious.

@BruceForstall
Copy link
Author

@dotnet/jit-contrib @dotnet/arm32-contrib @dotnet/arm64-contrib PTAL

This is a large change; I don't expect anyone to review it all. Perhaps some spot checking will be sufficient and useful. This might lead to ideas for further cleanup.

I do want everyone to be aware this is happening, and have the opportunity to provide any feedback you might have.

@AndyAyersMS
Copy link
Member

Would be nice to verify that this is a no-diff change.

tannergooding reacted with thumbs up emoji

@@ -113,12 +105,12 @@ GTSTRUCT_1(Qmark , GT_QMARK)
GTSTRUCT_1(PhiArg , GT_PHI_ARG)
GTSTRUCT_1(StoreInd , GT_STOREIND)
GTSTRUCT_N(Indir , GT_STOREIND, GT_IND, GT_NULLCHECK, GT_BLK, GT_STORE_BLK, GT_OBJ, GT_STORE_OBJ, GT_DYN_BLK, GT_STORE_DYN_BLK)
#if defined(LEGACY_BACKEND) || !defined(_TARGET_ARM_)
GTSTRUCT_1(PutArgStk , GT_PUTARG_STK)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This declaration was also for all non ARM targets. Is it correct to remove it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Oh, it's declared on 112 line.

@gbalykov
Copy link
Member

@dotnet-bot test Tizen armel Cross Release Build

@gbalykov
Copy link
Member

@dotnet-bot test Tizen armel Cross Checked Build

@BruceForstall
Copy link
Author

@gbalykov Looks like the Tizen builds are failing because we now require clang5.0 for ARM, but Docker image hqueue/dotnet-buildtools-prereqs:ubuntu-14.04-cross-0cd4667-20172211042239-tizen-rootfs-20170925 doesn't have this, like our Ubuntu Docker image now does.

@gbalykov
Copy link
Member

@BruceForstall, Tizen docker image was updated in#17814, it was just merged to master. Could you, please, rebase on current master?

Copy link

@alpencoltalpencolt left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Great work!
I've looked almost only to ARM parts and it looks good.

We will run jit stress tests on armel and share results.

@@ -1292,10 +1290,6 @@ int LinearScan::ComputeAvailableSrcCount(GenTree* node)
//
void LinearScan::buildRefPositionsForNode(GenTree* tree, BasicBlock* block, LsraLocation currentLoc)
{
#ifdef _TARGET_ARM_
assert(!isRegPairType(tree->TypeGet()));
#endif // _TARGET_ARM_

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can we keep this check?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

RegPair is gone, so what would the check be?


#ifndef LEGACY_BACKEND
#ifdef _TARGET_ARM_
if (type == TYP_LONG)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Will it work correct with paired register after this change?

* Returns the register that holds the low 32 bits of the long value given
* by the register pair 'regPair'.
*/
inline regNumber genRegPairLo(regPairNo regPair)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

As I see logic for register pairs were changed. There were a lot issues related to it in ARM/ARM64.
Could you point where is new logic for long types?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There are no reg pairs anymore; long decomposition for arm/x86 forces LONG to INT types in RyuJIT.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

As I remember pairs were needed to store long values in neighboring registers which is declared in ARM ABI. Does the new implementation work well with this rule?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes. Note that we have been running with arm RyuJIT for a long time, with no failures. I do remember seeing code that uses regNum and REG_NEXT(regNum) instead of the register pairs, which might be what you are referring to. Note that the RegPair concept allowed any pair of registers, and was mostly used by the old register allocator. The new allocator, LSRA, requires long decomposition. LclVars still have lvOtherReg (for 32-bit targets) and lvOtherArgReg.

alpencolt reacted with thumbs up emoji

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Got it! Thank you

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@alpencolt - there are also a small number of nodes that support multiple registers, where there are actual target instructions that produce multiple registers (whereas decomposition expands instruction sequences when no such single instruction exists). Those instructions are:GT_CALL,GT_COPY, andGT_RELOAD (most targets), plusGT_MUL_LONG,GT_PUTARG_REG,GT_PUTARG_SPLIT andGT_BITCAST on ARM. Instead of using thegtRegPair these use additional register fields that are defined only for these nodes.
Note that this support has evolved a bit over time, and could use some cleanup (I think there may be an issue associated with that), though I believe that it is complete, just a but messier that it could be.

alpencolt reacted with thumbs up emoji
@BruceForstall
Copy link
Author

@BruceForstall, Tizen docker image was updated in#17814, it was just merged to master. Could you, please, rebase on current master?

No need to rebase; just need to wait for the generator job to complete (https://ci.dot.net/job/dotnet_coreclr/job/master/job/generator/3446/) and re-trigger the jobs.

@dotnet-bot test Tizen armel Cross Release Build
@dotnet-bot test Tizen armel Cross Checked Build

@ghost
Copy link

Would be nice to verify that this is a no-diff change.

Is there a job that runs on CI on-demand for this, that dotnet-bot could read the results from and post it here? that'd be convenient

Copy link

@CarolEidtCarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM. I've only looked at the .h files, plus lowering, lsra, regalloc and simd files. I'll probably have a quick look at the emit stuff as well, but I'm fine with this as-is. I noted a few clean up items for future consideration.


#endif // FEATURE_STACK_FP_X87

#ifndef LEGACY_BACKEND
regNumber genGetAssignedReg(GenTree* tree);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It appears that this is unused.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Removed. There will probably be a bit of fallout like this.

@@ -5,8 +5,6 @@
#ifndef REGALLOC_H_
#define REGALLOC_H_

// Some things that are used by both LSRA and regpredict allocators.

enum FrameType

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For clarity, it would probably be good to have a comment that explains why this is separate from the lsra files, and eventually we should consider restructuring - though perhaps we should keep the regalloc files as "common" functionality in the event that we add an alternate register allocator in future.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure there is a good reason it is separate, except for history, and maybe potential a multi-reg allocator case as you state. But it doesn't seem worth it unless we have two allocators. What about regalloc.cpp? Should that be incorporated directly into lsra.cpp?

Similarly, do we still need codegenlinear? Should it be put in codegencommon? (and codegenlinear.h in codegen.h?)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Even if we don't have two register allocators, it doesn't seem a bad thing to keep the things that are interfaces to the rest of the compiler separate, but it's not like we have a clean separation now. I'd probably leave the regalloc stuff alone and perhaps even migrate some other things there over time.
However, for codegenlinear, I agree that it would be better moved to codegencommon.cpp and codegen.h. I don't feel it's urgent, though.

#ifdef _TARGET_ARM_
// Unaligned loads/stores for floating point values must first be loaded into integer register(s)
//

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Now that this isn't really "shared" other than for the various non-x64/ux targets, this should also be cleaned up in future. There's really no reason that it needs to be special for x64/ux, and there's certainly no need for 3 different methods.

@BruceForstall
Copy link
Author

Is there a job that runs on CI on-demand for this, that dotnet-bot could read the results from and post it here? that'd be convenient

@kasper3 that has been on our "to do" list for a long time, but doesn't exist currently.

Copy link

@sandreenkosandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Morph changes looks good.

@@ -1814,7 +1736,7 @@ void fgArgInfo::ArgsComplete()
// morphing the register arguments of any GT_IND with a GTF_IND_RNGCHK flag
// Thus we can not reorder the argument after any stack based argument
// (Note that GT_LCLHEAP sets the GTF_EXCEPT flag so we don't need to
// check for it explicitly
// check for it explicitly.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

typo?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

typo?

Hmmm... I don't see one.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I was confused by the bracket, now I see that it is fine.

#endif // LEGACY_BACKEND
if (!opts.compCanUseSSE2)
{
return fgMorphCastIntoHelper(tree, CORINFO_HELP_DBL2INT, oper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This wasn't under LEGACY_BACKEND. Why is it not needed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

compCanUseSSE2 is always true

@BruceForstall
Copy link
Author

Would be nice to verify that this is a no-diff change.

I ran Windows Checked asm diffs on frameworks + tests for x86, x64, arm32 altjit, and arm64 altjit. There were no diffs in any of these configurations (about 460K methods each).

tannergooding and fiigii reacted with thumbs up emoji

Note: there appears to be an undesired case of ARM-specific code inref counts under this `#ifdef` that should be fixed.
Copy link

@alpencoltalpencolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

jitstress andjitstressregs are passed on armel Tizen

#else
#define HAS_TINY_DESC 0
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice to get rid of HAS_TINY_DESC

IF_DEF(AWR_TRD, IS_FP_STK|IS_AM_WR, AMD ) // write [adr], read ST(n)
//////(ARW_TRD, IS_FP_STK|IS_AM_RW, AMD ) // r/w [adr], read ST(n)

#endif // FEATURE_STACK_FP_X87

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Does the calling convention still use the FPU stack for FP return values ?
I thought that it did.

Copy link
Member

@tannergoodingtannergoodingMay 22, 2018
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For x86, all calling conventions except for__vectorcall return floating-point values on the FPU stack:
https://godbolt.org/g/QSqZVD

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, and we still have the two instructions we need:

// For RyuJIT/x86, we follow the x86 calling convention that requires// us to return floating point value on the x87 FP stack, so we need// these instructions regardless of whether we're using full stack fp.#ifdef _TARGET_X86_INST1(fld    , "fld"          , 1, IUM_WR, 0, 0, 0x0000D9)INST1(fstp   , "fstp"         , 1, IUM_WR, 0, 0, 0x0018D9)#endif // _TARGET_X86

@BruceForstallBruceForstall merged commit7153e44 intodotnet:masterMay 22, 2018
@BruceForstallBruceForstall deleted the Remove-LEGACY_BACKEND branchNovember 17, 2018 00:44
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@CarolEidtCarolEidtCarolEidt approved these changes

@mikednmikednmikedn left review comments

@tannergoodingtannergoodingtannergooding left review comments

@erozenfelderozenfelderozenfeld left review comments

@briansullbriansullbriansull left review comments

@sandreenkosandreenkosandreenko approved these changes

@alpencoltalpencoltalpencolt approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

10 participants
@BruceForstall@mikedn@AndyAyersMS@gbalykov@alpencolt@CarolEidt@tannergooding@erozenfeld@briansull@sandreenko

[8]ページ先頭

©2009-2025 Movatter.jp