Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Optimize fptrunc(x)>=C1 --> x>=C2#99475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
kissholic wants to merge2 commits intollvm:main
base:main
Choose a base branch
Loading
fromkissholic:main
Open

Conversation

kissholic
Copy link

@kissholickissholic requested a review fromnikic as acode ownerJuly 18, 2024 11:55
@github-actionsGitHub Actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using@ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by theLLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on theLLVM Discord or on theforums.

@llvmbot
Copy link
Member

@llvm/pr-subscribers-llvm-transforms

Author: None (kissholic)

Changes

Fix#85265 (comment)


Full diff:https://github.com/llvm/llvm-project/pull/99475.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+31)
  • (added) llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll (+11)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cppindex abadf54a96767..2af3e92213f13 100644--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp@@ -22,10 +22,13 @@ #include "llvm/Analysis/Utils/Local.h" #include "llvm/Analysis/VectorUtils.h" #include "llvm/IR/ConstantRange.h"+#include "llvm/IR/Constants.h" #include "llvm/IR/DataLayout.h" #include "llvm/IR/InstrTypes.h"+#include "llvm/IR/Instruction.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/PatternMatch.h"+#include "llvm/Support/Casting.h" #include "llvm/Support/KnownBits.h" #include "llvm/Transforms/InstCombine/InstCombiner.h" #include <bitset>@@ -7882,6 +7885,30 @@ static Instruction *foldFCmpReciprocalAndZero(FCmpInst &I, Instruction *LHSI,   return new FCmpInst(Pred, LHSI->getOperand(1), RHSC, "", &I); }+// Fold trunc(x) < constant --> x < constant if possible.+static Instruction *foldFCmpFpTrunc(FCmpInst &I, Instruction *LHSI,+                                    Constant *RHSC) {+  //+  FCmpInst::Predicate Pred = I.getPredicate();++  // Check that predicates are valid.+  if ((Pred != FCmpInst::FCMP_OGT) && (Pred != FCmpInst::FCMP_OLT) &&+      (Pred != FCmpInst::FCMP_OGE) && (Pred != FCmpInst::FCMP_OLE))+    return nullptr;++  auto *LType = LHSI->getOperand(0)->getType();+  auto *RType = RHSC->getType();++  if (!(LType->isFloatingPointTy() && RType->isFloatingPointTy() &&+        LType->getTypeID() >= RType->getTypeID()))+    return nullptr;++  auto *ROperand = llvm::ConstantFP::get(+      LType, dyn_cast<ConstantFP>(RHSC)->getValue().convertToDouble());++  return new FCmpInst(Pred, LHSI->getOperand(0), ROperand, "", &I);+}+ /// Optimize fabs(X) compared with zero. static Instruction *foldFabsWithFcmpZero(FCmpInst &I, InstCombinerImpl &IC) {   Value *X;@@ -8244,6 +8271,10 @@ Instruction *InstCombinerImpl::visitFCmpInst(FCmpInst &I) {                   cast<LoadInst>(LHSI), GEP, GV, I))             return Res;       break;+    case Instruction::FPTrunc:+      if (Instruction *NV = foldFCmpFpTrunc(I, LHSI, RHSC))+        return NV;+      break;   }   }diff --git a/llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll b/llvm/test/Transforms/InstCombine/fold-fcmp-trunc.llnew file mode 100644index 0000000000000..446111a60dd6c--- /dev/null+++ b/llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll@@ -0,0 +1,11 @@+; RUN: opt -passes=instcombine -S < %s | FileCheck %s+++;CHECK-LABEL: @src(+;CHECK: %result = fcmp oge double %0, 1.000000e+02+;CHECK-NEXT: ret i1 %result+define i1 @src(double %0) {+    %trunc = fptrunc double %0 to float+    %result = fcmp oge float %trunc, 1.000000e+02+    ret i1 %result+}\ No newline at end of file

@dtcxzywdtcxzyw requested a review fromarsenmJuly 18, 2024 11:58
@dtcxzyw
Copy link
Member

Please read the guidelinehttps://llvm.org/docs/InstCombineContributorGuide.html.

%trunc =fptruncdouble%0tofloat
%result =fcmpogefloat%trunc,1.000000e+02
reti1%result
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Missing newline.

auto *LType = LHSI->getOperand(0)->getType();
auto *RType = RHSC->getType();

if (!(LType->isFloatingPointTy() && RType->isFloatingPointTy() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You shouldn't need to check isFloatingPointTy, that's implied by the operations in use already. Also not sure what a >= means when comparing type IDs but you probably don't need that either

return nullptr;

auto *ROperand = llvm::ConstantFP::get(
LType, dyn_cast<ConstantFP>(RHSC)->getValue().convertToDouble());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Don't use convertToDouble, keep this entirely in APFloat

%trunc =fptruncdouble%0tofloat
%result =fcmpogefloat%trunc,1.000000e+02
reti1%result
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Test more combinations of types, including vectors. Also test flag preservation behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A test offp128,x86_fp80, orppc_fp128 in particular would be helpful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Test more combinations of types, including vectors. Also test flag preservation behavior
@arsenm Sorry, could you give me a hint what 'flag preservation behavior' means in IR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I mean the flags on the compare should be preserved, and you don't have any tests using fast math flags. e.g.https://alive2.llvm.org/ce/z/uQr4-J

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A test offp128,x86_fp80, orppc_fp128 in particular would be helpful.

Sorry for late.

I tried to test these types, but an error is generated whose message is "floating point constant does not have type 'fp128'".

The error is emitted in the parse stage, and more modifications might be required to be conducted.

Should i ignore this problem, or if there are better solutions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The IR syntax for floating point constants is unnecessarily painful. You need to use a different prefix and have the correct hex length depending on the format. For this I usually just write fpext from a reasonable FP type to the long one, and run it through instsimplify to see how it should be printed

@arsenmarsenm added llvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passes and removed llvm:transforms labelsJul 18, 2024
@aengelke
Copy link
Contributor

This appears to be incorrect w.r.t. round-to-nearest rounding of fptrunc.alive2 The constant needs adjustment.

@kissholic
Copy link
Author

This appears to be incorrect w.r.t. round-to-nearest rounding of fptrunc.alive2 The constant needs adjustment.

It seems that the double type the fp constant converted to can express the same value with float type without lossing accuracy. The rmNearestTiesToEven has already been applied, and no difference appeared.🫣


if (RHSC->getType()->isVectorTy()) {
Type *LVecType = LHSI->getOperand(0)->getType();
Type *LEleType = dyn_cast<VectorType>(LVecType)->getElementType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

unchecked dyn_cast. Use the dyn_cast in the if expression instead of using isVectorTy

auto *ROperand = llvm::ConstantFP::get(
LType, dyn_cast<ConstantFP>(RHSC)->getValue().convertToDouble());
std::vector<Constant *> EleVec(EleNum);
for (uint64_t Idx = 0; Idx < EleNum; ++Idx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is trying too hard to handle the non-splat case. Just use m_APFloat which handles splat vectors and scalars at the same time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So i should split splat vector case from the normal cases, and combine it with scalars, if i understand correctly?😖

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You shouldn't have to really worry about the vector case. If you use m_APFloat, it should just work. It will handle scalars and splat vectors

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You shouldn't have to really worry about the vector case. If you use m_APFloat, it should just work. It will handle scalars and splat vectors

Sorry, i still didn't get the point... It seems that m_APFloat can't cover the non-splat vector cases.

I also read the icmp-trunc optimization code and ran an int vector test case, but it seems not work in the int vector case.

I came up with an idea that FPExtInst may be applied to the constant, and left the optimization to constant extension part 😋 (joking).

Could you give some more hints? Thank you <3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It seems that m_APFloat can't cover the non-splat vector cases.

Correct. I'm saying it's a waste of time, and will multiply the patch size, to handle non-splat cases. If you really want to handle non-splat cases, it should be a follow up after the simple patch

ret <4 x i1> %cmp
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Add a scalable vector test

@aengelke
Copy link
Contributor

It seems that the double type the fp constant converted to can express the same value with float type without lossing accuracy. The rmNearestTiesToEven has already been applied, and no difference appeared.🫣

It's the opposite direction that is problematic. Consider input99.99999999. After the fptrunc, it will be value100.0f, which is >= 100.0f. But99.99999999 is not >=100.0. You need to find the smallest(/largest) value of the larger floating-point type which, after truncation, is satisfies the condition.

Thecontributor guide also says that you should provide alive2 proofs that your transformation is correct. I provided a proof above that the transformation as implemented/tested now is not correct.

Copy link
Contributor

@arsenmarsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you add the alive2 link proof to the description

@@ -7882,6 +7892,79 @@ static Instruction *foldFCmpReciprocalAndZero(FCmpInst &I, Instruction *LHSI,
return new FCmpInst(Pred, LHSI->getOperand(1), RHSC, "", &I);
}

// Fold trunc(x) < constant --> x < constant if possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

fptrunc, not trunc

Comment on lines 7901 to 7905
if ((Pred == FCmpInst::FCMP_OGE) || (Pred == FCmpInst::FCMP_UGE) ||
(Pred == FCmpInst::FCMP_OLT) || (Pred == FCmpInst::FCMP_ULT))
RoundDown = true;
else if ((Pred == FCmpInst::FCMP_OGT) || (Pred == FCmpInst::FCMP_UGT) ||
(Pred == FCmpInst::FCMP_OLE) || (Pred == FCmpInst::FCMP_ULE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Don't need parentheses around all these == expressions

%result = fcmp fast oge float %trunc, 1.000000e+02
ret i1 %result
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The set of tested constants seems too simple for the complexity of the loop testing constant validity. Should have negative tests for off by one bit in each direction. Also test with the edge case constants (inf, nan) and some denormal values?

Comment on lines 7938 to 7939
APFloat DupUpBound = UpBound;
DupUpBound.next(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Name UpBoundNext, or similar?

APFloat LowBound = RoundDown ? ExtNextRValue : ExtRValue;
APFloat UpBound = RoundDown ? ExtRValue : ExtNextRValue;

while (true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This loop is the hard to review part and needs some comments explaining what constants are legal

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for late, being occupied by work.

The alive proof will be posted soon.

The edge cases (inf, nan) has been folded before entering this optimization (e.g. fcmp oge x inf --> fcmp oeq x inf). Maybe filtering these cases is a good idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It's not always safe to rely on those folding before hand

kissholic added a commit to kissholic/llvm-project that referenced this pull requestOct 20, 2024
@kissholic
Copy link
Author

kissholic commentedOct 20, 2024
edited
Loading

Exclude nan and infinity from optimization. The two 'number' requires special comparison rules, and have been optimized well by other methods, which generate bool literal directly.

Also add special treatment for the max (and the min) representable float value, due to their next value is infinity.

alive2 proof:
https://alive2.llvm.org/ce/z/rphqKP
https://alive2.llvm.org/ce/z/UhU5nG
https://alive2.llvm.org/ce/z/v3zu93
https://alive2.llvm.org/ce/z/fQxsaA
https://alive2.llvm.org/ce/z/UYnaSb
https://alive2.llvm.org/ce/z/5VRZGg

@arsenmarsenm added the floating-pointFloating-point math labelOct 20, 2024
%result = fcmp olt float %trunc, -3.4028234663852885981170418348451692544e38
ret i1 %result
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Test the literal nan and inf cases

Comment on lines +7901 to +8047
if (Pred == FCmpInst::FCMP_OGE || Pred == FCmpInst::FCMP_UGE ||
Pred == FCmpInst::FCMP_OLT || Pred == FCmpInst::FCMP_ULT)
RoundDown = true;
else if (Pred == FCmpInst::FCMP_OGT || Pred == FCmpInst::FCMP_UGT ||
Pred == FCmpInst::FCMP_OLE || Pred == FCmpInst::FCMP_ULE)
RoundDown = false;
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

uge, ule cases not tested. Plus negative test for others

// Set the limit of ExtNextRValue.
if (NextRValue.isInfinity()) {
ExtNextRValue = ExtRValue * Two;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No braces. Can also defer construction of the 2 constant (or avoid it by using scalbn instead)

ExtNextRValue.convert(LEleType->getFltSemantics(),
APFloat::rmNearestTiesToEven, &lossInfo);

// Binary search to find the maximal (or minimal) value after RValue promotion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you write this with std::lower_bound/std::upper_bound?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

std::lower_bound/std::upper_bound (or similar algorithm like std::binary_search) seems only acceptForwardIt type and there is likely no suitable substitution algorithm in LLVM too, which may requires constructing a new complex wrapper struct of APFloat, such as implementing name required iteration methods, calculating the mean of two APFloat (without constructing a new APFloat divisor), defining a comparation function and so on.

Considering the internal complexity of APFloat wrapper, it is simpler to keep the original one (i think)?

@github-actionsGitHub Actions
Copy link

github-actionsbot commentedOct 20, 2024
edited
Loading

⚠️ C/C++ code formatter, clang-format found issues in your code.⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions cpp -- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cppindex 28d094bd9..8bd421ff2 100644--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp@@ -8297,7 +8297,7 @@ static Instruction *foldFCmpFpTrunc(FCmpInst &I, const Instruction &FPTrunc,    // Check whether 'ExtMidValue' is a valid result since the assumption on   // imaged 'NextCValue' might not hold for new float types.-  // ppc_fp128 can't pass here when converting from max float because of+  // ppc_fp128 can't pass here when converting from max float because of   // APFloat implementation.   if (NextCValue.isInfinity()) {     // ExtMidValue --- narrowed ---> Finite

@kissholic
Copy link
Author

@arsenm
Copy link
Contributor

Why was this closed?

@kissholic
Copy link
Author

Why was this closed?

Sorry, the old commits seem to be blocked due to a merge operation. I tried to discard those commits and push it again.

@kissholickissholic reopened thisMar 16, 2025
@kissholic
Copy link
Author

ping

@kissholic
Copy link
Author

ping

Copy link
Contributor

@arsenmarsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The hard part of reviewing this is the binary search. I feel like this should be a directly computable value? I guess it's fine though, mostly nit comments

@@ -8031,6 +8031,101 @@ static Instruction *foldFCmpReciprocalAndZero(FCmpInst &I, Instruction *LHSI,
return new FCmpInst(Pred, LHSI->getOperand(1), RHSC, "", &I);
}


// Fold fptrunc(x) < constant --> x < constant if possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
// Fold fptrunc(x) < constant --> x < constant if possible.
/// Fold fptrunc(x) < constant --> x < constant if possible.

This also covers other compare types, can you express that in the comment

@@ -8031,6 +8031,101 @@ static Instruction *foldFCmpReciprocalAndZero(FCmpInst &I, Instruction *LHSI,
return new FCmpInst(Pred, LHSI->getOperand(1), RHSC, "", &I);
}


// Fold fptrunc(x) < constant --> x < constant if possible.
static Instruction *foldFCmpFpTrunc(FCmpInst &I, Instruction *LHSI,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
static Instruction *foldFCmpFpTrunc(FCmpInst &I, Instruction*LHSI,
static Instruction *foldFCmpFpTrunc(FCmpInst &I,constInstruction&FPTrunc,

if (RValue->isNaN() || RValue->isInfinity())
return nullptr;

Type *LType = LHSI->getOperand(0)->getType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Seems more like a "Src" than a "Left" type

return nullptr;

Type *LType = LHSI->getOperand(0)->getType();
Type *RType = RHSC->getType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Seems more like a "Dest" than a "Right" type

Comment on lines 8085 to 8086
APFloat LowBound = RoundDown ? ExtNextRValue : ExtRValue;
APFloat UpBound = RoundDown ? ExtRValue : ExtNextRValue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
APFloat LowBound = RoundDown ? ExtNextRValue : ExtRValue;
APFloat UpBound = RoundDown ? ExtRValue : ExtNextRValue;
constAPFloat&LowBound = RoundDown ? ExtNextRValue : ExtRValue;
constAPFloat&UpBound = RoundDown ? ExtRValue : ExtNextRValue;

APFloat TruncValue = ExtValue;
TruncValue.convert(REleType->getFltSemantics(),
APFloat::rmNearestTiesToEven, &lossInfo);
return TruncValue == *RValue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Do these compares amount to the same as directly using losesInfo?

Comment on lines 8080 to 8082
// Binary search to find the maximal (or minimal) value after 'RValue'
// promotion. 'RValue' should obey normal comparison rules, which means nan or
// inf is not allowed here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Move this comment down to the loop

// Binary search to find the maximal (or minimal) value after 'RValue'
// promotion. 'RValue' should obey normal comparison rules, which means nan or
// inf is not allowed here.
APFloat RoundValue{LEleType->getFltSemantics()};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Move this down to where it's used before the loop

// 'EqualRValue' indicates whether Mid is qualified to be the final round
// value. if 'EqualRValue' == true, 'Mid' might be the final round value
// if 'RoundDown' == true, 'UpBound' can't be the final round value
// if 'RoudnDown' == false, 'DownBound' can't be the final round value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
// if 'RoudnDown' == false, 'DownBound' can't be the final round value
// if 'RoundDown' == false, 'DownBound' can't be the final round value

APFloat::rmNearestTiesToEven, &lossInfo);
return TruncValue == *RValue;
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I feel like this should be a directly computable constant without a binary search

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, binary search can be replaced by average value for most cases, and I got confused before. However, it seems a little tricky to tackle the comparison with the largest value. I tried several ways to work around, but all are unsatisfactory. Might be late for the final solution.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Although the absence ofnext(max_float) which is infinity actually, we can image its existence in wider types, and it should equal tomax_float + (max_float - prev(max_float)) in most cases (if not all). A bound check can ensure its correctness.

This method balance the efficiency and correctness, and keep consistence with logics for other floats despite the fact that it might not be standard for all float types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Are you planning on switching the patch to do this? Needs comments either way

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm trying to, but it seems that the rule can't be applied toppc_fp128 when converted frommax_float due to unknown precision loss. I want more investigation before decision

}


; max representable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Also test negative of max representable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This case is covered byfcmp_trunc_mn. Maybe this name is misleading and should be changed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

New algorithm is adopted~

@kissholic
Copy link
Author

https://alive2.llvm.org/ce/z/bTf3ZW
Converting frommax float toppc_fp128 will be0xM47EFFFFFF0000000C150000000000000 which should pass the last finite check in the commit but fails. This error seems to be coming from the precision loss of addition of two IEEE 64-bit doubles inside ofppc_fp128. This case will be bypassed temporarily before the issue fixed.

@kissholickissholic requested a review fromarsenmJuly 18, 2025 10:26
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@dtcxzywdtcxzywdtcxzyw left review comments

@jcranmer-inteljcranmer-inteljcranmer-intel left review comments

@arsenmarsenmarsenm approved these changes

@nikicnikicAwaiting requested review from nikicnikic is a code owner

@andykaylorandykaylorAwaiting requested review from andykaylor

@kpnealkpnealAwaiting requested review from kpneal

Assignees
No one assigned
Labels
floating-pointFloating-point mathllvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passesllvm:transforms
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

missed optimization, fptrunc (x) >= C1 => x >= C2
6 participants
@kissholic@llvmbot@dtcxzyw@aengelke@arsenm@jcranmer-intel

[8]ページ先頭

©2009-2025 Movatter.jp