SplitInt intoInt andMinInt so we can use traits with bigint types
Add 256-bit bigint types for widening operations on 128-bit integers
Refactor test macros so that systems without implementations test againstrustc_apfloat rather than just being skipped. I needed this so I can actually debug.
Add implementations and tests for the new operations
Change powerpc symbol names to match what LLVM emits, fromhttps://gcc.gnu.org/wiki/Ieee128PowerPC

tgross35 force-pushed thef16-f128-intrinsics-min branch 7 times, most recently from2e9bb45 to7871c81Compare

May 11, 2024 05:43

Copy link

ContributorAuthor

tgross35 commentedMay 11, 2024•
edited
Loading

Powerpc seems to be hitting a SIGILL onstxvd2x vs34,0,r9. Fromhttps://bugzilla.redhat.com/show_bug.cgi?id=1045384 andhttps://bugzilla.redhat.com/show_bug.cgi?id=1002077 it seems like this may be a limitation of qemu, but information is scarce. I may just need to use the apfloat fallback.

Also looks like symbol names may be different, I need to double checkhttps://gcc.gnu.org/wiki/Ieee128PowerPC

Copy link

ContributorAuthor

tgross35 commentedMay 11, 2024•
edited
Loading

Ok, some more info. Compiling C for 32-bit powerpc seems to always emit__gcc_qadd regardless of-mlong-double flag. When forced to use__addkf3 it segfaults as expected because of thestxvd2x instruction. I think we are okay just ignoring this one as a likely limitation of qemu.

Compiling C for powerpc64 seems to use an ifunc resolver for__addkf3, which goes to__addkf3_hw for the single-instructionxsaddqp. Compiling Rust seems to do the exact same thing but I am getting different results, __addtf3(0x00000000000000000000000000000000, 0x00000000000000000000000000000001): std: 0x00000000000000000000000000000000, builtins: 0x00000000000000000000000000000001.~~I get the same results when calling__addkf3 directly in C~~edit: no I don't. Maybe the operation format of xsaddqp needs to be set somehow to choose between ppc doubledouble and ieee f128?

Copy link

ContributorAuthor

tgross35 commentedMay 12, 2024•
edited
Loading

Some asm for anyone who understands it better:

Assembly generated from Rust (incorrect result)

0000000000010d00 <.add_entry>:   10d00:   7c0802 a6     mflr    r0   10d04:   f821 ff91     stdu    r1,-112(r1)   10d08:   f8010080std     r0,128(r1)   10d0c:   4b ff c695bl      d3a0 <00000143.plt_call.__addkf3>   10d10:   e8410028     ld      r2,40(r1)   10d14:38210070     addi    r1,r1,112   10d18:   e8010010     ld      r0,16(r1)   10d1c:   7c0803 a6     mtlr    r0   10d20:   4e800020     blr000000000000d3a0 <00000143.plt_call.__addkf3>:    d3a0:       f8410028std     r2,40(r1)    d3a4:       3d62 ff ff     addisr11,r2,-1    d3a8:       e9 8b 7f58     ldr12,32600(r11)    d3ac:       7d8903 a6     mtctrr12    d3b0:       e8 4b 7f60     ld      r2,32608(r11)    d3b4:       4e800420     bctr0000000000056790 <.__addkf3_resolve>:56790:81 2d 8f 9c     lwzr9,-28772(r13)56794:75290040     andis.r9,r9,6456798:41820018     beq     567b0 <.__addkf3_resolve+0x20>   5679c:       e8628010     ld      r3,-32752(r2)   567a0:       4e800020     blr   567a4:60000000nop   567a8:60000000nop   567ac:60420000     ori     r2,r2,0   567b0:       e8628018     ld      r3,-32744(r2)   567b4:       4e800020     blr        ...   567c4:60000000nop   567c8:60000000nop   567cc:60420000     ori     r2,r2,0000000000005e6e0 <.__addkf3_hw>:   5e6e0:       fc421808     xsaddqp v2,v2,v3   5e6e4:       4e800020     blr        ...   5e6f4:60000000nop   5e6f8:60000000nop   5e6fc:60000000nop0000000000057050 <.__addkf3_sw>:57050:       fb41 ff d0std     r26,-48(r1)57054:       fb61 ff d8std     r27,-40(r1)57058:       fb81 ff e0std     r28,-32(r1)   5705c:       fb a1 ff e8std     r29,-24(r1)57060:       fb c1 ff f0std     r30,-16(r1)57064:       fb e1 ff f8std     r31,-8(r1)57068:       f821 ff41     stdu    r1,-192(r1)   5706c:39210070     addir9,r1,11257070:39410070     addir10,r1,112   // ... full sw implementation

Assembly generated from C (correct result)

0000000010000af8 <.add_entry>:    10000af8:   7c0802 a6     mflr    r0    10000afc:   f8010010std     r0,16(r1)    10000b00:   fb e1 ff f8std     r31,-8(r1)    10000b04:   f821 ff81     stdu    r1,-128(r1)    10000b08:   7c 3f0b78     mr      r31,r1    10000b0c:39200030     lir9,48    10000b10:39 5f0080     addir10,r31,128    10000b14:   7c 4a 4f99     stxvd2x vs34,r10,r9    10000b18:39200040     lir9,64    10000b1c:39 5f0080     addir10,r31,128    10000b20:   7c 6a 4f99     stxvd2x vs35,r10,r9    10000b24:39200040     lir9,64    10000b28:39 5f0080     addir10,r31,128    10000b2c:   7c 6a 4e99     lxvd2x  vs35,r10,r9    10000b30:39200030     lir9,48    10000b34:39 5f0080     addir10,r31,128    10000b38:   7c 4a 4e99     lxvd2x  vs34,r10,r9    10000b3c:   4b ff fb a5bl      100006e0 <00000019.plt_call.__addkf3>    10000b40:   e8410028     ld      r2,40(r1)    10000b44:   f0021496     xxmr    vs0,vs34    10000b48:   f0400491     xxmr    vs34,vs0    10000b4c:38 3f0080     addi    r1,r31,128    10000b50:   e8010010     ld      r0,16(r1)    10000b54:   7c0803 a6     mtlr    r0    10000b58:   eb e1 ff f8     ld      r31,-8(r1)    10000b5c:   4e800020     blr    10000b60:00000000     .long0x0    10000b64:00000001     .long0x1    10000b68:80010001     lwz     r0,1(r1)0000000010000c40 <.__addkf3_resolve>:    10000c40:81 2d 8f 9c     lwzr9,-28772(r13)    10000c44:75290040     andis.r9,r9,64    10000c48:41820018     beq     10000c60 <.__addkf3_resolve+0x20>    10000c4c:   e8628010     ld      r3,-32752(r2)    10000c50:   4e800020     blr    10000c54:60000000nop    10000c58:60000000nop    10000c5c:60420000     ori     r2,r2,0    10000c60:   e8628018     ld      r3,-32744(r2)    10000c64:   4e800020     blr        ...    10000c74:60000000nop    10000c78:60000000nop    10000c7c:60420000     ori     r2,r2,00000000010008b90 <.__addkf3_hw>:    10008b90:   fc421808     xsaddqp v2,v2,v3    10008b94:   4e800020     blr        ...    10008ba4:60000000nop    10008ba8:60000000nop    10008bac:60000000nop0000000010001500 <.__addkf3_sw>:10001500:   fb41 ff d0std     r26,-48(r1)10001504:   fb61 ff d8std     r27,-40(r1)10001508:   fb81 ff e0std     r28,-32(r1)    1000150c:   fb a1 ff e8std     r29,-24(r1)10001510:   fb c1 ff f0std     r30,-16(r1)10001514:   fb e1 ff f8std     r31,-8(r1)10001518:   f821 ff41     stdu    r1,-192(r1)    1000151c:39210070     addir9,r1,11210001520:39410070     addir10,r1,112    // ...

Rust code

#![feature(f128)]#[no_mangle]#[inline(never)]fnadd_entry(a:f128,b:f128) ->f128{    a + b}fnmain(){let a = f128::from_bits(0x0);let b = f128::from_bits(0x1);dbg!(a, b);let c =add_entry(a, b);dbg!(c);}

C code

#include<stdio.h>#include<stdlib.h>#include<inttypes.h>#define_Float128 __float128typedefstruct {#if__BYTE_ORDER==__LITTLE_ENDIANuint64_tlower,upper;#elif__BYTE_ORDER==__BIG_ENDIANuint64_tupper,lower;#else#error missing endian check#endif} __attribute__((aligned(_Alignof(_Float128))))u128;_Float128__addkf3(_Float128,_Float128);voidf128_print(_Float128val) {u128ival=*((u128*)(&val));printf("%#018"PRIx64"%016"PRIx64" %lf\n",ival.upper,ival.lower, (double)val);}_Float128new_f128(uint64_tupper,uint64_tlower) {u128val= { .lower=lower, .upper=upper };return*((_Float128*)(&val));}_Float128add_entry(_Float128a,_Float128b) {#ifdefUSE_ADDKF3return__addkf3(a,b);#elsereturna+b;#endif}intmain() {_Float128a=new_f128(0x0000000000000000,0x0000000000000000);_Float128b=new_f128(0x0000000000000000,0x0000000000000001);f128_print(a);f128_print(b);_Float128c=add_entry(a,b);f128_print(c);return0;}

tgross35 force-pushed thef16-f128-intrinsics-min branch 11 times, most recently from5119b1a to3133d8fCompare

May 12, 2024 22:25

Copy link

ContributorAuthor

tgross35 commentedMay 12, 2024

I don't have any further insight on powerpc64, so I'll just disable testing against the system for now unless you have any suggestions@Amanieu. We're still testing against apfloat so our compiler-builtins is correct, but using the system symbols may cause precision issues for users. I think we can probably just address this more after everything merges.

This PR is probably best reviewed by-commit.

tgross35 marked this pull request as ready for review

May 12, 2024 22:58

Copy link

Member

Amanieu commentedMay 13, 2024

@lu-zero Maybe you have some insight into the powerpc issues?

This was referencedMay 13, 2024

Add missing functions forf16 andf128#587

Closed

f128 symbols on powerpc64 give inaccurate resultsrust-lang/rust#125109

Open

Copy link

ContributorAuthor

tgross35 commentedMay 14, 2024

Discussed some more athttps://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/f128.20system.20libraries.20noncompliant.20platforms/near/438486364,@ecnelises is going to take a look at some point. This PR doesn't break anything new so I don't think it is blocked, openedrust-lang/rust#125109 to track the issue further.

Copy link

lu-zero commentedMay 14, 2024

@Amanieu I can build and try on real hardware. We can ask for real runners, I think.

Copy link

ContributorAuthor

tgross35 commentedMay 14, 2024•
edited
Loading

@lu-zero could you try the code at#606 (comment) on powerpc64? I put more about how I am building atrust-lang/rust#125109

Less pressing but it would also be nice if you have a way to confirm thatstxvd2x does not sigill on realpowerpc-* targets, as in#606 (comment)

Copy link

lu-zero commentedMay 14, 2024

so this is what I'm getting.

$ cargo run   Compiling test_f128 v0.1.0 (/home/lu_zero/test_f128)    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.23s     Running `target/debug/test_f128`[src/main.rs:12:5] a = 0x00000000000000000000000000000000[src/main.rs:12:5] b = 0x00000000000000000000000000000001[src/main.rs:14:5] c = 0x00000000000000000000000000000001

Copy link

ContributorAuthor

tgross35 commentedMay 14, 2024

Well that is interesting. That is 64-bit, correct?

Copy link

lu-zero commentedMay 14, 2024•
edited
Loading

Yes, tested on power9, used as ppc64le

Copy link

ContributorAuthor

tgross35 commentedMay 14, 2024

LE seems to work fine in qemu, the green test in CI isn't using the fallback. Onlypowerpc- andpowerpc64- are causing the issueshttps://github.com/rust-lang/compiler-builtins/blob/3133d8f555580d93645eb763f94ec4cb59ac5880/testcrate/build.rs

Copy link

lu-zero commentedMay 14, 2024

Sadly less and less people care about BE and it is fairly annoying to set up a BE environment =/

Copy link

ContributorAuthor

tgross35 commentedMay 14, 2024

Ah, well thanks for taking a look! I'll dig a little bit more but I think we are probably okay to not worry about it unless rust-lang/rust unit f128 unit tests have failures (we currently can't test much without this)

Copy link

lu-zero commentedMay 14, 2024

Either qemu gets fixed for BE or we'll need a BE runner.

Copy link

ContributorAuthor

tgross35 commentedMay 14, 2024

I don't think the ppc64 BE is a qemu issue since the C version works fine. 32-bit hopefully should be just qemu.

Agreed that native runners wouldn't be a bad thing.

This was referencedMay 14, 2024

Tracking Issue forf16 andf128 float typesrust-lang/rust#116909

Open

Add more intrinsics#611

Closed

tgross35 added8 commits

May 15, 2024 07:19

SplitInt intoInt andMinInt

9c6fcb5

`MinInt` contains the basic methods that are only needed by integersinvolved in widening operations, i.e. big integers. `Int` retains allother operations and convenience methods.

Addi256 andu256 bigint types

2868c26

Refactor float test macros to have a fallback

77faba1

Change float test macros to fall back to testing against `rustc_apfloat`when system implementations are not available, rather than just skippingtests.This allows for easier debugging where operations may not be supported.

Enable no-fail-fast for more usable test output

255c9f3

Implementf128 addition and subtraction

c8cc819

Implementf128 multiplication

58ad317

Implementf128 comparison

9bea196

Correct f128 extend and truncate symbol names on powerpc

6a847ab

PowerPC uses `kf` instead of `tf`:<https://gcc.gnu.org/wiki/Ieee128PowerPC>

tgross35 force-pushed thef16-f128-intrinsics-min branch from3133d8f to6a847abCompare

May 15, 2024 12:19

Copy link

Member

Amanieu commentedMay 16, 2024

This took a while to fully review, but LGTM!

Amanieu merged commit449643f intorust-lang:master

May 16, 2024

Copy link

ContributorAuthor

tgross35 commentedMay 16, 2024

Thanks for taking a look!

tgross35 deleted the f16-f128-intrinsics-min branch

May 16, 2024 17:47

This was referencedMay 17, 2024

Unable to find long double symbols onaarch64-apple-darwin#567

Closed

[PowerPC] SIGILL inPPCTargetLowering forpowerpc-unknown-linux-gnu with pwr9 target featurellvm/llvm-project#92233

Closed

tgross35 mentioned this pull request

Aug 26, 2024

release/19.x: [PowerPC] Respect endianness when bitcasting to fp128 (#95931)llvm/llvm-project#105623

Merged

Labels

None yet

Movatterモバイル変換

Add addition, subtraction, multiplication, and compare operations forf128#606

Add addition, subtraction, multiplication, and compare operations forf128#606

Uh oh!

Conversation

tgross35 commentedMay 10, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tgross35 commentedMay 11, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tgross35 commentedMay 11, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tgross35 commentedMay 12, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tgross35 commentedMay 12, 2024

Uh oh!

Amanieu commentedMay 13, 2024

Uh oh!

tgross35 commentedMay 14, 2024

Uh oh!

lu-zero commentedMay 14, 2024

Uh oh!

tgross35 commentedMay 14, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

lu-zero commentedMay 14, 2024

Uh oh!

tgross35 commentedMay 14, 2024

Uh oh!

lu-zero commentedMay 14, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tgross35 commentedMay 14, 2024

Uh oh!

lu-zero commentedMay 14, 2024

Uh oh!

tgross35 commentedMay 14, 2024

Uh oh!

lu-zero commentedMay 14, 2024

Uh oh!

tgross35 commentedMay 14, 2024

Uh oh!

Amanieu commentedMay 16, 2024

Uh oh!

tgross35 commentedMay 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add addition, subtraction, multiplication, and compare operations for`f128`#606

Add addition, subtraction, multiplication, and compare operations for`f128`#606

tgross35 commentedMay 10, 2024•
edited
Loading

tgross35 commentedMay 11, 2024•
edited
Loading

tgross35 commentedMay 11, 2024•
edited
Loading

tgross35 commentedMay 12, 2024•
edited
Loading

tgross35 commentedMay 14, 2024•
edited
Loading

lu-zero commentedMay 14, 2024•
edited
Loading