On SPIR-V, this would decorate affected expressions withNoContraction.
On Metal, this would add "-fno-fast-math" (or a subset of it) to the affectedMTLLibrary.
On DX12 this would addprecise to the variable declarations used by the affected functions.

Note:SignedZeroInfNanPreserve and other features ofVK_KHR_shader_float_controls are intentionally not included.

precise_math attribute on functions

c2e4178

kvark mentioned this pull request

Sep 1, 2021

Add method to disable fast-math on a per-shader basis.#2076

Open

Copy link

Contributor

github-actionsbot commentedSep 1, 2021

Previews, as seen when thisbuild job started (c2e4178):
WebGPU |IDL
WGSL
Explainer

dneto0 reviewed

Sep 1, 2021

View reviewed changes

Copy link

Contributor

dneto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for taking this stab. It's getting there

wgsl/index.bs OutdatedShow resolvedHide resolved

wgsl/index.bsShow resolvedHide resolved

wgsl/index.bs Outdated

		<tr><td><dfn noexport dfn-for="attribute">`precise_math`</dfn>
		<td>None

		Indicates that the arithmetic computations in the function need to be performed with

Copy link

Contributor

dneto0Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have trouble with the word "precision" here, because that means "with more bits represented".
(Never mind the "precise" part of the attribute name, inherited from GLSL. It's good to reuse the GLSL word.)

Also, this should be constrained to floating point, I think.

How about:

Indicates that the floating point arithmetic computations in the function should be performed
without [=reassociation/reassociating=] subexpressions
while preserving infinities, NaNs, and signed zeroes

Apply this attribute when the correctness of the function is numerically sensitive, and it is acceptable to incur potential performance loss when forbidding such optimizations.

blah blah blah?

Copy link

ContributorAuthor

kvarkSep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you!
I took the liberty of modifying this a bit more. Let me know if it needs more fixing!

Copy link

ContributorAuthor

kvarkSep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I felt that it was important to refer to the floating point evaluation section from here

Apply David's suggestions

9c8c768

kvark requested a review fromdneto0

September 1, 2021 20:21

kvark commented

Sep 1, 2021

View reviewed changes

wgsl/index.bs OutdatedShow resolvedHide resolved

Copy link

Contributor

github-actionsbot commentedSep 1, 2021

Previews, as seen when thisbuild job started (9c8c768):
WebGPU |IDL
WGSL
Explainer

dneto0 previously approved these changes

Sep 1, 2021

View reviewed changes

Copy link

Contributor

dneto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Seems ok to me now.

The key word is "should", instead of "must"

The group should review this.

dneto0 added the wgslWebGPU Shading Language Issues label

Sep 1, 2021

dneto0 added this to theV1.0 milestone

Sep 1, 2021

Copy link

Contributor

litherum commentedSep 7, 2021•
edited
Loading

Metal exposes fastmath on the entire module:https://developer.apple.com/documentation/metal/mtlcompileoptions?language=objc. So this is a good idea, but it should be elevated to module-level (either by something at the global scope in the language, or as additional data tocreateShaderModule()).

Copy link

Contributor

litherum commentedSep 7, 2021

The SPIR-V registry says SignedZeroInfNanPreserve is missing before version 1.4. The earliest version of Vulkan to require SPIR-V 1.4 is Vulkan 1.2, which I thought was unavailable on most Android devices. Can we really require it?

Copy link

ContributorAuthor

kvark commentedSep 7, 2021

I don't think we require thisSignedZeroInfNanPreserve. Theprecise_math is basically a "best effort" attribute. If SPIR-V doesn't supportSignedZeroInfNanPreserve, then we don't use it.

So this is a good idea, but it should be elevated to module-level

There is definitely value in having it exposed in a more granular level than the module scope:

SPIR-V implementations can use it
Metal implementations that have generateMTLLibrary at pipeline creation time (wgpu andDawn, at least) can generate some entry points precise and other with fast math, if requested by the user.

Copy link

Contributor

kdashg commentedSep 8, 2021

WGSL meeting minutes 2021-09-07

DN: Further discussion today: Concerns that it’s not testable. Also concerned that it’s not stable, no way to make sure that (because untestable) it will keep working. One possibility is to make it an extension with actual strict requirements, but without signing us up for these sometimes-impossible strict requirements in core.
MM: In version of spir-v that wgsl is targeting, there’s no way to guarantee/require ieee floats?
DN: OpenCL can, but not spir-v in general.
DM: Spir-v doesn’t support some things, but not everything we need?
DN: NoContraction is visible, feature in vulkan-spirv. Not sure how strict the vulkan conformance tests are good enough to guarantee what we need. I would need more time to test whether it’s feasible on vulkan.
MM: When this extension is enabled, we can test that the math would be right. Problem is when we don’t have the extension, where we can’t really guarantee anything.
DN: When this was “SHOULD”, it’s easy to try for. Making an extension would require more work to see if we can support “MUST”.
GR: Why was it said that this would have no effect on DX12?
DN: I think the original poster did minor testing and didn’s have issues, so didn’t look into this?
GR: We do haveprecise which should work, but yeah, not sure how to test non-precise.
DM:precise is applied to variables? (yes)
MM: Another question, is can global variables be precise? This leads me to a recommendation that it be per-module, and also that this all Metal can handle.
DN: Could (galaxy-brain idea) compile multiple times with and without precise as needed, since we control when entrypoints are used?
MM: Compiling multiple times would be bad, because compiling is already slow.
DM: I think we sort of already handle this (function granularity) in Metal backends.
MM: Why per-function, when no native API does that. Metal is per-module, others are per-variable.
DN: There’s some concerns about how deep (into function calls?) to propagateprecise when on variables, at least in a way that’s not super verbose.
(timebox hit, tabled to next meeting)

Copy link

Contributor

dneto0 commentedSep 8, 2021

To fill in some a detail:

First,@kvark is right about the implications of the best-effort framing
That said, the SignedZeroInfNanPreserve was first made available in a Vulkan extension VK_KHR_shader_float_controls / SPIR-V extensionSPV_KHR_float_controls which has support going backalmost three years..
You can use the SPIR-V extension in pre 1.4 SPIR-V modules if you declare the extension (OpExtension "SPV_KHR_float_controls"). Saying the feature was "incorporated into 1.4" means you can use the feature without having to declare the extension.

Copy link

ContributorAuthor

kvark commentedSep 8, 2021

My reading of the current state of the debate is that we need to decide if this functionality is testable or not. I believe having it testable would make a stronger API, and thus we need to explore this path before proceeding (with this PR as it stands now).

It sounds like DX12 and Metal support this "precise" mode unconditionally, and there is a chance we'll be able to test it. In Vulkan, it's more complicated. As@dneto0 noted, there is an extension. However, one has to check for the properties of this extension before using them:https://vulkan.gpuinfo.org/listpropertiesextensions.php?extension=VK_KHR_shader_float_controls&platform=all
It's concerning to see "shaderSignedZeroInfNanPreserveFloat32" only supported by "46%" of reports.

If we make this an optional feature, we'd deny access to it for users who either don't care aboutshaderSignedZeroInfNanPreserveFloat32 specifically, or happy with Vulkan driver behavior by default. I don't think we want to end up in a situation where people writeif features.contains(PreciseMath) || IsVulkan().

Copy link

Contributor

dneto0 commentedSep 8, 2021

My reading of the current state of the debate is that we need to decide if this functionality is testable or not.

Agreed. I wasn't sure on the call yesterday, so I investigated what Vulkan does to test NoContract:

The NoContract feature has been supported by SPIR-V / Vulkan from the start.
Its test ishere

The test tempts the compiler to fuse a multiply-add into one operation (FMA).
FMA is spec'd to produce a rounded result where the intermediate results are computed with infinite precision and accuracy. The test uses sample values that produce catastrophic cancellation. A fused operation would produce a tiny-magnitude number (2**-46), but a non-fused result produces either zero or a small but larger number (2**-24).

This depends crucially on the fact that certain basic operations (add, subtract, multiply) are "correctly rounded" (as defined by IEEE 754, and adopted by Vulkan and WGSL).

In general, catastrophic cancellation can be used to magnify errors for other undesirable cases: reassociation, distribution of multiply over addiiton.

So I think fusing, reassociation, and distribution aspects are testable.

Copy link

Contributor

mrshannon commentedSep 8, 2021•
edited
Loading

In an ideal worldprecise (meaning no fusing, reassociation, or distribution) and the other fast math optimizations would be separate. It seems they can be on DX12 and Vulkan, but as far as I could tell Metal is all or nothing. I think a majority of use cases could be solved withprecise alone. So perhaps a lesser feature could be made core where at some level of:

module
function
variable

precise mode could be enabled which would not enableshaderSignedZeroInfNanPreserveFloat32 (as support for that is not great, even on desktops) but would just:

Useprecise on DX12
UseNoContraction and possiblyInvariant on Vulkan
Use-fno-fast-math on Metal.

We already have theinvariant qualifier which maps toprecise in HLSL but it can only be used for the built-inposition output. Also this maps toInvariant in SPIR-V and notNoContraction whileprecise in HLSL implies both.

Copy link

ContributorAuthor

kvark commentedSep 8, 2021

Metal is not exactly all or nothing. As@kainino0x pointed in#2076 (comment), we can pick a subset of fast-math stuff. It sounds like you are suggesting to adopt the current PR but cut out everything related toVK_KHR_shader_float_controls, since it's not universally available. This means Metal compiler wouldn't need "-fno-signed-zeros" for example, and possibly other things. Do I understand your proposal,@mrshannon ?

Then we can have an optional feature exposing something that capturesVK_KHR_shader_float_controls functionality, as a follow-up.

Copy link

Contributor

mrshannon commentedSep 8, 2021•
edited
Loading

Do I understand your proposal,@mrshannon ?

Yes, just disable fusing, reassociation, and distribution. With signed zeros and such not universal, and the lack of example code that would be effected by them I am proposing scaling back to only whatprecise in HLSL promises as there is plenty of rendering code in the wild which relies on that.

Metal is not exactly all or nothing. As@kainino0x pointed in#2076 (comment), we can pick a subset of fast-math stuff.

I was not sure if that was kosher since it was not documented in the Metal spec.

Then we can have an optional feature exposing something that capturesVK_KHR_shader_float_controls functionality, as a follow-up.

Or you could wait until someone needs it, its probably a failure of my imagination but I can't think of a case where asymptotic limits would be of use in rendering.

Remove the nan/zero handling

09e6e9c

Copy link

ContributorAuthor

kvark commentedSep 8, 2021

The last commit here describes this semantics. I'm sure@dneto0 would want to put more technical details of what is preserved, adding examples and such, and I'm hoping we can follow-up with this.

Copy link

Contributor

munrocket commentedSep 8, 2021

Love to see where it is going, thanks@kvark. Floating point expansion definitely not rely on NaN/Infinity/SignedZero's.

Copy link

Contributor

litherum commentedSep 9, 2021

From talking with the Metal team, we haven't gotten requests to apply fastMath per function rather than per MTLLibrary.

This makes intuitive sense, because the use cases that need IEEE precision are things like scientific computing, where it's likely that all the functions in the library will need to be precise. Conversely, for use cases like games, it's likely that none of the functions in the library will need to be precise.

(Games do need things like theinvariant keyword, but that's a different thing.)

Copy link

Contributor

litherum commentedSep 9, 2021•
edited
Loading

Metal is not exactly all or nothing. As@kainino0x pointed in#2076 (comment), we can pick a subset of fast-math stuff.

These things aren't API. Ideally, WebGPU / WGSL wouldn't rely on anything that isn't API in the 3 backend APIs. The API is a single boolean switch.

(Anything that isn't API is unsupported, and able/willing to be removed at any point in the future.)

Copy link

Contributor

litherum commentedSep 9, 2021•
edited
Loading

It would be unfortunate to make fastMath a "best effort" attribute.

From an author's perspective, what's the point of a precision guarantee if the guarantee isn't actually guaranteed?

From an implementor's perspective, why would an implementor implement any of the feature at all if it just slows down code and doesn't actually have any expected (testable) behavior? Or, stated a different way: Let's say I want to implement this feature in a particular WebGPU implementation, and I sit down and start typing code into the computer to do it. How do I know when I'm done? Why shouldn't I consider myself to be done implementing the feature before writing a single line of code?

Copy link

ContributorAuthor

kvark commentedSep 9, 2021

@litherum it sounds like the desire to have this behavior testable is shared between all parties, so it's good to have this settled. The last version of the PR, which I mentioned in#2080 (comment), already makes it normative. It just doesn't spell out the exact norms affecting it, which is intended to be written at some point. So, no "best effort" any more.

As for the scope of the change, I'm curious what use cases are to consider. From the distance, it felt useful to be able to make, say, vertex shaders precise but not the fragment shaders. Or even just computation of one specific output of a vertex shader. But I haven't used this myself, so happy to hear ISV feedback!

@mrshannon could you share the intended usage of this attribute? Would you be doing it for the whole module, or potentially more granularly?

Copy link

Contributor

mrshannon commentedSep 9, 2021•
edited
Loading

@mrshannon could you share the intended usage of this attribute? Would you be doing it for the whole module, or potentially more granularly?

First, I am specifically talking aboutprecise as it exists in HLSL (rearrangement etc), not signed zero and the rest. We have two use cases:

The first is extremely large scale terrain generation in a compute shader which requires double precision. An existing example of this is Elite Dangerous which uses real doubles on some cards and emulated doubles (which requireprecise) on others. Their reason for emulation is because it's faster than the real thing on some cards, our reason is because we don't have real doubles at all. In this case, while there will be calculations in the compute shader which do not requireprecise it is likely that at least half of the compute shader module will require it.

Use in the wild:Generating the Universe in Elite Dangerous

The second case is when rendering very large objects (which cannot be handled in other ways). To avoid jitter we need to perform the model to camera space transform in double precision sometimes. Therefore, again emulated doubles. But in this case the calculation is in the vertex shader and is pretty narrow in scope as it is just used for the model to world transform and furthermore is only used on a small subset of vertices (those close to the camera). Therefore it would be undesirable to requireprecise at the module level since variable, statement, or function level would allow the disabling of the optimizations at a narrow scope for a tiny part of the vertex shader and in our case only on some invocations.

Use in the wild:3D Engine Design for Virtual Globes

Conversely, for use cases like games, it's likely that none of the functions in the library will need to be precise.
(Games do need things like theinvariant keyword, but that's a different thing.)

This is not true, seeGenerating the Universe in Elite Dangerous. What is required is not IEEE but specifically the guarantees that HLSL gives with itsprecise decorator which is more than whatinvariant guarantees, except on DX12 whereinvariant maps toprecise.

In general there are cases where floating point error needs to be mitigated, even in rendering, which requires controlling the order of operations.

Copy link

Contributor

kainino0x commentedSep 10, 2021

As@kainino0x pointed in#2076 (comment), we can pick a subset of fast-math stuff.

FWIW the flags I pointed to can probably only be used when invoking an MSL compiler via command line, but not via newLibraryWithSource. However I found the associated clang pragmas:
https://clang.llvm.org/docs/LanguageExtensions.html#extensions-to-specify-floating-point-flags
(I haven't tested them, and it's entirely possible they don't actually work because MSL's LLVM backend doesn't understand them.)

Of course@litherum's point that these aren't officially supported still stands.

Copy link

Contributor

mrshannon commentedSep 14, 2021•
edited
Loading

Here implementation of emulated double in WebGPU that works right now in Chrome/Firefox on MacBook and PC with Linux, uncomment trick withmix if you testing precise math.https://codepen.io/munrocket/pen/vYZgyqa

@munrocket Not sure that it is working on Windows, is the top of the fractal supposed to be filled with strange bands.

Also not sure you needmix, theselect, or an if statement, should be enough.

Copy link

Contributor

munrocket commentedSep 14, 2021•
edited
Loading

@mrshannon yes, it shows that float32 with limited precision. It’s intentional.

I am started to think that fast-math is pretty ok even for this purposes because Dekker multiplication algorithm become smaller in x10 (2 FLOP vs 17 FLOP) with hardware fma instruction. It is implicitly inherited from fma(a,b,c) in current WebGPU implementation in Chrome/Firefox. Also withselect trick you could implement NoContract for Moller/Knuth’s summation and it will be calculated correctly but little bit slower. At lest on my machines all works pretty good. 😍

If you going to expose precise math in this PR then fma(a, b, c) will become twice rounded expression RN(RN(a * b) + c). And your will need to use more slower algorithm. I don’t know could you add support for hardware fma in this PR or not. But currently it is a trade-off.

Fast multiplication and slow summation VS slow multiplication and fast summation

Copy link

Contributor

mrshannon commentedSep 14, 2021•
edited
Loading

@munrocket We just tested theselect trick in our implementation and it works. Thanks for the idea.

We are likely to use it over this PR (even if it is merged in) as it has better performance on Metal due to not disabling all optimizations and works at the expression and not at the module level.

Copy link

Contributor

munrocket commentedSep 15, 2021

Glad to help. The only reason why someone will still need to turn off fast-math if they detect that host doesn’t support FMA in hardware. After that fma emulation withselect trick will be painful.

I am removedmix in NoContraction trick thanks, because reordering turned off without it. Also if you find that some devices not support this, please share.

Copy link

ContributorAuthor

kvark commentedSep 15, 2021

Hey users, if you keep finding nice hacks and workarounds for this, we'll have no incentive to do anything with the spec! 🤪

Copy link

Contributor

munrocket commentedSep 15, 2021•
edited
Loading

Ha-ha, that was fun.

It's actually miracle how it works. Because current rounding is UB and not specified, as well as fast-math mode. This PR still have potential. For example if somebody figure out how to turn on correct math and fused-multiply-add at the same time, mrshannon probably will use it.

Copy link

Contributor

kdashg commentedSep 15, 2021

WGSL meeting minutes 2021-09-14

DN: Discussing earlier and with MS. Our concern is two things
- Making sure when targeting FXC that the math survives as well as we hope. We’re concerned about stability/reliability here
- The nameprecise may end up promising more than all underlying platforms can do, so we may want to revisit the name if we can’t guarantee its behavior everywhere.
- Thanks for the feedback about infinities and NaNs not being too important
MS: Requires operations to not be reordered
DN: Ops that are correctly rounded are +/-/*, but division is a harder request. Do you need division? (maybe?)
DM: MM worried about spooky action at a distance. (SAAAD)
KN: If this is defined as best-effort, at least for Metal, I think would prefer to use clang pragmas rather than SAAAD. I’m worried that we wouldn’t want to useall the flags, for perf reasons. Just reordering is less bad.
DN: Rough consensus that we want this to be testable, rather than a pure hint.
DN:I want an investigation to show that assured non-reordering and non-reassociation are both implementable. On DX11 (FXC), DX12, and Vulkan (desktop and mobile).
DM: Sounds like something we need to figure out before v1.
AB: Why?
DM: We see real issues, MS and users of MoltenVk exist and they need this.
JG: Is this a need that e.g. webgl already had.
MS: We’re going from desktop directly to WebGPU. There’s extant code for this. Without this, we would need to do vert shading on CPU. We do that today, but we’re expecting to want to change this. We’d be pretty sad to not have this.

Copy link

Contributor

dneto0 commentedSep 15, 2021

Hey@munrocket thanks for this technique!

And thank youalso for a nice compact test case. We had been discussing the need for a good way to test the behaviour.

Some thoughts:

Will this continue to work with future implementations? I think basically yes. The select introduces a data dependency and possibly a control dependency on the test condition. The compiler must not be able to know the value of that condition (statically). The safest thing (from the programmer's perspective) is to pass in a known-to-be-zero value from the outside (a parameter buffer), and compare a value against that. This is what GLSL fuzz does (blog paper)
What's the performance cost? I think probably low? That assumes a few things: (a) we care most about throughput (b) the test condition is cheap (e.g. compare against opaque zero) (3) the implementation evaluates both options, and uses predicated execution, and at least one side is cheap. Then I would guess the additional costs are small, so performance is probably going to be pretty good.

So two thumbs up for this technique!

Copy link

Contributor

dneto0 commentedSep 15, 2021

Hey users, if you keep finding nice hacks and workarounds for this, we'll have no incentive to do anything with the spec!

It's a feature, not a bug. :-)
And this is why open processes can be so great.

Copy link

Contributor

dneto0 commentedSep 15, 2021

Another thing about the performance cost: Yes, this prevents the compiler from rearranging code to go faster, but that's exactly what the programmer wanted.

Copy link

Contributor

munrocket commentedSep 15, 2021

Will this continue to work with future implementations?

It works with round-to-nearest-even floating point rounding, which is default usually, butnot specified for some reason in DX11 for example. Also as mentioned: floating point arithmetic not associative, muladd should be allowed only for fma.

What's the performance cost?

Usually for emulated double addition 20 flop, multiplication 24 flop with software FMA, 9 flop with hardware. So it's cheap. When we usingselect I don't know, but it is possible to measure. Probablyselect not so perfect, it's branching?

This is what GLSL fuzz does,papers

Interesting, if we need a stronger confidence, we can pass variable there.

Copy link

Contributor

dneto0 commentedSep 15, 2021

About the performance cost, I meant theadditional performance cost of using the select. Thanks for the extra info for the cost of the double precision emulation. :-)

Right, rounding mode is not specified for graphics APIs because some devices use round-to-even, some use round-to-zero (which is cheaper in hardware).

Does select do branching? It is common for GPUs to use predicated execution: they execute both paths, but selectively turn off side effects of that path "not taken", and then only use the chosen result. (wikipedia This trades off possibly wasting cycles stepping through the dead code path, but saves the machine from taking a branch and destroying internal state.

So that's why I would hope to make the evaluation of the "other" path and the condition cheap: we want that so on a predicated execution they don't waste too much extra time.

Copy link

Contributor

litherum commentedSep 21, 2021•
edited
Loading

@mrshannon

The first is extremely large scale terrain generation in a compute shader which requires double precision.

The second case is when rendering very large objects ... in this case the calculation is in the vertex shader and is pretty narrow in scope

Both of these use cases are supported by putting the precise attribute on the entire module rather than the individual function - just put these two entry points and their dependent functionality in a separate module. Linking a vertex shader from one module and a fragment shader from a different module is supported.

Copy link

ContributorAuthor

kvark commentedSep 21, 2021

If I understand correctly, we are mostly fine with introducingprecise_math attribute (in a way that we can test), we just can't agree on what scope it covers:

function scope (the current shape of this PR):
- This is nice for SPIR-V and HLSL.
- On Metal, it has a SAaaD effect: adding an attribute to one function can end up affecting other functions. On wgpu and Dawn, it would affect all the functions in the call graph of a specific entry point (if one of them has the attribute). On Safari's implementation, it would affect all the functions in the module.
entry point scope:
- This is nice for Metal via wgpu or Dawn, since they buildMTLLibrary per entry point.
- Has SAaaD on SPIR-V and HLSL, since using a function from another entry point (which has the attribute enabled) would make it slower for other entry points using this function.
- Has SAaaD on Metal in Safari, since one entry point will affect the others.
module scope:
- Can be mapped to all of the APIs
- less optimal for SPIR-V and HLSL
- No SAaaD

Copy link

Contributor

dneto0 commentedSep 21, 2021

The user has reasonable workaround, and it appears to be performant and likely stable over time. I thought this was an easy "not in V1.0" decision.

Copy link

ContributorAuthor

kvark commentedSep 21, 2021

I'm not happy about this workaround becoming sort of a tribal knowledge thing.
If we consider it good that people do this, can we hide the workaround behind something like:

fncompute(val:T) ->T

So doinglet x = compute(a * b) + c would effectively putNoCompaction attribute on the intermediate result (andprecise in HLSL). On Metal, it could use theselect trick internally.

Copy link

Contributor

mrshannon commentedSep 21, 2021

Both of these use cases are supported by putting the precise attribute on the entire module rather than the individual function - just put these two entry points and their dependent functionality in a separate module. Linking a vertex shader from one module and a fragment shader from a different module is supported.

It would be wasteful in the 2nd case. Perhaps as much as 10% of vertices (depending on camera location) in any given object need emulated double vertex position. The rest can take the faster 32-bit float path as they are further from the camera.

Copy link

Contributor

mrshannon commentedSep 21, 2021

So doinglet x = compute(a * b) + c would effectively putNoCompaction attribute on the intermediate result (andprecise in HLSL). On Metal, it could use theselect trick internally.

Either this or actual function scope (including Metal) would keep us from using theselect trick. I agree with the tribal knowledge issue but I am not going to severely harm performance to avoid it. Not surecompute is the right term but I can't think of anything better at the moment.

Copy link

ContributorAuthor

kvark commentedSep 28, 2021

@litherum it looks like MSL supports[[clang::optnone]] on functions -KhronosGroup/SPIRV-Cross#1746 . We could consider it as a direct effect of[[precise_math]] in WGSL.

Copy link

Contributor

kainino0x commentedSep 28, 2021

[[clang::optnone]] seems like far too heavy of a hammer to me.

The optnone attribute suppresses essentially all optimizations on a function or method, regardless of the optimization level applied to the compilation unit as a whole. This is particularly useful when you need to debug a particular function, but it is infeasible to build the entire application without optimization. Avoiding optimization on the specified function can improve the quality of the debugging information for that function.

kdashg modified the milestones:V1.0,post-V1

Sep 28, 2021

Copy link

Contributor

kdashg commentedSep 29, 2021

WGSL meeting minutes 2021-09-28

(Previously: MM: Let’s postpone.)
DM: Offline, discussed MSL’s clang::optnone, which is on function scope. Kai noted it’s a big hammer and may not do what’s wanted.
MM: I thought we postponed this until after MVP.
DM: New information came up.
MM: Would like to re-propose postpone to MVP. That attribute is not part of the Metal API. So you can’t rely on it. Don’t think it’s a good solution.
DM: Other idea is to expose a function that shields optimizations across its argument vs. its result. Think we can implement it well on all backends. (NoContract on SPIR-V, or select trick as discussed on the issue.)
MM: Not familiar with the technique, and I didn’t prepare; think it was a mistake to put this on the agenda.
DN: Also think this can be postponed until after MVP. Have workaround, even if it’s in the “lore” category.
JG: Will mark as milestone Post-V1.

kdashg mentioned this pull request

Apr 26, 2022

Expose fast-math#2077

Closed

dneto0 dismissed theirstale review

April 13, 2023 20:31

revoking my own review. Let's reconsider with fresh eyes

kdashg mentioned this pull request

Nov 28, 2023

Missing intrinsic functions like rcp and rsqrt that are useful in other shading languages#4092

Open

rconde01 mentioned this pull request

Apr 7, 2024

Do we need ahigh-precision vsspeed option?#4562

Closed

Copy link

Contributor

greggman commentedApr 7, 2024

I'm not sure this idea appeared above but .... what about module level flag that only works if a feature like "high-precision" exists? So you check if the adapter supports "high-precision". If it does you request a device with{requiredFeatures: ['high-precision']}. Now you can pass'high-precision' tocreateShaderModule

This way, if an GPU/driver can't pass the high-precision CTS tests it doesn't advertise the feature.

If you don't like features bleeding into WGSL you could move the check into pipeline creation where you use the precision keywords/options in WGSL but when you go try to make a pipeline, if you didn't request the'high-precision' feature then you get an error your shader isn't supported on this device.

Copy link

TimTheBig commentedOct 26, 2025

Is there any way I can get this moving again?

TimTheBig reviewed

Oct 26, 2025

View reviewed changes

wgsl/index.bs

		* without[=Reassociation\|reassociating=] subexpressions

		Note: this translates to `NoContraction` decoration in SPIR-V, `precise` qualifier in HLSL,
		and a subset of `"-fno-fast-math" group of compile options in MSL.

Copy link

TimTheBigOct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

    and a subset of `"-fno-fast-math" group of compile options in MSL.

Should the subset not be documented?

Labels

wgsl

WebGPU Shading Language Issues

Movatterモバイル変換

precise_math attribute on functions#2080

Are you sure you want to change the base?

precise_math attribute on functions#2080

Uh oh!

Conversation

kvark commentedSep 1, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

github-actionsbot commentedSep 1, 2021

Uh oh!

dneto0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dneto0Sep 1, 2021

Choose a reason for hiding this comment

Uh oh!

kvarkSep 1, 2021

Choose a reason for hiding this comment

Uh oh!

kvarkSep 1, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actionsbot commentedSep 1, 2021

Uh oh!

dneto0 left a comment

Choose a reason for hiding this comment

Uh oh!

litherum commentedSep 7, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

litherum commentedSep 7, 2021

Uh oh!

kvark commentedSep 7, 2021

Uh oh!

kdashg commentedSep 8, 2021

Uh oh!

dneto0 commentedSep 8, 2021

Uh oh!

kvark commentedSep 8, 2021

Uh oh!

dneto0 commentedSep 8, 2021

Uh oh!

mrshannon commentedSep 8, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

kvark commentedSep 8, 2021

Uh oh!

mrshannon commentedSep 8, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

kvark commentedSep 8, 2021

Uh oh!

munrocket commentedSep 8, 2021

Uh oh!

litherum commentedSep 9, 2021

Uh oh!

litherum commentedSep 9, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

litherum commentedSep 9, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

kvark commentedSep 9, 2021

Uh oh!

mrshannon commentedSep 9, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

kainino0x commentedSep 10, 2021

Uh oh!

mrshannon commentedSep 14, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

munrocket commentedSep 14, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

kvark commentedSep 1, 2021•
edited
Loading

litherum commentedSep 7, 2021•
edited
Loading

mrshannon commentedSep 8, 2021•
edited
Loading

mrshannon commentedSep 8, 2021•
edited
Loading

litherum commentedSep 9, 2021•
edited
Loading

litherum commentedSep 9, 2021•
edited
Loading

mrshannon commentedSep 9, 2021•
edited
Loading

mrshannon commentedSep 14, 2021•
edited
Loading

munrocket commentedSep 14, 2021•
edited
Loading

mrshannon commentedSep 14, 2021•
edited
Loading

munrocket commentedSep 15, 2021•
edited
Loading

litherum commentedSep 21, 2021•
edited
Loading