Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Specify immediate data API and WGSL <immediate> address space#5423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
shaoboyan091 wants to merge13 commits intogpuweb:main
base:main
Choose a base branch
Loading
fromshaoboyan091:gpu-spec

Conversation

@shaoboyan091
Copy link
Contributor

Landing the immediate data API specification into the WebGPU spec. Ref to proposal: immediate-data.md.

Copy link
Contributor

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull Request Overview

This PR adds support for immediate data in WebGPU pipelines, allowing small amounts of data to be passed directly to shaders without requiring buffer bindings. This is useful for frequently changing small data like transformation matrices.

  • IntroducesmaxImmediateSize limit (64 bytes) for immediate data ranges
  • AddsimmediateSize parameter to pipeline layouts and validation for immediate data usage
  • ImplementssetImmediateData() method for setting immediate data in encoders

💡Add Copilot custom instructions for smarter, more guided reviews.Learn how to get started.

@github-actions
Copy link
Contributor

github-actionsbot commentedOct 29, 2025
edited
Loading

@jimblandy
Copy link
Contributor

minutes from API committee meeting 2025-10-29
  • KN: Everything should be more or less done in the proposal. Is there anything else people know they want fixed before it moves forward? Shaobo has begun work on the spec PR.
  • JB: Does the proposal include the word-granularity initialization checks that we agreed to in Toronto?
  • KN: Yes
  • JB: Okay, cool. Then I’m not aware of any other concerns that would block drafting a PR against the spec.

spec/index.bs Outdated
::
The current dynamic offsets for each {{GPUBindingCommandsMixin/[[bind_groups]]}} entry.

: <dfn>\[[immediate_data]]</dfn>, of type [=ordered map=]&lt;{{GPUSize32}}, [=list=]&lt;[=byte=]&gt;&gt;, initially empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit: The array of arrays of bytes is a little confusing to read, I think it would be easier to describe if we tracked the data and the initialization-state separately. Data as an array of N bytes, initialization state as an array of N/4 booleans.

jimblandy reacted with thumbs up emoji
- |pipeline| must not be `null`.
- Let |pipelineLayout| be |pipeline|.{{GPUPipelineBase/[[layout]]}}.
- Let |immediateSize| be |pipelineLayout|.{{GPUPipelineLayout/[[immediateSize]]}}.
- For each immediate data variable that is [=statically used=] by any entry point in |pipeline|:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

When I read this I think "immediate data variable" means onevar<immediate>, of which there is of course only one (at most). But thatvar itself has aSizeOf the entire struct, whereas we actually want onlyaccessible parts of the struct to need to be initialized. That is going to require the WGSL spec to "reflect" some new information for us - the map ofslot -> is_accessible - for us to validate against.

I would recommend combining this PR with the WGSL PR, since there will be new integration between the specs.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, Let me do this and abandon wgsl PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unfortunately the WGSL spec doesn't really defined "padding" or "contains actual data". It probably should, but that's probably too much to ask to fix in this PR.

I think the text in this PR is good enough for all but the most adversarial reader. I recommend a followup PR (by the WGSL editors) to more carefully define padding and non-padding (or whatever we cal it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

API spec still needs to be updated to useAccessibleBytes?

wgsl/index.bs Outdated
<td>Invocations in the same [=shader stage=]
<td>[=access/read=]
<td>For [=immediate data=] variables.<br>
[=type/concrete|Concrete=] [=constructible=] [=host-shareable=] types, excluding arrays and structures containing array members.<br>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Copying Alan's comment here:
#5424 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I agree with Alan here, the "Concrete ..." part is redundant, and should be removed.

@kainino0xkainino0x added wgslWebGPU Shading Language Issues apiWebGPU API labelsOct 31, 2025
@kainino0xkainino0x changed the titleSpecify immediate data API in the WebGPU specSpecify immediate data API and WGSLOct 31, 2025
@kainino0xkainino0x changed the titleSpecify immediate data API and WGSLSpecify immediate data API and WGSL <immediate> address spaceOct 31, 2025
Copy link
Contributor

@alan-bakeralan-baker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Copying my other requests from#5424

There ought to be either a language feature or enable extension as part of this specification.

Additionally, there need to be some updates to the Resource Interface and Resource Layout Compatibility sections in the Entry Points chapter.

dneto0 reacted with thumbs up emoji
spec/index.bs Outdated
GPUSize64 dynamicOffsetsDataStart,
GPUSize32 dynamicOffsetsDataLength);

undefined setImmediateData(GPUSize32 rangeOffset, AllowSharedBufferSource data,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We didn't really discuss this in the group, but do you have any opinion aboutsetImmediates instead? This is to try to improve the ergonomics very slightly but might be a bit confusing to developers given that an ArrayBuffer is passed.

mwyrzykowski reacted with thumbs up emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm Ok forsetImmediates (TBH, I'm Ok for all simplification names). But better go through community meeting.

mwyrzykowski reacted with thumbs up emoji

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

setImmediates sounds good 👍

@jimblandyjimblandy moved this toCurrent agenda inWGSLNov 3, 2025
@jimblandyjimblandy added this to theMilestone 2 milestoneNov 3, 2025
Copy link
Contributor

@dneto0dneto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I agree with other review comments. Once those are resolved I'm happy with this landing.

wgsl/index.bs Outdated
<td>Invocations in the same [=shader stage=]
<td>[=access/read=]
<td>For [=immediate data=] variables.<br>
[=type/concrete|Concrete=] [=constructible=] [=host-shareable=] types, excluding arrays and structures containing array members.<br>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I agree with Alan here, the "Concrete ..." part is redundant, and should be removed.

- |pipeline| must not be `null`.
- Let |pipelineLayout| be |pipeline|.{{GPUPipelineBase/[[layout]]}}.
- Let |immediateSize| be |pipelineLayout|.{{GPUPipelineLayout/[[immediateSize]]}}.
- For each immediate data variable that is [=statically used=] by any entry point in |pipeline|:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unfortunately the WGSL spec doesn't really defined "padding" or "contains actual data". It probably should, but that's probably too much to ask to fix in this PR.

I think the text in this PR is good enough for all but the most adversarial reader. I recommend a followup PR (by the WGSL editors) to more carefully define padding and non-padding (or whatever we cal it).

@mwyrzykowski
Copy link

mwyrzykowski commentedNov 4, 2025
edited
Loading

Copying my other requests from#5424

There ought to be either a language feature or enable extension as part of this specification.

Would prefer not using an enable extension since this is supported or can be easily emulated by all core devices (and compat too I think).

I.e., WGSL language extension would be my preference.

dneto0 reacted with thumbs up emoji

@mwyrzykowski
Copy link

I can't make the meeting today but this is 👍 from Apple's side. Metal doesn't have the concept of immediates and there was some discussion about this taking up a buffer slot. If the UA is using argument buffers then it would not, since a slot is already needed for dynamic data and you can reuse / share the same slot for immediates. Basically any transient data (dynamic offsets, immediates, other constants) can be passed in a single call to set[Vertex/Fragment]Bytes.

But we are also fine with this taking up a slot. This allows UAs to not have to migrate to ABs for everything for their Metal backends. I suppose its a tradeoff but either way is 👍 from Apple's side given our AB implementation is not really impacted either way.

shaoboyan091 reacted with thumbs up emoji

@jimblandy
Copy link
Contributor

minutes from WGSL committee meeting 2025-11-4
  • JB: PR for specifying immediate data. Understanding is this is a popular thing in graphics APIS. Think a lot of graphics folks would be happy if it was in the API and shipped. So, looks like most of the conversation has been Kai.
  • DN: This is a way of getting constant data into shader that is uniform. Popular because it's fast, fewer indirections. Because it's fast, it's tiny. Feature allows a single variable in the "immediate" address space which is declared at module scope, it's uniform. From a WGSL perspective, slides in with other addres spaces. Most restrictions are API side. Make sure data gets in (api call) part of the binding pipeline descriptor to say space size. Validation for space side on api side covers the variable in the WGSL shader. Single immediate space of memory used for all the shaders in a given pipeline (unified model). THere is some validation about which 32-bit words have to be set in order to supply data. Don't have to write padding. Discussion about the imprecision of padding bytes. We dont' define padding or the inverse in a structure. Kind of implied. We could/recommend cleaning that up after this lands. Otherwise, editorial feedback. TLDR; fast uniform data access in shader.
  • JB: Restriction against arrays, no arrays in the value, not even in struct. Is the reason that the platform level instructions for accessing odn't allow dynamic offsets?
  • AB: Vulkan has a restriction that it'sdynamically uniform
  • JB: Do mention the WGSL out of bounds rules, if there are no arrays then that only applies to vectors and matrices. So, guess what's going on is the WebGPU backend needs to lift dynamic indexing of vectors and matrices in the immediate data into something. LIfting to a variable and indexing into that or something
  • DN: That sounds right. Saw out of bounds doesn't really apply but you're right vector and matrices does exercise that.
  • JB: Otherwise, editorial comments. Only remaining interesting thing is question of render bundles. Way it's specified that after bundle runs the immediate data is considered uninitialized. Chose that because it's the most restrictive and easiest to implement. If decide later due to render bundle transparency can consider relaxing this restriction. Want to start most restrictive and allow more over time. What is the status of this for CTS implementation and Dawn implementation.
  • DS: Think the implementation is done, modulo bug fixes. We use this mechanism to pass side channel data for Dawn. Don’t think CTS is started.
  • JB: Precedent I have in mind, when talking about subgroup_id, we said it looks good, and is accepted but wait for CTS before merging to main. Think the same reasoning applies here. Want to make sure we have CTS passing and the benefit from CTS before going ahead and shipping. MW has comment would prefer not using enable as it can be supported or emulated everywhere.
  • DN: On WGSL side because it's a new feature we want enable or feature so we can write skipping conformance tests. Agree with MW that this should be everywhere, question in my mind is emulation steals uniform buffer slot, do we worry or need wording in API that the limits possibly get eaten into. Not a language concern, but an API concern.
  • JB: So as far as WGSL is concerned … ok. How does this pr phrase it now, just no extension at all? Ok, seems like a good thing to fix.
  • MW: Perhaps an issue for API meeting. Seems the only backend which doesn't support it is metal and there are ways to do it without taking a buffer slot (probably outside the scope of this meeting)
  • JB: Then not a lot of discussion for this meeting. So, resolved on language extension?
  • MW: <thumbs up>
  • DN: That's my preference
  • JB: Mozilla is thumbs up.
  • RESOLUTION: Make it a language extension, PR is good, needs CTS before landing PR

@jimblandyjimblandy moved this fromCurrent agenda toWaiting on CTS inWGSLNov 6, 2025
wgsl/index.bs Outdated
The set of accessible slots is computed as follows:

<blockquote algorithm="accessible slots">
<dfn export>AccessibleSlots</dfn>(|T|) computes a [=set=] of slot indices (where each slot is 4 bytes):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

As far as I can tell there could be nested structs. If we want to go with this algorithm, maybe we should instead recursively find all scalar and vector types and require the range [offset, offset + size) be set in the API?

As it stands for the structure cases I could have:

struct S {  a : vec4u,  b : u32}struct T {  x : S,  y : u32,}var<immediate> imm : T;

Since the align ofS is 16,SizeOf(S) is 32. That means some padding bytes are required to be set.

The spec uses, but does not define padding bytes. I wonder if it would be better to just define padding bytes and use that here.

@Kangz
Copy link
Contributor

There ought to be either a language feature or enable extension as part of this specification.

@amaiorano is implementing this speculatively in Dawn, we are going withimmediate_address_space, what do folks think of this name?

@alan-baker
Copy link
Contributor

There ought to be either a language feature or enable extension as part of this specification.

@amaiorano is implementing this speculatively in Dawn, we are going withimmediate_address_space, what do folks think of this name?

That works, but I'd also be ok with justimmediates.

@shaoboyan091
Copy link
ContributorAuthor

Rebase and iterate the PR. The new changes:

  • Adding words in wgslresource part.
  • Instead of adding "padding", defines "AccessibleSlots" in wgsl spec.

Copy link
Contributor

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.


💡Add Copilot custom instructions for smarter, more guided reviews.Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card.Take the survey.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡Add Copilot custom instructions for smarter, more guided reviews.Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card.Take the survey.

Comment on lines +11289 to +11293
: <dfn>\[[immediate_data]]</dfn>, of type [=byte sequence=], initially empty
::
The current immediate data bytes.
The length equals the device's {{supported limits/maxImmediateSize}}.
Values are set by {{GPUBindingCommandsMixin/setImmediates()}}.
Copy link

CopilotAIDec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The[[immediate_data]] internal slot is described as "initially empty" but also states "The length equals the device's {{supported limits/maxImmediateSize}}." These two statements are contradictory. It should be initialized to a byte sequence of lengthmaxImmediateSize with all bytes set to a default value (e.g., 0), not "initially empty".

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

+1

Comment on lines +11295 to +11299
: <dfn>\[[immediate_data_set]]</dfn>, of type [=list=]&lt;{{boolean}}&gt;, initially empty
::
Tracks which 32-bit word slots of immediate data have been set.
The length equals the device's {{supported limits/maxImmediateSize}} divided by 4.
Each entry corresponds to a 4-byte slot and is initially `false`.
Copy link

CopilotAIDec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The[[immediate_data_set]] internal slot is described as "initially empty" but also states "The length equals the device's {{supported limits/maxImmediateSize}} divided by 4." These two statements are contradictory. It should be initialized to a list of lengthmaxImmediateSize / 4 with all entries set tofalse, not "initially empty".

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

+1

shaoboyan091 added a commit to shaoboyan091/types that referenced this pull requestDec 2, 2025
Kangz pushed a commit to gpuweb/types that referenced this pull requestDec 2, 2025
wgsl/index.bs Outdated
Resources are shared by all invocations of the shader.

There arefour kinds of resources:
There arefive kinds of resources:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

These edits look inconsistent with Shader Interfaces now. Also the carve outs for immediate data aren't great with the rest of the spec. I wonder if we should instead introduce a new interface for immediates. That might make all this cleaner. Thoughts@dneto0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, I think so.

There is a strong association with the pipeline layout rules, and those are all about things that have binding points.

Immediates are more like overrides, and override-declarations are already separately called out in the Shader Interface (section 13.3).

Also, override-declarations are called out in footnote 4 of the table in seciton 7. Variable and value declarations.

Tomorrow I'll try to make a patch to this PR to do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've reworked it to make immediate data variables their own aspect of the shader interface, not part of the resource interface.
PTAL@alan-baker

Copy link
Contributor

@dneto0dneto0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This should add "immediate variable declaration, if any" to the list at the top of13.3 Shader Interface, and to the bulleted list that defines 'interface of a shader' (add a bullet similar to override -declarations)

Then I think we don't have to touch the "resource interface" section at all (i.e. no additions for immediates).

wgsl/index.bs Outdated
Resources are shared by all invocations of the shader.

There arefour kinds of resources:
There arefive kinds of resources:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, I think so.

There is a strong association with the pipeline layout rules, and those are all about things that have binding points.

Immediates are more like overrides, and override-declarations are already separately called out in the Shader Interface (section 13.3).

Also, override-declarations are called out in footnote 4 of the table in seciton 7. Variable and value declarations.

Tomorrow I'll try to make a patch to this PR to do this.

|data|: Data to write into the immediate data range.
|dataOffset|: Offset into |data| to begin writing from. Given in elements if
|data| is a {{TypedArray}} and bytes otherwise. Defaults to 0.
|size|: Size of content to write from |data|. Given in elements if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

nit: Maybe this should be calleddataSize, so it's clearly associated with the type ofdata (not specified in bytes). We can changewriteBuffer as well.

@dneto0
Copy link
Contributor

WGSL part LGTM now

@kainino0xkainino0x marked this pull request as draftDecember 16, 2025 04:57
@kainino0x
Copy link
Contributor

Marking draft just since we said we would finish CTS before landing this.

* <dfn noexport id="accessible-bytes">AccessibleBytes</dfn>(|T|) is the set of byte offsets within an instance of type |T| that contain data.
* If |T| is a scalar or vector, then [=AccessibleBytes=](|T|) is the set of integers `k` such that `0 <= k <` [=SizeOf=](|T|).
* If |T| is a matrix with |C| columns and |R| rows, then [=AccessibleBytes=](|T|) is the union of sets `{ k + i * Stride | k in AccessibleBytes(vec|R|) }` for `i` in `0..C-1`, where `Stride` is [=roundUp=]([=AlignOf=](vec|R|), [=SizeOf=](vec|R|)).
* If |T| is a structure |S|, then [=AccessibleBytes=](|T|) is the union of sets `{ k + Offset | k in AccessibleBytes(M_i) }` for each member `M_i` at offset `Offset` = [=OffsetOfMember=](|S|, `i`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

These are very hard to decypher. I think it might be better to write it out as something more then the set notation (I didn't realize the| ink + Offset | k in.. was doing set and wasn't or'ing together two things).

Maybe something like:

If |T| Is a structure |S|, then [=AccessibleBytes=](|T|) is the set calculated as:  * for each member `i` in S    * for each k in AccessibleBytes(`i`)      * Then the set contains k + `Offset`, where `Offset` = [=OffsetOfMember](|S|, `i`)

A similar reformatting formatrix could also be helpful

If |T| is a matrix with |C| columns and |R| rows, then [=AccessibleBytes=](|T|) is the set calculated as:  * for each `i` in 0..C-1     * for each k in AccessibleBytes(vec|R|)       * Then the set contains k + i * `Stride`, where `Stride` is [=roundUp=]([=AlignOf=](vec|R|), [=SizeOf=](vec|R|))

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@dj2dj2dj2 left review comments

@KangzKangzKangz left review comments

@kainino0xkainino0xkainino0x left review comments

@dneto0dneto0dneto0 left review comments

Copilot code reviewCopilotCopilot left review comments

@alan-bakeralan-bakeralan-baker approved these changes

@mwyrzykowskimwyrzykowskimwyrzykowski approved these changes

Assignees

No one assigned

Labels

apiWebGPU APIwgslWebGPU Shading Language Issues

Projects

Status: Waiting on CTS

Milestone

Milestone 2

Development

Successfully merging this pull request may close these issues.

8 participants

@shaoboyan091@jimblandy@mwyrzykowski@Kangz@alan-baker@dneto0@kainino0x@dj2

[8]ページ先頭

©2009-2025 Movatter.jp