- Notifications
You must be signed in to change notification settings - Fork74.8k
Introduces a new utility function,MatchPermutedSliceAndPartitionOffset
, to detect a pattern where aDynamicSlice
consumes the output of anAllGather
with a permuted set of offsets. This pattern is equivalent to aCollectivePermute
and can be optimized accordingly.#97189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Draft
copybara-service wants to merge1 commit intomasterChoose a base branch fromexported_pr_783030292
base:master
Could not load branches
Branch not found:{{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
+1,380 −1,012
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
d2b7e75
tofc555ad
Compare…set`, to detect a pattern where a `DynamicSlice` consumes the output of an `AllGather` with a permuted set of offsets. This pattern is equivalent to a `CollectivePermute` and can be optimized accordingly.The logic is divided into four main sections:1. **Initial Checks:** Verifies that the `AllGather` is a suitable candidate for this optimization. It ensures the operation is performed across multiple partitions with a single replica and uses flattened IDs (i.e., `use_global_device_ids` is enabled).2. **Locate DynamicSlice:** Finds the `DynamicSlice` user of the `AllGather`, correctly traversing through any intervening `Reshape` or `Bitcast` operations that do not alter the data.3. **Extract Offset Specifications:** Derives the memory offset-to-partition ID mappings from both the `AllGather` (the source layout) and the `DynamicSlice` (the destination/permuted layout).4. **Match Specifications:** Compares the source and destination offset maps to derive the permutation. For each memory offset, it pairs the source partition ID from the `AllGather` with the destination partition ID from the `DynamicSlice`.This CL also introduces a few key data structures to support this logic:* **`PartitionOffsetSpec`**: Represents the mapping from a memory offset to a partition ID for each replica group. This is used to model the data layout produced by the `AllGather` and the permuted access pattern of the `DynamicSlice`.* **`PermutationPairs`**: A type alias for a list of `(source_id, destination_id)` pairs, representing a permute operation for CP.PiperOrigin-RevId: 783030292
fc555ad
toec5c229
CompareSign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduces a new utility function,
MatchPermutedSliceAndPartitionOffset
, to detect a pattern where aDynamicSlice
consumes the output of anAllGather
with a permuted set of offsets. This pattern is equivalent to aCollectivePermute
and can be optimized accordingly.The logic is divided into four main sections:
AllGather
is a suitable candidate for this optimization. It ensures the operation is performed across multiple partitions with a single replica and uses flattened IDs (i.e.,use_global_device_ids
is enabled).DynamicSlice
user of theAllGather
, correctly traversing through any interveningReshape
orBitcast
operations that do not alter the data.AllGather
(the source layout) and theDynamicSlice
(the destination/permuted layout).AllGather
with the destination partition ID from theDynamicSlice
.This CL also introduces a few key data structures to support this logic:
PartitionOffsetSpec
: Represents the mapping from a memory offset to a partition ID for each replica group. This is used to model the data layout produced by theAllGather
and the permuted access pattern of theDynamicSlice
.PermutationPairs
: A type alias for a list of(source_id, destination_id)
pairs, representing a permute operation for CP.