Movatterモバイル変換

View reviewed changes

coderabbitaibot reviewed

View reviewed changes

Copy link

Contributor

coderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tensorrt_llm/_torch/models/modeling_llama.py (1)

1-3:Missing required NVIDIA Apache‑2.0 header.

Per repo guidelines, prepend the current-year header to this file.

Apply:

+# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.+#+# Licensed under the Apache License, Version 2.0 (the "License");+# you may not use this file except in compliance with the License.+# You may obtain a copy of the License at+#+#     http://www.apache.org/licenses/LICENSE-2.0+#+# Unless required by applicable law or agreed to in writing, software+# distributed under the License is distributed on an "AS IS" BASIS,+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+# See the License for the specific language governing permissions and+# limitations under the License.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between14aa34f and580a733.

📒 Files selected for processing (1)

tensorrt_llm/_torch/models/modeling_llama.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use only spaces, no tabs; indent with 4 spaces.

Files:

tensorrt_llm/_torch/models/modeling_llama.py

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+.
Indent Python code with 4 spaces; do not use tabs.
Maintain module namespace when importing; prefer 'from package.subpackage import foo' then 'foo.SomeClass()' instead of importing the class directly.
Python filenames should be snake_case (e.g., some_file.py).
Python classes use PascalCase names.
Functions and methods use snake_case names.
Local variables use snake_case; prefix 'k' for variables that start with a number (e.g., k_99th_percentile).
Global variables use upper SNAKE_CASE prefixed with 'G' (e.g., G_MY_GLOBAL).
Constants use upper SNAKE_CASE (e.g., MY_CONSTANT).
Avoid shadowing variables from an outer scope.
Initialize all externally visible members of a class in the constructor.
Prefer docstrings for interfaces that may be used outside a file; comments for in-function or file-local interfaces.
Use Google-style docstrings for classes and functions (Sphinx-parsable).
Document attributes and variables inline so they render under the class/function docstring.
Avoid reflection when a simpler, explicit approach suffices (e.g., avoid dict(**locals()) patterns).
In try/except, catch the most specific exceptions possible.
For duck-typing try/except, keep the try body minimal and use else for the main logic.

Files:

tensorrt_llm/_torch/models/modeling_llama.py

**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).

Files:

tensorrt_llm/_torch/models/modeling_llama.py

🧠 Learnings (1)

📚 Learning: 2025-08-14T06:36:40.701Z

Learnt from: timlee0212PR: NVIDIA/TensorRT-LLM#6886File: tensorrt_llm/_torch/models/modeling_deepseekv3.py:0-0Timestamp: 2025-08-14T06:36:40.701ZLearning: In DeepSeek V3 model (tensorrt_llm/_torch/models/modeling_deepseekv3.py), the disagreement between AllReduce.__init__ guard and _compute_mlp_tp_size logic for MNNVL usage is expected by design. The AllReduce component and MLP TP-size computation intentionally use different criteria for MNNVL availability decisions.

Applied to files:

tensorrt_llm/_torch/models/modeling_llama.py

🧬 Code graph analysis (1)

tensorrt_llm/_torch/models/modeling_llama.py (2)

tensorrt_llm/_torch/distributed/ops.py (1)
AllReduce (362-491)
tensorrt_llm/_utils.py (1)
get_sm_version (681-683)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (1)

tensorrt_llm/_torch/models/modeling_llama.py (1)
649-652:Require explicit AllReduce.strategy or a safe ModelConfig default
Script output shows AllReduce() invocations without a strategy kwarg at:
./tensorrt_llm/_torch/distributed/ops.py:287
./tensorrt_llm/_torch/distributed/ops.py:362
./tensorrt_llm/_torch/distributed/ops.py:424
./tensorrt_llm/_torch/distributed/ops.py:494
./tensorrt_llm/_torch/models/modeling_deepseekv3.py:688
./tensorrt_llm/_torch/models/modeling_llama.py:454
./tensorrt_llm/_torch/models/modeling_qwen3_moe.py:206
./tests/unittest/_torch/auto_deploy/unit/multigpu/transformations/library/test_collective_fusion.py:20
./tests/unittest/_torch/multi_gpu/test_allreduce.py:129
./tests/unittest/_torch/multi_gpu/test_allreduce.py:340
./tests/unittest/_torch/multi_gpu/test_allreduce.py:502
./tests/unittest/_torch/multi_gpu/test_user_buffers.py:695
./tests/unittest/_torch/multi_gpu/test_user_buffers.py:696
./tests/unittest/_torch/multi_gpu/test_user_buffers.py:697
Action: update these call sites to pass strategy (e.g., strategy=model_config.allreduce_strategy or an explicit AllReduceStrategy like AUTO), or ensure ModelConfig supplies that safe default.
⛔ Skipped due to learnings
Learnt from: timlee0212PR: NVIDIA/TensorRT-LLM#6886File: tensorrt_llm/_torch/models/modeling_deepseekv3.py:0-0Timestamp: 2025-08-14T06:36:40.701ZLearning: In DeepSeek V3 model (tensorrt_llm/_torch/models/modeling_deepseekv3.py), the disagreement between AllReduce.__init__ guard and _compute_mlp_tp_size logic for MNNVL usage is expected by design. The AllReduce component and MLP TP-size computation intentionally use different criteria for MNNVL availability decisions.

tensorrt_llm/_torch/models/modeling_llama.py OutdatedShow resolvedHide resolved

hyukn changed the title~~[None][fix] Pass allreduce strategy and disable AR fusion due to perf regression on preBlackwell~~[https://nvbugs/5517023][fix] Pass allreduce strategy and disable AR fusion due to perf regression on preBlackwell

hyukn changed the title~~[https://nvbugs/5517023][fix] Pass allreduce strategy and disable AR fusion due to perf regression on preBlackwell~~[https://nvbugs/5517023][fix] Pass allreduce strategy and disable AR fusion due to perf regression on pre-Blackwell

[https://nvbugs/5517023][fix] Pass allreduce strategy and disable AR …

b666707

…fusion due to perf regression on preBlackwell.Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>

hyukn force-pushed thefix/llama3_allreduce_strategy branch from580a733 tob666707Compare

September 16, 2025 12:13

hyukn changed the title~~[https://nvbugs/5517023][fix] Pass allreduce strategy and disable AR fusion due to perf regression on pre-Blackwell~~[https://nvbugs/5517023][fix] Pass allreduce strategy and disable AR fusion due to perf regression on pre-Blackwell arch

Copy link

CollaboratorAuthor

hyukn commentedSep 16, 2025

/bot run

Copy link

Collaborator

tensorrt-cicd commentedSep 16, 2025

PR_Github #18794 [ run ] triggered by Bot

Copy link

Collaborator

tensorrt-cicd commentedSep 16, 2025

PR_Github #18788 [ run ] completed with stateABORTED
LLM/release-1.0/L0_MergeRequest_PR #396 (Blue Ocean) completed with status: ABORTED

hyukn changed the title~~[https://nvbugs/5517023][fix] Pass allreduce strategy and disable AR fusion due to perf regression on pre-Blackwell arch~~[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL on pre-Blackwell arch

Copy link

Collaborator

tensorrt-cicd commentedSep 16, 2025

PR_Github #18794 [ run ] completed with stateSUCCESS
/LLM/release-1.0/L0_MergeRequest_PR pipeline #399 completed with status: 'FAILURE'

Copy link

CollaboratorAuthor

hyukn commentedSep 17, 2025

/bot run

Copy link

Collaborator

tensorrt-cicd commentedSep 17, 2025

PR_Github #18848 [ run ] triggered by Bot

litaotju added the Release BlockerPRs that blocking the final release build or branching out the release branch label

Copy link

Collaborator

tensorrt-cicd commentedSep 17, 2025

PR_Github #18848 [ run ] completed with stateSUCCESS
/LLM/release-1.0/L0_MergeRequest_PR pipeline #403 completed with status: 'SUCCESS'

Superjomn merged commit88fe78e intoNVIDIA:release/1.0

8 checks passed

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

0560467

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

601fc88

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

f00c0c3

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

litaotju mentioned this pull request

Revert "[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL on pre-Blackwell arch"#7810

Closed

litaotju added a commit to litaotju/TensorRT-LLM that referenced this pull request

Revert "[https://nvbugs/5517023][fix] Pass allreduce strategy and for…

17188ca

…ce NCCL on pre-Blackwell arch (NVIDIA#7768)"This reverts commit88fe78e.Signed-off-by: Tao Li <tali@nvidia.com>

litaotju mentioned this pull request

[https://nvbugs/1234567][fix] Revert https://github.com/NVIDIA/TensorRT-LLM/pull/7768/files#7813

Merged

1 task

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

2c0a9f4

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

Sep 18, 2025

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

d267578

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

Sep 18, 2025

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

ef98158

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

Sep 18, 2025

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

10ed474

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

44a17cf

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

cf834e3

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5517023][fix] Pass allreduce strategy and force NCCL …

096df09

…on pre-Blackwell arch (NVIDIA#7768)Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request