Movatterモバイル変換

github-project-automationbot moved this toBacklog inAutoDeploy Board

lucaslie requested a review fromMrGeva

October 20, 2025 19:41

lucaslie self-assigned this

lucaslie moved this fromBacklog toIn review inAutoDeploy Board

Copy link

MemberAuthor

lucaslie commentedOct 20, 2025

/bot run

lucaslie force-pushed thell/torch_dtype_deprecation branch from81aeb50 to47140c1Compare

October 20, 2025 19:43

lucaslie removed the request for review froma team

October 20, 2025 19:43

Copy link

Contributor

coderabbitaibot commentedOct 20, 2025•
edited
Loading

📝 Walkthrough

Walkthrough

The PR refactors dtype specification in HuggingFace model loading. Dtype normalization moves from initialization time to config-update time, converting string dtype values to torch.dtype objects. The parameter key shifts from "torch_dtype" to "dtype" across implementation and tests.

Changes

Cohort / File(s)	Change Summary
HF model dtype handling `tensorrt_llm/_torch/auto_deploy/models/hf.py`	Removed runtime normalization of`torch_dtype` in`__init__`. Added dtype string-to-torch.dtype conversion in`_recursive_update_config` for "torch_dtype" and "dtype" keys via`setattr()`. Updated`build_and_load_model` to use`"dtype": "auto"` instead of`"torch_dtype": "auto"`.
Test configurations `tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py`,`tests/unittest/_torch/auto_deploy/unit/singlegpu/models/test_hybrid_patches.py`	Updated model_kwargs parameter key from`"torch_dtype"` to`"dtype"` with value`"bfloat16"` in test fixtures for Bamba and Nemotron models.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

The changes are well-localized with straightforward refactoring: dtype normalization relocates from initialization to config updates, and parameter keys rename consistently across files. Test updates are repetitive. Minimal review complexity, though understanding the rationale for the normalization shift requires brief context.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The PR description is largely incomplete compared to the required template structure. While the author provided a brief contextual note about HF deprecating torch_dtype and referenced a related PR, the two critical sections—Description and Test Coverage—are empty and only contain placeholder comments from the template. The Description section should explain the issue and solution in detail, and the Test Coverage section should list the relevant tests that safeguard the changes. Only the PR Checklist box was marked as complete, but the substantive explanatory sections that help reviewers understand the scope and testing strategy are missing.	Add a detailed Description section explaining the issue (HF's deprecation of torch_dtype) and the solution (replacing it with dtype across AutoDeploy). Include a Test Coverage section that lists which tests validate the dtype parameter changes, such as the tests modified in the PR (test_hybrid_patches.py and _model_test_utils.py). Briefly explain what behavioral changes, if any, result from this transition, and confirm that existing functionality is preserved.
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run`@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The PR title "[None][chore] AutoDeploy: replace HF's deprecated keyword torch_dtype --> dtype" is well-formed and directly reflects the primary change across all three modified files. The title follows the required format with the ticket reference and type indicator, and it clearly communicates that this is a refactoring task to replace a deprecated HuggingFace keyword with its modern equivalent. The changes in the raw summary confirm this is the main objective: updating the AutoDeploy code, test configurations, and test files to use "dtype" instead of "torch_dtype".

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for usingCodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment@coderabbitai help to get the list of available commands and usage tips.}

coderabbitaibot reviewed

Copy link

Contributor

coderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between9b289d5 and47140c1.

📒 Files selected for processing (3)

tensorrt_llm/_torch/auto_deploy/models/hf.py (2 hunks)
tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py (2 hunks)
tests/unittest/_torch/auto_deploy/unit/singlegpu/models/test_hybrid_patches.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use only spaces, no tabs; indent with 4 spaces.

Files:

tests/unittest/_torch/auto_deploy/unit/singlegpu/models/test_hybrid_patches.py
tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py
tensorrt_llm/_torch/auto_deploy/models/hf.py

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+.
Indent Python code with 4 spaces; do not use tabs.
Maintain module namespace when importing; prefer 'from package.subpackage import foo' then 'foo.SomeClass()' instead of importing the class directly.
Python filenames should be snake_case (e.g., some_file.py).
Python classes use PascalCase names.
Functions and methods use snake_case names.
Local variables use snake_case; prefix 'k' for variables that start with a number (e.g., k_99th_percentile).
Global variables use upper SNAKE_CASE prefixed with 'G' (e.g., G_MY_GLOBAL).
Constants use upper SNAKE_CASE (e.g., MY_CONSTANT).
Avoid shadowing variables from an outer scope.
Initialize all externally visible members of a class in the constructor.
Prefer docstrings for interfaces that may be used outside a file; comments for in-function or file-local interfaces.
Use Google-style docstrings for classes and functions (Sphinx-parsable).
Document attributes and variables inline so they render under the class/function docstring.
Avoid reflection when a simpler, explicit approach suffices (e.g., avoid dict(**locals()) patterns).
In try/except, catch the most specific exceptions possible.
For duck-typing try/except, keep the try body minimal and use else for the main logic.

Files:

tests/unittest/_torch/auto_deploy/unit/singlegpu/models/test_hybrid_patches.py
tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py
tensorrt_llm/_torch/auto_deploy/models/hf.py

**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).

Files:

tests/unittest/_torch/auto_deploy/unit/singlegpu/models/test_hybrid_patches.py
tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py
tensorrt_llm/_torch/auto_deploy/models/hf.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: ixlmarPR: NVIDIA/TensorRT-LLM#8263File: examples/models/contrib/sdxl/run_sdxl.py:0-0Timestamp: 2025-10-13T13:55:04.170ZLearning: The `diffusers` library (e.g., `DiffusionPipeline`, `StableDiffusionXLPipeline`, `StableDiffusion3Pipeline`) uses the `torch_dtype` parameter in `from_pretrained()` calls, not `dtype`. Only the `transformers` library has migrated to using `dtype`.

🔇 Additional comments (3)

tensorrt_llm/_torch/auto_deploy/models/hf.py (1)
276-276:LGTM! Correct migration todtype for transformers.
The change fromtorch_dtype todtype is correct for thetransformers library, which has migrated to the new parameter name. HuggingFace'sfrom_pretrained will handle the"auto" value appropriately.
Based on learnings
tests/unittest/_torch/auto_deploy/unit/singlegpu/models/test_hybrid_patches.py (1)
51-51:LGTM! Test updated consistently with implementation.
The change fromtorch_dtype todtype aligns with the implementation changes inhf.py. The concrete dtype value"bfloat16" will be correctly converted totorch.bfloat16 by the_recursive_update_config logic.
tests/unittest/_torch/auto_deploy/_utils_test/_model_test_utils.py (1)
468-468:LGTM! Model configs updated consistently.
Both model configurations correctly updated fromtorch_dtype todtype. The changes align with the implementation inhf.py and ensure test coverage for the new parameter name across different model types.
Also applies to: 487-487

tensorrt_llm/_torch/auto_deploy/models/hf.py OutdatedShow resolvedHide resolved

Copy link

Collaborator

tensorrt-cicd commentedOct 20, 2025

PR_Github #21943 [ run ] triggered by Bot. Commit:47140c1

[None][chore] AutoDeploy: replace HF's deprecated keyword torch_dtype…

c1e4528

… --> dtypeSigned-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>

lucaslie force-pushed thell/torch_dtype_deprecation branch from47140c1 toc1e4528Compare

October 20, 2025 19:50

Copy link

MemberAuthor

lucaslie commentedOct 20, 2025

/bot run

Copy link

Collaborator

tensorrt-cicd commentedOct 20, 2025

PR_Github #21944 [ run ] triggered by Bot. Commit:c1e4528

Copy link

Collaborator

tensorrt-cicd commentedOct 20, 2025

PR_Github #21943 [ run ] completed with stateABORTED. Commit:47140c1
LLM/main/L0_MergeRequest_PR #16541 (Blue Ocean) completed with status: ABORTED

nvchenghaoz approved these changes

lucaslieenabled auto-merge (squash)

October 20, 2025 20:31

govind-ramnarayan approved these changes

Copy link

Collaborator

tensorrt-cicd commentedOct 20, 2025

PR_Github #21944 [ run ] completed with stateSUCCESS. Commit:c1e4528
/LLM/main/L0_MergeRequest_PR pipeline #16543 completed with status: 'FAILURE'

Copy link

MemberAuthor

lucaslie commentedOct 20, 2025

/bot run

Copy link

Collaborator

tensorrt-cicd commentedOct 20, 2025

PR_Github #21952 [ run ] triggered by Bot. Commit:c1e4528

Copy link

Collaborator

tensorrt-cicd commentedOct 21, 2025

PR_Github #21952 [ run ] completed with stateSUCCESS. Commit:c1e4528
/LLM/main/L0_MergeRequest_PR pipeline #16549 completed with status: 'FAILURE'

MrGeva approved these changes

lucaslie mentioned this pull request

[AutoDeploy]: Dashboard updates with modular config system#8543

Closed

1 task

Copy link

MemberAuthor

lucaslie commentedOct 21, 2025

/bot run

Copy link

Collaborator

tensorrt-cicd commentedOct 21, 2025

PR_Github #22074 [ run ] triggered by Bot. Commit:c1e4528

Copy link

Collaborator

tensorrt-cicd commentedOct 21, 2025

PR_Github #22074 [ run ] completed with stateSUCCESS. Commit:c1e4528
/LLM/main/L0_MergeRequest_PR pipeline #16645 completed with status: 'FAILURE'

Merge branch 'main' into ll/torch_dtype_deprecation

747051c

Copy link

MemberAuthor

lucaslie commentedOct 21, 2025

/bot run

Copy link

Collaborator

tensorrt-cicd commentedOct 21, 2025

PR_Github #22082 [ run ] triggered by Bot. Commit:747051c

Copy link

Collaborator

tensorrt-cicd commentedOct 21, 2025

PR_Github #22082 [ run ] completed with stateSUCCESS. Commit:747051c
/LLM/main/L0_MergeRequest_PR pipeline #16649 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check thererun report for details.

lucaslie merged commit9b54b3b intoNVIDIA:main

5 checks passed

github-project-automationbot moved this fromIn review toDone inAutoDeploy Board