Movatterモバイル変換

View reviewed changes

Copy link

Collaborator

HuiGao-NV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

coderabbitaibot reviewed

View reviewed changes

Copy link

Contributor

coderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)

tests/integration/defs/accuracy/test_llm_api_pytorch.py (1)

2067-2070:Make the 0.6 fraction overrideable via env to de-flake across lab hardware.

Different CI runners/driver stacks may still OOM or underutilize. Allow an env override with a brief comment.

Apply:

-        kv_cache_config = KvCacheConfig(-            enable_block_reuse=False,-            free_gpu_memory_fraction=0.6,-        )+        # Tuneable for CI/hardware variance; default 60% avoids OOM while reuse stays off for Eagle3.+        mem_frac = float(os.getenv("TRTLLM_FREE_GPU_MEM_FRAC_EAGLE3_QWEN3_8B", "0.6"))+        kv_cache_config = KvCacheConfig(+            enable_block_reuse=False,+            free_gpu_memory_fraction=mem_frac,+        )

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between7c2f04f and5c55ee3.

📒 Files selected for processing (1)

tests/integration/defs/accuracy/test_llm_api_pytorch.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use only spaces, no tabs; indent with 4 spaces.

Files:

tests/integration/defs/accuracy/test_llm_api_pytorch.py

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Python code must target Python 3.8+.
Indent Python code with 4 spaces; do not use tabs.
Maintain module namespace when importing; prefer 'from package.subpackage import foo' then 'foo.SomeClass()' instead of importing the class directly.
Python filenames should be snake_case (e.g., some_file.py).
Python classes use PascalCase names.
Functions and methods use snake_case names.
Local variables use snake_case; prefix 'k' for variables that start with a number (e.g., k_99th_percentile).
Global variables use upper SNAKE_CASE prefixed with 'G' (e.g., G_MY_GLOBAL).
Constants use upper SNAKE_CASE (e.g., MY_CONSTANT).
Avoid shadowing variables from an outer scope.
Initialize all externally visible members of a class in the constructor.
Prefer docstrings for interfaces that may be used outside a file; comments for in-function or file-local interfaces.
Use Google-style docstrings for classes and functions (Sphinx-parsable).
Document attributes and variables inline so they render under the class/function docstring.
Avoid reflection when a simpler, explicit approach suffices (e.g., avoid dict(**locals()) patterns).
In try/except, catch the most specific exceptions possible.
For duck-typing try/except, keep the try body minimal and use else for the main logic.

Files:

tests/integration/defs/accuracy/test_llm_api_pytorch.py

**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).

Files:

tests/integration/defs/accuracy/test_llm_api_pytorch.py

🧬 Code graph analysis (1)

tests/integration/defs/accuracy/test_llm_api_pytorch.py (1)

tensorrt_llm/llmapi/llm_args.py (1)
KvCacheConfig (923-1002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Pre-commit Check

🔇 Additional comments (2)

tests/integration/defs/accuracy/test_llm_api_pytorch.py (2)
2067-2070:Free-GPU-memory fraction set to 0.6 is a sensible, low-risk fix to unskip Eagle3 on 8B.
This targets KV cache pressure without altering execution paths. Looks consistent with other tests using 0.5–0.9.
2067-2070:Run the updated test with the correct path
python -u -m pytest -s -v tests/integration/defs/accuracy/test_llm_api_pytorch.py::TestQwen3_8B::test_eagle3

Copy link

Collaborator

tensorrt-cicd commentedSep 10, 2025

PR_Github #18361 [ run ] triggered by Bot

Copy link

Collaborator

tensorrt-cicd commentedSep 10, 2025

PR_Github #18361 [ run ] completed with stateSUCCESS
/LLM/release-1.0/L0_MergeRequest_PR pipeline #372 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check thererun report for details.

Copy link

CollaboratorAuthor

leslie-fang25 commentedSep 10, 2025

@NVIDIA/trt-llm-release-branch-approval please kindly take a look

chzblych approved these changes

View reviewed changes

chzblych merged commit9ca8662 intoNVIDIA:release/1.0

7 checks passed

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

2751822

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

0f51b99

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

e84f118

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

45a7e04

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

91d8603

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

2f4b1da

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

a7800ba

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

af1fd95

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

33fdf99

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

a3b4099

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

2f6d96a

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

2cf40ce

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

f61f4c2

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

f3b4a3d

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

5333064

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

4d466a0

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

7b1e106

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

711992a

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

e0fd478

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

Sep 18, 2025

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

daf2173

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

Sep 18, 2025

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

0921661

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

Sep 18, 2025

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

2c6dad5

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

82fd963

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

4967432

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request

[https://nvbugs/5436461][fix] Adjust free_gpu_memory_fraction of test…

09c5581

…_eagle3 (NVIDIA#7673)Signed-off-by: leslie-fang25 <leslief@nvidia.com>Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request