Movatterモバイル変換

NVIDIA/TensorRT-Model-OptimizerPublic

NotificationsYou must be signed in to change notification settings
Fork202
Star1.6k

New pull requestNew

42 Open 298 Closed

Author

Label

Projects

Milestones

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Sort

Optimize calibrate_draft_vocab to read only required lines when calib…

#618 openedNov 27, 2025 byOfir408

Loading…

Add all example e2e tests for github PR merge / nightly

#617 openedNov 27, 2025 bykevalmorabia97

Loading…

Add build replacement library to the compress algorithm.

#616 openedNov 27, 2025 bydanielkorzekwa

Loading…

MLA eagle for K2

#615 openedNov 27, 2025 byh-guo18 • Draft

draft: Add per block MSE for NVFP4 and INT4

#613 openedNov 27, 2025 byFridah-nv • Draft

[5680954,5620660@2][ONNX][Autocast] Update value info in converted graph

#611 openedNov 26, 2025 bygcunhase

Loading…

Add checkpoint save/load to ForwardHook + add IterativeChannelContributionHook

#610 openedNov 26, 2025 bydanielkorzekwa

Loading…

Support attention quantization for diffusers >= 0.35.0

#608 openedNov 25, 2025 byshengliangxu • Draft

Fix extra args and --component-dtype default value

#605 openedNov 24, 2025 byshengliangxu

Loading…

Convert compressed-tensor int4 format to GPTQ int4 format

#590 openedNov 20, 2025 byEdwardf0t1

Loading…

make eagle embedding optional

#589 openedNov 20, 2025 byyeyu-nvidia

Loading…

Yeyu/remove embedding from eagle

#585 openedNov 20, 2025 byyeyu-nvidia • Draft

Product Rename: TensorRT Model Optimizer to Model Optimizer

#583 openedNov 20, 2025 bykevalmorabia97

Loading…

1 of 2 tasks

support for newer checkpoints

#582 openedNov 20, 2025 bybinghanc • Draft

Feat: SGL backend for online SD training

#564 openedNov 14, 2025 byh-guo18

Loading…

Fix hf_quant_config with kv cache type

#557 openedNov 14, 2025 byjenchen13

Loading…

GPTQ Lite implementation

#555 openedNov 13, 2025 bysugunav14

Loading…

1 of 2 tasks

[OMNIML-2850] [3/n] Adds sparse attention calibration

#538 openedNov 11, 2025 bykaix-nv

Loading…

[OMNIML-2852] [2/n] Add Core Sparse Attention Infrastructure

#527 openedNov 7, 2025 bykaix-nv

Loading…

parallel eagle draft

#523 openedNov 6, 2025 byyeyu-nvidia • Draft

[Bug #193] fix fp8 blockwise real quantization

#522 openedNov 6, 2025 bymeenchen

Loading…

Support AWQ fake quant for vLLM MoE models

#521 openedNov 6, 2025 bymeenchen • Draft

[Draft] [5526696] Add kv cache quantization support for onnx quantization

#486 openedOct 31, 2025 byzhanghaoc

Loading…

Yeyu/set block

#480 openedOct 28, 2025 byyeyu-nvidia • Draft

Preserve original rope scaling type in export due to transformers library AutoConfig issue

#452 openedOct 17, 2025 byEdwardf0t1

Loading…

ProTip! Updated in the last three days:updated:>2025-11-26.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Movatterモバイル変換

Pull requests: NVIDIA/TensorRT-Model-Optimizer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Pull requests list