Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/aoPublic

enable smoothquant for int8 static tensor#3468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
jcaip wants to merge40 commits intomain
base:main
Choose a base branch
Loading
fromjcaip/enable-smoothquant
Open
Show file tree
Hide file tree
Changes from1 commit
Commits
Show all changes
40 commits
Select commitHold shift + click to select a range
48cdb61
Int8Tensor migration
jcaipDec 1, 2025
0b73aed
ruff fixes
jcaipDec 1, 2025
1e49945
add init
jcaipDec 1, 2025
669b6ee
fix ruff again
jcaipDec 1, 2025
9071526
update
jcaipDec 1, 2025
1539e0f
wip
jcaipDec 2, 2025
d9a2b1b
Merge branch 'main' into jcaip/int8-tensor
jcaipDec 3, 2025
673f228
undo update tests
jcaipDec 3, 2025
739fd64
fix ruff
jcaipDec 3, 2025
750db1a
fix varname
jcaipDec 3, 2025
9410488
fix typing
jcaipDec 3, 2025
45a3a76
add tests
jcaipDec 3, 2025
4e2f09c
fix dtype
jcaipDec 3, 2025
dd80cca
fix ci
jcaipDec 3, 2025
7f73062
address granularity cr
jcaipDec 4, 2025
ac6a2b6
update _choose_quant_func_and_quantize_tensor
jcaipDec 4, 2025
f28df4a
make block size required attribute
jcaipDec 4, 2025
328585e
made dtype required as well
jcaipDec 4, 2025
ce4d568
address nits
jcaipDec 4, 2025
a665d45
skip per tensor weight only test for now
jcaipDec 4, 2025
0338016
add static quant
jcaipDec 3, 2025
ee39691
add static quant
jcaipDec 4, 2025
9eb0aa9
update
jcaipDec 5, 2025
d4a1514
static quant working eager + compile
jcaipDec 6, 2025
3cdea56
remove file
jcaipDec 6, 2025
fa9022d
added asserts
jcaipDec 6, 2025
8ce5cde
undo smoothquant change
jcaipDec 6, 2025
6f64121
fix return
jcaipDec 6, 2025
8ae921d
Merge branch 'main' into jcaip/static-quant-rebased
jcaipDec 7, 2025
5b9e243
got smoothquant + int8 static working
jcaipDec 8, 2025
7a0e38f
generalized smoothquat code
jcaipDec 8, 2025
3d18edf
free tests
jcaipDec 8, 2025
9e07f8b
fix static scale check
jcaipDec 8, 2025
4274e02
update
jcaipDec 8, 2025
b5309eb
address cr feedback
jcaipDec 9, 2025
a732fee
Merge branch 'jcaip/static-quant-rebased' into jcaip/enable-smoothquant
jcaipDec 9, 2025
0c23589
Merge branch 'main' into jcaip/enable-smoothquant
jcaipDec 9, 2025
0872986
update
jcaipDec 17, 2025
049830f
fix ruff
jcaipDec 17, 2025
2586ab6
fix varname
jcaipDec 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
fix static scale check
  • Loading branch information
@jcaip
jcaip committedDec 8, 2025
commit9e07f8b0839b1a441f836d3800c3546e8626d79d

Some comments aren't visible on the classic Files Changed page.

7 changes: 6 additions & 1 deletiontest/prototype/test_smoothquant.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -106,7 +106,12 @@ def test_smoothquant_accuracy(self, alpha, base_config, device, input_dtype):
# Step 1. Basic quantization
basic_model = deepcopy(m)
if isinstance(base_config, Int8StaticActivationInt8WeightConfig):
quantize_(basic_model, Int8DynamicActivationInt8WeightConfig(version=2))
quantize_(
basic_model,
Int8DynamicActivationInt8WeightConfig(
version=2, granularity=base_config.granularity
),
)
else:
quantize_(basic_model, base_config)
out_basic = basic_model(*x)
Expand Down
4 changes: 2 additions & 2 deletionstorchao/prototype/smoothquant/core.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -52,7 +52,7 @@ def calculate_qparams(self, weight_quant_kwargs=None):
inputs = [inp.to(self.device) for inp in self.inputs]
acc = torch.cat(inputs, dim=0)
# Reshape if needed: [batch, seq, features] -> [batch*seq, features]
temp = acc
example_input_for_quantization = acc
if acc.ndim > 2:
acc = acc.view(-1, acc.shape[-1])

Expand All@@ -71,7 +71,7 @@ def calculate_qparams(self, weight_quant_kwargs=None):

if weight_quant_kwargs is not None:
quant_smooth_activation = _choose_quant_func_and_quantize_tensor(
temp / smoothing_factor, weight_quant_kwargs
example_input_for_quantization / smoothing_factor, weight_quant_kwargs
)
return smoothing_factor, quant_smooth_activation.scale
else:
Expand Down
12 changes: 5 additions & 7 deletionstorchao/quantization/quantize_/workflows/int8/int8_tensor.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -3,7 +3,7 @@
#
# This source code is licensed under the BSD 3-Clause license found in the
# LICENSE file in the root directory of this source tree.

import math
from dataclasses import dataclass
from typing import List, Optional

Expand DownExpand Up@@ -140,12 +140,10 @@ def from_hp(
else:
# Scale can be provided in the case of static quant
assert scale.ndim == hp_tensor.ndim
# if isinstance(granularity, PerTensor):
# assert scale.numel() == 1
# elif isinstance(granularity, PerRow):
# breakpoint()
# assert scale.numel() == block_size[-1]

num_expected_values = math.prod(
[num_dim // bs for (bs, num_dim) in zip(block_size, hp_tensor.shape)]
)
assert scale.numel() == num_expected_values
zero_point = torch.zeros_like(scale, dtype=torch.int8)

int_data = quantize_affine(
Expand Down

[8]ページ先頭

©2009-2025 Movatter.jp