int4 quant broken right now? #217

New issue

Open

int4 quant broken right now?#217

Description

jerryzh168

opened

on Dec 20, 2024

I tried the following and seems it breaks right now

> python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4 --groupsize 64Loading model ...Quantizing model weights for int4 weight-only affine per-channel groupwise quantizationlinear: layers.0.attention.wqkv, in=4096, out=6144Traceback (most recent call last):  File "/data/users/jerryzh/gpt-fast/quantize.py", line 622, in <module>    quantize(args.checkpoint_path, args.mode, args.groupsize, args.calibration_tasks, args.calibration_limit, args.calibration_seq_length, args.pad_calibration_inputs, args.percdamp, args.blocksize, args.label)  File "/data/users/jerryzh/gpt-fast/quantize.py", line 569, in quantize    quantized_state_dict = quant_handler.create_quantized_state_dict()  File "/home/jerryzh/.conda/envs/sglang/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context    return func(*args, **kwargs)  File "/data/users/jerryzh/gpt-fast/quantize.py", line 433, in create_quantized_state_dict    weight_int4pack, scales_and_zeros = prepare_int4_weight_and_scales_and_zeros(  File "/data/users/jerryzh/gpt-fast/quantize.py", line 363, in prepare_int4_weight_and_scales_and_zeros    weight_int4pack = torch.ops.aten._convert_weight_to_int4pack(weight_int32, inner_k_tiles)  File "/home/jerryzh/.conda/envs/sglang/lib/python3.10/site-packages/torch/_ops.py", line 1123, in __call__    return self._op(*args, **(kwargs or {}))RuntimeError: Expected in.dtype() == at::kByte to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.)

it's probably because of@yanbing-j's recent refactors, but I'm not sure if we want to migrate to use torchao's quant at some point so not sure if it's worth fixing now.

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

int4 quant broken right now? #217

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions