mixtral-moe example fails to torch.compile #232

New issue

Open

mixtral-moe example fails to torch.compile#232

Description

zou3519

opened

on Jul 23, 2025

pytorch-triton           3.4.0+git11ec6354torch                    2.9.0.dev20250723+cu128torchaudio               2.8.0.dev20250723+cu128torchvision              0.24.0.dev20250723+cu128

Repro:

 CUDA_VISIBLE_DEVICES=1 python generate.py  --checkpoint_path checkpoints/$MODEL_REPO/model.pth --prompt "The capital of France is:"  --num_samples=1 --compile

Gives the following:

RuntimeError: Error: accessing tensor output of CUDAGraphs that has been overwritten by a subsequent run. Stack trace: File "/home/rzou/dev/lab/gpt-fast/mixtral-moe/generate.py", line 68, in decode_one_token    return sample(logits, **sampling_kwargs)  File "/home/rzou/dev/lab/gpt-fast/mixtral-moe/generate.py", line 56, in sample    idx_next = multinomial_sample_one_no_sync(probs)  File "/home/rzou/dev/lab/gpt-fast/mixtral-moe/generate.py", line 42, in multinomial_sample_one_no_sync    return torch.argmax(probs_sort / q, dim=-1, keepdim=True).to(dtype=torch.int). To prevent overwriting, clone the tensor outside of torch.compile() or call torch.compiler.cudagraph_mark_step_begin() before each model invocation.

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mixtral-moe example fails to torch.compile #232

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions