Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitf32562c

Browse files
PaulZhang12facebook-github-bot
authored andcommitted
[Inductor] Add envvar to disable decomposeK (#154421)
Summary:Pull Requestresolved:#154421Add envvar to Inductor config to disable decomposeK autotuning choiceTest Plan: `buck test 'fbcode//mode/opt' fbcode//caffe2/test/inductor:max_autotune -- --exact 'caffe2/test/inductor:max_autotune - test_max_autotune_decompose_k_dynamic_False_sizes2 (caffe2.test.inductor.test_max_autotune.TestMaxAutotune)' --run-disabled`Reviewed By: eellisonDifferential Revision: D75174823
1 parent0db9c64 commitf32562c

File tree

3 files changed

+27
-0
lines changed

3 files changed

+27
-0
lines changed

‎test/inductor/test_max_autotune.py‎

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1366,6 +1366,29 @@ def test_func3(x, y, z, m, l):
13661366
self.assertEqual(hits(),0)
13671367
self.assertEqual(misses(),7)
13681368

1369+
@skipIfXpu
1370+
@unittest.skipIf(TEST_WITH_ROCM,"decompose_k not supported on ROCm")
1371+
@unittest.skipIf(
1372+
config.cpp_wrapper,"decompose_k not supported for cpp_wrapper yet"
1373+
)
1374+
@config.patch(
1375+
max_autotune=True,
1376+
max_autotune_gemm_backends="TRITON",
1377+
autotune_fallback_to_aten=False,
1378+
disable_decompose_k=True,
1379+
)
1380+
deftest_max_autotune_disable_decompose_K(self):
1381+
M,N,K= (32,32,32768)
1382+
1383+
a=torch.randn(M,K,dtype=torch.float16,device="cuda",requires_grad=True)
1384+
b=torch.randn(K,N,dtype=torch.float16,device="cuda",requires_grad=True)
1385+
1386+
compiled_func=torch.compile(lambdaa,b:a @b)
1387+
out,code=run_and_get_code(compiled_func,a,b)
1388+
1389+
forcodegenincode:
1390+
FileCheck().check_not("decompose_k").run(codegen)
1391+
13691392

13701393
classTestMaxAutotunePrecompile(TestCase):
13711394
deftest_precompilation_threads(self):

‎torch/_inductor/config.py‎

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -396,6 +396,9 @@ def prologue_fusion_enabled() -> bool:
396396
# enable slow autotuning passes to select gemm algorithms
397397
max_autotune_gemm=os.environ.get("TORCHINDUCTOR_MAX_AUTOTUNE_GEMM")=="1"
398398

399+
# disable decomposek autotune choice for gemm
400+
disable_decompose_k=os.environ.get("TORCHINDUCTOR_DISABLE_DECOMPOSE_K")=="1"
401+
399402
# Modifies the number of autotuning choices displayed, set to None for all
400403
autotune_num_choices_displayed:Optional[int]=10
401404

‎torch/_inductor/utils.py‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1595,6 +1595,7 @@ def use_decompose_k_choice(m: _IntLike, n: _IntLike, k: _IntLike) -> bool:
15951595
)
15961596
andnotV.graph.aot_mode# TODO: Support AOTI for decomposeK
15971597
andnotV.graph.cpp_wrapper
1598+
andnotconfig.disable_decompose_k
15981599
)
15991600

16001601

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp