- Notifications
You must be signed in to change notification settings - Fork70
TMA pointwise scheduler tests#5565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
liqiangxl commentedNov 20, 2025
!test |
github-actionsbot commentedNov 20, 2025 • edited by xwang233
Loading Uh oh!
There was an error while loading.Please reload this page.
edited by xwang233
Uh oh!
There was an error while loading.Please reload this page.
Review updated until commita21ff37 Description
|
| Relevant files | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Enhancement | 9 files
| ||||||||||||||||||
| Tests |
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Debug Output |
Test failures
(Medium, 90)
NVFuser internal assert (Unknown tensor map data type) in test_direct_ops opinfo suiteTest Name GB200 H100 Source tests.python.direct.test_repro.test_issue1277 ❌ ❌ tests.python.opinfo.test_direct_ops.test_correctness_abs_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_abs_complex64 ❌ ❌ tests.python.opinfo.test_direct_ops.test_correctness_acos_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_acos_complex64 ❌ ❌ tests.python.opinfo.test_direct_ops.test_correctness_acosh_complex128 ❌ ❌ tests.python.opinfo.test_direct_ops.test_correctness_acosh_complex64 ❌ ❌ tests.python.opinfo.test_direct_ops.test_correctness_add_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_add_complex64 ❌ tests.python.opinfo.test_direct_ops.test_correctness_asin_complex128 ❌ ... with 57 more test failures omitted. Check internal logs. (Medium, 46)
NVFuser internal assert: Unknown tensor map data type on complex dtype ops (opinfo direct & UnaryTests)Test Name GB200 H100 Source UnaryTests/UnaryTest.Neg/std__complex_float_ ❌ ❌ Link tests.python.opinfo.test_direct_ops.test_correctness_abs_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_acos_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_add_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_add_complex64 ❌ tests.python.opinfo.test_direct_ops.test_correctness_asin_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_asin_complex64 ❌ tests.python.opinfo.test_direct_ops.test_correctness_asinh_complex128 ❌ tests.python.opinfo.test_direct_ops.test_correctness_asinh_complex64 ❌ tests.python.opinfo.test_direct_ops.test_correctness_atan_complex128 ❌ ... with 35 more test failures omitted. Check internal logs. (Medium, 12)
NVFuser internal assertion failures in BlockQuantizationSchedulingTestSuite and MatmulSchedulerTestTest Name GB200 Source BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/__bfloat_1024x1024_WithGlobalScale_NoSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/__bfloat_128x64_NoGlobalScale_WithSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/__bfloat_2048x128_NoGlobalScale_NoSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/__bfloat_2048x128_WithGlobalScale_WithSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/__bfloat_2048x2048_WithGlobalScale_NoSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/float_1024x1024_NoGlobalScale_NoSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/float_1024x1024_WithGlobalScale_WithSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/float_128x64_WithGlobalScale_NoSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/float_2048x128_NoGlobalScale_WithSwizzle ❌ Link BlockQuantizationSchedulingTestSuite/BlockQuantizationSchedulingTest.AutoScheduleSingleOp/float_2048x2048_NoGlobalScale_NoSwizzle ❌ Link ... with 2 more test failures omitted. Check internal logs. (Medium, 9)
Multiple NVFuser internal assertion failures across grouped_mm, multidevice matmul/transformer, and thunderfx MoE testsTest Name GB200 GB200 (dist.) H100 H100 (dist.) Source tests.python.direct.test_with_id_model_indexer.test_layout_op_and_cutlass_nvfp4_grouped_mm[out_dtype=torch.bfloat16-tokens_per_expert_neg_one=[115, 144, 8]-config=[1024, 128, 256]] ❌ tests.python.multidevice.test_matmul.test_linear_reduce_scatter ❌ ❌ tests.python.multidevice.test_matmul.test_sequence_parallel_linear ❌ ❌ tests.python.multidevice.test_transformer.test_grouped_mlp ❌ ❌ tests.python.test_moe.test_llama4_moe_thunderfx ❌ ❌ (Medium, 8)
NVFuser TMA analysis internal asserts (merge-discontiguous / extent divisibility) in PointwiseTest, ResizeTest, matmul_stride, and issue1953 suitesTest Name GB200 H100 Source PointwiseTest.VIssue1567ectorizationFactorAnalysisCase3 ❌ ❌ Link ResizeTest.PadAndCacheUses ❌ ❌ Link tests.python.direct.test_matmul.test_matmul_stride ❌ ❌ tests.python.direct.test_repro.test_issue1953 ❌ ❌ (Medium, 2)
nvFuser internal input-size assert in test_schedule_ops::TestScheduleOps.test_concretize_reshape_pointwiseTest Name GB200 H100 Source tests.python.test_schedule_ops.TestScheduleOps.test_concretize_reshape_pointwise ❌ ❌ (Medium, 2)
nvFuser split-after-parallelization assertion in multidevice transformer testsTest Name GB200 H100 Source tests.python.multidevice.test_transformer.test_grouped_mlp ❌ ❌ (Medium, 2)
Heuristic string mismatch in test_tutorial_compute_heuristics_and_scheduleTest Name GB200 H100 Source tests.python.direct.test_tutorial.test_tutorial_compute_heuristics_and_schedule ❌ ❌ (Medium, 1)
nvFuser pointwise heuristic unroll factor mismatch in PointwiseTestTest Name GB200 Source PointwiseTest.Heuristicst1Compute2Unroll4 ❌ Link
liqiangxl commentedNov 20, 2025
!test |
liqiangxl commentedNov 21, 2025
!test |
liqiangxl commentedNov 21, 2025
!test |
liqiangxl commentedNov 21, 2025
!test |
enable and run ci tests