Movatterモバイル変換

NotificationsYou must be signed in to change notification settings
Fork26.3k
Star96k

viable/strict/1766042748

Toggle viable/strict/1766042748's commit message

Optimize Triton template heuristics (#170444)Summary:This diff contains three small optimizations:1) Directly cache the triton Config object import. Not a huge win, but measurably faster than relying on importlib's cache.2) Only copy configs when the new value is different from the old one. Configs are fairly large objects, so unneccesary dict copies get expensive.3) Replace `gcd(k, BLOCK_K) == BLOCK_K` with `(k % BLOCK_K) == 0`. This is equivalent when `BLOCK_K > 0`, which must be true.Test Plan:```tlp buck run mode/opt //scripts/paulzhan:repro```and then looking at perfetto.Differential Revision: D88415189Pull Requestresolved:#170444Approved by:https://github.com/PaulZhang12,https://github.com/eellison,https://github.com/shunting314

Dec 18, 2025
863d0eb
zip
tar.gz

viable/strict/1766040973

Toggle viable/strict/1766040973's commit message

Shorten the file names in libtorch_agnostic tests (#170664)To fix```ninja: error: Stat(C:/actions-runner/_work/pytorch/pytorch/test/cpp_extensions/libtorch_agnostic_2_10_extension/build/temp.win-amd64-cpython-310/Release/actions-runner/_work/pytorch/pytorch/test/cpp_extensions/libtorch_agnostic_2_10_extension/libtorch_agnostic_2_10/csrc/get_any_data_ptr.obj): Filename longer than 260 characters```in#170564Pull Requestresolved:#170664Approved by:https://github.com/mikaylagawarecki

Dec 18, 2025
c047f39
zip
tar.gz

viable/strict/1766033750

Toggle viable/strict/1766033750's commit message

[17/N] Use Python 3.10 typing (#169735)This PR fixes typing of accelerator files.Pull Requestresolved:#169735Approved by:https://github.com/albanD

Dec 18, 2025
70971ea
zip
tar.gz

v2.10.0-rc2

Toggle v2.10.0-rc2's commit message

[c10d] Add thread safety when calling ncclCommGetAsyncError (#170633)[c10d] Add thread safety when calling ncclCommGetAsyncError (#170424)Fixes#169484Pull Requestresolved:#170424Approved by:https://github.com/kwen2501(cherry picked from commit9d0d198)Co-authored-by: Rohit Singh Rathaur <rrathaur@redhat.com>

Dec 18, 2025
ffe4a52
zip
tar.gz

trunk/7031901e40749c8761d30d4f20bbe9ed3a9285c9

Toggle trunk/7031901e40749c8761d30d4f20bbe9ed3a9285c9's commit message

[BE][Inductor] Move bmm template into separate file (#170482)Summary:The inductor kernel files embed multiple Jinja templates inline, making them harder to read and maintain. This change switches bmm.py to using `load_kernel_template()`, placing each template in its own file and restoring proper Jinja syntax highlighting.To add a new template named, for example, new_mm, place the jinja code in _inductor/kernel/templates/new_mm.py.jinja, then just call load_template("new_mm").Test Plan: CIDifferential Revision: D89233930Pull Requestresolved:#170482Approved by:https://github.com/jananisriram

Dec 18, 2025
7031901
zip
tar.gz

trunk/392330c7f29afad69b5935d7dd4d3e802f40f507

Toggle trunk/392330c7f29afad69b5935d7dd4d3e802f40f507's commit message

[audio hash update] update the pinned audio hash (#170727)This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml).Update the pinned audio hash.Pull Requestresolved:#170727Approved by:https://github.com/pytorchbot

Dec 18, 2025
392330c
zip
tar.gz

trunk/70971eabdcd2d92efc29a2d70aac85f2096b9042

Toggle trunk/70971eabdcd2d92efc29a2d70aac85f2096b9042's commit message

[17/N] Use Python 3.10 typing (#169735)This PR fixes typing of accelerator files.Pull Requestresolved:#169735Approved by:https://github.com/albanD

Dec 18, 2025
70971ea
zip
tar.gz

trunk/863d0ebb5c3401f8d2f88e8946511784ba0b41ab

Toggle trunk/863d0ebb5c3401f8d2f88e8946511784ba0b41ab's commit message

Optimize Triton template heuristics (#170444)Summary:This diff contains three small optimizations:1) Directly cache the triton Config object import. Not a huge win, but measurably faster than relying on importlib's cache.2) Only copy configs when the new value is different from the old one. Configs are fairly large objects, so unneccesary dict copies get expensive.3) Replace `gcd(k, BLOCK_K) == BLOCK_K` with `(k % BLOCK_K) == 0`. This is equivalent when `BLOCK_K > 0`, which must be true.Test Plan:```tlp buck run mode/opt //scripts/paulzhan:repro```and then looking at perfetto.Differential Revision: D88415189Pull Requestresolved:#170444Approved by:https://github.com/PaulZhang12,https://github.com/eellison,https://github.com/shunting314

Dec 18, 2025
863d0eb
zip
tar.gz

trunk/614ff1a63ed8b4056ce9b9a9bafd2f15e8eb06a4

Toggle trunk/614ff1a63ed8b4056ce9b9a9bafd2f15e8eb06a4's commit message

Skip failing tests on xpu with complex dtype on windows (#165049)Fixesintel/torch-xpu-ops#1195On xpu we use std:: implemetation of trig kernels. Issue comes from differences in implementation of trigonometry functions on complex dtypes in compiler headers. Windows compiler implementation is not conformant with ISO 9899. For example, following code```#include <cmath>#include <complex>#include <iostream>#include <limits>int main() {  std::complex<float> x(std::numeric_limits<float>::infinity(),                        std::numeric_limits<float>::infinity());  std::cout << std::sinh(x) << std::endl;}```Compiled with g++:`(inf,-nan)`Compiled with msvc:`(inf,inf)`While ISO 9899 clearly says:> csinh(+∞ + i∞) returns ±∞ + iNaN (where the sign of the real part of the result is unspecified) and raises the ‘‘invalid’’ floating-point exceptionThese tests use numpy as reference and numpy is implemented according to ISO 9899, hence those tests fail on Windows.Same failures can be observed on cpu, and those tests are skipped there. I propose we do the same for xpu.(intel/torch-xpu-ops#1195 (comment))Pull Requestresolved:#165049Approved by:https://github.com/guangyey,https://github.com/EikanWang,https://github.com/albanD

Dec 18, 2025
614ff1a
zip
tar.gz

trunk/57e5b3769c8d58f45e0a742f2f157c1a41f0a654

Toggle trunk/57e5b3769c8d58f45e0a742f2f157c1a41f0a654's commit message

[CI] Swap TPUs from v6 to v7 (#170690)Fixes #ISSUE_NUMBERPull Requestresolved:#170690Approved by:https://github.com/seemethere

Dec 18, 2025
57e5b37
zip
tar.gz

PreviousNext

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

viable/strict/1766042748

viable/strict/1766040973

viable/strict/1766033750

v2.10.0-rc2

Verified

trunk/7031901e40749c8761d30d4f20bbe9ed3a9285c9

trunk/392330c7f29afad69b5935d7dd4d3e802f40f507

trunk/70971eabdcd2d92efc29a2d70aac85f2096b9042

trunk/863d0ebb5c3401f8d2f88e8946511784ba0b41ab

trunk/614ff1a63ed8b4056ce9b9a9bafd2f15e8eb06a4

trunk/57e5b3769c8d58f45e0a742f2f157c1a41f0a654

Movatterモバイル変換

Tags: pytorch/pytorch