- Notifications
You must be signed in to change notification settings - Fork26.3k
Enable CPU fused kernel on Windows#25578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Immocat commentedSep 6, 2019 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thank you@peterjc123 for the implementation. Actually I am working on writing a Unity native plugin(c++) on Windows to infer neural net results every frame, and CPU only is indeed much slower without this feature. I tried to the plugin with CUDA libtorch, however, Unity crashes at the exact line that doing neural net inference ( So I think I will give up the CUDA version. My question is that is this feature finished in your fork branch( If you've already finished it, may I try to build it from your last commit( on Windows and CPU only). Thank you very much! |
peterjc123 commentedSep 7, 2019
@Immocat No, it is still in an early stage. There are some difficulties that I have to tackle before it can be merged into master.
For cuda jit fusion conflicts, maybe you could try building the static version of LibTorch. Below are the steps: cmd:: EssentialsetBUILD_SHARED_LIBS=OFF:: [Optional] If you want to build with VS 2019 generator, please change the value in the next line to `Visual Studio 16 2019`.:: Note: This value is useless if Ninja is detected. However, you can force that by using `set USE_NINJA=OFF`.setCMAKE_GENERATOR=Visual Studio152017:: Read the content in the previous section carefully before you preceed.:: [Optional] If you want to override the underlying toolset used by Ninja and Visual Studio with CUDA, please run the following script block.:: "Visual Studio 2017 Developer Command Prompt" will be run automatically.:: Make sure you have CMake >= 3.12 before you do this when you use the Visual Studio generator.:: It's an essential step if you use Python 3.5.setCMAKE_GENERATOR_TOOLSET_VERSION=14.11setDISTUTILS_USE_SDK=1for /f"usebackq tokens=*"%i in (`"%ProgramFiles(x86)%\Microsoft Visual Studio\Installer\vswhere.exe" -version [15^,16^) -products * -latest -property installationPath`) do call "%i\VC\Auxiliary\Build\vcvarsall.bat" x64 -vcvars_ver=%CMAKE_GENERATOR_TOOLSET_VERSION%:: [Optional] If you want to override the cuda host compilersetCUDAHOSTCXX=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.11.25503\bin\HostX64\x64\cl.exepython tools\build_libtorch.py |
491b8cf toc4731a1Comparepeterjc123 commentedSep 7, 2019 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
The basic functionality is working now. However, there are still some points to improve:
|
854deda to56fe412Compareezyang commentedSep 9, 2019
This is nifty stuff. Let us know if there is stuff we can do to help move it along. |
peterjc123 commentedSep 9, 2019
@ezyang Could you please tell me where the jit frontend is? That is, how can I disable it in the Python side? |
ezyang commentedSep 9, 2019
Are you talking about the TorchScript compiler? It's not really disableable; when you request a function to be compiled for torchscript, we recursively collect the source code reachable from it and compile it. Maybe you could tell us more about what's going on? cc@suo for perhaps more comments |
peterjc123 commentedSep 10, 2019 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@ezyang
The following is what I want to do now. First, I want to add a check for the VS env before every jit fuse call. If it is not activated, then we will try to activate it, but if we cannot find it, then we will skip the fusion step. Do you know where should I add these code? |
ezyang commentedSep 10, 2019
Ah yes, I forgot about that. That will indeed turn off JIT globally; it's meant as an easy way to turn off script if you're debugging an issue without having to edit source code.
Actually, fusion can apply to trace too. Trace versus script refers to different ways of getting the IR in question; trace means we run your program and record what happened; script means we parse the literal program text. The IR can be fused in both cases.
I am not aware of any Windows specific behavior for torch.jit.trace, and we don't seem to have any macros on MSVC that would affect this.
For cpu, it's going to be somewhere like |
peterjc123 commentedSep 11, 2019
@pytorchbot rebase this please |
peterjc123 commentedSep 13, 2019
@xscha Sure, it should be fairly easy to support clang or any other compilers, but not in this PR. And we may need a code refactory otherwise the code will look messy. As for android, I think it should be just be the same with the deskop OS, using interpreter to run the operators when using jit script, and for jit fusion, only gcc is supported. |
xsacha commentedSep 13, 2019
I'm just worried about the fact we need a compiler on deployed systems where we do inferencing (JIT is only for the inferencing right?). |
peterjc123 commentedSep 13, 2019
@xsacha Yes, I agree with you that we should use some lightweight cross-platform compilers like llvmlite used by numba. |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
ezyang left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is very nice work. Inclusion of LGPL code is a blocker; we'll have to find an implementation somewhere else. I think my only other major concern is in-place mutation of environment variables in process.
bug fixEnable the jit tests on WindowsMore fixesFix tempfile for WindowsMore fixesMinor fixesadd headerlint changesDebugging stuff.....dllexportChange working dir to make git cleanCleanupRemove useless print
Fix lintmore lint fixes
More fixesFix comments.
peterjc123 commentedSep 16, 2019
@pytorchbot rebase this please |
peterjc123 commentedSep 16, 2019
@ezyang Could you please take some time to review this PR? |
facebook-github-bot left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@ezyang is landing this pull request. If you are a Facebook employee, you can view this diffon Phabricator.
Uh oh!
There was an error while loading.Please reload this page.
No description provided.