- Notifications
You must be signed in to change notification settings - Fork65
Releases: gorgonia/cu
Releases · gorgonia/cu
CUDA 12 Support (Windows)
8ef88f2
This commit was created on GitHub.com and signed with GitHub’sverified signature.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
What's Changed
CUDA 12 works for Windows as well. Thanks to Mike(@hunjixin)
New Contributors
Full Changelog:v0.9.5...v0.9.6
CUDA 12 Support
498bd6b
This commit was created on GitHub.com and signed with GitHub’sverified signature.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
What's Changed
- Incorrect variable passed in cuLaunchAndSync by@tkunicki in#58
- Fixed install command by@MarvinJWendt in#59
- Add CUDA 11.8 and alternative means of setting up cgo on Windows by@dalva24 in#60
- Cuda 12 by@neurlang in#69
New Contributors
- @tkunicki made their first contribution in#58
- @MarvinJWendt made their first contribution in#59
- @dalva24 made their first contribution in#60
- @neurlang made their first contribution in#69
Full Changelog:v0.9.4...v0.9.5
CUDA11 supported
a41082c
This commit was created on GitHub.com and signed with GitHub’sverified signature. The key has expired.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
* CUDA11 initial work. First, we generate the new enums* Added generateEnums, which generates the Go version of the CUresult type* Updated tests such that they no longer fail.Added a Signal() method to BatchedContext, to force the BatchedContext to DoWork* Updated benchmarking of batched vs no batched context. It would appear that for now Batching no longer confers a benefit* Attempt #4 at getting CUDA11. Previous attempts were working based off a faulty copy of `cuda.h`- Updated Device to support UUID- Updated README- Updated genlib to do more things more carefully* More work on CUDA11- Added more mappings into mappings.go to generate stufff- Changed the definition of Context, by adding one additional method to clear L2Cache- Added stubs for LaunchCooperativeKernel- Added Graph types.TODO next: add all the basic Graph data structure and then autogenerate all the things!* Fixed mappings to also include @egonelbre's change in 2e25e65507Fixed a bug where Fix() wasn't called, leading to weird generations* Added some graph stuff, fixed some mappings stuff for genAPI. It seems that the graph functions will have to be manually written for now* Updated graph.go from ages ago* Updated more of CUDA11 Graph API into the library.Slowly getting there.* Added the body of CopyParams* Added AddMemsetNode method for Graph.* Fixed a bunch of things* Switched to modernc.org/cc instead of using the older github.com/cznic/cc* cuDNN updated their website. So parse.py also has to change.As a result moredecls.go also changed* Sorted the data in mappings.go. This will allow for better diffing* Updated the generatethis pipeline* Initial mappings generation.* Mapped the old commented out mappings to new commented out mappings (see mappings.ods)* Generated enums.* Updated enums and enum strings* Added more generated data structures* Added methods* Generated stubs. 7 TODOs* Added more incompletes report* Manually fixed the TODO of SpatialTransformer* Manually fixed generated_rnndata.go* Manually fixed generated_seqdata.go* Manually fixed generated_backend.go* Manually fixed generated_tensortransform.go* Fixed the missing getters* fixed all the .C()s of the generated types* Generated a new API* Fixed random C int issues. Now to handle the rest* Updated INCOMPLETES_REPORTS* fixed variable collition in _BackendAttributeTypeNames* gencudnn enum generation syntax fixes added* Updated INCOMPLETES* variable renaming added as per the review* AlgorithmDescriptor syntax fixes added* AlgorithmPerformance syntax fixes added* Activation cudnnActivationDescriptor_t return method name change added* syntax fixes added on FusedOpVariantParams* FusedOpConsts syntax fixes added* C type retrieve function added for cudnnStatus* tensor file syntax fixes addedtensor file unreachable code removed* method receiver renaming added* optensor syntax fixes added* generated_api syntax fixes added* code review changes added* go modules updatedalgorithmdescriptor Algorithm type changes added* review changes addedGetRNNLinLayerBiasParams & GetRNNLinLayerMatrixParams methods moved to manually written API.go file* Fixed a bug in parse.py where when parsing the documentation for CUDA11, the function names have `()`* Removed deprecated functions from being generated* More deprecated stuff no longer generated* Fixed up algorithmdescriptor.go* fixed some auto generated issues* Manually fixed the fused ops generation* Fixed even more autogenerated errors* Fixed up more of the auto generated issues* Renamed API to todo, because eh, I'll figure it out laterCo-authored-by: Aruna Prabhashwara <wg.aruna.p@gmail.com>
Assets2
CUDA 10.2 supported
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
v0.9.3Added some more documentation, and support for cuda 10.2
Assets2
New CUDA versions supported
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
fixed the convolution.c importuse cuda 10.1
Assets2
v0.9.1
4f793ce
This commit was created on GitHub.com and signed with GitHub’sverified signature. The key has expired.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
v0.9.0 never got out of beta and Gomod didn't like it. This release fixes that
Assets2
Beta release of v0.9.0
a49599f
This commit was created on GitHub.com and signed with GitHub’sverified signature. The key has expired.
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
Beta release of v0.9.0Pre-release
Pre-release
Features:
- CUDA 9 support
- CuDNN 7 support
- JIT support (thanks@egonelbre )
- nvRTC support (thanks@egonelbre )
- Full CUBLAS support
- Move towards a unified generation method
- Various API changes
- Various fixes (@egonelbre)
- Bug fixes (thanks to@egonelbre):
- CString not freed
Assets2
v0.8.0
Compare
Could not load tags
Nothing to show
{{ refName }}defaultLoading
Merge remote-tracking branch 'origin/master'