pytorch/pytorchPublic

NotificationsYou must be signed in to change notification settings
Fork26.3k
Star96k

Mobile Support, Named Tensors, Quantization, Type Promotion and many more

nairbv released this 10 Oct 17:26

· 2 commits to v1.3.0 since this release

v1.3.0

de394b6

Breaking Changes
Highlights
- [Experimental]: Mobile Support
- [Experimental]: Named Tensor Support
- [Experimental]: Quantization support
- Type Promotion
- Deprecations
New Features
- TensorBoard: 3D Mesh and Hyperparameter Support
- Distributed
- Libtorch Binaries with C++11 ABI
- New TorchScript features
Improvements
- C++ Frontend Improvements
  - Autograd
  - New torch::nn modules
  - New torch::nn::functional functions
  - tensor Construction API
  - Other C++ Improvements
- Distributed Improvements
- Performance Improvements
- JIT Improvements
- ONNX Exporter Improvements
  - Adding Support for ONNX IR v4
  - Adding Support for ONNX Opset 11
  - Exporting More Torch Operators/Models to ONNX
  - Enhancing ONNX Export Infra
- Other Improvements
Bug Fixes
- TensorBoard Bug Fixes
- C++ API Bug fixes
- JIT
- Other Bug Fixes
Documentation Updates
- Distributed
- JIT
- Other documentation improvements

Breaking Changes

Type Promotion: Mixed dtype operations may return a different dtype and value than in previous versions. (22273,26981)

Previous versions of PyTorch supported a limited number of mixed dtype operations. These operations could result in loss of precision by, for example, truncating floating-point zero-dimensional tensors or Python numbers.

In Version 1.3, PyTorch supports NumPy-style type promotion (with slightly modified rules, seefull documentation). These rules generally will retain precision and be less surprising to users.

Version 1.2	Version 1.3
_{>>>torch.tensor(1)+2.5tensor(3)>>>torch.tensor([1])+torch.tensor(2.5)tensor([3])>>>torch.tensor(True)+5tensor(True)}	_{>>>torch.tensor(1)+2.5tensor(3.5000)>>>torch.tensor([1])+torch.tensor(2.5)tensor([3.5000])>>>torch.tensor(True)+5tensor(6)}

Version 1.2

Version 1.3

_{>>>torch.tensor(1)+2.5tensor(3)>>>torch.tensor([1])+torch.tensor(2.5)tensor([3])>>>torch.tensor(**True**)+5tensor(True)}

_{>>>torch.tensor(1)+2.5tensor(3.5000)>>>torch.tensor([1])+torch.tensor(2.5)tensor([3.5000])>>>torch.tensor(True)+5tensor(6)}

Type Promotion: in-place operations whose result_type is a lower dtype category (bool < integer < floating-point) than the in-place operand now throw an Error. (22273,26981)

Version 1.2	Version 1.3
_{>>>int_tensor=torch.tensor(1)>>>int_tensor.add_(1.5)tensor(2)>>>bool_tensor=torch.tensor(True)>>>bool_tensor.add_(5)tensor(True)}	_{>>>int_tensor=torch.tensor(1)>>>int_tensor.add_(1.5)RuntimeError:result typeFloatcannotbecasttothedesiredoutputtypeLong>>>bool_tensor=torch.tensor(True)>>>bool_tensor.add_(5)RuntimeError:result typeLongcannotbecasttothedesiredoutputtypeBool}

Version 1.2

Version 1.3

_{>>>int_tensor=torch.tensor(1)>>>int_tensor.add_(1.5)tensor(2)>>>bool_tensor=torch.tensor(True)>>>bool_tensor.add_(5)tensor(True)}

_{>>>int_tensor=torch.tensor(1)>>>int_tensor.add_(1.5)RuntimeError:result typeFloatcannotbecasttothedesiredoutputtypeLong>>>bool_tensor=torch.tensor(True)>>>bool_tensor.add_(5)RuntimeError:result typeLongcannotbecasttothedesiredoutputtypeBool}

These rules can be checked at runtime viatorch.can_cast.

`torch.flatten`: 0-dimensional inputs now return a 1-dim tensor. (25406).

Version 1.2	Version 1.3
_{>>>torch.flatten(torch.tensor(0))tensor(0)}	_{>>>torch.flatten(torch.tensor(0))tensor([0])}

`nn.functional.affine_grid`: when`align_corners = True`, changed the behavior of 2D affine transforms on 1D data and 3D affine transforms on 2D data (i.e., when one of the spatial dimensions has unit size).

Previously, all grid points along a unit dimension were considered arbitrarily to be at -1, now they are considered to be at 0 (the center of the input image).

`torch.gels:` removed deprecated operator, use`torch.lstsq` instead. (26480).

`utils.data.DataLoader:` made a number of Iterator attributes private (e.g.`num_workers`,`pin_memory`). (22273)

[C++]`Variable::backward` will no longer implicitly create a gradient for non-1-element Variables. Previously, a gradient tensor of all 1s would be implicitly created . This behavior matches the Python API. (26150)

auto x = torch::randn({5, 5}, torch::requires_grad());auto y = x * x;y.backward()// ERROR: "grad can be implicitly created only for scalar outputs"

[C++] All option specifiers (e.g.`GRUOptions::bidirectional_`) are now private, use the function variants (`GRUOptions::bidirectional(...))` instead. (26419).

Highlights

[Experimental]: Mobile Support

In PyTorch 1.3, we are launching experimental support for mobile. Now you can run any TorchScript model directly without any conversion. Here are the full list of features in this release:

Support for full TorchScript inference on mobile;
Prebuilt LibTorch libraries for Android/iOS on JCenter/CocoaPods;
Java wrapper for Android with functionality to cover common inference cases (loading and invoking the model);
Support for all forward ops on mobile CPU (backward ops are not supported yet);
Some optimized fp32 operator implementations for ARM CPUs (based on Caffe2Go);
Some optimized int8 operator implementations for ARM CPUs (based on QNNPACK);

We decided not to create a new framework for mobile so that you can use the same APIs you are already familiar with to run the same TorchScript models on Android/iOS devices without any format conversion. This way you can have the shortest path from research ideas to production-ready mobile apps.

The tutorials, demo apps and download links for prebuilt libraries can be found at:https://pytorch.org/mobile/

This is an experimental release. We are working on other features like customized builds to make PyTorch smaller, faster and better for your specific use cases. Stay tuned and give us your feedback!

[Experimental]: Named Tensor Support

Named Tensors aim to make tensors easier to use by allowing users to associate explicit names with tensor dimensions. In most cases, operations that take dimension parameters will accept dimension names, avoiding the need to track dimensions by position. In addition, named tensors use names to automatically check that APIs are being used correctly at runtime, providing extra safety. Names can also be used to rearrange dimensions, for example, to support "broadcasting by name" rather than "broadcasting by position".

Create a named tensor by passing anames argument into most tensor factory function.

>>>tensor=torch.zeros(2,3,names=('C','N'))tensor([[0.,0.,0.],            [0.,0.,0.]],names=('C','N'))

Named tensors propagate names across operations.

>>>tensor.abs()tensor([[0.,0.,0.],            [0.,0.,0.]],names=('C','N'))

Rearrange to a desired ordering by usingalign_to .

>>>tensor=tensor.align_to('N','C','H','W')>>>tensor.names,tensor.shape    (('N','C','H','W'),torch.Size([3,2,1,1]))

And more!Please see our documentation on named tensors.

[Experimental]: Quantization support

PyTorch now supports quantization from the ground up, starting with support for quantized tensors. Convert a float tensor to a quantized tensor and back by:

x = torch.rand(10,1, dtype=torch.float32)xq = torch.quantize_per_tensor(x, scale = 0.5, zero_point = 8, dtype=torch.quint8)# xq is a quantized tensor with data represented as quint8xdq = x.dequantize()# convert back to floating point

We also support 8 bit quantized implementations of most common operators in CNNs, including:

Tensor operations:
- view, clone, resize, slice
- add, multiply, cat, mean, max, sort, topk
Modules/Functionals (in torch.nn.quantized)
- Conv2d
- Linear
- Avgpool2d, AdaptiveAvgpool2d, MaxPool2d, AdaptiveMaxPool2d
- Interpolate
- Upsample
Fused operations for preserving better accuracy (in torch.nn.intrinsic)
- ConvReLU2d, ConvBnReLU2d, ConvBn2d
- LinearReLU
- add_relu

We also support dynamic quantized operators, which take in floating point activations, but use quantized weights (in torch.nn.quantized.dynamic).

LSTM
Linear

Quantization also requires support for methods to collect statistics from tensors and calculate quantization parameters (implementing interface torch.quantization.Observer). We support several methods to do so:

MinMaxObserver
MovingAverageMinMaxObserver
PerChannelMinMaxObserver
MovingAveragePerChannelMinMaxObserver
HistogramObserver

For quantization aware training, we support fake-quantization operators and modules to mimic quantization during training:

torch.fake_quantize_per_tensor_affine,torch.fake_quantize_per_channel_affine
torch.quantization.FakeQuantize

In addition, we also support workflows in torch.quantization for:

post-training dynamic quantization
static post training quantization
quantization aware training

All quantized operators are compatible with TorchScript.

For more details, see the documentation at:https://pytorch.org/docs/master/quantization.html

Type Promotion

Arithmetic and comparison operations may now perform mixed-type operations that promote to a common dtype.

This below example was not allowed in version 1.2. In version 1.3, the same code returns a tensor withdtype=torch.float32.

>>> torch.tensor([1], dtype=torch.int) + torch.tensor([1], dtype=torch.float32)

See the fulldocumentation for more details.

torch.result_type Provide function to determine result of mixed-type operations (26012).
torch.can_cast Expose casting rules for type promotion (26805).
torch.promote_types Expose promotion logic (26655).

Deprecations

`nn.functional.affine_grid` /`nn.functional.grid_sample`: USING The Align_CORNER Default value is now deprecated, because it will be changed in 1.4 release.

Thealign_corner parameter was added in this release; the behavior in the previous release was equivalent to setting the parameter toTrue. This is also the current default value but it will be changed toFalse from 1.4 release. Note that using the default will trigger a warning as demonstrated below; set the value explicitly to remove the warning.

>>> torch.nn.functional.affine_grid(torch.randn(1,2,3),                                    (1,3,2,2))UserWarning: Default grid_sample and affine_grid behavior will be changedto align_corners=False from 1.4.0. See the documentation of grid_sample for details....>>> torch.nn.functional.affine_grid(torch.randn(1,2,3),                                    (1,3,2,2),                                    align_corners=True)# NO WARNING!...

[C++] Deprecate`torch::Tensor::data<T>()` in favor of`torch::Tensor::data_ptr<T>()` (24847,24886).

New Features

TensorBoard: 3D Mesh and Hyperparameter Support

torch.utils.tensorboard supports 3D mesh and points plus hyperparameter logging. More details can be found inthe documentation forSummaryWriter withadd_mesh andadd_hparams.

A simple example exercising both methods:

from torch.utils.tensorboard import SummaryWritervertices_tensor = torch.as_tensor([    [1, 1, 1],    [-1, -1, 1],    [1, -1, -1],    [-1, 1, -1],], dtype=torch.float).unsqueeze(0)colors_tensor = torch.as_tensor([    [255, 0, 0],    [0, 255, 0],    [0, 0, 255],    [255, 0, 255],], dtype=torch.int).unsqueeze(0)faces_tensor = torch.as_tensor([    [0, 2, 3],    [0, 3, 1],    [0, 1, 2],    [1, 3, 2],], dtype=torch.int).unsqueeze(0)with SummaryWriter() as w:    w.add_mesh('my_mesh', vertices=vertices_tensor, colors=colors_tensor, faces=faces_tensor)    for i in range(5):        w.add_hparams({'lr': 0.1*i, 'bsize': i},                      {'hparam/accuracy': 10*i, 'hparam/loss': 10*i})

Distributed

This release adds macOS support fortorch.distributed with the Gloo backend. You can more easily switch from development (e.g. on macOS) to deployment (e.g. on Linux) without having to change a single line of code. The prebuilt binaries for macOS (stable and nightly) include support out of the box.

torch.distributed.all_reduce_coalesced Support allreduce of a list of same-device tensors (24949,25470,24876)
torch.distributed.all_reduce Add bitwise reduction ops (BAND, BOR, BXOR) (26824)

Libtorch Binaries with C++11 ABI

We now provide Libtorch binaries for building applications compatible with the C++11 ABI. The download links for libtorch binaries with C++11 ABI can be found inhttps://pytorch.org/ “QUICK START LOCALLY”.

New TorchScript features

Addnot in support for TorchScript (23637).
You can now raise exceptions in one side of an if branch (23565).
Addtorch.jit.is_scripting() API (25955).
Make assertions likex is not None unwrap the optional type ofx (23949).
Add dictionary augmented assignment (+=) support to TorchScript (23639).
Supportgrad anddata attribute for tensor in TorchScript (23842).
Add@ignore for TorchScript classes (23614).
Support nn.GRU in script (23266).
Support tensor as a key type in TorchScript (23638).
Add support for ModuleDict (25715).
Bindset_grad_enabled() into TorchScript (25350).
Addin membership checks for lists (25796).
Addtuple keyword (25474).
Add__getitem__ to class types (25664).
Add__setitem__ to class types (25750).
Make JIT dicts ordered, matching Python 3.6+ semantics (26465).
Added invert bitwise operation to TorchScript (22324).
Addmin() andmax() for lists to TorchScript (26351).
Support iterables and ranges in list comprehensions (26768).

Improvements

C++ Frontend Improvements

We are on our way to better API parity between our Python and C++ frontends. Specifically, we made the following improvements:

Autograd

Tensor autograd APIs
- torch::Tensor::data Added (26008).
- torch::Tensor::grad Don’t create a gradient for non-1-element Variables [BC-breaking] (26150).
- torch::Tensor::is_leaf Added (26186).
- torch::Tensor::output_nr Added (26216).
- torch::Tensor::_version Added (26217).
Add support for custom autograd functions in C++ API
- For example usage, please see the PR description and test cases in (23572,23628, and23803)
torch::autograd::backward andtorch::autograd::grad (24342)
torch::autograd::Variable::register_hook (24393).

New torch::nn modules

Containers
- torch::nn::ModuleList (24317).
Linear layers
- torch::nn::Identity (26713).
Convolution layers
- torch::nn::Fold (24160).
Pooling layers
- torch::nn::MaxPool1d / MaxPool2d / MaxPool3d (24860,26521).
- torch::nn::AvgPool1d / AvgPool2d / AvgPool3d (25800).
- torch::nn::AdaptiveMaxPool1d / AdaptiveMaxPool2d / AdaptiveMaxPool3d (26755,26772,26775).
Loss functions
- torch::nn::L1Loss (25902).
Distance functions
- torch::nn::CosineSimilarity (26424)
- torch::nn::PairwiseDistance (26424)

New torch::nn::functional functions

Pooling functions
- torch::nn::functional::max_pool1d / max_pool2d / max_pool3d (26262).
- torch::nn::functional::max_pool1d_with_indices / max_pool2d_with_indices / max_pool3d_with_indices (26521).
- torch::nn::functional::avg_pool1d / avg_pool2d / avg_pool3d (26262).
- torch::nn::functional::adaptive_max_pool1d / adaptive_max_pool2d / adaptive_max_pool3d (26755,26772,26775).
- torch::nn::functional::adaptive_max_pool1d_with_indices / adaptive_max_pool2d_with_indices / adaptive_max_pool3d_with_indices (26755,26772,26775).
Distance functions
- torch::nn::functional::cosine_similarity (26424).
- torch::nn::functional::pairwise_distance (26424).

tensor Construction API

Add support for multidimensional inputs totorch::tensor (26210,26890,26756).
- From now on, we can usetorch::tensor({{1, 2}, {3, 4}}) in C++ to construct the same tensor astorch.tensor([[1, 2], [3, 4]]) in Python. Some caveats are noted inthis comment.
Add support for bool and BFloat16 dtypes totorch::tensor (23337).

Other C++ Improvements

Addtorch::nn::Module::unregister_module function, for unregistering a submodule from atorch::nn::Module (26088).

Distributed Improvements

torch.distributed Detect and handle NCCL errors appropriately instead of blocking peers until timeout inProcessGroupNCCL (25012,25905)
torch.distributed Make scatter/gather arguments optional (25575)
torch.distributed.launch Add a -m flag to allow users to launch python modules (24910).
torch.distributed Add function to get NCCL version for logging (26583).
torch.distributed Add timeout parameter to connect function in TCPStore (26554).
torch.distributed use timeout in connect function to prevent against infinite loop (26364).
torch.nn.modules.batchnorm Allow SyncBatchNorm to run without DDP in inference mode (24815)

Performance Improvements

torch.argmax/argmin Rewrite as TensorIterator reductions (26181).
torch.erfinv Vectorize unary operator (26629).
torch.sin/cos/tan Use intrinsics for trigonometric functions on CPU (26431).
Fix possible deadlock in SharedCache inside a forked child proc (25158).
torch.qr Fix a regression (23591).
nn.Conv Use Caffe2's implementation of grouped depthwise 3x3 convolutions (26556).
nn.Conv Use parallel_for in DepthwiseConvKernel (26879).
nn.Conv Change shape for conv and unary ops (25477).
Fix pin_memory_thread not exiting quickly (23646).
Increase predefined_minimum_secs to reduce variation (23734).
Enhance Tensor indexSelect performance (23055).
Separate input shapes to reduce default execution time (24136).
constraints.lower_cholesky Vectorize LowerCholeskyTransform (24131).
Speed up an integer to the power of a positive integer on CPU (26020).
[ROCm] Enable jit fusion (22872).
[ROCm] Use MIOpen for transpose convolutions (26172).

JIT Improvements

Enable CPU fused kernel on Windows (25578).
Expose an API to iterate all the registered operators (23207).
Include recursive class compilations in error call stack (23454).
Substantial improvements to saved model format speed and size.
- Compress debug symbols when serializing TorchScript models. (23659).
- Compress all non-Tensor components of a serialized TorchScript model. (23723).
- Perform string uniquing by value in pickle serialization. (23741).
- Implement a bunch of pickle serialization features that optimize for size. (23759).
- Implement more size-oriented opcodes in the depickler. (26454).
Cache node operators to speed up optimization (24827).
Allow forward hooks in tracing (23613).
Add Pickler C++ API (23241).
Open up AliasAnalysisKind for any ops (23810).
Add the ability to compile exports on traced modules (24298).
MakeNoneType a subtype ofOptional[T] (25361).

ONNX Exporter Improvements

In PyTorch 1.3, we have added support for exporting graphs with ONNX IR v4 semantics, and set it as default. We have achieved good initial coverage for ONNX Opset 11, which was released recently with ONNX 1.6. Further enhancement to Opset 11 coverage will follow in the next release. We have enabled export for about 20 new PyTorch operators. Also, we have focused on enabling the export for all models in torchvision. We have introduced some necessary groundwork for that in this release, e.g., accepting PyTorch models with inputs/outputs of Dict or String. We continue to work on torchvision models, such as FasterRCNN and MaskRCNN, to enable their export.

Adding Support for ONNX IR v4

Provide an option to exclude the weights from model inputs (#23284)
Make graph inputs without weights as default (#26146)

Adding Support for ONNX Opset 11

Introduce ONNX Opset 11 support (#23739)
Add export for torch.Interpolate in Opset 11 (#24805,#27179)
Add export for tensor.gather, tensor.scatter and tensor.scatter_add in Opset 11 (#24790)
Add export for tensor.clamp in Opset 11 (#25797)
Add export for torch.topk and torch.sort in Opset 11 (#25739)

Exporting More Torch Operators/Models to ONNX

Export torch.pixel_shuffle (#23739)
Export torch.multinomial (#23581)
Export torch.norm’s frobenius_norm (#23536)
Export torch.std (#22310)
Export torch.empty and torch.empty_like (#24166)
Export torch.rsqrt (#24153)
Export torch.log1p (#25808)
Export torch.unique (#25050)
Export torch.gelu (#24475)
Export tensor.index_fill and tensor.index_copy (#23052)
Export torch.round (#26126)
Export torch.baddbmm (#25738)
Export torch.remainder (#24410)
Export torch.cumsum (#24476)
Export tensor.size with negative axis (#26436)
Export RNN/LSTM with h0/c0 initial state (#22813)

Enhancing ONNX Export Infra

Enable exporting PyTorch models which have Dict and String as inputs and outputs (#25889)
Systematically solving mismatched types caused by implicit type conversion for binary arithmetic operators by adding an ONNX type conversions pass. (#24378)
Correctly validate dynamic axes names. (#23974)
Enable ONNX Runtime tests for Opset 10 and partially for Opset 11 (#22993)

Other Improvements

Error checking: many operators now perform strides check of the output tensor and errors if it contains inner overlaps that would result in incorrect result (23063).
torch.det/logdet/slogdet Allowing batching (22909).
torch.logical_not Add new operator (23839).
torch.logical_xor Add new operator (23847).
torch.symeig Improve the stability of gradient updates (23018).
torch.eye Enable for bool and half (24148).
torch.tril / triu Enable for bool and half (24163).
torch.logical_not/xor support non-bool tensors. (23916,23978).
torch.index_select Implement indexing methods for sparse tensors (24937).
torch.lu_solve Enable broadcasting of batch dimensions (24333).
torch.cholesky Enable batches greater than 262140 (24438).
torch.det Simplify generation of singular matrices to avoid numerical issue on PowerPC (25773).
torch.erfinv In the CUDA implementation, use erfinv() for double to preserve accuracy (25337).
torch.erfinv Add a float version of erfinv on CPU (26070).
torch.cuda.stream Updates autograd engine to respect streams set in forward (8354).
torch.backends.mkldnn.enabled Allow disabling MKLDNN at runtime (25459).
torch.cholesky_solve Add derivative (26185).
torch.cholesky_inverse Add derivative (26451).
torch.polygamma Ensure that n is non-negative (26294).
torch.pinverse Enable batching (26095).
torch.digamma/trigamma Fix type mismatches on CUDA (25791).
torch.where Enable for bool tensor on CUDA (26430).
torch.load default encoding change to 'utf-8' (26421).
torch.repeat_interleave Respect the current stream (26946).
torch.bernoulli_ Implement for bool tensors (25076).
torch.norm Fix nuclear norm with requires_grad=True (26303).
torch.hub.download_url_to_file Make function public (26723).
nn.modules.conv add padding_mode to repr (23996).
nn.Transformer Extend to support BERT (gelu) (24181).
nn.BatchNorm2d Add support for non-affine batch norm with float stats and half inputs (22750).
nn.Parameter Fix type hints (25586).
nn.CTCLoss Improve error message (26325).
nn.Conv Allow batch size of 0 (26214).
nn.LSTM/GRU enable double backward for non-cudnn (26660).
optim.Adagrad Add epsilon argument (24980).
optim.LBFGS Change default tolerance_grad to 1e-7 (25240).
optim.lr_scheduler.OneCycleLR Add new 1cycle learning rate scheduler (25324).
optimizer.step Fix type annotation (26930).
bfloat16 Add support for sub, mul, and div on CPU (22851).
bfloat16 Enabled comparison ops on CPU (24182).
bfloat16 Enabled masked methods (24183).
bfloat16 Enabled torch.mm and torch.mv (24224).
bfloat16 Enable log_softmax and CrossEntropyLoss (24457).
bfloat16 Enabled conv methods (26167).
bfloat16 Enabled dtype on CUDA (26407).
quasirandom.SobolEngine Use random seed if not specified (24884).
utils.data.dataloader Add possible out of shared memory error message (25730).
cuda.set_rng_state Add type hint (26200).
Zero sized tensor support for repeat_interleave (23717).
Recommend~ andbitwise_not() when user tries to apply neg (-) on a bool tensor. (23621).
Fix double backward of inplace op on view (23502).
autograd.grad Validate shapes of outputs (25349).
Enable libflame as a LAPACK choice (25795).
Fix race condition in CUDA initialization (25788).
Includeiteration_ in SGD optimizer serialization (26906).
[C++]torch::tensor Fix an ambiguous overload issues in constructor (26890).
[XLA] Check device before accessing data_ptr in PackLayer (26056).
[XLA] Allow overwriting catch-all kernels (25947).

Bug Fixes

TensorBoard Bug Fixes

SummaryWriter.add_graph: Fix empty graph output in some cases (25599).
Update Caffe2 contrib TensorBoard logging to not require TensorFlow (25259).
SummaryWriter.make_video: Fix write_gif call to moviepy for newer lib (21218).

C++ API Bug fixes

Fixes mismatch of device and data type when computingstep_size in LBFGS optimizer (25909).

JIT

Fix list comprehension that change the type of the original iterable (24271).
Fix double copying of constants during recursive scripting (24412).
Fix frontend error message (23576).
Clear recursive error stack on each compilation (23458).
Fix bugs in assignment to optionals (25059).
Maketorch.jit.Attribute work whenPYTORCH_ENABLED=0 (23851).
Fix unicode in comments causing compilation errors (24218).
Correctly raise an error if annn.Module has not been initialized but you try to script it (24852).
Fix annotated assignment to variables (25094).
dictPop: dereference dict.find() iterator before calling dict.erase() (25056).
fix closures which always throw. (25278).
Add source location to class instantiation error (24990).
FixAliasAnalysisKind::PURE on MSVC (25375).
Emit script function calls during tracing. (25089).
ResolveNamedTuple types properly in Python (26443).
Fix schema matching of tuples to vartype lists (25944).
Correctly preserve ignored function return value type (25262).
Fix missing newline in compiled from source range highlight (25802).
Fix use-after-free bug inoptional (25965).
Fix torch.arange traced as constant (25363).
Preserve module names in recursive script (24505).
Properly resolve ignored module method type annotations (26683).
Makeis_optional check more robust (26312).
Fix builtin lookup for Python functions (26688).
Typevar matching fix + implicit conversions from Scalar to int/float (26453).
Fix range for non-int inputs and pow implementation (26926).

Other Bug Fixes

torch.is_pinned pin_memory should not copy on already pinned tensors (23484).
torch.cdist Fix incorrect gradients on CUDA non-batch tensors (22915).
torch.from_numpy Fix failure on windows for int32 (25139).
torch.tensor Fix memory leak creating a tensor from numpy (24267).
torch.index Don't saveself inindex backward (25594).
torch.bincount Fix int32 overflow on CUDA (25748).
torch.bernoulli Fix the distribution sampler (26864).
torch.pow Fix precision (25476).
torch.cdist Fix gradient computation when first arg is 1xn (26254).
torch.scatter_add_ Fix scatter CPU kernel when (input size, src size) > index size (25839).
nn.ConvTranspose2d Fixed an error with float16 inputs and weights on CUDA. (23552).
nn.CTCLoss Fix zero-length targets on CUDA (23298).
nn.Conv2d Correct an overflow in an error message (25146).
optim.Adam apply a small mathematical fix. (23737).
dataloader Fix IndexError on shutdown if not all workers are started (23761).
Tensor.repeat Fix crash on for 0 repeats (23766).
torch.pin_memory only use one thread (25111).
distributions.Uniform,HalfCauchy,Gamma Fixlog_prob when value is a float (23017).
Fix typing error for Padding with asymmetric signatures (24895).
Avoid race condition inintrusive_ptr.reset_() (24464).
torch.hub: Fix SSL cert issue for hub in Python 2 (25042).
Fix int overflow issue in CUDA kernels. (24818).
Module.cuda Fix type hints (25018).
Fix bug in assertNotEqual for int tensors (25412).
Fix 'in' return true incorrectly (24156).
Fix bugs in bulk loader whenbatch_size=None or with namedtuple (26065).
Fix serialization issue in big endian arch (26383).
FixVec256::abs() for floating point when applied on -0.0 (26422).
Fix cyclic reference in _LRScheduler (25776).
Fix a build failure on s390x (26233).
[XLA] Fix tensor construction from array (24283).

Documentation Updates

Distributed

torch.distributed Error phrasing in torch.distributed helper functions (25574)
torch.distributions.negative_binomial clarified ambiguous doc string in NegativeBinomial (25923)

JIT

Add technical documentation for the serialization format (23456).
Fix trace docs (24191).
Addtrace_module to docs (24258).
Cleanup distinction aroundscript andtrace (24208).
Fixitem() call in docs (25404).
Misc doc updates / fixes (24371,24445).

Movatterモバイル変換

Mobile Support, Named Tensors, Quantization, Type Promotion and many more

Table of Contents

Breaking Changes

Type Promotion: Mixed dtype operations may return a different dtype and value than in previous versions. (22273,26981)

Type Promotion: in-place operations whose result_type is a lower dtype category (bool < integer < floating-point) than the in-place operand now throw an Error. (22273,26981)

torch.flatten: 0-dimensional inputs now return a 1-dim tensor. (25406).

nn.functional.affine_grid: whenalign_corners = True, changed the behavior of 2D affine transforms on 1D data and 3D affine transforms on 2D data (i.e., when one of the spatial dimensions has unit size).

torch.gels: removed deprecated operator, usetorch.lstsq instead. (26480).

utils.data.DataLoader: made a number of Iterator attributes private (e.g.num_workers,pin_memory). (22273)

[C++]Variable::backward will no longer implicitly create a gradient for non-1-element Variables. Previously, a gradient tensor of all 1s would be implicitly created . This behavior matches the Python API. (26150)

[C++] All option specifiers (e.g.GRUOptions::bidirectional_) are now private, use the function variants (GRUOptions::bidirectional(...)) instead. (26419).

Highlights

[Experimental]: Mobile Support

[Experimental]: Named Tensor Support

[Experimental]: Quantization support

Type Promotion

Deprecations

nn.functional.affine_grid /nn.functional.grid_sample: USING The Align_CORNER Default value is now deprecated, because it will be changed in 1.4 release.

[C++] Deprecatetorch::Tensor::data<T>() in favor oftorch::Tensor::data_ptr<T>() (24847,24886).

New Features

TensorBoard: 3D Mesh and Hyperparameter Support

Distributed

Libtorch Binaries with C++11 ABI

New TorchScript features

Improvements

C++ Frontend Improvements

Autograd

New torch::nn modules

New torch::nn::functional functions

tensor Construction API

Other C++ Improvements

Distributed Improvements

Performance Improvements

JIT Improvements

ONNX Exporter Improvements

Adding Support for ONNX IR v4

Adding Support for ONNX Opset 11

Exporting More Torch Operators/Models to ONNX

Enhancing ONNX Export Infra

Other Improvements

Bug Fixes

TensorBoard Bug Fixes

C++ API Bug fixes

JIT

Other Bug Fixes

Documentation Updates

Distributed

JIT

Other documentation improvements

Uh oh!

`torch.flatten`: 0-dimensional inputs now return a 1-dim tensor. (25406).

`nn.functional.affine_grid`: when`align_corners = True`, changed the behavior of 2D affine transforms on 1D data and 3D affine transforms on 2D data (i.e., when one of the spatial dimensions has unit size).

`torch.gels:` removed deprecated operator, use`torch.lstsq` instead. (26480).

`utils.data.DataLoader:` made a number of Iterator attributes private (e.g.`num_workers`,`pin_memory`). (22273)

[C++]`Variable::backward` will no longer implicitly create a gradient for non-1-element Variables. Previously, a gradient tensor of all 1s would be implicitly created . This behavior matches the Python API. (26150)

[C++] All option specifiers (e.g.`GRUOptions::bidirectional_`) are now private, use the function variants (`GRUOptions::bidirectional(...))` instead. (26419).

`nn.functional.affine_grid` /`nn.functional.grid_sample`: USING The Align_CORNER Default value is now deprecated, because it will be changed in 1.4 release.

[C++] Deprecate`torch::Tensor::data<T>()` in favor of`torch::Tensor::data_ptr<T>()` (24847,24886).