Unfold #

classtorch.nn.modules.fold.Unfold(kernel_size,dilation=1,padding=0,stride=1)[source]#

Extracts sliding local blocks from a batched input tensor.

Consider a batchedinput tensor of shape $(N, C, *) (N, C, *)$ ,where $N N$ is the batch dimension, $C C$ is the channel dimension,and $* *$ represent arbitrary spatial dimensions. This operation flattenseach slidingkernel_size-sized block within the spatial dimensionsofinput into a column (i.e., last dimension) of a 3-Doutputtensor of shape $(N, C \times \prod (kernel_size), L) (N, C \times \prod(\text{kernel\_size}), L)$ , where $C \times \prod (kernel_size) C \times \prod(\text{kernel\_size})$ is the total number of valueswithin each block (a block has $\prod (kernel_size) \prod(\text{kernel\_size})$ spatiallocations each containing a $C C$ -channeled vector), and $L L$ isthe total number of such blocks:

L = \prod_{d} ⌊ \frac{spatial_size [d] + 2 \times padding [d] - dilation [d] \times (kernel_size [d] - 1) - 1}{stride [d]} + 1 ⌋, L = \prod_d \left\lfloor\frac{\text{spatial\_size}[d] + 2 \times \text{padding}[d] %    - \text{dilation}[d] \times (\text{kernel\_size}[d] - 1) - 1}{\text{stride}[d]} + 1\right\rfloor,

where $spatial_size \text{spatial\_size}$ is formed by the spatial dimensionsofinput ( $* *$ above), and $d d$ is over all spatialdimensions.

Therefore, indexingoutput at the last dimension (column dimension)gives all values within a certain block.

Thepadding,stride anddilation arguments specifyhow the sliding blocks are retrieved.

stride controls the stride for the sliding blocks.
padding controls the amount of implicit zero-paddings on bothsides forpadding number of points for each dimension beforereshaping.
dilation controls the spacing between the kernel points; also known as the à trous algorithm.It is harder to describe, but thislink has a nice visualization of whatdilation does.

Parameters

kernel_size (int ortuple) – the size of the sliding blocks
dilation (int ortuple,optional) – a parameter that controls thestride of elements within theneighborhood. Default: 1
padding (int ortuple,optional) – implicit zero padding to be added onboth sides of input. Default: 0
stride (int ortuple,optional) – the stride of the sliding blocks in the inputspatial dimensions. Default: 1

Ifkernel_size,dilation,padding orstride is an int or a tuple of length 1, their values will bereplicated across all spatial dimensions.
For the case of two input spatial dimensions this operation is sometimescalledim2col.

Note

Fold calculates each combined value in the resultinglarge tensor by summing all values from all containing blocks.Unfold extracts the values in the local blocks bycopying from the large tensor. So, if the blocks overlap, they are notinverses of each other.

In general, folding and unfolding operations are related asfollows. ConsiderFold andUnfold instances created with the sameparameters:

>>>fold_params=dict(kernel_size=...,dilation=...,padding=...,stride=...)>>>fold=nn.Fold(output_size=...,**fold_params)>>>unfold=nn.Unfold(**fold_params)

Then for any (supported)input tensor the followingequality holds:

fold(unfold(input))==divisor*input

wheredivisor is a tensor that depends only on the shapeand dtype of theinput:

>>>input_ones=torch.ones(input.shape,dtype=input.dtype)>>>divisor=fold(unfold(input_ones))

When thedivisor tensor contains no zero elements, thenfold andunfold operations are inverses of eachother (up to constant divisor).

Warning

Currently, only 4-D input tensors (batched image-like tensors) aresupported.

Shape:

Input: $(N, C, *) (N, C, *)$
Output: $(N, C \times \prod (kernel_size), L) (N, C \times \prod(\text{kernel\_size}), L)$ as described above

Examples:

>>>unfold=nn.Unfold(kernel_size=(2,3))>>>input=torch.randn(2,5,3,4)>>>output=unfold(input)>>># each patch contains 30 values (2x3=6 vectors, each of 5 channels)>>># 4 blocks (2x3 kernels) in total in the 3x4 input>>>output.size()torch.Size([2, 30, 4])>>># Convolution is equivalent with Unfold + Matrix Multiplication + Fold (or view to output shape)>>>inp=torch.randn(1,3,10,12)>>>w=torch.randn(2,3,4,5)>>>inp_unf=torch.nn.functional.unfold(inp,(4,5))>>>out_unf=inp_unf.transpose(1,2).matmul(w.view(w.size(0),-1).t()).transpose(1,2)>>>out=torch.nn.functional.fold(out_unf,(7,8),(1,1))>>># or equivalently (and avoiding a copy),>>># out = out_unf.view(1, 2, 7, 8)>>>(torch.nn.functional.conv2d(inp,w)-out).abs().max()tensor(1.9073e-06)

extra_repr()[source]#

Return the extra representation of the module.

Return type: str

forward(input)[source]#

Runs the forward pass.

Return type: Tensor

On this page

Show Source

PyTorch Libraries

Movatterモバイル変換

Unfold #

Docs

Tutorials

Resources

Movatterモバイル変換

Unfold#

Docs

Tutorials

Resources

Unfold #