Unfold#
- classtorch.nn.modules.fold.Unfold(kernel_size,dilation=1,padding=0,stride=1)[source]#
Extracts sliding local blocks from a batched input tensor.
Consider a batched
inputtensor of shape,where is the batch dimension, is the channel dimension,and represent arbitrary spatial dimensions. This operation flattenseach slidingkernel_size-sized block within the spatial dimensionsofinputinto a column (i.e., last dimension) of a 3-Doutputtensor of shape, where is the total number of valueswithin each block (a block has spatiallocations each containing a-channeled vector), and isthe total number of such blocks:where is formed by the spatial dimensionsof
input( above), and is over all spatialdimensions.Therefore, indexing
outputat the last dimension (column dimension)gives all values within a certain block.The
padding,strideanddilationarguments specifyhow the sliding blocks are retrieved.stridecontrols the stride for the sliding blocks.paddingcontrols the amount of implicit zero-paddings on bothsides forpaddingnumber of points for each dimension beforereshaping.dilationcontrols the spacing between the kernel points; also known as the à trous algorithm.It is harder to describe, but thislink has a nice visualization of whatdilationdoes.
- Parameters
dilation (int ortuple,optional) – a parameter that controls thestride of elements within theneighborhood. Default: 1
padding (int ortuple,optional) – implicit zero padding to be added onboth sides of input. Default: 0
stride (int ortuple,optional) – the stride of the sliding blocks in the inputspatial dimensions. Default: 1
If
kernel_size,dilation,paddingorstrideis an int or a tuple of length 1, their values will bereplicated across all spatial dimensions.For the case of two input spatial dimensions this operation is sometimescalled
im2col.
Note
Foldcalculates each combined value in the resultinglarge tensor by summing all values from all containing blocks.Unfoldextracts the values in the local blocks bycopying from the large tensor. So, if the blocks overlap, they are notinverses of each other.In general, folding and unfolding operations are related asfollows. Consider
FoldandUnfoldinstances created with the sameparameters:>>>fold_params=dict(kernel_size=...,dilation=...,padding=...,stride=...)>>>fold=nn.Fold(output_size=...,**fold_params)>>>unfold=nn.Unfold(**fold_params)
Then for any (supported)
inputtensor the followingequality holds:fold(unfold(input))==divisor*input
where
divisoris a tensor that depends only on the shapeand dtype of theinput:>>>input_ones=torch.ones(input.shape,dtype=input.dtype)>>>divisor=fold(unfold(input_ones))
When the
divisortensor contains no zero elements, thenfoldandunfoldoperations are inverses of eachother (up to constant divisor).Warning
Currently, only 4-D input tensors (batched image-like tensors) aresupported.
- Shape:
Input:
Output: as described above
Examples:
>>>unfold=nn.Unfold(kernel_size=(2,3))>>>input=torch.randn(2,5,3,4)>>>output=unfold(input)>>># each patch contains 30 values (2x3=6 vectors, each of 5 channels)>>># 4 blocks (2x3 kernels) in total in the 3x4 input>>>output.size()torch.Size([2, 30, 4])>>># Convolution is equivalent with Unfold + Matrix Multiplication + Fold (or view to output shape)>>>inp=torch.randn(1,3,10,12)>>>w=torch.randn(2,3,4,5)>>>inp_unf=torch.nn.functional.unfold(inp,(4,5))>>>out_unf=inp_unf.transpose(1,2).matmul(w.view(w.size(0),-1).t()).transpose(1,2)>>>out=torch.nn.functional.fold(out_unf,(7,8),(1,1))>>># or equivalently (and avoiding a copy),>>># out = out_unf.view(1, 2, 7, 8)>>>(torch.nn.functional.conv2d(inp,w)-out).abs().max()tensor(1.9073e-06)