BackwardCFunction#
- classtorch.autograd.function.BackwardCFunction[source]#
This class is used for internal autograd work. Do not use.
- mark_dirty(*args)[source]#
Mark given tensors as modified in an in-place operation.
This should be called at most once, in either the
setup_context()orforward()methods, and all arguments should be inputs.Every tensor that’s been modified in-place in a call to
forward()should be given to this function, to ensure correctness of our checks.It doesn’t matter whether the function is called before or aftermodification.- Examples::
>>>classInplace(Function):>>>@staticmethod>>>defforward(ctx,x):>>>x_npy=x.numpy()# x_npy shares storage with x>>>x_npy+=1>>>ctx.mark_dirty(x)>>>returnx>>>>>>@staticmethod>>>@once_differentiable>>>defbackward(ctx,grad_output):>>>returngrad_output>>>>>>a=torch.tensor(1.,requires_grad=True,dtype=torch.double).clone()>>>b=a*a>>>Inplace.apply(a)# This would lead to wrong gradients!>>># but the engine would not know unless we mark_dirty>>>b.backward()# RuntimeError: one of the variables needed for gradient>>># computation has been modified by an inplace operation
- mark_non_differentiable(*args)[source]#
Mark outputs as non-differentiable.
This should be called at most once, in either the
setup_context()orforward()methods, and all arguments should be tensor outputs.This will mark outputs as not requiring gradients, increasing theefficiency of backward computation. You still need to accept a gradientfor each output in
backward(), but it’s always going tobe a zero tensor with the same shape as the shape of a correspondingoutput.- This is used e.g. for indices returned from a sort. See example::
>>>classFunc(Function):>>>@staticmethod>>>defforward(ctx,x):>>>sorted,idx=x.sort()>>>ctx.mark_non_differentiable(idx)>>>ctx.save_for_backward(x,idx)>>>returnsorted,idx>>>>>>@staticmethod>>>@once_differentiable>>>defbackward(ctx,g1,g2):# still need to accept g2>>>x,idx=ctx.saved_tensors>>>grad_input=torch.zeros_like(x)>>>grad_input.index_add_(0,idx,g1)>>>returngrad_input
- save_for_backward(*tensors)[source]#
Save given tensors for a future call to
backward().save_for_backwardshould be called at most once, in either thesetup_context()orforward()methods, and only with tensors.All tensors intended to be used in the backward pass should be savedwith
save_for_backward(as opposed to directly onctx) to preventincorrect gradients and memory leaks, and enable the application of savedtensor hooks. Seetorch.autograd.graph.saved_tensors_hooks.SeeExtending torch.autograd for more details.Note that if intermediary tensors, tensors that are neither inputsnor outputs of
forward(), are saved for backward, your custom Functionmay not support double backward.Custom Functions that do not support double backward should decorate theirbackward()method with@once_differentiableso that performingdouble backward raises an error. If you’d like to support double backward,you can either recompute intermediaries based on the inputs during backwardor return the intermediaries as the outputs of the custom Function. See thedouble backward tutorialfor more details.In
backward(), saved tensors can be accessed through thesaved_tensorsattribute. Before returning them to the user, a check is made to ensurethey weren’t used in any in-place operation that modified their content.Arguments can also be
None. This is a no-op.SeeExtending torch.autograd for more details on how to use this method.
Example:
>>>classFunc(Function):>>>@staticmethod>>>defforward(ctx,x:torch.Tensor,y:torch.Tensor,z:int):>>>w=x*z>>>out=x*y+y*z+w*y>>>ctx.save_for_backward(x,y,w,out)>>>ctx.z=z# z is not a tensor>>>returnout>>>>>>@staticmethod>>>@once_differentiable>>>defbackward(ctx,grad_out):>>>x,y,w,out=ctx.saved_tensors>>>z=ctx.z>>>gx=grad_out*(y+y*z)>>>gy=grad_out*(x+z+w)>>>gz=None>>>returngx,gy,gz>>>>>>a=torch.tensor(1.,requires_grad=True,dtype=torch.double)>>>b=torch.tensor(2.,requires_grad=True,dtype=torch.double)>>>c=4>>>d=Func.apply(a,b,c)
- save_for_forward(*tensors)[source]#
Save given tensors for a future call to
jvp().save_for_forwardshould be called at most once, in either thesetup_context()orforward()methods, and all argumentsshould be tensors.In
jvp(), saved objects can be accessed through thesaved_tensorsattribute.Arguments can also be
None. This is a no-op.SeeExtending torch.autograd for more details on how to use this method.
Example:
>>>classFunc(torch.autograd.Function):>>>@staticmethod>>>defforward(ctx,x:torch.Tensor,y:torch.Tensor,z:int):>>>ctx.save_for_backward(x,y)>>>ctx.save_for_forward(x,y)>>>ctx.z=z>>>returnx*y*z>>>>>>@staticmethod>>>defjvp(ctx,x_t,y_t,_):>>>x,y=ctx.saved_tensors>>>z=ctx.z>>>returnz*(y*x_t+x*y_t)>>>>>>@staticmethod>>>defvjp(ctx,grad_out):>>>x,y=ctx.saved_tensors>>>z=ctx.z>>>returnz*grad_out*y,z*grad_out*x,None>>>>>>a=torch.tensor(1.,requires_grad=True,dtype=torch.double)>>>t=torch.tensor(1.,dtype=torch.double)>>>b=torch.tensor(2.,requires_grad=True,dtype=torch.double)>>>c=4>>>>>>withfwAD.dual_level():>>>a_dual=fwAD.make_dual(a,t)>>>d=Func.apply(a_dual,b,c)
- set_materialize_grads(value)[source]#
Set whether to materialize grad tensors. Default is
True.This should be called only from either the
setup_context()orforward()methods.If
True, undefined grad tensors will be expanded to tensors full of zerosprior to calling thebackward()andjvp()methods.Example:
>>>classSimpleFunc(Function):>>>@staticmethod>>>defforward(ctx,x):>>>returnx.clone(),x.clone()>>>>>>@staticmethod>>>@once_differentiable>>>defbackward(ctx,g1,g2):>>>returng1+g2# No check for None necessary>>>>>># We modify SimpleFunc to handle non-materialized grad outputs>>>classFunc(Function):>>>@staticmethod>>>defforward(ctx,x):>>>ctx.set_materialize_grads(False)>>>ctx.save_for_backward(x)>>>returnx.clone(),x.clone()>>>>>>@staticmethod>>>@once_differentiable>>>defbackward(ctx,g1,g2):>>>x,=ctx.saved_tensors>>>grad_input=torch.zeros_like(x)>>>ifg1isnotNone:# We must check for None now>>>grad_input+=g1>>>ifg2isnotNone:>>>grad_input+=g2>>>returngrad_input>>>>>>a=torch.tensor(1.,requires_grad=True)>>>b,_=Func.apply(a)# induces g2 to be undefined