Rate this Page

SWALR#

classtorch.optim.swa_utils.SWALR(optimizer,swa_lr,anneal_epochs=10,anneal_strategy='cos',last_epoch=-1)[source]#

Anneals the learning rate in each parameter group to a fixed value.

This learning rate scheduler is meant to be used with Stochastic WeightAveraging (SWA) method (seetorch.optim.swa_utils.AveragedModel).

Parameters
  • optimizer (torch.optim.Optimizer) – wrapped optimizer

  • swa_lrs (float orlist) – the learning rate value for all param groupstogether or separately for each group.

  • annealing_epochs (int) – number of epochs in the annealing phase(default: 10)

  • annealing_strategy (str) – “cos” or “linear”; specifies the annealingstrategy: “cos” for cosine annealing, “linear” for linear annealing(default: “cos”)

  • last_epoch (int) – the index of the last epoch (default: -1)

TheSWALR scheduler can be used together with otherschedulers to switch to a constant learning rate late in the trainingas in the example below.

Example

>>>loader,optimizer,model=...>>>lr_lambda=lambdaepoch:0.9>>>scheduler=torch.optim.lr_scheduler.MultiplicativeLR(optimizer,>>>lr_lambda=lr_lambda)>>>swa_scheduler=torch.optim.swa_utils.SWALR(optimizer,>>>anneal_strategy="linear",anneal_epochs=20,swa_lr=0.05)>>>swa_start=160>>>foriinrange(300):>>>forinput,targetinloader:>>>optimizer.zero_grad()>>>loss_fn(model(input),target).backward()>>>optimizer.step()>>>ifi>swa_start:>>>swa_scheduler.step()>>>else:>>>scheduler.step()
get_last_lr()[source]#

Return last computed learning rate by current scheduler.

Return type

list[float]

get_lr()[source]#

Get learning rate.

load_state_dict(state_dict)[source]#

Load the scheduler’s state.

Parameters

state_dict (dict) – scheduler state. Should be an object returnedfrom a call tostate_dict().

state_dict()[source]#

Return the state of the scheduler as adict.

It contains an entry for every variable in self.__dict__ whichis not the optimizer.

Return type

dict[str,Any]

step(epoch=None)[source]#

Perform a step.