SWALR#
- classtorch.optim.swa_utils.SWALR(optimizer,swa_lr,anneal_epochs=10,anneal_strategy='cos',last_epoch=-1)[source]#
Anneals the learning rate in each parameter group to a fixed value.
This learning rate scheduler is meant to be used with Stochastic WeightAveraging (SWA) method (seetorch.optim.swa_utils.AveragedModel).
- Parameters
optimizer (torch.optim.Optimizer) – wrapped optimizer
swa_lrs (float orlist) – the learning rate value for all param groupstogether or separately for each group.
annealing_epochs (int) – number of epochs in the annealing phase(default: 10)
annealing_strategy (str) – “cos” or “linear”; specifies the annealingstrategy: “cos” for cosine annealing, “linear” for linear annealing(default: “cos”)
last_epoch (int) – the index of the last epoch (default: -1)
The
SWALRscheduler can be used together with otherschedulers to switch to a constant learning rate late in the trainingas in the example below.Example
>>>loader,optimizer,model=...>>>lr_lambda=lambdaepoch:0.9>>>scheduler=torch.optim.lr_scheduler.MultiplicativeLR(optimizer,>>>lr_lambda=lr_lambda)>>>swa_scheduler=torch.optim.swa_utils.SWALR(optimizer,>>>anneal_strategy="linear",anneal_epochs=20,swa_lr=0.05)>>>swa_start=160>>>foriinrange(300):>>>forinput,targetinloader:>>>optimizer.zero_grad()>>>loss_fn(model(input),target).backward()>>>optimizer.step()>>>ifi>swa_start:>>>swa_scheduler.step()>>>else:>>>scheduler.step()
- load_state_dict(state_dict)[source]#
Load the scheduler’s state.
- Parameters
state_dict (dict) – scheduler state. Should be an object returnedfrom a call to
state_dict().