ReduceLROnPlateau#
- classtorch.optim.lr_scheduler.ReduceLROnPlateau(optimizer,mode='min',factor=0.1,patience=10,threshold=0.0001,threshold_mode='rel',cooldown=0,min_lr=0,eps=1e-08)[source]#
Reduce learning rate when a metric has stopped improving.
Models often benefit from reducing the learning rate by a factorof 2-10 once learning stagnates. This scheduler reads a metricsquantity and if no improvement is seen for a ‘patience’ numberof epochs, the learning rate is reduced.
- Parameters
optimizer (Optimizer) – Wrapped optimizer.
mode (str) – One ofmin,max. Inmin mode, lr willbe reduced when the quantity monitored has stoppeddecreasing; inmax mode it will be reduced when thequantity monitored has stopped increasing. Default: ‘min’.
factor (float) – Factor by which the learning rate will bereduced. new_lr = lr * factor. Default: 0.1.
patience (int) – The number of allowed epochs with no improvement afterwhich the learning rate will be reduced.For example, consider the case of having no patience (patience = 0).In the first epoch, a baseline is established and is always considered good as there’s no previous baseline.In the second epoch, if the performance is worse than the baseline,we have what is considered an intolerable epoch.Since the count of intolerable epochs (1) is greater than the patience level (0),the learning rate is reduced at the end of this epoch.From the third epoch onwards, the learning rate continues to be reduced at the end of each epochif the performance is worse than the baseline. If the performance improves or remains the same,the learning rate is not adjusted.Default: 10.
threshold (float) – Threshold for measuring the new optimum,to only focus on significant changes. Default: 1e-4.
threshold_mode (str) – One ofrel,abs. Inrel mode,dynamic_threshold = best * ( 1 + threshold ) in ‘max’mode or best * ( 1 - threshold ) inmin mode.Inabs mode, dynamic_threshold = best + threshold inmax mode or best - threshold inmin mode. Default: ‘rel’.
cooldown (int) – Number of epochs to wait before resumingnormal operation after lr has been reduced. Default: 0.
min_lr (float orlist) – A scalar or a list of scalars. Alower bound on the learning rate of all param groupsor each group respectively. Default: 0.
eps (float) – Minimal decay applied to lr. If the differencebetween new and old lr is smaller than eps, the update isignored. Default: 1e-8.
Example
>>>optimizer=torch.optim.SGD(model.parameters(),lr=0.1,momentum=0.9)>>>scheduler=ReduceLROnPlateau(optimizer,"min")>>>forepochinrange(10):>>>train(...)>>>val_loss=validate(...)>>># Note that step should be called after validate()>>>scheduler.step(val_loss)
