Optim¶
Optimizer¶
-
class
lightning_asr.optim.optimizer.
Optimizer
(optim, scheduler=None, scheduler_period=None, max_grad_norm=0)[source]¶ This is wrapper classs of torch.optim.Optimizer. This class provides functionalities for learning rate scheduling and gradient norm clipping.
- Parameters
optim (torch.optim.Optimizer) – optimizer object, the parameters to be optimized should be given when instantiating the object, e.g. torch.optim.Adam, torch.optim.SGD
scheduler (kospeech.optim.lr_scheduler, optional) – learning rate scheduler
scheduler_period (int, optional) – timestep with learning rate scheduler
max_grad_norm (int, optional) – value used for gradient norm clipping
AdamP¶
-
class
lightning_asr.optim.adamp.
AdamP
(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, delta=0.1, wd_ratio=0.1, nesterov=False)[source]¶ Paper: “AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights” Copied from https://github.com/clovaai/AdamP/ Copyright (c) 2020 Naver Corp. MIT License
-
step
(closure=None)[source]¶ Performs a single optimization step (parameter update).
- Parameters
closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.
Note
Unless otherwise specified, this function should not modify the
.grad
field of the parameters.
-
RAdam¶
-
class
lightning_asr.optim.radam.
RAdam
(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, degenerated_to_sgd=True)[source]¶ Paper: “On the Variance of the Adaptive Learning Rate and Beyond” Refer to https://github.com/LiyuanLucasLiu/RAdam Copyright (c) LiyuanLucasLiu Apache 2.0 License
-
step
(closure=None)[source]¶ Performs a single optimization step (parameter update).
- Parameters
closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.
Note
Unless otherwise specified, this function should not modify the
.grad
field of the parameters.
-