Optim¶

Optimizer¶

class lightning_asr.optim.optimizer.Optimizer(optim, scheduler=None, scheduler_period=None, max_grad_norm=0)[source]¶

This is wrapper classs of torch.optim.Optimizer. This class provides functionalities for learning rate scheduling and gradient norm clipping.

Parameters

optim (torch.optim.Optimizer) – optimizer object, the parameters to be optimized should be given when instantiating the object, e.g. torch.optim.Adam, torch.optim.SGD
scheduler (kospeech.optim.lr_scheduler, optional) – learning rate scheduler
scheduler_period (int, optional) – timestep with learning rate scheduler
max_grad_norm (int, optional) – value used for gradient norm clipping

AdamP¶

class lightning_asr.optim.adamp.AdamP(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, delta=0.1, wd_ratio=0.1, nesterov=False)[source]¶

step(closure=None)[source]¶

Performs a single optimization step (parameter update).

Parameters: closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Note

Unless otherwise specified, this function should not modify the .grad field of the parameters.

RAdam¶

class lightning_asr.optim.radam.RAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, degenerated_to_sgd=True)[source]¶

Paper: “On the Variance of the Adaptive Learning Rate and Beyond” Refer to https://github.com/LiyuanLucasLiu/RAdam Copyright (c) LiyuanLucasLiu Apache 2.0 License

step(closure=None)[source]¶

Performs a single optimization step (parameter update).

Parameters: closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Note

Unless otherwise specified, this function should not modify the .grad field of the parameters.