LARS

Contents

LARS#

class stable_ssl.optimizers.LARS(params, lr=1.0, momentum=0, eta=0.001, dampening=0, weight_decay=0, nesterov=False, epsilon=0)[source]#

Bases: Optimizer

Implement LARS (Layer-wise Adaptive Rate Scaling) optimizer.

Parameters:
  • params (iterable) – Iterable of parameters to optimize or dicts defining parameter groups.

  • lr (float) – Learning rate.

  • momentum (float, optional) – Momentum factor. Default is 0.

  • eta (float, optional) – LARS coefficient as used in the paper. Default is 1e-3.

  • weight_decay (float, optional) – Weight decay (L2 penalty). Default is 0.

  • dampening (float, optional) – Dampening for momentum. Default is 0.

  • nesterov (bool, optional) – Enables Nesterov momentum. Default is False.

  • epsilon (float, optional) – Epsilon to prevent division by zero. Default is 0.

step(closure=None)[source]#

Perform a single optimization step.

Parameters:

closure (callable, optional) – A closure that reevaluates the model and returns the loss.