LinearWarmupCosineAnnealing

LinearWarmupCosineAnnealing#

class stable_ssl.optim.LinearWarmupCosineAnnealing(optimizer, total_steps, start_factor=0.01, end_lr=0.0, peak_step=0.01)[source]#

Bases:

Combine linear warmup with cosine annealing decay.

This function creates a scheduler that first linearly warms up the learning rate, then applies cosine annealing decay. This is commonly used in self-supervised learning to achieve better convergence.

Parameters:
  • optimizer (torch.optim.Optimizer) – The optimizer to schedule.

  • total_steps (int) – Total number of training steps.

  • start_factor (float, optional) – Initial learning rate factor for warmup. Defaults to 0.01.

  • end_lr (float, optional) – Final learning rate after annealing. Defaults to 0.0.

  • peak_step (float, optional) – Step at which warmup ends (as fraction of total_steps). Defaults to 0.01.

Returns:

Combined warmup and annealing scheduler.

Return type:

torch.optim.lr_scheduler.SequentialLR

Example

>>> optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
>>> scheduler = LinearWarmupCosineAnnealing(optimizer, total_steps=1000)