OptimConfig#
- class stable_ssl.config.OptimConfig(optimizer: str = 'LARS', lr: float = 1.0, batch_size: int = 256, epochs: int = 1000, max_steps: int = -1, weight_decay: float = 0, momentum: float | None = None, nesterov: bool | None = None, betas: Tuple[float, float] | None = None, grad_max_norm: float | None = None)[source]#
Bases:
object
Configuration for the ‘optimizer’ parameters.
- Parameters:
optimizer (str) – Type of optimizer to use (e.g., “AdamW”, “RMSprop”, “SGD”, “LARS”). Default is “LARS”.
lr (float) – Learning rate for the optimizer. Default is 1e0.
batch_size (int, optional) – Batch size for training. Default is 256.
epochs (int, optional) – Number of epochs to train the model. Default is 10.
max_steps (int, optional) – Maximum number of steps per epoch. Default is -1.
weight_decay (float) – Weight decay for the optimizer. Default is 1e-6.
momentum (float) – Momentum for the optimizer. Default is None.
nesterov (bool) – Whether to use Nesterov momentum. Default is False.
betas (Tuple[float, float], optional) – Betas for the AdamW optimizer. Default is (0.9, 0.999).
grad_max_norm (float, optional) – Maximum norm for gradient clipping. Default is None.