RepeatedRandomSampler#

class stable_pretraining.data.RepeatedRandomSampler(data_source_or_len: int | Iterable, n_views: int = 1, replacement: bool = False, seed: int = 0, pass_view_idx: bool = False)[source]#

Bases: DistributedSampler

Sampler that repeats each dataset index consecutively for multi-view learning.

IMPORTANT: This sampler repeats each index n_views times in a row, creating sequences like [0,0,0,0, 1,1,1,1, 2,2,2,2, …] for n_views=4. This means: - The DataLoader will load the SAME image multiple times consecutively - Each repeated index goes through the transform pipeline separately - BATCH SIZE: The batch_size in DataLoader refers to total augmented samples.

For example: batch_size=128 with n_views=8 means only 16 unique images, each appearing 8 times with different augmentations

Designed to work with RoundRobinMultiViewTransform which uses a counter to apply different augmentations to each repeated occurrence of the same image.

Example behavior with n_views=3:

Dataset indices: [0, 1, 2, 3, 4] Sampler output: [0,0,0, 1,1,1, 2,2,2, 3,3,3, 4,4,4]

Parameters:
  • data_source (Dataset) – dataset to sample from

  • n_views (int) – number of times to repeat each index consecutively, default=1

  • replacement (bool) – samples are drawn on-demand with replacement if True, default=``False``

  • seed (int) – random seed for shuffling

  • pass_view_idx (bool) – whether to pass the view index to the dataset getitem

Note: For an alternative approach that loads each image once, consider using MultiViewTransform with a standard sampler.

Examples using RepeatedRandomSampler:#

sphx_glr_auto_examples_multi_layer_probe.py

Multi-layer probe for vision models.

Supervised Learning Example

Supervised Learning Example