RepeatedRandomSampler#
- class stable_pretraining.data.RepeatedRandomSampler(data_source_or_len: int | Iterable, n_views: int = 1, replacement: bool = False, seed: int = 0, pass_view_idx: bool = False)[source]#
Bases:
DistributedSampler
Sampler that repeats each dataset index consecutively for multi-view learning.
IMPORTANT: This sampler repeats each index n_views times in a row, creating sequences like [0,0,0,0, 1,1,1,1, 2,2,2,2, …] for n_views=4. This means: - The DataLoader will load the SAME image multiple times consecutively - Each repeated index goes through the transform pipeline separately - BATCH SIZE: The batch_size in DataLoader refers to total augmented samples.
For example: batch_size=128 with n_views=8 means only 16 unique images, each appearing 8 times with different augmentations
Designed to work with RoundRobinMultiViewTransform which uses a counter to apply different augmentations to each repeated occurrence of the same image.
- Example behavior with n_views=3:
Dataset indices: [0, 1, 2, 3, 4] Sampler output: [0,0,0, 1,1,1, 2,2,2, 3,3,3, 4,4,4]
- Parameters:
data_source (Dataset) – dataset to sample from
n_views (int) – number of times to repeat each index consecutively, default=1
replacement (bool) – samples are drawn on-demand with replacement if
True
, default=``False``seed (int) – random seed for shuffling
pass_view_idx (bool) – whether to pass the view index to the dataset getitem
Note: For an alternative approach that loads each image once, consider using MultiViewTransform with a standard sampler.
Examples using RepeatedRandomSampler
:#

sphx_glr_auto_examples_multi_layer_probe.py