CIFAR-10
Overview
The CIFAR-10 dataset is a widely-used image classification benchmark for single-label classification consisting of 60,000 32×32 RGB color images across 10 balanced classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck), with 6,000 images per class.
Train: 50,000 images (5,000 per class)
Test: 10,000 images (1,000 per class)
Data Structure
When accessing an example using ds[i], you will receive a dictionary with the following keys:
Key |
Type |
Description |
|---|---|---|
|
|
32×32×3 RGB image |
|
int |
Class label (0-9) |
Usage Example
Basic Usage
from stable_datasets.images.cifar10 import CIFAR10
# First run will download + prepare cache, then return the split as a HF Dataset
ds = CIFAR10(split="train")
# If you omit the split (split=None), you get a DatasetDict with all available splits
ds_all = CIFAR10(split=None)
sample = ds[0]
print(sample.keys()) # {"image", "label"}
# Optional: make it PyTorch-friendly
ds_torch = ds.with_format("torch")
References
Official website: https://www.cs.toronto.edu/~kriz/cifar.html
License: MIT License
Citation
@Techreport{krizhevsky2009learning,
author = {Krizhevsky, Alex and Hinton, Geoffrey},
address = {Toronto, Ontario},
institution = {University of Toronto},
number = {0},
publisher = {Technical report, University of Toronto},
title = {Learning multiple layers of features from tiny images},
year = {2009},
title_with_no_special_chars = {Learning multiple layers of features from tiny images},
url = {https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf}
}