stable_datasets.images package

Submodules

stable_datasets.images.arabic_characters module

class ArabicCharacters(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Arabic Handwritten Characters Dataset

Abstract Handwritten Arabic character recognition systems face several challenges, including the unlimited variation in human handwriting and large public databases. In this work, we model a deep learning architecture that can be effectively apply to recognizing Arabic handwritten characters. A Convolutional Neural Network (CNN) is a special type of feed-forward multilayer trained in supervised mode. The CNN trained and tested our database that contain 16800 of handwritten Arabic characters. In this paper, the optimization methods implemented to increase the performance of CNN. Common machine learning methods usually apply a combination of feature extractor and trainable classifier. The use of CNN leads to significant improvements across different machine-learning classification algorithms. Our proposed CNN is giving an average 5.1% misclassification error on testing data.

Context The motivation of this study is to use cross knowledge learned from multiple works to enhancement the performance of Arabic handwritten character recognition. In recent years, Arabic handwritten characters recognition with different handwriting styles as well, making it important to find and work on a new and advanced solution for handwriting recognition. A deep learning systems needs a huge number of data (images) to be able to make a good decisions.

Content The data-set is composed of 16,800 characters written by 60 participants, the age range is between 19 to 40 years, and 90% of participants are right-hand. Each participant wrote each character (from ’alef’ to ’yeh’) ten times on two forms as shown in Fig. 7(a) & 7(b). The forms were scanned at the resolution of 300 dpi. Each block is segmented automatically using Matlab 2016a to determining the coordinates for each block. The database is partitioned into two sets: a training set (13,440 characters to 480 images per class) and a test set (3,360 characters to 120 images per class). Writers of training set and test set are exclusive. Ordering of including writers to test set are randomized to make sure that writers of test set are not from a single institution (to ensure variability of the test set).

SOURCE: Mapping = mappingproxy({'homepage': 'https://github.com/mloey/Arabic-Handwritten-Characters-Dataset', 'assets': mappingproxy({'train': 'https://github.com/mloey/Arabic-Handwritten-Characters-Dataset/raw/master/Train%20Images%2013440x32x32.zip', 'test': 'https://github.com/mloey/Arabic-Handwritten-Characters-Dataset/raw/master/Test%20Images%203360x32x32.zip'}), 'citation': '@article{el2017arabic,\n title={Arabic handwritten characters recognition using convolutional neural network},\n author={El-Sawy, Ahmed and Loey, Mohamed and El-Bakry, Hazem},\n journal={WSEAS Transactions on Computer Research},\n volume={5},\n pages={11--19},\n year={2017}}'})

VERSION: datasets.Version = 1.0.0

stable_datasets.images.arabic_digits module

Bases: GeneratorBasedBuilder

Arabic Handwritten Digits Dataset.

VERSION = 1.0.0

stable_datasets.images.awa2 module

Bases: GeneratorBasedBuilder

The Animals with Attributes 2 (AwA2) dataset provides images across 50 animal classes, useful for attribute-based classification and zero-shot learning research. See https://cvml.ista.ac.at/AwA2/ for more information.

VERSION = 1.0.0

stable_datasets.images.beans module

Bases: GeneratorBasedBuilder

Bean disease dataset for classification of three classes: Angular Leaf Spot, Bean Rust, and Healthy leaves.

VERSION = 1.0.0

stable_datasets.images.cars196 module

class Cars196(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Cars-196 Dataset The Cars-196 dataset, also known as the Stanford Cars dataset, is a benchmark dataset for fine-grained visual classification of automobiles. It contains 16,185 color images covering 196 car categories, where each category is defined by a specific combination of make, model, and year. The dataset is split into 8,144 training images and 8,041 test images, with the first 98 classes used exclusively for training and the remaining 98 classes reserved for testing, ensuring that training and test classes are disjoint. Images are collected from real-world scenes and exhibit significant variation in v iewpoint, background, and lighting conditions. Each image is annotated with a class label and a tight bounding box around the car, making the dataset suitable for fine-grained recognition tasks that require precise object localization and strong generalization to unseen categories.

SOURCE: Mapping = mappingproxy({'homepage': 'https://ai.stanford.edu/~jkrause/cars/car_dataset.html', 'assets': mappingproxy({'train': 'https://huggingface.co/datasets/haodoz0118/cars196-img/resolve/main/cars196_train.zip', 'test': 'https://huggingface.co/datasets/haodoz0118/cars196-img/resolve/main/cars196_test.zip'}), 'citation': '@inproceedings{krause20133d,\n title={3d object representations for fine-grained categorization},\n author={Krause, Jonathan and Stark, Michael and Deng, Jia and Fei-Fei, Li},\n booktitle={Proceedings of the IEEE international conference on computer vision workshops},\n pages={554--561},\n year={2013}}'})

VERSION: datasets.Version = 1.0.0

stable_datasets.images.cassava module

Legacy Cassava loader (to be refactored into a BaseDatasetBuilder).

This module was moved under stable_datasets.images to align the repository layout. It still exposes the original imperative cassava.load(…) API for now.

class cassava[source]

Bases: object

Plant images classification.

The data consists of two folders, a training folder that contains 5 subfolders that contain the respective images for the different 5 classes and a test folder containing test images.

classes = ['cbb', 'cmd', 'cbsd', 'cgm', 'healthy']

static download(path)[source]

static load(path=None)[source]

stable_datasets.images.celeb_a module

Bases: GeneratorBasedBuilder

The CelebA dataset is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.

VERSION = 1.0.0

stable_datasets.images.cifar10 module

class CIFAR10(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Image classification. The `CIFAR-10 < https: // www.cs.toronto.edu/~kriz/cifar.html >`_ dataset was collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

SOURCE: Mapping = mappingproxy({'homepage': 'https://www.cs.toronto.edu/~kriz/cifar.html', 'assets': mappingproxy({'train': 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz', 'test': 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'}), 'citation': '@article{krizhevsky2009learning,\n title={Learning multiple layers of features from tiny images},\n author={Krizhevsky, Alex and Hinton, Geoffrey and others},\n year={2009},\n publisher={Toronto, ON, Canada}}'})

VERSION: datasets.Version = 1.0.0

stable_datasets.images.cifar100 module

class CIFAR100(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

CIFAR-100 dataset, a variant of CIFAR-10 with 100 classes.

SOURCE: Mapping = mappingproxy({'homepage': 'https://www.cs.toronto.edu/~kriz/cifar.html', 'assets': mappingproxy({'train': 'https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz', 'test': 'https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz'}), 'citation': '@article{krizhevsky2009learning,\n title={Learning multiple layers of features from tiny images},\n author={Krizhevsky, Alex and Hinton, Geoffrey and others},\n year={2009},\n publisher={Toronto, ON, Canada}}'})

VERSION: datasets.Version = 1.0.0

stable_datasets.images.cifar100_c module

class CIFAR100C(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

CIFAR-100-C dataset with corrupted CIFAR-100 images.

SOURCE: Mapping = mappingproxy({'homepage': 'https://zenodo.org/records/3555552', 'assets': mappingproxy({'test': 'https://zenodo.org/records/3555552/files/CIFAR-100-C.tar?download=1'}), 'citation': '@article{hendrycks2019robustness,\n title={Benchmarking Neural Network Robustness to Common Corruptions and Perturbations},\n author={Dan Hendrycks and Thomas Dietterich},\n journal={Proceedings of the International Conference on Learning Representations},\n year={2019}}'})

VERSION: datasets.Version = 1.0.0

stable_datasets.images.cifar10_c module

class CIFAR10C(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

CIFAR-10-C dataset with corrupted CIFAR-10 images.

SOURCE: Mapping = mappingproxy({'homepage': 'https://zenodo.org/records/2535967', 'assets': mappingproxy({'test': 'https://zenodo.org/records/2535967/files/CIFAR-10-C.tar?download=1'}), 'citation': '@article{hendrycks2019robustness,\n title={Benchmarking Neural Network Robustness to Common Corruptions and Perturbations},\n author={Dan Hendrycks and Thomas Dietterich},\n journal={Proceedings of the International Conference on Learning Representations},\n year={2019}}'})

VERSION: datasets.Version = 1.0.0

stable_datasets.images.country211 module

Bases: GeneratorBasedBuilder

Country211: Image Classification Dataset for Geolocation. This dataset uses a subset of the YFCC100M dataset, filtered by GPS coordinates to include images labeled with ISO-3166 country codes. Each country has a balanced sample of images for training, validation, and testing.

VERSION = 1.0.0

stable_datasets.images.cub200 module

Bases: GeneratorBasedBuilder

Caltech-UCSD Birds-200-2011 (CUB-200-2011) Dataset

VERSION = 1.0.0

stable_datasets.images.dsprites module

Bases: GeneratorBasedBuilder

TODO: Short description of my dataset.

VERSION = 1.0.0

stable_datasets.images.dtd module

class DTD(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Describable Textures Dataset (DTD)

DTD is a texture database, consisting of 5640 images, organized according to a list of 47 terms (categories) inspired from human perception. There are 120 images for each category. Image sizes range between 300x300 and 640x640, and the images contain at least 90% of the surface representing the category attribute. The images were collected from Google and Flickr by entering our proposed attributes and related terms as search queries. The images were annotated using Amazon Mechanical Turk in several iterations. For each image we provide key attribute (main category) and a list of joint attributes.

The data is split in three equal parts, in train, validation and test, 40 images per class, for each split. We provide the ground truth annotation for both key and joint attributes, as well as the 10 splits of the data we used for evaluation.

SOURCE: Mapping = mappingproxy({'homepage': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/', 'assets': mappingproxy({'train': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz', 'test': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz', 'val': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz'}), 'citation': '@InProceedings{cimpoi14describing,\n Author = {M. Cimpoi and S. Maji and I. Kokkinos and S. Mohamed and and A. Vedaldi},\n Title = {Describing Textures in the Wild},\n Booktitle = {Proceedings of the {IEEE} Conf. on Computer Vision and Pattern Recognition ({CVPR})},\n Year = {2014}}'})

VERSION: datasets.Version = 1.0.0

stable_datasets.images.e_mnist module

Bases: GeneratorBasedBuilder

BUILDER_CONFIGS = [EMNISTConfig(name='byclass', version=1.0.0, data_dir=None, data_files=None, description=None), EMNISTConfig(name='bymerge', version=1.0.0, data_dir=None, data_files=None, description=None), EMNISTConfig(name='balanced', version=1.0.0, data_dir=None, data_files=None, description=None), EMNISTConfig(name='letters', version=1.0.0, data_dir=None, data_files=None, description=None), EMNISTConfig(name='digits', version=1.0.0, data_dir=None, data_files=None, description=None), EMNISTConfig(name='mnist', version=1.0.0, data_dir=None, data_files=None, description=None)]

class EMNISTConfig(variant, **kwargs)[source]: Bases: BuilderConfig

stable_datasets.images.face_pointing module

stable_datasets.images.fashion_mnist module

Bases: GeneratorBasedBuilder

Grayscale image classification.

Fashion-MNIST is a dataset of Zalando’s article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

VERSION = 1.0.0

stable_datasets.images.fgvc_aircraft module

Bases: GeneratorBasedBuilder

FGVC Aircraft Dataset.

VERSION = 1.0.0

stable_datasets.images.flowers102 module

Bases: GeneratorBasedBuilder

Flowers102 Dataset.

VERSION = 1.0.0

stable_datasets.images.food101 module

Bases: GeneratorBasedBuilder

A challenging data set of 101 food categories, with 101,000 images. For each class, 250 manually reviewed test images are provided as well as 750 training images. On purpose, the training images were not cleaned, and thus still contain some amount of noise. This comes mostly in the form of intense colors and sometimes wrong labels. All images were rescaled to have a maximum side length of 512 pixels.

VERSION = 1.0.0

stable_datasets.images.hasy_v2 module

Bases: GeneratorBasedBuilder

The HASYv2 dataset contains handwritten symbol images of 369 classes. Each image is 32x32 pixels in size.

VERSION = 1.0.0

stable_datasets.images.imagenet module

exception DownloadError(message='')[source]

Bases: Exception

Base class for exceptions in this module.

download(n_images, min_size, n_threads, wnids_list, out_dir)[source]

download_images(dir_path, image_url_list, n_images, min_size)[source]

get_full_subtree_wnid(wnid, timeout=5, retry=3)

get_image_urls(wnid, timeout=5, retry=3)

get_subtree_wnid(wnid, timeout=5, retry=3)

get_url_request_list_function(request_url)[source]

get_words_wnid(wnid)[source]

main(wnid, out_dir, n_threads, n_images, fullsubtree, noroot, nosubtree, min_size)[source]

mkdir(path)[source]

stable_datasets.images.imagenette module

Bases: GeneratorBasedBuilder

TODO: Short description of my dataset.

BUILDER_CONFIGS = [BuilderConfig(name='imagenet', version=1.1.0, data_dir=None, data_files=None, description='1000-class version'), BuilderConfig(name='imagenette', version=1.1.0, data_dir=None, data_files=None, description='10-class version'), BuilderConfig(name='imagenet100', version=1.1.0, data_dir=None, data_files=None, description='100-class version')]

DEFAULT_CONFIG_NAME = 'imagenette'

VERSION = 1.1.0

stable_datasets.images.k_mnist module

Bases: GeneratorBasedBuilder

Kuzushiji-MNIST and Kuzushiji-49 datasets.

BUILDER_CONFIGS = [BuilderConfig(name='kmnist', version=0.0.0, data_dir=None, data_files=None, description='Kuzushiji-MNIST dataset with 10 classes.'), BuilderConfig(name='k49mnist', version=0.0.0, data_dir=None, data_files=None, description='Kuzushiji-49 dataset with 49 classes.')]

VERSION = 1.0.0

stable_datasets.images.linnaeus5 module

Bases: GeneratorBasedBuilder

Linnaeus 5 Dataset: RGB images (256x256) for classification across 5 categories.

VERSION = 1.0.0

stable_datasets.images.med_mnist module

class MedMNIST(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

MedMNIST, a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D.

BUILDER_CONFIGS = [MedMNISTConfig(name='pathmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST PathMNIST (2D)'), MedMNISTConfig(name='chestmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST ChestMNIST (2D, multi-label)'), MedMNISTConfig(name='dermamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST DermaMNIST (2D)'), MedMNISTConfig(name='octmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OCTMNIST (2D)'), MedMNISTConfig(name='pneumoniamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST PneumoniaMNIST (2D)'), MedMNISTConfig(name='retinamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST RetinaMNIST (2D)'), MedMNISTConfig(name='breastmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST BreastMNIST (2D)'), MedMNISTConfig(name='bloodmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST BloodMNIST (2D)'), MedMNISTConfig(name='tissuemnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST TissueMNIST (2D)'), MedMNISTConfig(name='organamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganAMNIST (2D)'), MedMNISTConfig(name='organcmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganCMNIST (2D)'), MedMNISTConfig(name='organsmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganSMNIST (2D)'), MedMNISTConfig(name='organmnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganMNIST3D (3D)'), MedMNISTConfig(name='nodulemnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST NoduleMNIST3D (3D)'), MedMNISTConfig(name='adrenalmnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST AdrenalMNIST3D (3D)'), MedMNISTConfig(name='fracturemnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST FractureMNIST3D (3D)'), MedMNISTConfig(name='vesselmnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST VesselMNIST3D (3D)'), MedMNISTConfig(name='synapsemnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST SynapseMNIST3D (3D)')]

VERSION: datasets.Version = 1.0.0

class MedMNISTConfig(*, num_classes: int, is_3d: bool = False, multi_label: bool = False, **kwargs)[source]

Bases: BuilderConfig

BuilderConfig with per-variant metadata used by MedMNIST._info().

stable_datasets.images.mnist module

Bases: GeneratorBasedBuilder

MNIST Dataset using raw IDX files for digit classification.

VERSION = 1.0.0

stable_datasets.images.not_mnist module

Bases: GeneratorBasedBuilder

NotMNIST Dataset that contains images of letters A-J.

VERSION = 1.0.0

stable_datasets.images.patch_camelyon module

PatchCamelyon dataset (stub).

This file was previously a broken legacy loader at the top-level package. It was moved under stable_datasets.images to match the repository layout.

TODO: Implement as a HuggingFace-compatible builder using BaseDatasetBuilder and the local download helpers in stable_datasets.utils.

class PatchCamelyon(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

SOURCE: Mapping = mappingproxy({'homepage': 'https://github.com/basveeling/pcam', 'citation': 'TBD', 'assets': mappingproxy({})})

VERSION: datasets.Version = 0.0.0

stable_datasets.images.places365_small module

Bases: GeneratorBasedBuilder

The Places365-Standard dataset (small version) for image classification.

VERSION = 1.0.0

static extract_train_class(input_string)[source]

stable_datasets.images.rock_paper_scissor module

Bases: GeneratorBasedBuilder

Rock Paper Scissors dataset.

VERSION = 1.0.0

stable_datasets.images.stl10 module

Bases: GeneratorBasedBuilder

STL-10 Dataset The STL-10 dataset is a dataset for developing unsupervised feature learning, deep learning, and self-taught learning algorithms.

VERSION = 1.0.0

stable_datasets.images.svhn module

Bases: GeneratorBasedBuilder

SVHN (Street View House Numbers) Dataset for image classification.

VERSION = 1.0.0

stable_datasets.images.tiny_imagenet module

Bases: GeneratorBasedBuilder

Tiny ImageNet dataset for image classification tasks. It contains 200 classes with 500 training images, 50 validation images, and 50 test images per class.

VERSION = 1.0.0

stable_datasets.images.tiny_imagenet_c module

Bases: GeneratorBasedBuilder

Tiny ImageNet-C dataset for image classification tasks with corruptions applied.

VERSION = 1.0.0

Module contents

class ArabicCharacters(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Arabic Handwritten Characters Dataset

Abstract Handwritten Arabic character recognition systems face several challenges, including the unlimited variation in human handwriting and large public databases. In this work, we model a deep learning architecture that can be effectively apply to recognizing Arabic handwritten characters. A Convolutional Neural Network (CNN) is a special type of feed-forward multilayer trained in supervised mode. The CNN trained and tested our database that contain 16800 of handwritten Arabic characters. In this paper, the optimization methods implemented to increase the performance of CNN. Common machine learning methods usually apply a combination of feature extractor and trainable classifier. The use of CNN leads to significant improvements across different machine-learning classification algorithms. Our proposed CNN is giving an average 5.1% misclassification error on testing data.

Context The motivation of this study is to use cross knowledge learned from multiple works to enhancement the performance of Arabic handwritten character recognition. In recent years, Arabic handwritten characters recognition with different handwriting styles as well, making it important to find and work on a new and advanced solution for handwriting recognition. A deep learning systems needs a huge number of data (images) to be able to make a good decisions.

Content The data-set is composed of 16,800 characters written by 60 participants, the age range is between 19 to 40 years, and 90% of participants are right-hand. Each participant wrote each character (from ’alef’ to ’yeh’) ten times on two forms as shown in Fig. 7(a) & 7(b). The forms were scanned at the resolution of 300 dpi. Each block is segmented automatically using Matlab 2016a to determining the coordinates for each block. The database is partitioned into two sets: a training set (13,440 characters to 480 images per class) and a test set (3,360 characters to 120 images per class). Writers of training set and test set are exclusive. Ordering of including writers to test set are randomized to make sure that writers of test set are not from a single institution (to ensure variability of the test set).

SOURCE: Mapping = mappingproxy({'homepage': 'https://github.com/mloey/Arabic-Handwritten-Characters-Dataset', 'assets': mappingproxy({'train': 'https://github.com/mloey/Arabic-Handwritten-Characters-Dataset/raw/master/Train%20Images%2013440x32x32.zip', 'test': 'https://github.com/mloey/Arabic-Handwritten-Characters-Dataset/raw/master/Test%20Images%203360x32x32.zip'}), 'citation': '@article{el2017arabic,\n title={Arabic handwritten characters recognition using convolutional neural network},\n author={El-Sawy, Ahmed and Loey, Mohamed and El-Bakry, Hazem},\n journal={WSEAS Transactions on Computer Research},\n volume={5},\n pages={11--19},\n year={2017}}'})

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str

class CIFAR10(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Image classification. The `CIFAR-10 < https: // www.cs.toronto.edu/~kriz/cifar.html >`_ dataset was collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

SOURCE: Mapping = mappingproxy({'homepage': 'https://www.cs.toronto.edu/~kriz/cifar.html', 'assets': mappingproxy({'train': 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz', 'test': 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'}), 'citation': '@article{krizhevsky2009learning,\n title={Learning multiple layers of features from tiny images},\n author={Krizhevsky, Alex and Hinton, Geoffrey and others},\n year={2009},\n publisher={Toronto, ON, Canada}}'})

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str

class CIFAR100(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

CIFAR-100 dataset, a variant of CIFAR-10 with 100 classes.

SOURCE: Mapping = mappingproxy({'homepage': 'https://www.cs.toronto.edu/~kriz/cifar.html', 'assets': mappingproxy({'train': 'https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz', 'test': 'https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz'}), 'citation': '@article{krizhevsky2009learning,\n title={Learning multiple layers of features from tiny images},\n author={Krizhevsky, Alex and Hinton, Geoffrey and others},\n year={2009},\n publisher={Toronto, ON, Canada}}'})

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str

class CIFAR100C(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

CIFAR-100-C dataset with corrupted CIFAR-100 images.

SOURCE: Mapping = mappingproxy({'homepage': 'https://zenodo.org/records/3555552', 'assets': mappingproxy({'test': 'https://zenodo.org/records/3555552/files/CIFAR-100-C.tar?download=1'}), 'citation': '@article{hendrycks2019robustness,\n title={Benchmarking Neural Network Robustness to Common Corruptions and Perturbations},\n author={Dan Hendrycks and Thomas Dietterich},\n journal={Proceedings of the International Conference on Learning Representations},\n year={2019}}'})

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str

class CIFAR10C(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

CIFAR-10-C dataset with corrupted CIFAR-10 images.

SOURCE: Mapping = mappingproxy({'homepage': 'https://zenodo.org/records/2535967', 'assets': mappingproxy({'test': 'https://zenodo.org/records/2535967/files/CIFAR-10-C.tar?download=1'}), 'citation': '@article{hendrycks2019robustness,\n title={Benchmarking Neural Network Robustness to Common Corruptions and Perturbations},\n author={Dan Hendrycks and Thomas Dietterich},\n journal={Proceedings of the International Conference on Learning Representations},\n year={2019}}'})

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str

class Cars196(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Cars-196 Dataset The Cars-196 dataset, also known as the Stanford Cars dataset, is a benchmark dataset for fine-grained visual classification of automobiles. It contains 16,185 color images covering 196 car categories, where each category is defined by a specific combination of make, model, and year. The dataset is split into 8,144 training images and 8,041 test images, with the first 98 classes used exclusively for training and the remaining 98 classes reserved for testing, ensuring that training and test classes are disjoint. Images are collected from real-world scenes and exhibit significant variation in v iewpoint, background, and lighting conditions. Each image is annotated with a class label and a tight bounding box around the car, making the dataset suitable for fine-grained recognition tasks that require precise object localization and strong generalization to unseen categories.

SOURCE: Mapping = mappingproxy({'homepage': 'https://ai.stanford.edu/~jkrause/cars/car_dataset.html', 'assets': mappingproxy({'train': 'https://huggingface.co/datasets/haodoz0118/cars196-img/resolve/main/cars196_train.zip', 'test': 'https://huggingface.co/datasets/haodoz0118/cars196-img/resolve/main/cars196_test.zip'}), 'citation': '@inproceedings{krause20133d,\n title={3d object representations for fine-grained categorization},\n author={Krause, Jonathan and Stark, Michael and Deng, Jia and Fei-Fei, Li},\n booktitle={Proceedings of the IEEE international conference on computer vision workshops},\n pages={554--561},\n year={2013}}'})

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str

class DTD(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

Describable Textures Dataset (DTD)

DTD is a texture database, consisting of 5640 images, organized according to a list of 47 terms (categories) inspired from human perception. There are 120 images for each category. Image sizes range between 300x300 and 640x640, and the images contain at least 90% of the surface representing the category attribute. The images were collected from Google and Flickr by entering our proposed attributes and related terms as search queries. The images were annotated using Amazon Mechanical Turk in several iterations. For each image we provide key attribute (main category) and a list of joint attributes.

The data is split in three equal parts, in train, validation and test, 40 images per class, for each split. We provide the ground truth annotation for both key and joint attributes, as well as the 10 splits of the data we used for evaluation.

SOURCE: Mapping = mappingproxy({'homepage': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/', 'assets': mappingproxy({'train': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz', 'test': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz', 'val': 'https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz'}), 'citation': '@InProceedings{cimpoi14describing,\n Author = {M. Cimpoi and S. Maji and I. Kokkinos and S. Mohamed and and A. Vedaldi},\n Title = {Describing Textures in the Wild},\n Booktitle = {Proceedings of the {IEEE} Conf. on Computer Vision and Pattern Recognition ({CVPR})},\n Year = {2014}}'})

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str

class MedMNIST(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

MedMNIST, a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D.

BUILDER_CONFIGS = [MedMNISTConfig(name='pathmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST PathMNIST (2D)'), MedMNISTConfig(name='chestmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST ChestMNIST (2D, multi-label)'), MedMNISTConfig(name='dermamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST DermaMNIST (2D)'), MedMNISTConfig(name='octmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OCTMNIST (2D)'), MedMNISTConfig(name='pneumoniamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST PneumoniaMNIST (2D)'), MedMNISTConfig(name='retinamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST RetinaMNIST (2D)'), MedMNISTConfig(name='breastmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST BreastMNIST (2D)'), MedMNISTConfig(name='bloodmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST BloodMNIST (2D)'), MedMNISTConfig(name='tissuemnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST TissueMNIST (2D)'), MedMNISTConfig(name='organamnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganAMNIST (2D)'), MedMNISTConfig(name='organcmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganCMNIST (2D)'), MedMNISTConfig(name='organsmnist', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganSMNIST (2D)'), MedMNISTConfig(name='organmnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST OrganMNIST3D (3D)'), MedMNISTConfig(name='nodulemnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST NoduleMNIST3D (3D)'), MedMNISTConfig(name='adrenalmnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST AdrenalMNIST3D (3D)'), MedMNISTConfig(name='fracturemnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST FractureMNIST3D (3D)'), MedMNISTConfig(name='vesselmnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST VesselMNIST3D (3D)'), MedMNISTConfig(name='synapsemnist3d', version=1.0.0, data_dir=None, data_files=None, description='MedMNIST SynapseMNIST3D (3D)')]

SOURCE: Mapping

VERSION: datasets.Version = 1.0.0

hash: str | None

name: str