stable_datasets package
Subpackages
- stable_datasets.images package
- Submodules
- stable_datasets.images.arabic_characters module
- stable_datasets.images.arabic_digits module
- stable_datasets.images.awa2 module
- stable_datasets.images.beans module
- stable_datasets.images.cars196 module
- stable_datasets.images.cassava module
- stable_datasets.images.celeb_a module
- stable_datasets.images.cifar10 module
- stable_datasets.images.cifar100 module
- stable_datasets.images.cifar100_c module
- stable_datasets.images.cifar10_c module
- stable_datasets.images.country211 module
- stable_datasets.images.cub200 module
- stable_datasets.images.dsprites module
- stable_datasets.images.dtd module
- stable_datasets.images.e_mnist module
- stable_datasets.images.face_pointing module
- stable_datasets.images.fashion_mnist module
- stable_datasets.images.fgvc_aircraft module
- stable_datasets.images.flowers102 module
- stable_datasets.images.food101 module
- stable_datasets.images.hasy_v2 module
- stable_datasets.images.imagenet module
- stable_datasets.images.imagenette module
- stable_datasets.images.k_mnist module
- stable_datasets.images.linnaeus5 module
- stable_datasets.images.med_mnist module
- stable_datasets.images.mnist module
- stable_datasets.images.not_mnist module
- stable_datasets.images.patch_camelyon module
- stable_datasets.images.places365_small module
- stable_datasets.images.rock_paper_scissor module
- stable_datasets.images.stl10 module
- stable_datasets.images.svhn module
- stable_datasets.images.tiny_imagenet module
- stable_datasets.images.tiny_imagenet_c module
- Module contents
- stable_datasets.timeseries package
- Submodules
- stable_datasets.timeseries.CatsDogs module
- stable_datasets.timeseries.JapaneseVowels module
- stable_datasets.timeseries.MosquitoSound module
- stable_datasets.timeseries.Phoneme module
- stable_datasets.timeseries.RightWhaleCalls module
- stable_datasets.timeseries.TUTacousticscenes2017 module
- stable_datasets.timeseries.UCR_multivariate module
- stable_datasets.timeseries.UCR_univariate module
- stable_datasets.timeseries.UrbanSound module
- stable_datasets.timeseries.VoiceGenderDetection module
- stable_datasets.timeseries.audiomnist module
- stable_datasets.timeseries.birdvox_70k module
- stable_datasets.timeseries.birdvox_dcase_20k module
- stable_datasets.timeseries.brain_mnist module
- stable_datasets.timeseries.dcase_2019_task4 module
- stable_datasets.timeseries.dclde module
- stable_datasets.timeseries.esc module
- stable_datasets.timeseries.freefield1010 module
- stable_datasets.timeseries.fsd_kaggle_2018 module
- stable_datasets.timeseries.groove_MIDI module
- stable_datasets.timeseries.gtzan module
- stable_datasets.timeseries.high_gamma module
- stable_datasets.timeseries.irmas module
- stable_datasets.timeseries.picidae module
- stable_datasets.timeseries.seizures_neonatal module
- stable_datasets.timeseries.sonycust module
- stable_datasets.timeseries.speech_commands module
- stable_datasets.timeseries.vocalset module
- stable_datasets.timeseries.warblr module
- Module contents
Submodules
stable_datasets.cassava module
- class cassava[source]
Bases:
objectPlant images classification.
The data consists of two folders, a training folder that contains 5 subfolders that contain the respective images for the different 5 classes and a test folder containing test images.
Participants are to train their models using the images in the training folder and provide a submission file like the sample provided which contains the image name exactly matching the image name in the test folder and the corresponding class prediction with labels corresponding to the disease categories, cmd, healthy, cgm, cbsd, cbb.
Please cite this paper if you use the dataset for your project: https://arxiv.org/pdf/1908.02900.pdf
- static download(path)[source]
Download the cassava dataset and store the result into the given path
- Parameters:
path (str) – the path where the downloaded files will be stored. If the directory does not exist, it is created.
- static load(path=None)[source]
- Parameters:
path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
- Returns:
train_images (array)
train_labels (array)
valid_images (array)
valid_labels (array)
test_images (array)
test_labels (array)
stable_datasets.utils module
- class BaseDatasetBuilder(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]
Bases:
GeneratorBasedBuilderBase class for stable-datasets that enables direct dataset loading.
- bulk_download(urls: Iterable[str], dest_folder: str | Path, backend: str = 'filesystem', cache_dir: str = '~/.stable_datasets/') list[Path][source]
Download multiple files concurrently and return their local paths.
- Parameters:
urls – Iterable of URL strings to download.
dest_folder – Destination folder for downloads.
backend – requests_cache backend (e.g. “filesystem”).
cache_dir – Cache directory for requests_cache.
- Returns:
Local file paths in the same order as the input URLs.
- Return type:
list[Path]
- download(url: str, dest_folder: str | Path | None = None, backend: str = 'filesystem', cache_dir: str = '~/.stable_datasets/', progress_bar: bool = True, _progress_dict=None, _task_id=None) Path[source]
Download a single file from a URL with caching and optional progress tracking.
- Parameters:
url – URL to download from.
dest_folder – Destination folder for the downloaded file. If None, defaults to ~/.stable_datasets/downloads/.
backend – requests_cache backend (e.g. “filesystem”).
cache_dir – Cache directory for requests_cache.
progress_bar – Whether to show a tqdm progress bar (for standalone use).
_progress_dict – Internal shared dict for bulk_download progress reporting.
_task_id – Internal task ID key for bulk_download progress reporting.
- Returns:
Local path to the downloaded file.
- Return type:
Path
- Raises:
Exception – Any exception from network/file operations is logged and re-raised.
- load_from_tsfile_to_dataframe(full_file_path_and_name, return_separate_X_and_y=True, replace_missing_vals_with='NaN')[source]
Load data from a .ts file into a Pandas DataFrame. Credit to https://github.com/sktime/sktime/blob/7d572796ec519c35d30f482f2020c3e0256dd451/sktime/datasets/_data_io.py#L379 :param full_file_path_and_name: The full pathname of the .ts file to read. :type full_file_path_and_name: str :param return_separate_X_and_y: true if X and Y values should be returned as separate Data Frames (
X) and a numpy array (y), false otherwise. This is only relevant for data that
- Parameters:
replace_missing_vals_with (str) – The value that missing values in the text file should be replaced with prior to parsing.
- Returns:
DataFrame (default) or ndarray (i – If return_separate_X_and_y then a tuple containing a DataFrame and a numpy array containing the relevant time-series and corresponding class values.
DataFrame – If not return_separate_X_and_y then a single DataFrame containing all time-series and (if relevant) a column “class_vals” the associated class values.