stable_datasets.timeseries package
Submodules
stable_datasets.timeseries.CatsDogs module
stable_datasets.timeseries.JapaneseVowels module
stable_datasets.timeseries.MosquitoSound module
stable_datasets.timeseries.Phoneme module
stable_datasets.timeseries.RightWhaleCalls module
stable_datasets.timeseries.TUTacousticscenes2017 module
stable_datasets.timeseries.UCR_multivariate module
stable_datasets.timeseries.UCR_univariate module
stable_datasets.timeseries.UrbanSound module
stable_datasets.timeseries.VoiceGenderDetection module
stable_datasets.timeseries.audiomnist module
stable_datasets.timeseries.birdvox_70k module
stable_datasets.timeseries.birdvox_dcase_20k module
stable_datasets.timeseries.brain_mnist module
BrainMNIST dataset (stub).
Moved under stable_datasets.timeseries per project convention.
Reference: - http://mindbigdata.com/opendb/index.html
TODO: Implement as a HuggingFace-compatible builder using BaseDatasetBuilder and the local download helpers in stable_datasets.utils.
- class BrainMNIST(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]
Bases:
BaseDatasetBuilder
stable_datasets.timeseries.dcase_2019_task4 module
stable_datasets.timeseries.dclde module
DCLDE dataset (stub).
This file was previously a legacy imperative loader at the top-level package. It was moved under stable_datasets.timeseries to match the repository layout.
TODO: Implement as a HuggingFace-compatible builder using BaseDatasetBuilder and the local download helpers in stable_datasets.utils.
- class DCLDE(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]
Bases:
BaseDatasetBuilder
stable_datasets.timeseries.esc module
- load(path=None)[source]
ESC-10/50: Environmental Sound Classification
https://github.com/karolpiczak/ESC-50#download
The ESC-50 dataset is a labeled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification.
The dataset consists of 5-second-long recordings organized into 50 semantical classes (with 40 examples per class) loosely arranged into 5 major categories:
Animals Natural soundscapes & water sounds Human, non-speech sounds Interior/domestic sounds Exterior/urban noises
Clips in this dataset have been manually extracted from public field recordings gathered by the Freesound.org project. The dataset has been prearranged into 5 folds for comparable cross-validation, making sure that fragments from the same original source file are contained in a single fold.
ESC 50.
https://github.com/karolpiczak/ESC-50#download
- Parameters:
path (str (optional)) – default $DATASET_path), the path to look for the data and where the data will be downloaded if not present
- Returns:
wavs (array) – the wavs as a numpy array (matrix) with first dimension the data and second dimension time
fine_labels (array) – the labels of the final classes (50 different ones) as a integer vector
coarse_labels (array) – the labels of the classes big category (5 of them)
folds (array) – the fold as an integer from 1 to 5 specifying how to split the data one should not split a fold into train and set as it would make the same recording (but different subparts) be present in train and test, biasing optimistically the results.
esc10 (array) – the boolean vector specifying if the corresponding datum (wav, label, …) is in the ESC-10 dataset or not. That is, to load the ESC-10 dataset simply load ESC-50 and use this boolean vector to extract only the ESC-10 data.
stable_datasets.timeseries.freefield1010 module
stable_datasets.timeseries.fsd_kaggle_2018 module
Legacy FSDKaggle2018 loader (to be refactored into a BaseDatasetBuilder).
This module was moved under stable_datasets.timeseries to align the repository layout. It still exposes the original imperative FSDKaggle2018.load(…) API for now.
- class FSDKaggle2018[source]
Bases:
objectFSDKaggle2018 Sound Classification https://zenodo.org/record/2552860
- download()[source]
- load()[source]
stable_datasets.timeseries.groove_MIDI module
stable_datasets.timeseries.gtzan module
stable_datasets.timeseries.high_gamma module
High-Gamma dataset (stub).
Moved under stable_datasets.timeseries as it is an EEG/time-series dataset.
Reference: - https://github.com/robintibor/high-gamma-dataset
TODO: Implement as a HuggingFace-compatible builder using BaseDatasetBuilder and the local download helpers in stable_datasets.utils.
- class HighGamma(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]
Bases:
BaseDatasetBuilder