stable_datasets.timeseries package

Submodules

stable_datasets.timeseries.CatsDogs module

stable_datasets.timeseries.JapaneseVowels module

stable_datasets.timeseries.MosquitoSound module

stable_datasets.timeseries.Phoneme module

stable_datasets.timeseries.RightWhaleCalls module

stable_datasets.timeseries.TUTacousticscenes2017 module

stable_datasets.timeseries.UCR_multivariate module

stable_datasets.timeseries.UCR_univariate module

stable_datasets.timeseries.UrbanSound module

stable_datasets.timeseries.VoiceGenderDetection module

stable_datasets.timeseries.audiomnist module

stable_datasets.timeseries.birdvox_70k module

stable_datasets.timeseries.birdvox_dcase_20k module

stable_datasets.timeseries.brain_mnist module

BrainMNIST dataset (stub).

Moved under stable_datasets.timeseries per project convention.

Reference: - http://mindbigdata.com/opendb/index.html

TODO: Implement as a HuggingFace-compatible builder using BaseDatasetBuilder and the local download helpers in stable_datasets.utils.

class BrainMNIST(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

SOURCE: Mapping = mappingproxy({'homepage': 'http://mindbigdata.com/opendb/index.html', 'citation': 'TBD', 'assets': mappingproxy({})})

VERSION: datasets.Version = 0.0.0

stable_datasets.timeseries.dcase_2019_task4 module

stable_datasets.timeseries.dclde module

DCLDE dataset (stub).

This file was previously a legacy imperative loader at the top-level package. It was moved under stable_datasets.timeseries to match the repository layout.

TODO: Implement as a HuggingFace-compatible builder using BaseDatasetBuilder and the local download helpers in stable_datasets.utils.

class DCLDE(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

SOURCE: Mapping = mappingproxy({'homepage': 'TBD', 'citation': 'TBD', 'assets': mappingproxy({})})

VERSION: datasets.Version = 0.0.0

stable_datasets.timeseries.esc module

load(path=None)[source]

ESC-10/50: Environmental Sound Classification

https://github.com/karolpiczak/ESC-50#download

The ESC-50 dataset is a labeled collection of 2000 environmental audio recordings suitable for benchmarking methods of environmental sound classification.

The dataset consists of 5-second-long recordings organized into 50 semantical classes (with 40 examples per class) loosely arranged into 5 major categories:

Animals Natural soundscapes & water sounds Human, non-speech sounds Interior/domestic sounds Exterior/urban noises

Clips in this dataset have been manually extracted from public field recordings gathered by the Freesound.org project. The dataset has been prearranged into 5 folds for comparable cross-validation, making sure that fragments from the same original source file are contained in a single fold.

ESC 50.

https://github.com/karolpiczak/ESC-50#download

Parameters:

path (str (optional)) – default $DATASET_path), the path to look for the data and where the data will be downloaded if not present

Returns:

wavs (array) – the wavs as a numpy array (matrix) with first dimension the data and second dimension time
fine_labels (array) – the labels of the final classes (50 different ones) as a integer vector
coarse_labels (array) – the labels of the classes big category (5 of them)
folds (array) – the fold as an integer from 1 to 5 specifying how to split the data one should not split a fold into train and set as it would make the same recording (but different subparts) be present in train and test, biasing optimistically the results.
esc10 (array) – the boolean vector specifying if the corresponding datum (wav, label, …) is in the ESC-10 dataset or not. That is, to load the ESC-10 dataset simply load ESC-50 and use this boolean vector to extract only the ESC-10 data.

stable_datasets.timeseries.freefield1010 module

stable_datasets.timeseries.fsd_kaggle_2018 module

Legacy FSDKaggle2018 loader (to be refactored into a BaseDatasetBuilder).

This module was moved under stable_datasets.timeseries to align the repository layout. It still exposes the original imperative FSDKaggle2018.load(…) API for now.

class FSDKaggle2018[source]

Bases: object

FSDKaggle2018 Sound Classification https://zenodo.org/record/2552860

download()[source]

load()[source]

stable_datasets.timeseries.groove_MIDI module

stable_datasets.timeseries.gtzan module

stable_datasets.timeseries.high_gamma module

High-Gamma dataset (stub).

Moved under stable_datasets.timeseries as it is an EEG/time-series dataset.

Reference: - https://github.com/robintibor/high-gamma-dataset

TODO: Implement as a HuggingFace-compatible builder using BaseDatasetBuilder and the local download helpers in stable_datasets.utils.

class HighGamma(*args, split=None, processed_cache_dir=None, download_dir=None, **kwargs)[source]

Bases: BaseDatasetBuilder

SOURCE: Mapping = mappingproxy({'homepage': 'https://github.com/robintibor/high-gamma-dataset', 'citation': 'TBD', 'assets': mappingproxy({})})

stable_datasets.timeseries package

Submodules

stable_datasets.timeseries.CatsDogs module

stable_datasets.timeseries.JapaneseVowels module

stable_datasets.timeseries.MosquitoSound module

stable_datasets.timeseries.Phoneme module

stable_datasets.timeseries.RightWhaleCalls module

stable_datasets.timeseries.TUTacousticscenes2017 module

stable_datasets.timeseries.UCR_multivariate module

stable_datasets.timeseries.UCR_univariate module

stable_datasets.timeseries.UrbanSound module

stable_datasets.timeseries.VoiceGenderDetection module

stable_datasets.timeseries.audiomnist module

stable_datasets.timeseries.birdvox_70k module

stable_datasets.timeseries.birdvox_dcase_20k module

stable_datasets.timeseries.brain_mnist module

stable_datasets.timeseries.dcase_2019_task4 module

stable_datasets.timeseries.dclde module

stable_datasets.timeseries.esc module

stable_datasets.timeseries.freefield1010 module

stable_datasets.timeseries.fsd_kaggle_2018 module

stable_datasets.timeseries.groove_MIDI module

stable_datasets.timeseries.gtzan module

stable_datasets.timeseries.high_gamma module

stable_datasets.timeseries.irmas module

stable_datasets.timeseries.picidae module

stable_datasets.timeseries.seizures_neonatal module

stable_datasets.timeseries.sonycust module

stable_datasets.timeseries.speech_commands module

stable_datasets.timeseries.vocalset module

stable_datasets.timeseries.warblr module

Module contents