Classify raw sound events
LicenseCC BY-SA 3.0
Context
With this dataset we hope to do a nice cheeky wink to the “cats and dogs” image dataset.
In fact, this dataset is aimed to be the audio counterpart of the famous “cats and dogs” image classification task, here available on Kaggle.
Content
The dataset consists in many “wav” files for both the cat and dog classes :
- cat has 164 WAV files to which corresponds 1323 sec of audio
- dog has 113 WAV files to which corresponds 598 sec of audio
You can have an visual description of the Wav here : Visualizing woofs & meows 🐱. In Accessing the Dataset 2 we propose a train / test split which can be used.
All the WAV files contains 16KHz audio and have variable length.
Acknowledgements
We have not much credit in proposing the dataset here. Much of the work have been done by the AE-Dataset creator (From which we extracted the two classes) and by the humans behind FreeSound From which was extracted the AE-Dataset.
PS: the AE-Dataset has a policy saying you can mention them: Naoya Takahashi, Michael Gygli, Beat Pfister and Luc Van Gool, “Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition”, Proc. Interspeech 2016, San Fransisco.
Inspiration
You might use this dataset to test raw audio classification challenge 😉
A more challenging dataset is available here