Audio Cats and Dogs

user1

3 years ago

Classify raw sound events

Context

With this dataset we hope to do a nice cheeky wink to the “cats and dogs” image dataset.
In fact, this dataset is aimed to be the audio counterpart of the famous “cats and dogs” image classification task, here available on Kaggle.

Content

The dataset consists in many “wav” files for both the cat and dog classes :

cat has 164 WAV files to which corresponds 1323 sec of audio
dog has 113 WAV files to which corresponds 598 sec of audio

You can have an visual description of the Wav here : Visualizing woofs & meows 🐱. In Accessing the Dataset 2 we propose a train / test split which can be used.

All the WAV files contains 16KHz audio and have variable length.

Acknowledgements

We have not much credit in proposing the dataset here. Much of the work have been done by the AE-Dataset creator (From which we extracted the two classes) and by the humans behind FreeSound From which was extracted the AE-Dataset.

PS: the AE-Dataset has a policy saying you can mention them: Naoya Takahashi, Michael Gygli, Beat Pfister and Luc Van Gool, “Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition”, Proc. Interspeech 2016, San Fransisco.

Inspiration

You might use this dataset to test raw audio classification challenge 😉
A more challenging dataset is available here