Site icon Zdataset

Pokémon for Data Mining and Machine Learning

(Almost) all Pokémon stats until generation 6: 21 variables per Pokémon

LicenseCC BY-NC-SA 4.0

Tagsarts and entertainmentgamesvideo gamesanime and mangacomics and animationand 1 more

Context

With the rise of the popularity of machine learning, this is a good opportunity to share a wide database of the even more popular video-game Pokémon by Nintendo, Game freak, and Creatures, originally released in 1996.

Pokémon started as a Role Playing Game (RPG), but due to its increasing popularity, its owners ended up producing many TV series, manga comics, and so on, as well as other types of video-games (like the famous Pokémon Go!).

This dataset is focused on the stats and features of the Pokémon in the RPGs. Until now (08/01/2017) seven generations of Pokémon have been published. All in all, this dataset does not include the data corresponding to the last generation, since 1) I created the databased when the seventh generation was not released yet, and 2) this database is a modification+extension of the database “721 Pokemon with stats” by Alberto Barradas (https://www.kaggle.com/abcsds/pokemon), which does not include (of course) the latest generation either.

Content

This database includes 21 variables per each of the 721 Pokémon of the first six generations, plus the Pokémon ID and its name. These variables are briefly described next:

Notes

Please note that many Pokémon are multi-form, and also some of them can Mega-evolve. I wanted to keep the structure of the dataset as simple and general as possible, as well as the Number variable (the ID of the Pokémon) unique. Hence, in the cases of the multi-form Pokémon, or the ones capable of Mega-evolve, I just chose one of the forms, the one I (and my brother) considered the standard and/or the most common. The specific choice for each of this Pokémon are shown below:

Acknowledgements

As said at the beginning, this database was based on the Kaggle database “721 Pokemon with stats” by Alberto Barradas (https://www.kaggle.com/abcsds/pokemon). The other resources I mainly used are listed below:

Possible future work

This dataset can be used with different objectives, such as, Pokémon clustering, trying to find relations or dependencies between the variables, and also for supervised classification purposes, where the class could be the Primary Type, but also many of the other variables.

Author

Asier López Zorrilla

Exit mobile version