This dataset contains sprites, animated sprites, Bulbapedia artworks, and tabular data for all of the Pokémon listed on Bulbapdia for the purpose of training a diffusion model. Built on the back of the Bulbapeida and the PokéAPI, this dataset aggregates the resources required for building a generative diffusion model for Pokémon sprite animations.
Note that all of the sprites are owned by Nintendo, so use at your own risk!
Currently, there are three datasets:
- Animated sprite gifs for all generation V Pokémon for shiny and normal Pokémon.
- Unwrapped generation V sprites (i.e. each frame of the gif as an individual image)
- A "full art" dataset, with the header images from Bulbapedia entries.
All three datasets have conditional and unconditional features available, which are scraped and stored in CSV format. Look at the call signatures in the dataset code to understand available options.
Below are examples from each of the three datasets:
Clone the repository and then run
conda env create -f environment.yml
conda activate poke_sprite_dataset
to set up the conda environment, and run
python setup.py install
to install as a module.
Next run
python build_dataset.py
to download the data and create a data directory.
From here, you can look at example/
to find Jupiter notebooks demonstrating the usage of the API.
Enjoy!