Skip to content

Latest commit

 

History

History
72 lines (52 loc) · 3.37 KB

README.md

File metadata and controls

72 lines (52 loc) · 3.37 KB

Official PyTorch implementation for our paper "HyenaPixel: Global Image Context with Convolutions".

Visualization of HyenaPixel

Setup

Create a conda environment and install the requirements.

conda create -n hyenapixel python=3.10
conda activate hyenapixel
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install -e .

Dataset

Prepare ImageNet-1k with this script.

Models

Model Resolution Params Top1 Acc Download
hpx_former_s18 224 29M 83.2 HuggingFace
hpx_former_s18_384 384 29M 84.7 HuggingFace
hb_former_s18 224 28M 83.5 HuggingFace
c_hpx_former_s18 224 28M 83.0 HuggingFace
hpx_a_former_s18 224 28M 83.6 HuggingFace
hb_a_former_s18 224 27M 83.2 HuggingFace
hpx_former_b36 224 111M 84.9 HuggingFace
hb_former_b36 224 102M 85.2 HuggingFace

Usage

Training

We trained our models with 8 Nvidia A100 GPUs with the SLURM scripts located in ./scripts/. Adjust the SLURM parameters NUM_GPU and GRAD_ACCUM_STEPS to match your system.

For object detection and segmentation view the detection and segmentation folders.

Validation

Run the following command to validate the hpx_former_s18. Replace data/imagenet with the path to ImageNet-1k and hpx_former_s18 wtih the model you intend to validate.

python validate.py data/imagenet --model hpx_former_s18

Acknowledgments

Our implementation is based on HazyResearch/safari, rwightman/pytorch-image-models and sail-sg/metaformer. This research has been funded by the Federal Ministry of Education and Research of Germany under grant no. 01IS22094C WEST-AI.

Bibtex

@article{spravil2024hyenapixel,
  title={HyenaPixel: Global Image Context with Convolutions},
  author={Julian Spravil and Sebastian Houben and Sven Behnke},
  journal={arXiv preprint arXiv:2402.19305},
  year={2024},
}