Vehicle View Synthesis by Generative Adversarial Network, International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2023 (Accepted).
We add GAN into this ReID model (AICITY2021_Track2_DMT),and our model is vehicle pose transform by generative adversarial network for ReID. AICITY2021_Track2_DMT is the 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.
ReID Model Backbone AICITY2021_Track2_DMT: Detailed information of NVIDIA AI City Challenge 2021 can be found here.
The code is modified from AICITY2020_DMT_VehicleReID, TransReID, reid_strong baseline,and AICITY2021_Track2_DMT.
2023/05/05 - We released detailed information of Vehicle View Synthesis by Generative Adversarial Network can be found paper and poster.
2023/03/01 - Our lab CCUMVL released CCUMVL-Vehicle-ReID DATASET (Please click here to the new webpage.)
2023/02/17 - PTGAN: Vehicle View Synthesis by Generative Adversarial Network was accepted by ICASSP 2023.
-
cd
to folder where you want to download this repo -
Run
git clone
-
Install dependencies:
pip install requirements.txt
We use cuda 11.0/python 3.7/torch 1.6.0/torchvision 0.7.0 for training and testing.
-
Prepare Datasets Download Original dataset, Cropped_dataset, SPGAN_dataset, Veri-776,and veri_pose.
The name of veri_pose folder is xxx_y_z (ex:000_0_3).
- xxx: identity
- y: color
- z: type
and the numbers of 1~8 are the pose of vehicles.
├── AIC21/
│ ├── AIC21_Track2_ReID/
│ ├── image_train/
│ ├── image_test/
│ ├── image_query/
│ ├── train_label.xml
│ ├── ...
│ ├── training_part_seg/
│ ├── cropped_patch/
│ ├── cropped_aic_test
│ ├── image_test/
│ ├── image_query/
│ ├── AIC21_Track2_ReID_Simulation/
│ ├── sys_image_train/
│ ├── sys_image_train_tr/
│ ├── veri_pose/
│ ├── train/
│ ├── 000_0_3/
│ ├── 0/
│ ├── 3/
│ ├── 4/
│ ├── ...
│ ├── ...
│ ├── query/
│ ├── 0002_c002_00030600_0.jpg
│ ├── ...
│ ├── test/
│ ├── 000_0_3/
│ ├── 1/
│ ├── 3/
│ ├── 4/
│ ├── ...
│ ├── ...
│ ├── train_label.xml
│ ├── query_label.xml
│ ├── test_label.xml
│ ├── ...
- Put pre-trained models into ./pretrained/
We utilize 1 GPU 1080ti(11GB) for training.
-
cd
to folder where you want to download this repo. -
Run
git clone
, we use the code to train GAN models. -
Prepare Datasets Download Original dataset, Veri-776,and veri_pose.
├── veri_pose/
│ ├── train/
│ ├── 000_0_3/
│ ├── 0/
│ ├── 3/
│ ├── 4/
│ ├── ...
│ ├── ...
│ ├── query/
│ ├── 0002_c002_00030600_0.jpg
│ ├── ...
│ ├── test/
│ ├── 000_0_3/
│ ├── 1/
│ ├── 3/
│ ├── 4/
│ ├── ...
│ ├── ...
│ ├── train_label.xml
│ ├── query_label.xml
│ ├── test_label.xml
│ ├── list_color.txt
│ ├── list_type.txt
│ ├── ...
- Start training
Open Visdom to monitor training process
python -m visdom.server -port [xxxx]
Training Orthogonal Encoder
python orthogonal_encoder.py
Training GAN model
python train.py --stage 1 --dataset train --dataroot [path]
python train.py --stage 2 --dataset train --dataroot [path]
- Put trained models into
- ./code_GAN/gan/weights/
- model_130.pth
- ./code_GAN/weights/GAN_stage_2
- 17_net_Di.pth
- 17_net_Dp.pth
- 17_net_E.pth
- 17_net_G.pth
- ./code_GAN/gan/weights/
├── PTGAN/
│ ├── code_GAN/
│ ├── gan/
│ ├── weights/
│ ├── model_130.pth
│ ├── weights/
│ ├── GAN_stage_2/
│ ├── 17_net_Di.pth
│ ├── 17_net_Dp.pth
│ ├── 17_net_E.pth
│ ├── 17_net_G.pth
│ ├── 17_net_Dp.pth
│ ├── ...
We utilize 8 GPU 1080ti(11GB) for training. You can train one backbone as follow.
# ResNext101-IBN-a
python train.py --config_file configs/stage1/resnext101a_384.yml MODEL.DEVICE_ID "('0')"
python train_stage2_v1.py --config_file configs/stage2/resnext101a_384.yml MODEL.DEVICE_ID "('0')" OUTPUT_DIR './logs/stage2/resnext101a_384/v1'
python train_stage2_v2.py --config_file configs/stage2/resnext101a_384.yml MODEL.DEVICE_ID "('0')" OUTPUT_DIR './logs/stage2/resnext101a_384/v2'
You should train camera and viewpoint models before the inference stage. You also can directly use our trained results (track_cam_rk.npy and track_view_rk.npy):
python train_cam.py --config_file configs/camera_view/camera_101a.yml
python train_view.py --config_file configs/camera_view/view_101a.yml
You can train all eight backbones by checking run.sh. Then, you can ensemble all results:
python ensemble.py
All ReID trained models can be downloaded from here
After Training all the above models, you can test one backbone as follow.
python PTGAN.py --config_file configs/stage2/resnext101a_384_veri_gan.yml MODEL.DEVICE_ID "('0')" TEST.WEIGHT './logs/stage2/resnext101a_384/v1/resnext101_ibn_a_2.pth' OUTPUT_DIR './logs/stage2/resnext101a_384/veri_gan_v1'
python PTGAN.py --config_file configs/stage2/resnext101a_384_veri_gan.yml MODEL.DEVICE_ID "('0')" TEST.WEIGHT './logs/stage2/resnext101a_384/v2/resnext101_ibn_a_2.pth' OUTPUT_DIR './logs/stage2/resnext101a_384/veri_gan_v2'
This result is the ReID test result. We train the model on AICITY2021 dataset, and test on VeRi-776 by stage2/v1 ReID model. (will be updated other results in the future.)
Baseline
Backbones | mAP | R-1 | R-5 | R-10 |
---|---|---|---|---|
ResNet101-IBN-a | 48.8 | 86.1 | 86.4 | 91.5 |
ResNet101-IBN-a+ | 48.7 | 87.0 | 87.3 | 91.8 |
ResNet101-IBN-a ++ | 48.6 | 87.1 | 87.4 | 92.3 |
ResNext101-IBN-a | 48.1 | 86.1 | 86.4 | 91.1 |
ResNest101 | 48.2 | 85.9 | 86.3 | 91.0 |
SeResNet101-IBN | 46.5 | 85.0 | 85.2 | 90.2 |
DenseNet169-IBN | 46.5 | 84.8 | 85.2 | 90.2 |
TransReID | 48.9 | 87.0 | 87.4 | 91.3 |
Our Model Transform to gallery pose
Backbones | mAP | R-1 | R-5 | R-10 |
---|---|---|---|---|
ResNet101-IBN-a | 47.4 | 85.7 | 88.9 | 93.1 |
ResNet101-IBN-a+ | 47.2 | 86.2 | 90.0 | 93.7 |
ResNet101-IBN-a ++ | 47.5 | 86.7 | 90.6 | 94.0 |
ResNext101-IBN-a | 46.9 | 85.5 | 89.3 | 92.6 |
ResNest101 | 46.3 | 84.9 | 90.2 | 94.3 |
SeResNet101-IBN | 47.0 | 84.9 | 90.2 | 91.7 |
DenseNet169-IBN | 45.1 | 84.2 | 87.9 | 92.1 |
TransReID | 47.0 | 85.6 | 91.1 | 94.0 |
Transform all photo
Backbones | mAP | R-1 | R-5 | R-10 |
---|---|---|---|---|
ResNet101-IBN-a | 47.1 | 86.1 | 88.9 | 92.3 |
ResNet101-IBN-a+ | 46.9 | 86.2 | 89.7 | 92.8 |
ResNet101-IBN-a ++ | 47.1 | 86.5 | 89.6 | 93.6 |
ResNext101-IBN-a | 46.5 | 85.8 | 88.9 | 92.3 |
ResNest101 | 46.7 | 85.1 | 88.3 | 92.3 |
SeResNet101-IBN | 46.6 | 85.0 | 87.6 | 91.7 |
DenseNet169-IBN | 44.8 | 84.6 | 87.8 | 92.1 |
TransReID | 50.3 | 87.7 | 90.9 | 93.6 |
@inproceedings{luo2021empirical,
title={An Empirical Study of Vehicle Re-Identification on the AI City Challenge},
author={Luo, Hao and Chen, Weihua and Xu Xianzhe and Gu Jianyang and Zhang, Yuqi and Chong Liu and Jiang Qiyi and He, Shuting and Wang, Fan and Li, Hao},
booktitle={Proc. CVPR Workshops},
year={2021}
}
If you find our work valuable for your research, we kindly request that you consider citing the following reference:
@INPROCEEDINGS{10096633,
author={Hu, Chan-Shuo and Tseng, Sung-Wei and Fan, Xin-Yun and Chiang, Chen-Kuo},
booktitle={ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Vehicle View Synthesis by Generative Adversarial Network},
year={2023},
volume={},
number={},
pages={1-5},
doi={10.1109/ICASSP49357.2023.10096633}}