Skip to content

Original pytorch implementation of pp-pad (peripheral prediction padding) in ICIP2023 paper

License

Notifications You must be signed in to change notification settings

islab-sophia/pp-pad

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pp-pad

This repository contains the pytorch implementation of pp-pad (peripheral prediction padding) in ICIP2023 paper:

  • Kensuke Mukai and Takao Yamanaka, "Improving Translation Invariance in Convolutional Neural Networks with Peripheral Prediction Padding," International Conference on Image Processing (ICIP), 2023, Kuala Lumpur, Malaysia. [arXiv https://arxiv.org/abs/2307.07725]

outline of pp-pad

More qualitative comparison: Results_summary.md

Downloads

In addition to this repository, download the following files, and then place them under the pp-pad folders.

Examples

Change current directory:

cd program

To train the network:

python PSPNet.py --cfg config/cfg_sample.yaml

To evaluate the translation invarinace and classification accuracy:

python segment_val.py --cfg config/cfg_sample.yaml

The results are saved in the 'program/outputs' directory. The filepath and other settings are specified in the config file 'config/*.yaml'.

To save the visualized results, set the hyper-parameter 'save_patches' in the config file (.yaml) to True and then run the segment_val.py:

python segment_val.py --cfg config/cfg_sample.yaml

Sample images can be obtained in the 'program/samples' directory.

Settings in Config (program/config)

The yaml file can be modified in a text editor.
sample config file: program/config/cfg_sample.yaml

[padding mode]

  • padding_mode: Select from 'pp-pad', 'zeros', 'reflect', 'replicate', 'circular', 'partial', 'cap_pretrain', or 'cap_train'. For 'cap_pretrain' and 'cap_train', use them in this order, since pretrain is required in CAP.

[path for PSPNet.py]

  • dataset: Dataset folder
  • pretrain: Initial weights for the network
  • outputs: Output folder

[training parameters]

  • num_epoches: Number of training epochs
  • val_output_interval: Interval for validation [epochs]
  • batch_size
  • batch_multiplier: Weights are updated every batch_multiplier, which means the effective batch size is batch_size x batch_multiplier
  • optimizer: sgd or adam

[path for segment_val.py]

  • val_images: Validation file list for evaluation of translation invariance and classification accuracy. Only 100 images were used for evaluation due to the computational cost.
  • weights: Network model for evaluation

[image sizes]

  • input_size: Patch size extracted from original image in training and evaluation
  • expanded_size: Original image was first resized into expanded_size in the long side, and then croped in the input_size specified above.
  • patch_stride: Stride of sliding window in evaluation creating overlapping patches

[dataset info]

  • color_mean: Mean values of images in the dataset
  • color_std: Standard deviations of images in the dataset

[to save patches and visualize inference results]

  • save_patches: True / False
  • sample_images: Image files to save patches and visualize inference results

Results

The values in the following results were different from the original ICIP2023 paper[1], especially in meanE, because there were bugs in the initial implementation for calculating meanE. The following is the results obatained by the current code. The network was trained in 320 epoches.

[Simple mean IOU (excluding background), meanE & disR (including background class)]

  • mIoU is simple mean of IoU, but not weighted average. The background class was excluded in the calculation of IoU.
  • meanE and disR were calculated including background class.

Patch Size: 475x475

Methods mIoU ↑ meanE_in ↓ disR_in ↓
Conventional Zero 0.3233 0.4536 0.5847
Reflect 0.3090 0.4826 0.6087
Replicate 0.3100 0.4745 0.6038
Circular 0.3062 0.4923 0.6126
Previous Partial [17] 0.3184 0.4575 0.5893
Proposed PP-Pad (2x3) 0.3203 0.4221 0.5443
PP-Pad (2x3 conv) 0.3255 0.4195 0.5423

Patch Size: 512x512

Methods mIoU ↑ meanE_in ↓ disR_in ↓
Previous CAP [19] 0.3300 0.4440 0.5794
Proposed PP-Pad (2x3) 0.3301 0.4315 0.5533
PP-Pad (2x3 conv) 0.3307 0.4238 0.5505

[Weighted average IOU, meanE, and disR excluding background class]

  • Weighted average version of mIoU (mIoU_weighted). Each patch was weighted by the number of effective pixels. 'effective' means that union of predicted area and ground-truth area is not zero in at least one class when (excluding the background class) calculating IoU for each patch. By weighting, patches filled with the background can be less contributed to mIoU, since the background class was excluded in mIoU calculations.
  • meanE & disR excluding the background class in annotation (meanE_ex, disR_ex)

Patch Size: 475x475

Methods mIoU_weighted ↑ meanE_ex ↓ disR_ex ↓
Conventional Zero 0.4102 0.5564 0.7088
Reflect 0.3941 0.6016 0.7605
Replicate 0.3990 0.5953 0.7565
Circular 0.3879 0.6205 0.7593
Previous Partial [17] 0.4062 0.5900 0.7462
Proposed PP-Pad (2x3) 0.4120 0.5272 0.6820
PP-Pad (2x3 conv) 0.4182 0.5118 0.6745

Patch Size: 512x512

Methods mIoU_weighted ↑ meanE_ex ↓ disR_ex ↓
Previous CAP [19] 0.4189 0.5419 0.7040
Proposed PP-Pad (2x3) 0.4258 0.5370 0.6932
PP-Pad (2x3 conv) 0.4295 0.5213 0.6827

References

  1. Kensuke Mukai and Takao Yamanaka, "Improving Translation Invariance in Convolutional Neural Networks with Peripheral Prediction Padding," ICIP2023. https://arxiv.org/abs/2307.07725
  2. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia, "Pyramid Scene Parsing Network," CVPR2017. https://arxiv.org/abs/1612.01105
  3. Yu-Hui Huang, Marc Proesmans, and Luc Van Gool, "Context-aware Padding for Semantic Segmentation," arXiv 2021. https://arxiv.org/abs/2109.07854
  4. Guilin Liu, Kevin J. Shih, Ting-Chun Wang, Fitsum A. Reda, Karan Sapra, Zhiding Yu, Andrew Tao, and Bryan Catanzaro, "Partial Convolution based Padding," arXiv 2018. https://arxiv.org/abs/1811.11718

Versions

The codes were confirmed with the following versions.

  • Python 3.7.13
  • Pytorch 1.13.0+cu117
  • NVIDIA Driver 510.108.03
  • CUDA 11.6

About

Original pytorch implementation of pp-pad (peripheral prediction padding) in ICIP2023 paper

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published