Improving Human-object Interaction with Auxiliary Semantic Information and Enhanced Instance Representation

In this study, we improve the performance of human-object interaction based on an end-to-end Transformer-based model called HOTR. In detail, we propose a simple but effective mechanism for enhancing the instances representation; moreover the semantic information is also explored to provide more knowledge; and finally, the cross-attention is proposed to fuse multi-level high-level feature maps in the Transformer architecture. The study achieves a significant improvement compared to the baseline HOTR model and is very competitive with other models

We proposed three modules.: Enhanced Instance Pointers, Semantic-guided Mechanism, Multi-level cross-attention

1. Environmental Setup

We experimented three modules in colab environmental

!pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
!pip install cython scipy
!pip install pycocotools
!pip install opencv-python
!pip install wandb
!pip install tensorflow

2. How to Train/Test

For both training and testing, you can either run on a single GPU or multiple GPUs.

# Train from epoch 1
!python main.py \
		--group_name vcoco \
		--run_name vcoco_single_run_000001 \
		--HOIDet \
		--validate  \
		--share_enc \
		--pretrained_dec \
		--lr 1e-4 \
		--num_hoi_queries 16 \
		--set_cost_idx 10 \
		--hoi_act_loss_coef 10 \
		--hoi_eos_coef 0.1 \
		--temperature 0.05 \
		--no_aux_loss \
		--hoi_aux_loss \
		--dataset_file vcoco \
		--frozen_weights https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth \
		--data_path /v-coco/data  \
		--output_dir  /checkpoints

# To train from a resume epoch
!python main.py \
		--group_name HOTR_vcoco \
		--run_name vcoco_single_run_000001 \
		--HOIDet \
		--validate \
		--share_enc \
		--pretrained_dec \
		--lr 1e-4 \
		--num_hoi_queries 16 \
		--set_cost_idx 10 \
		--hoi_act_loss_coef 10 \
		--hoi_eos_coef 0.1 \
		--temperature 0.05 \
		--no_aux_loss \
		--hoi_aux_loss \
		--dataset_file vcoco \
		--frozen_weights https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth \
    		--resume checkpoints/HOTR_vcoco/vcoco_single_run_000001/checkpoint99.pth \
    		--start_epoch 99 \
		--data_path /v-coco/data  \
		--output_dir  checkpoints

For testing, you can use your own trained weights and pass the group name and run name to the 'resume' argument.

!python main.py \
		--HOIDet \
		--share_enc \
		--pretrained_dec \
		--num_hoi_queries 16 \
		--object_threshold 0 \
		--temperature 0.05 \
		--no_aux_loss \
		--eval \
		--dataset_file vcoco \
		--data_path /v-coco/data \
		--resume /checkpoints/checkpoint99.pth

In order to use our provided weights, you can download the weights provided below. Then, pass the directory of the downloaded file (for example, to test our pre-trained weights on the vcoco dataset, we put the downloaded weights under the directory checkpoints/vcoco.pth) to the 'resume' argument.

4. Results

Here, we provide results of V-COCO Scenario 1 (60.88 mAP) and Scenario2 (65.69 mAP). This is obtained "without" applying any priors on the scores (see iCAN).

# queries	Scenario 1	Scenario 2	Checkpoint
16	60.88	65.69	download

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
hotr		hotr
imgs		imgs
.gitattributes		.gitattributes
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Human-object Interaction with Auxiliary Semantic Information and Enhanced Instance Representation

1. Environmental Setup

2. How to Train/Test

4. Results

About

Releases

Packages

Languages

levietthinh/HOI_paper

Folders and files

Latest commit

History

Repository files navigation

Improving Human-object Interaction with Auxiliary Semantic Information and Enhanced Instance Representation

1. Environmental Setup

2. How to Train/Test

4. Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages