KEMBAR78
GitHub - CodeGoat24/CA-GAN: [ICME2023] Official PyTorch Implementation for CA-GAN.
Skip to content

CodeGoat24/CA-GAN

Repository files navigation

CA-GAN: Object Placement via Coalescing Attention based Generative Adversarial Network

Yibin Wang, Yuchao Feng, Jie Wu, Honghui Xu, Jianwei Zheng

(†corresponding author)

[Zhejiang University of Techonology]

ICME 2023

⏬ Pre-trained Models

We provide models for TERSE [arXiv], PlaceNet [arXiv], GracoNet [arXiv] and our CA-GAN:

method FID ACC LPIPS url of model & logs
0 TERSE 46.88 68.8% 0 baidu disk (code: zkk8)
1 PlaceNet 37.01 69.2% 0.161 baidu disk (code: rap8)
2 GracoNet 28.10 82.9% 0.207 baidu disk (code: cayr)
3 CA-GAN 23.21 86.7% 0.270 baidu disk (code: 90yf)

🔧 Environment Setup

Install Python 3.6 and PyTorch 1.9.1 (require CUDA >= 10.2):

conda install pytorch==1.9.1 torchvision==0.10.1 torchaudio==0.9.1 cudatoolkit=10.2 -c pytorch

🌓 Data preparation

Download and extract OPA dataset from the official link: google drive. We expect the directory structure to be the following:

<PATH_TO_OPA>
  background/       # background images
  foreground/       # foreground images with masks
  composite/        # composite images with masks
  train_set.csv     # train annotation
  test_set.csv      # test annotation

Then, make some preprocessing:

python tool/preprocess.py --data_root <PATH_TO_OPA>

You will see some new files and directories:

<PATH_TO_OPA>
  com_pic_testpos299/          # test set positive composite images (resized to 299)
  train_data.csv               # transformed train annotation
  train_data_pos.csv           # train annotation for positive samples
  test_data.csv                # transformed test annotation
  test_data_pos.csv            # test annotation for positive samples
  test_data_pos_unique.csv     # test annotation for positive samples with different fg/bg pairs 

💻 Training

To train CA-GAN on a single 3090 gpu with batch size 32 for 15 epochs, run:

python main.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME>

If you want to reproduce the baseline models, just replace main.py with main_terse.py / main_placenet.py / main_graconet.py for training.

To see the change of losses dynamically, use TensorBoard:

tensorboard --logdir result/<YOUR_EXPERIMENT_NAME>/tblog --port <YOUR_SPECIFIED_PORT>

🔥 Inference

To predict composite images from a trained CA-GAN model, run:

python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type evaluni --repeat 10

If you want to infer the baseline models, just replace infer.py with infer_terse.py / infer_placenet.py / infer_graconet.py.

You could also directly make use of our provided models. For example, if you want to infer our best CA-GAN model, please 1) download CA-GAN.zip given above, 2) place it under result and uncompress it:

mv path/to/your/downloaded/CA-GAN.zip result/CA-GAN.zip
cd result
unzip CA-GAN.zip
cd ..

and 3) run:

python infer.py --data_root <PATH_TO_OPA> --expid CA-GAN --epoch 15 --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid CA-GAN --epoch 15 --eval_type evaluni --repeat 10

The procedure of inferring our provided baseline models are similar. Remember to use --epoch 11 for TERSE, GracoNet and --epoch 9 for PlaceNet.

🌈 Evaluation

To evaluate FID score, run:

sh script/eval_fid.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE> <PATH_TO_OPA/com_pic_testpos299>

To evaluate LPIPS score, run:

sh script/eval_lpips.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE>

To evaluate the Accuracy score, please follow GracoNet.

🙏 Acknowledgements

Some of the evaluation codes in this repo are borrowed and modified from OPA, FID-Pytorch, GracoNet and Perceptual Similarity. Thank them for their great work.

🖊️ BibTeX

If you find CA-GAN useful or relevant to your research, please kindly cite our paper:

@inproceedings{wang2023gan,
  title={Ca-gan: Object placement via coalescing attention based generative adversarial network},
  author={Wang, Yibin and Feng, Yuchao and Wu, Jie and Xu, Honghui and Zheng, Jianwei},
  booktitle={2023 IEEE International Conference on Multimedia and Expo (ICME)},
  pages={2375--2380},
  year={2023},
  organization={IEEE}
}

📧 Contact

If you have any technical comments or questions, please open a new issue or feel free to contact Yibin Wang.

About

[ICME2023] Official PyTorch Implementation for CA-GAN.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published