CA-GAN: Object Placement via Coalescing Attention based Generative Adversarial Network

Yibin Wang, Yuchao Feng, Jie Wu, Honghui Xu, Jianwei Zheng†

(†corresponding author)

[Zhejiang University of Techonology]

ICME 2023

⏬ Pre-trained Models

We provide models for TERSE [arXiv], PlaceNet [arXiv], GracoNet [arXiv] and our CA-GAN:

	method	FID	ACC	LPIPS	url of model & logs
0	TERSE	46.88	68.8%	0	baidu disk (code: zkk8)
1	PlaceNet	37.01	69.2%	0.161	baidu disk (code: rap8)
2	GracoNet	28.10	82.9%	0.207	baidu disk (code: cayr)
3	CA-GAN	23.21	86.7%	0.270	baidu disk (code: 90yf)

🔧 Environment Setup

Install Python 3.6 and PyTorch 1.9.1 (require CUDA >= 10.2):

conda install pytorch==1.9.1 torchvision==0.10.1 torchaudio==0.9.1 cudatoolkit=10.2 -c pytorch

🌓 Data preparation

Download and extract OPA dataset from the official link: google drive. We expect the directory structure to be the following:

<PATH_TO_OPA>
  background/       # background images
  foreground/       # foreground images with masks
  composite/        # composite images with masks
  train_set.csv     # train annotation
  test_set.csv      # test annotation

Then, make some preprocessing:

python tool/preprocess.py --data_root <PATH_TO_OPA>

You will see some new files and directories:

<PATH_TO_OPA>
  com_pic_testpos299/          # test set positive composite images (resized to 299)
  train_data.csv               # transformed train annotation
  train_data_pos.csv           # train annotation for positive samples
  test_data.csv                # transformed test annotation
  test_data_pos.csv            # test annotation for positive samples
  test_data_pos_unique.csv     # test annotation for positive samples with different fg/bg pairs

💻 Training

To train CA-GAN on a single 3090 gpu with batch size 32 for 15 epochs, run:

python main.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME>

If you want to reproduce the baseline models, just replace main.py with main_terse.py / main_placenet.py / main_graconet.py for training.

To see the change of losses dynamically, use TensorBoard:

tensorboard --logdir result/<YOUR_EXPERIMENT_NAME>/tblog --port <YOUR_SPECIFIED_PORT>

🔥 Inference

To predict composite images from a trained CA-GAN model, run:

python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type evaluni --repeat 10

If you want to infer the baseline models, just replace infer.py with infer_terse.py / infer_placenet.py / infer_graconet.py.

You could also directly make use of our provided models. For example, if you want to infer our best CA-GAN model, please 1) download CA-GAN.zip given above, 2) place it under result and uncompress it:

mv path/to/your/downloaded/CA-GAN.zip result/CA-GAN.zip
cd result
unzip CA-GAN.zip
cd ..

and 3) run:

python infer.py --data_root <PATH_TO_OPA> --expid CA-GAN --epoch 15 --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid CA-GAN --epoch 15 --eval_type evaluni --repeat 10

The procedure of inferring our provided baseline models are similar. Remember to use --epoch 11 for TERSE, GracoNet and --epoch 9 for PlaceNet.

🌈 Evaluation

To evaluate FID score, run:

sh script/eval_fid.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE> <PATH_TO_OPA/com_pic_testpos299>

To evaluate LPIPS score, run:

sh script/eval_lpips.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE>

To evaluate the Accuracy score, please follow GracoNet.

🙏 Acknowledgements

Some of the evaluation codes in this repo are borrowed and modified from OPA, FID-Pytorch, GracoNet and Perceptual Similarity. Thank them for their great work.

🖊️ BibTeX

If you find CA-GAN useful or relevant to your research, please kindly cite our paper:

@inproceedings{wang2023gan,
  title={Ca-gan: Object placement via coalescing attention based generative adversarial network},
  author={Wang, Yibin and Feng, Yuchao and Wu, Jie and Xu, Honghui and Zheng, Jianwei},
  booktitle={2023 IEEE International Conference on Multimedia and Expo (ICME)},
  pages={2375--2380},
  year={2023},
  organization={IEEE}
}

📧 Contact

If you have any technical comments or questions, please open a new issue or feel free to contact Yibin Wang.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
eval		eval
loader		loader
result		result
script		script
tool		tool
Model.png		Model.png
README.md		README.md
infer.py		infer.py
infer_graconet.py		infer_graconet.py
infer_placenet.py		infer_placenet.py
infer_terse.py		infer_terse.py
main.py		main.py
main_graconet.py		main_graconet.py
main_placenet.py		main_placenet.py
main_terse.py		main_terse.py
model.py		model.py
model_graconet.py		model_graconet.py
model_placenet.py		model_placenet.py
model_terse.py		model_terse.py
network.py		network.py
network_graconet.py		network_graconet.py
network_placenet.py		network_placenet.py
network_terse.py		network_terse.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CA-GAN: Object Placement via Coalescing Attention based Generative Adversarial Network

⏬ Pre-trained Models

🔧 Environment Setup

🌓 Data preparation

💻 Training

🔥 Inference

🌈 Evaluation

🙏 Acknowledgements

🖊️ BibTeX

📧 Contact

About

Uh oh!

Releases

Packages

Languages

CodeGoat24/CA-GAN

Folders and files

Latest commit

History

Repository files navigation

CA-GAN: Object Placement via Coalescing Attention based Generative Adversarial Network

⏬ Pre-trained Models

🔧 Environment Setup

🌓 Data preparation

💻 Training

🔥 Inference

🌈 Evaluation

🙏 Acknowledgements

🖊️ BibTeX

📧 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages