Generative adversarial networks

Generative Adversarial Networks (GAN)
What is real?
Ding Li 2021.4

Hyper-Realistic Face Generator StyleGAN medium

3
Generative Adversarial Network
Generator and discriminator learn from the competition with each other. At the end, fakes look real.

5
Generator: Learning
Cost
Update
to fool discriminator better
Add randomness
to generated picture
update
to judge real/fake better

6
Binary Cross Entropy (BCE) Loss Function

7
Example: Generate Handwritten Digits
randn
noise
64
Generator
block
linear
Batch
Norm
ReLU
128
256
512
1024
linear
784
Sigmoid
Discriminator
block
linear
Leaky
ReLU
512
256
128
1
Ture/False?
Generator Discriminator
prediction
Objective for
Prediction
Real → 1
Fake → 0
Fake → 1
python
ReLU f(z) = max(0, z)
Leaky ReLU f(z) = max(0.1z, z)
Sigmoid 𝑓(𝑧) =
1
1 + 𝑒−𝑧
28x28

8
Covariate Shift & Batch Normalization
Covariate Shift Normalization

9
Conditional Generation & Interpolation
Class vector for digits GAN
0
1
2
3
4
5
6
7
8
9
0
1
0
0
0
0
0
0
0
0
0
0.875
0
0
0
0.125
0
0
0
0
0
0.75
0
0
0
0.25
0
0
0
0
0
0.125
0
0
0
0.875
0
0
0
0
0
0
0
0
0
1
0
0
0
0
……
python

10
Controllable Generation & Z-Space Disentanglement
python

11
Evaluation
to pre-train image
to calculate image similarity

12
Fréchet Inception Distance (FID) and Inception Score
python

PULSE & Machine Bias
14
Pulse turns Obama
white

15
StyleGAN
Noise Mapping
Style GAN Style GAN 2 python YouTube

16
Data Augmentation & Image Synthesis
Training with fake data (paper) GauGAN: Semantic Image Synthesis (paper blog YouTube)
StackGAN: Text to Photo-realistic Image Synthesis (paper)

17
Talking Head Models
Realistic Neural Talking Head Models (paper, YouTube)

18
Synthesis of Diagnostic Quality Cancer Pathology Images (paper)
Sample images (size=1024×1024 pixels) from GANs trained on TCGA Image Dataset (in order: PTC, HCC, LGG, RCC, SCC).
Sample images from GANs trained on OVCARE Dataset (in order: CCC, ENC, HGSC, LGSC, MUC)
Training of computer-aided diagnostic systems can benefit from synthetic images where labeled datasets are limited (e.g., rare cancers).

19
Image to Image Translation
pix2pixHD paper GitHub YouTube
pix2pix paper GitHub

20
Pix2pix Architecture: U-Net + PatchGAN
python

21
Cycle GAN: Mapping Between Two Unpaired Piles (paper GitHub YouTube)
Cycle Consistency Loss
python

22
EditGAN: High-Precision Semantic Image Editing (paper GitHub YouTube)
(1) EditGAN builds on a GAN
framework that jointly
models images and their
semantic segmentations.
(2) (2 & 3) Users can modify
segmentation masks,
based on which we
perform optimization in
the GAN’s latent space to
realize the edit.
(3) (4) Users can perform
editing simply by applying
previously learnt editing
vectors and manipulate
images at interactive rates.
DatasetGAN for semantic segmentations Raising eyebrows editing. Images are images before editing and after editing. Segmentation masks are before editing
and target segmentation mask after manual modification.

24
 Coursera: Generative Adversarial Networks (GANs) Specialization
 Papers and Websites
• Generative Adversarial Networks (Goodfellow et al., 2014): https://arxiv.org/abs/1406.2661
• Hyperspherical Variational Auto-Encoders (Davidson, Falorsi, De Cao, Kipf, and Tomczak, 2018): https://www.researchgate.net/figure/Latent-space-visualization-of-the-10-MNIST-digits-in-2-dimensions-of-both-N-VAE-
left_fig2_324182043
• Analyzing and Improving the Image Quality of StyleGAN (Karras et al., 2020): https://arxiv.org/abs/1912.04958
• Semantic Image Synthesis with Spatially-Adaptive Normalization (Park, Liu, Wang, and Zhu, 2019): https://arxiv.org/abs/1903.07291
• Few-shot Adversarial Learning of Realistic Neural Talking Head Models (Zakharov, Shysheya, Burkov, and Lempitsky, 2019): https://arxiv.org/abs/1905.08233
• Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling (Wu, Zhang, Xue, Freeman, and Tenenbaum, 2017): https://arxiv.org/abs/1610.07584
• These Cats Do Not Exist (Glover and Mott, 2019): http://thesecatsdonotexist.com/
• Large Scale GAN Training for High Fidelity Natural Image Synthesis (Brock, Donahue, and Simonyan, 2019): https://arxiv.org/abs/1809.11096
• Deconvolution and Checkerboard Artifacts (Odena et al., 2016): http://doi.org/10.23915/distill.00003
• Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (Radford, Metz, and Chintala, 2016): https://arxiv.org/abs/1511.06434
• Wasserstein GAN (Arjovsky, Chintala, and Bottou, 2017): https://arxiv.org/abs/1701.07875
• Improved Training of Wasserstein GANs (Gulrajani et al., 2017): https://arxiv.org/abs/1704.00028
• Interpreting the Latent Space of GANs for Semantic Face Editing (Shen, Gu, Tang, and Zhou, 2020): https://arxiv.org/abs/1907.10786
• CelebFaces Attributes Dataset (CelebA): http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
• StyleGAN - Official TensorFlow Implementation: https://github.com/NVlabs/stylegan
• Stanford Vision Lab: http://vision.stanford.edu/
• Review: Inception-v3 — 1st Runner Up (Image Classification) in ILSVRC 2015 (Tsang, 2018): https://medium.com/@sh.tsang/review-inception-v3-1st-runner-up-image-classification-in-ilsvrc-2015-17915421f77c
• HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models (Zhou et al., 2019): https://arxiv.org/abs/1904.01121
• Improved Precision and Recall Metric for Assessing Generative Models (Kynkäänniemi, Karras, Laine, Lehtinen, and Aila, 2019): https://arxiv.org/abs/1904.06991
• Large Scale GAN Training for High Fidelity Natural Image Synthesis (Brock, Donahue, and Simonyan, 2019): https://arxiv.org/abs/1809.11096
• The Fréchet Distance between Multivariate Normal Distributions (Dowson and Landau, 1982): https://core.ac.uk/reader/82269844
• Hyperspherical Variational Auto-Encoders (Davidson, Falorsi, De Cao, Kipf, and Tomczak, 2018): https://arxiv.org/abs/1804.00891
• Generating Diverse High-Fidelity Images with VQ-VAE-2 (Razavi, van den Oord, and Vinyals, 2019): https://arxiv.org/abs/1906.00446
• Conditional Image Generation with PixelCNN Decoders (van den Oord et al., 2016): https://arxiv.org/abs/1606.05328
• Glow: Better Reversible Generative Models (Dhariwal and Kingma, 2018): https://openai.com/blog/glow/
• Machine Bias (Angwin, Larson, Mattu, and Kirchner, 2016): https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
• Fairness Definitions Explained (Verma and Rubin, 2018): https://fairware.cs.umass.edu/papers/Verma.pdf
• Does Object Recognition Work for Everyone? (DeVries, Misra, Wang, and van der Maaten, 2019): https://arxiv.org/abs/1906.02659
• PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models (Menon, Damian, Hu, Ravi, and Rudin, 2020): https://arxiv.org/abs/2003.03808
• What a machine learning tool that turns Obama white can (and can't) tell us about AI bias (Vincent, 2020): https://www.theverge.com/21298762/face-depixelizer-ai-machine-learning-tool-pulse-stylegan-obama-bias
• Mitigating Unwanted Biases with Adversarial Learning (Zhang, Lemoine, and Mitchell, 2018): https://m-mitchell.com/papers/Adversarial_Bias_Mitigation.pdf
• Tutorial on Fairness Accountability Transparency and Ethics in Computer Vision at CVPR 2020 (Gebru and Denton, 2020): https://sites.google.com/view/fatecv-tutorial/schedule?authuser=0
• Coupled Generative Adversarial Networks (Liu and Tuzel, 2016): https://arxiv.org/abs/1606.07536
• Progressive Growing of GANs for Improved Quality, Stability, and Variation (Karras, Aila, Laine, and Lehtinen, 2018): https://arxiv.org/abs/1710.10196
• A Style-Based Generator Architecture for Generative Adversarial Networks (Karras, Laine, and Aila, 2019): https://arxiv.org/abs/1812.04948
• The Unusual Effectiveness of Averaging in GAN Training (Yazici et al., 2019): https://arxiv.org/abs/1806.04498v2
• Progressive Growing of GANs for Improved Quality, Stability, and Variation (Karras, Aila, Laine, and Lehtinen, 2018): https://arxiv.org/abs/1710.10196
• StyleGAN Faces Training (Branwen, 2019): https://www.gwern.net/images/gan/2019-03-16-stylegan-facestraining.mp4
• Facebook AI Proposes Group Normalization Alternative to Batch Normalization (Peng, 2018): https://medium.com/syncedreview/facebook-ai-proposes-group-normalization-alternative-to-batch-normalization-fb0699bffae7

25
 Papers and Websites
• Semantic Image Synthesis with Spatially-Adaptive Normalization (Park, Liu, Wang, and Zhu, 2019): https://arxiv.org/abs/1903.07291
• Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (Ledig et al., 2017): https://arxiv.org/abs/1609.04802
• Multimodal Unsupervised Image-to-Image Translation (Huang et al., 2018): https://github.com/NVlabs/MUNIT
• StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks (Zhang et al., 2017): https://arxiv.org/abs/1612.03242
• Few-Shot Adversarial Learning of Realistic Neural Talking Head Models (Zakharov, Shysheya, Burkov, and Lempitsky, 2019): https://arxiv.org/abs/1905.08233
• Snapchat: https://www.snapchat.com
• MaskGAN: Towards Diverse and Interactive Facial Image Manipulation (Lee, Liu, Wu, and Luo, 2020): https://arxiv.org/abs/1907.11922
• When AI generated paintings dance to music... (2019): https://www.youtube.com/watch?v=85l961MmY8Y
• Data Augmentation Generative Adversarial Networks (Antoniou, Storkey, and Edwards, 2018): https://arxiv.org/abs/1711.04340
• Training progression of StyleGAN on H&E tissue fragments (Zhou, 2019): https://twitter.com/realSharonZhou/status/1182877446690852867
• Establishing an evaluation metric to quantify climate change image realism (Sharon Zhou, Luccioni, Cosne, Bernstein, and Bengio, 2020): https://iopscience.iop.org/article/10.1088/2632-2153/ab7657/meta
• Deepfake example (2019): https://en.wikipedia.org/wiki/File:Deepfake_example.gif
• Introduction to adversarial robustness (Kolter and Madry): https://adversarial-ml-tutorial.org/introduction/
• Large Scale GAN Training for High Fidelity Natural Image Synthesis (Brock, Donahue, and Simonyan, 2019): https://openreview.net/pdf?id=B1xsqj09Fm
• GazeGAN - Unpaired Adversarial Image Generation for Gaze Estimation (Sela, Xu, He, Navalpakkam, and Lagun, 2017): https://arxiv.org/abs/1711.09767
• Data Augmentation using GANs for Speech Emotion Recognition (Chatziagapi et al., 2019): https://pdfs.semanticscholar.org/395b/ea6f025e599db710893acb6321e2a1898a1f.pdf
• GAN-based Synthetic Medical Image Augmentation for increased CNN Performance in Liver Lesion Classification (Frid-Adar et al., 2018): https://arxiv.org/abs/1803.01229
• GANsfer Learning: Combining labelled and unlabelled data for GAN based data augmentation (Bowles, Gunn, Hammers, and Rueckert, 2018): https://arxiv.org/abs/1811.10669
• Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks (Sandfort, Yan, Pickhardt, and Summers, 2019): https://www.nature.com/articles/s41598-019-52737-
x/figures/3
• De-identification without losing faces (Li and Lyu, 2019): https://arxiv.org/abs/1902.04202
• Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing (Beaulieu-Jones et al., 2019): https://www.ahajournals.org/doi/epub/10.1161/CIRCOUTCOMES.118.005122
• DeepPrivacy: A Generative Adversarial Network for Face Anonymization (Hukkelås, Mester, and Lindseth, 2019): https://arxiv.org/abs/1909.04538
• GAIN: Missing Data Imputation using Generative Adversarial Nets (Yoon, Jordon, and van der Schaar, 2018): https://arxiv.org/abs/1806.02920
• Conditional Infilling GANs for Data Augmentation in Mammogram Classification (E. Wu, K. Wu, Cox, and Lotter, 2018): https://link.springer.com/chapter/10.1007/978-3-030-00946-5_11
• The Effectiveness of Data Augmentation in Image Classification using Deep Learning (Perez and Wang, 2017): https://arxiv.org/abs/1712.04621
• CIFAR-10 and CIFAR-100 Dataset; Learning Multiple Layers of Features from Tiny Images (Krizhevsky, 2009): https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
• DeOldify... (Antic, 2019): https://twitter.com/citnaj/status/1124904251128406016
• pix2pixHD (Wang et al., 2018): https://github.com/NVIDIA/pix2pixHD
• [4k, 60 fps] Arrival of a Train at La Ciotat (The Lumière Brothers, 1896) (Shiryaev, 2020): https://youtu.be/3RYNThid23g
• Image-to-Image Translation with Conditional Adversarial Networks (Isola, Zhu, Zhou, and Efros, 2018): https://arxiv.org/abs/1611.07004
• Pose Guided Person Image Generation (Ma et al., 2018): https://arxiv.org/abs/1705.09368
• AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks (Xu et al., 2017): https://arxiv.org/abs/1711.10485
• Few-Shot Adversarial Learning of Realistic Neural Talking Head Models (Zakharov, Shysheya, Burkov, and Lempitsky, 2019): https://arxiv.org/abs/1905.08233
• Patch-Based Image Inpainting with Generative Adversarial Networks (Demir and Unal, 2018): https://arxiv.org/abs/1803.07422
• Image Segmentation Using DIGITS 5 (Heinrich, 2016): https://developer.nvidia.com/blog/image-segmentation-using-digits-5/
• Stroke of Genius: GauGAN Turns Doodles into Stunning, Photorealistic Landscapes (Salian, 2019): https://blogs.nvidia.com/blog/2019/03/18/gaugan-photorealistic-landscapes-nvidia-research/
• Crowdsourcing the creation of image segmentation algorithms for connectomics (Arganda-Carreras et al., 2015): https://www.frontiersin.org/articles/10.3389/fnana.2015.00142/full
• U-Net: Convolutional Networks for Biomedical Image Segmentation (Ronneberger, Fischer, and Brox, 2015): https://arxiv.org/abs/1505.04597
• Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (Zhu, Park, Isola, and Efros, 2020): https://arxiv.org/abs/1703.10593
• PyTorch implementation of CycleGAN (2017): https://github.com/togheppi/CycleGAN
• Distribution Matching Losses Can Hallucinate Features in Medical Image Translation (Cohen, Luck, and Honari, 2018): https://arxiv.org/abs/1805.08841
• Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks (Sandfort, Yan, Pickhardt, and Summers, 2019): https://www.nature.com/articles/s41598-019-52737-x.pdf
• Unsupervised Image-to-Image Translation (NVIDIA, 2018): https://github.com/mingyuliutw/UNIT
• Multimodal Unsupervised Image-to-Image Translation (Huang et al., 2018): https://github.com/NVlabs/MUNIT

Generative adversarial networks

More Related Content

What's hot

Similar to Generative adversarial networks

More from Ding Li

Recently uploaded

Generative adversarial networks