Image Generation - Image Processing Using Multi-Code GAN Prior - 《Notes of CV》

Title
Information
Summary
Research Objective
Problem Statement
Method(s)
- Structure
- optimization
Evaluation
Conclusion
Criticism
Reference

Title

Image Processing Using Multi-Code GAN Prior

Information

论文地址：https://arxiv.org/abs/1912.07116
github地址：https://github.com/genforce/mganprior

Summary

作者提出利用多维latent code实现图像重建，通过预训练好的GAN网络将这些latent code生成的特征图在某一层用自适应权重结合起来，可以得到较好的图像重建效果。该方法也同样可以用在上色、超分、修复、去噪等任务中。

Research Objective

提高图像重建的质量

Problem Statement

图像重建的普遍做法是从图像中获取latent code, 为此有两种方法。

根据重建的损失函数反向传播来优化latent code
训练一个额外的encoder对图像实现编码但这些重建都不符合预期质量，同时如果输入图片和训练GAN的数据不在同一个domain差距会更大。作者认为只靠single latent code是无法真正重建图像的（否则图像压缩技术将取得重大发展）

作者期望基于已有的GAN模型，实现高清图像的高质量重建

Method(s)

Structure

Image Processing Using Multi-Code GAN Prior - 图1
作者使用了多维latent code，使用GAN生成中间特征，用自适应通道权重将它们组合起来(Feature Composition & Adaptive Channel Importance)，以生成最终结果。

optimization

loss的一般形式：

Image Processing Using Multi-Code GAN Prior - 图2

图像上色：

Image Processing Using Multi-Code GAN Prior - 图3

超分：

Image Processing Using Multi-Code GAN Prior - 图4

图像修补：

Image Processing Using Multi-Code GAN Prior - 图5

Image Processing Using Multi-Code GAN Prior - 图6 表示像素级乘法

Evaluation

先验GAN网络使用了PGGAN和StyleGAN，训练基于不同数据集，包括CelebA-HQ, FFHQ, LSUN。

和其它Inversion methods的比较

直接优化单一latent code
用encoder生成latent code
1和2的结合
作者提出的mGANprior方法

评价标准：Peak Signal-to-Noise Ratio（PSNR，越高越好）和LPIPS（越低越好）
Image Processing Using Multi-Code GAN Prior - 图7

作者提出的方法在PSNR和LPIPS均有优势，同时细节还原得更好，也可以在西方人的预训练GAN上还原东方人。

自身的比较

latent code的维度

Latent code增多会带来更好的细节，但增长到20以后就没有什么提升了
在哪一层做feature composition
结果同样在figure4中，基于PGGAN的实验表示，层数越高，效果越好。但层数高了以后，特征图中会含有更多的pixel信息而非语义信息，这不利于复用GAN的先验语义知识
可视化latent code对最终结果的影响

可以看出每个latent code负责生成图像的某一个部分，因此multiple latent codes是有意义的

不同图像任务的实验
不细写了，就基于不同任务（上色、修补这些）去做对比实验
不同的层对结果的影响

重建：层越高，重建效果又好，因为高层表示内容细节
着色：第8层，因为着色是低级渲染任务
修复：第4层，因为需要GAN去填补丢失的部分

Conclusion

作者认为自己的方法能高还原地重建图像，在上色、超分、图像修补、语义操控上都有着杰出表现。

Criticism

latent code的维度实验中，作者画了个图，纵轴是correlation，这个图不能理解

Reference

提高GAN生成能力
- [23]Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In ICLR, 2018
- [8]Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high ﬁdelity natural image synthesis. In ICLR, 2019.
- [24]Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In CVPR, 2019
增强GAN的训练稳定性
- [1] Martin Arjovsky, Soumith Chintala, and L´eon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
- [7] David Berthelot, Thomas Schumm, and Luke Metz. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717, 2017
- [17]Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. Improved training of wasserstein gans. In NeurIPS, 2017.
人脸属性编辑
- [27]-新的网络结构 Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, and Marc’Aurelio Ranzato. Fader networks: Manipulating images by sliding attributes. In NeurIPS, 2017
- [36]-新loss Yujun Shen, Ping Luo, Junjie Yan, Xiaogang Wang, and Xiaoou Tang. Faceid-gan: Learning a symmetry three-player gan for identity-preserving face synthesis. In CVPR, 2018
- [35]Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. Interpreting the latent space of gans for semantic face editing. In CVPR, 2020
超分
- [28]-新loss Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photorealistic single image super-resolution using a generative adversarial network. In CVPR, 2017
- [42]Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esrgan: Enhanced super-resolution generative adversarial networks. In ECCV Workshop, 2018.
image2image
- [53]-新的网络结构 Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycleconsistent adversarial networks. In ICCV, 2017.
- [11] Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. Stargan: Uniﬁed generative adversarial networks for multi-domain image-to-image translation. In CVPR, 2018.
- [31]Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. Few-shot unsupervised image-to-image translation. In ICCV, 2019.
GAN作为先验用于图像生成和重建
- [40]Paul Upchurch, Jacob Gardner, Geoff Pleiss, Robert Pless, Noah Snavely, Kavita Bala, and Kilian Weinberger. Deep feature interpolation for image content changes. In CVPR, 2017
- [39]Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky.
  Deep image prior. In CVPR, 2018
- [2]Latent convolutional models. In ICLR, 2019.
从图像生成latent code(GAN Inversion)
- 根据重建损失反向传播优化latent
  - [30] Zachary C Lipton and Subarna Tripathi. Precise recovery of latent vectors from generative adversarial networks. In ICLR Workshop, 2017.
  - [12] Antonia Creswell and Anil Anthony Bharath. Inverting the generator of a generative adversarial network. TNNLS, 2018
  - [32]Fangchang Ma, Ulas Ayaz, and Sertac Karaman. Invertibility of convolutional generative networks from partial measurements. In NeurIPS, 2018.
- 训练额外的encoder
  - [34] Guim Perarnau, Joost Van De Weijer, Bogdan Raducanu, and Jose M ´Alvarez. Invertible conditional gans for image editing. In NeurIPS Workshop, 2016.
  - [52] Jun-Yan Zhu, Philipp Kr¨ahenb¨uhl, Eli Shechtman, and Alexei A Efros. Generative visual manipulation on the natural image manifold. In ECCV, 2016.
  - [6]David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. Seeing what a gan cannot generate. In ICCV, 2019.
  - [5]David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. Inverting layers of a large generator. In ICLR Workshop, 2019.
perceptual features
- [22]Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.