Title
Image Processing Using Multi-Code GAN Prior
Information
论文地址:https://arxiv.org/abs/1912.07116
github地址:https://github.com/genforce/mganprior
Summary
作者提出利用多维latent code实现图像重建,通过预训练好的GAN网络将这些latent code生成的特征图在某一层用自适应权重结合起来,可以得到较好的图像重建效果。该方法也同样可以用在上色、超分、修复、去噪等任务中。
Research Objective
Problem Statement
图像重建的普遍做法是从图像中获取latent code, 为此有两种方法。
- 根据重建的损失函数反向传播来优化latent code
- 训练一个额外的encoder对图像实现编码 但这些重建都不符合预期质量,同时如果输入图片和训练GAN的数据不在同一个domain差距会更大。作者认为只靠single latent code是无法真正重建图像的(否则图像压缩技术将取得重大发展)
Method(s)
Structure
作者使用了多维latent code,使用GAN生成中间特征,用自适应通道权重将它们组合起来(Feature Composition & Adaptive Channel Importance),以生成最终结果。
optimization
- loss的一般形式:
- 图像上色:
- 超分:
- 图像修补:
Evaluation
先验GAN网络使用了PGGAN和StyleGAN,训练基于不同数据集,包括CelebA-HQ, FFHQ, LSUN。
和其它Inversion methods的比较
- 直接优化单一latent code
- 用encoder生成latent code
- 1和2的结合
- 作者提出的mGANprior方法
评价标准:Peak Signal-to-Noise Ratio(PSNR,越高越好)和LPIPS(越低越好)
作者提出的方法在PSNR和LPIPS均有优势,同时细节还原得更好,也可以在西方人的预训练GAN上还原东方人。
自身的比较
- latent code的维度
Latent code增多会带来更好的细节,但增长到20以后就没有什么提升了 - 在哪一层做feature composition
结果同样在figure4中,基于PGGAN的实验表示,层数越高,效果越好。但层数高了以后,特征图中会含有更多的pixel信息而非语义信息,这不利于复用GAN的先验语义知识 - 可视化latent code对最终结果的影响
可以看出每个latent code负责生成图像的某一个部分,因此multiple latent codes是有意义的
不同图像任务的实验
不细写了,就基于不同任务(上色、修补这些)去做对比实验不同的层对结果的影响
- 重建:层越高,重建效果又好,因为高层表示内容细节
- 着色:第8层,因为着色是低级渲染任务
- 修复:第4层,因为需要GAN去填补丢失的部分
Conclusion
作者认为自己的方法能高还原地重建图像, 在上色、超分、图像修补、语义操控上都有着杰出表现。
Criticism
latent code的维度实验中,作者画了个图,纵轴是correlation,这个图不能理解
Reference
- 提高GAN生成能力
- [23]Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In ICLR, 2018
- [8]Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis. In ICLR, 2019.
- [24]Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In CVPR, 2019
- 增强GAN的训练稳定性
- [1] Martin Arjovsky, Soumith Chintala, and L´eon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
- [7] David Berthelot, Thomas Schumm, and Luke Metz. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717, 2017
- [17]Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. Improved training of wasserstein gans. In NeurIPS, 2017.
- 人脸属性编辑
- [27]-新的网络结构 Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, and Marc’Aurelio Ranzato. Fader networks: Manipulating images by sliding attributes. In NeurIPS, 2017
- [36]-新loss Yujun Shen, Ping Luo, Junjie Yan, Xiaogang Wang, and Xiaoou Tang. Faceid-gan: Learning a symmetry three-player gan for identity-preserving face synthesis. In CVPR, 2018
- [35]Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. Interpreting the latent space of gans for semantic face editing. In CVPR, 2020
- 超分
- [28]-新loss Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photorealistic single image super-resolution using a generative adversarial network. In CVPR, 2017
- [42]Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esrgan: Enhanced super-resolution generative adversarial networks. In ECCV Workshop, 2018.
- image2image
- [53]-新的网络结构 Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycleconsistent adversarial networks. In ICCV, 2017.
- [11] Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In CVPR, 2018.
- [31]Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. Few-shot unsupervised image-to-image translation. In ICCV, 2019.
- GAN作为先验用于图像生成和重建
- [40]Paul Upchurch, Jacob Gardner, Geoff Pleiss, Robert Pless, Noah Snavely, Kavita Bala, and Kilian Weinberger. Deep feature interpolation for image content changes. In CVPR, 2017
- [39]Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky.
Deep image prior. In CVPR, 2018 - [2]Latent convolutional models. In ICLR, 2019.
- [40]Paul Upchurch, Jacob Gardner, Geoff Pleiss, Robert Pless, Noah Snavely, Kavita Bala, and Kilian Weinberger. Deep feature interpolation for image content changes. In CVPR, 2017
- 从图像生成latent code(GAN Inversion)
- 根据重建损失反向传播优化latent
- [30] Zachary C Lipton and Subarna Tripathi. Precise recovery of latent vectors from generative adversarial networks. In ICLR Workshop, 2017.
- [12] Antonia Creswell and Anil Anthony Bharath. Inverting the generator of a generative adversarial network. TNNLS, 2018
- [32]Fangchang Ma, Ulas Ayaz, and Sertac Karaman. Invertibility of convolutional generative networks from partial measurements. In NeurIPS, 2018.
- 训练额外的encoder
- [34] Guim Perarnau, Joost Van De Weijer, Bogdan Raducanu, and Jose M ´Alvarez. Invertible conditional gans for image editing. In NeurIPS Workshop, 2016.
- [52] Jun-Yan Zhu, Philipp Kr¨ahenb¨uhl, Eli Shechtman, and Alexei A Efros. Generative visual manipulation on the natural image manifold. In ECCV, 2016.
- [6]David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. Seeing what a gan cannot generate. In ICCV, 2019.
- [5]David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, and Antonio Torralba. Inverting layers of a large generator. In ICLR Workshop, 2019.
- 根据重建损失反向传播优化latent
- perceptual features
- [22]Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.