Reusing Discriminators for Encoding:
Towards Unsupervised Image-to-Image Translation

Runfa Chen
Wenbing Huang
Binghui Huang
Fuchun Sun*
Bin Fang
*Corresponding author: Fuchun Sun.
THUAI
BNRist
Tsinghua University
Code [GitHub]
CVPR 2020 [Paper]



Spotlight-1min



Abstract

Unsupervised image-to-image translation is a central task in computer vision. Current translation frameworks will abandon the discriminator once the training process is completed. This paper contends a novel role of the discriminator by reusing it for encoding the images of the target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied. The main issue in NICE-GAN is the coupling of translation with discrimination along the encoder, which could incur training inconsistency when we play the min-max game via GAN. To tackle this issue, we develop a decoupled training strategy by which the encoder is only trained when maximizing the adversary loss while keeping frozen otherwise. Extensive experiments on four popular benchmarks demonstrate the superior performance of NICE-GAN over state-of-the-art methods in terms of FID, KID, and also human preference. Comprehensive ablation studies are also carried out to isolate the validity of each proposed component. Our codes are available at https://github.com/alpc91/NICE-GAN-pytorch.


Try NICE-GAN/Download the Pytorch version


[GitHub]


Paper

Runfa Chen, Wenbing Huang, Binghui Huang,
Fuchun Sun, Bin Fang.

Reusing Discriminators for Encoding:
Towards Unsupervised Image-to-Image Translation

In CVPR, 2020 [ArXiv (preferred)] [CVF].



Poster


[PDF]


Acknowledgements

This research was jointly funded by the National Natural Science Foundation of China (NSFC) and the German Research Foundation (DFG) in project Cross Modal Learning, NSFC 61621136008/DFG TRR-169, and the National Natural Science Foundation of China(Grant No.91848206). We thank Chengliang Zhong, Mingxuan Jing and Dr. Tao Kong for the insightful advice. Special thanks to Amy for her language guidance and support.