avatarJonathan Hui

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

2268

Abstract

opagation to modify the generator to produce images that address the shortcoming identified by the discriminator. In this problem, we introduce a discriminator <i>D</i> to make sure <i>Y</i> resemble Van Gogh paintings.</p><figure id="791c"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*gvbr-kGW8R3LNLddV3-JJg.jpeg"><figcaption></figcaption></figure><h1 id="d538">Network design</h1><p id="552a">CycleGAN transfers pictures from one domain to another. To transform pictures between real images and Van Gogh paintings. We build three networks.</p><ul><li>A generator <b><i>G</i></b> to convert a real image to a Van Gogh style picture.</li><li>A generator <b><i>F</i></b> to convert a Van Gogh style picture to a real image.</li><li>A discriminator <b><i>D</i></b> to identify real or generated Van Gogh pictures.</li></ul><p id="4a65">For the reverse direction, we just reverse the data flow and build an additional discriminator <b><i>Dx</i></b> to identify real images.</p><figure id="eb83"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*dUUS4iudUm0Zj6othFJP-A.jpeg"><figcaption></figcaption></figure><figure id="3bc2"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*O2Qx4WdhQx59dghdAfnLTg.png"><figcaption></figcaption></figure><h1 id="7faa">Cost function</h1><p id="e0c9">Training GAN is like training a design without GAN and then put back the adversary loss for the generator and the discriminator.</p><p id="3df0">We first determine the reconstruction cost</p><ul><li>This is the Cycle consistency loss which measures the L1-norm reconstruction cost for the real image (x → y → reconstructed x) and the Monet paintings (y → x → reconstructed y)</li></ul><figure id="0376"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*VNOQ3TUM-S-JUC5OMr65tw.png"><figcaption></figcaption></figure><p id="3b6f">Then we add the adversary loss for the generator and the discriminator.</p><ul><li>Adversary loss</li></ul><figure id="d03a"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*cspuu0pwXwZMP62CvVan4w.png"><figcaption></figcaption></figure><p id="5bfb">Here is the final objective function:</p><figure id="24be"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*M

Options

DSQXQHroCL46wAKkcSW9Q.png"><figcaption></figcaption></figure><p id="e282">In addition, CycleGAN retains a history of the last 50 generated images to train the discriminator. Discriminators and generators can overfit themselves and fall into a cycle of greedy optimization that leads to the mode collapse. We use experience replay so the discriminator will not be overzealous in beating the current generator. It needs to beat the last 50 generators to create a more generalized solution.</p><h1 id="2660">Further readings</h1><p id="f874">If your want to lean more on GANs:</p><div id="88b5" class="link-block"> <a href="https://readmedium.com/gan-a-comprehensive-review-into-the-gangsters-of-gans-part-1-95ff52455672"> <div> <div> <h2>GAN — A comprehensive review into the gangsters of GANs (Part 1)</h2> <div><h3>Are we there yet? In this GAN series, we identify a general pattern on how GAN is applied to deep learning problems and…</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*jZzMIHkekeEzRnCX)"></div> </div> </div> </a> </div><p id="6d80">Or you are interested to see more GAN applications:</p><div id="9123" class="link-block"> <a href="https://readmedium.com/gan-some-cool-applications-of-gans-4c9ecca35900"> <div> <div> <h2>GAN — Some cool applications of GANs.</h2> <div><h3>We make impressive progress in the first few years of GAN developments. No more stamp-size facial pictures like those…</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*3bbZGnvcwgN8BNkD)"></div> </div> </div> </a> </div><h1 id="6f59">Reference</h1><p id="fe22"><a href="https://arxiv.org/pdf/1703.10593.pdf">CycleGAN paper</a></p><p id="bd9a"><a href="https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix">CycleGAN PyTorch implementation</a></p></article></body>

GAN — CycleGAN (Playing magic with pictures)

Many GAN research focuses on model convergence and mode collapse. We deal with game theories that we do not know how to solve it efficiently. But GAN can be fun, in particular for cross-domain transfer. CycleGAN transfers styles to images. For example, we start with collecting three sets of pictures: one for real scenery, one for Monet paintings and the last one for Van Gogh. Can we take a real picture and transfer the style of Monet or Van Gogh onto it? On the other hand, can we make a Monet picture looks real?

Can we turn a picture in the summer to winter?

Source

In addition, the set of images are not paired, i.e. we do not have the real images corresponding to the same locations where Monet painted the pictures. CycleGAN learns the style of his images as a whole and applies it to other types of images.

CycleGAN

The concept of applying GAN to an existing design is very simple. We can treat the original problem as a simple image reconstruction. We use a deep network G to convert image x to y. We reverse the process with another deep network F to reconstruct the image. Then, we use a mean square error MSE to guide the training of G and F.

However, we are not interested in reconstructing images. We want to create y resembling certain styles. In GAN, a discriminator D is added to an existing design to guide the generator network to perform better. D acts as a critic between the training samples and the generated images. Through this criticism, we use backpropagation to modify the generator to produce images that address the shortcoming identified by the discriminator. In this problem, we introduce a discriminator D to make sure Y resemble Van Gogh paintings.

Network design

CycleGAN transfers pictures from one domain to another. To transform pictures between real images and Van Gogh paintings. We build three networks.

  • A generator G to convert a real image to a Van Gogh style picture.
  • A generator F to convert a Van Gogh style picture to a real image.
  • A discriminator D to identify real or generated Van Gogh pictures.

For the reverse direction, we just reverse the data flow and build an additional discriminator Dx to identify real images.

Cost function

Training GAN is like training a design without GAN and then put back the adversary loss for the generator and the discriminator.

We first determine the reconstruction cost

  • This is the Cycle consistency loss which measures the L1-norm reconstruction cost for the real image (x → y → reconstructed x) and the Monet paintings (y → x → reconstructed y)

Then we add the adversary loss for the generator and the discriminator.

  • Adversary loss

Here is the final objective function:

In addition, CycleGAN retains a history of the last 50 generated images to train the discriminator. Discriminators and generators can overfit themselves and fall into a cycle of greedy optimization that leads to the mode collapse. We use experience replay so the discriminator will not be overzealous in beating the current generator. It needs to beat the last 50 generators to create a more generalized solution.

Further readings

If your want to lean more on GANs:

Or you are interested to see more GAN applications:

Reference

CycleGAN paper

CycleGAN PyTorch implementation

Deep Learning
Machine Learning
Data Science
Computer Vision
Artificial Intelligence
Recommended from ReadMedium