Back to Annotated Deep Learning Paper Implementations

Generative Adversarial Networks (GAN)

docs/gan/original/index.html

latest4.2 KB
Original Source

homeganoriginal

[View code on Github](https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/gan/original/ init.py)

#

Generative Adversarial Networks (GAN)

This is an implementation of Generative Adversarial Networks.

The generator, G(zz;θg​) generates samples that match the distribution of data, while the discriminator, D(xx;θg​) gives the probability that xx came from data rather than G.

We train D and G simultaneously on a two-player min-max game with value function V(G,D).

Gmin​Dmax​V(D,G)=Exx∼pdata​(xx)​[logD(xx)]+Ezz∼pzz​(zz)​[log(1−D(G(zz))]

pdata​(xx) is the probability distribution over data, whilst pzz​(zz) probability distribution of zz, which is set to gaussian noise.

This file defines the loss functions. Here is an MNIST example with two multilayer perceptron for the generator and discriminator.

34importtorch35importtorch.nnasnn36importtorch.utils.data37importtorch.utils.data

#

Discriminator Loss

Discriminator should ascend on the gradient,

∇θd​​m1​i=1∑m​[logD(xx(i))+log(1−D(G(zz(i))))]

m is the mini-batch size and (i) is used to index samples in the mini-batch. xx are samples from pdata​ and zz are samples from pz​.

40classDiscriminatorLogitsLoss(nn.Module):

#

55def\_\_init\_\_(self,smoothing:float=0.2):56super().\_\_init\_\_()

#

We use PyTorch Binary Cross Entropy Loss, which is −∑[ylog(y^​)+(1−y)log(1−y^​)], where y are the labels and y^​ are the predictions. Note the negative sign. We use labels equal to 1 for xx from pdata​ and labels equal to 0 for xx from pG​. Then descending on the sum of these is the same as ascending on the above gradient.

BCEWithLogitsLoss combines softmax and binary cross entropy loss.

67self.loss\_true=nn.BCEWithLogitsLoss()68self.loss\_false=nn.BCEWithLogitsLoss()

#

We use label smoothing because it seems to work better in some cases

71self.smoothing=smoothing

#

Labels are registered as buffered and persistence is set to False .

74self.register\_buffer('labels\_true',\_create\_labels(256,1.0-smoothing,1.0),False)75self.register\_buffer('labels\_false',\_create\_labels(256,0.0,smoothing),False)

#

logits_true are logits from D(xx(i)) and logits_false are logits from D(G(zz(i)))

77defforward(self,logits\_true:torch.Tensor,logits\_false:torch.Tensor):

#

82iflen(logits\_true)\>len(self.labels\_true):83self.register\_buffer("labels\_true",84\_create\_labels(len(logits\_true),1.0-self.smoothing,1.0,logits\_true.device),False)85iflen(logits\_false)\>len(self.labels\_false):86self.register\_buffer("labels\_false",87\_create\_labels(len(logits\_false),0.0,self.smoothing,logits\_false.device),False)8889return(self.loss\_true(logits\_true,self.labels\_true[:len(logits\_true)]),90self.loss\_false(logits\_false,self.labels\_false[:len(logits\_false)]))

#

Generator Loss

Generator should descend on the gradient,

∇θg​​m1​i=1∑m​[log(1−D(G(zz(i))))]

93classGeneratorLogitsLoss(nn.Module):

#

104def\_\_init\_\_(self,smoothing:float=0.2):105super().\_\_init\_\_()106self.loss\_true=nn.BCEWithLogitsLoss()107self.smoothing=smoothing

#

We use labels equal to 1 for xx from pG​. Then descending on this loss is the same as descending on the above gradient.

111self.register\_buffer('fake\_labels',\_create\_labels(256,1.0-smoothing,1.0),False)

#

113defforward(self,logits:torch.Tensor):114iflen(logits)\>len(self.fake\_labels):115self.register\_buffer("fake\_labels",116\_create\_labels(len(logits),1.0-self.smoothing,1.0,logits.device),False)117118returnself.loss\_true(logits,self.fake\_labels[:len(logits)])

#

Create smoothed labels

121def\_create\_labels(n:int,r1:float,r2:float,device:torch.device=None):

#

125returntorch.empty(n,1,requires\_grad=False,device=device).uniform\_(r1,r2)

labml.ai