docs/gan/original/index.html
[View code on Github](https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/gan/original/ init.py)
This is an implementation of Generative Adversarial Networks.
The generator, G(zz;θg) generates samples that match the distribution of data, while the discriminator, D(xx;θg) gives the probability that xx came from data rather than G.
We train D and G simultaneously on a two-player min-max game with value function V(G,D).
GminDmaxV(D,G)=Exx∼pdata(xx)[logD(xx)]+Ezz∼pzz(zz)[log(1−D(G(zz))]
pdata(xx) is the probability distribution over data, whilst pzz(zz) probability distribution of zz, which is set to gaussian noise.
This file defines the loss functions. Here is an MNIST example with two multilayer perceptron for the generator and discriminator.
34importtorch35importtorch.nnasnn36importtorch.utils.data37importtorch.utils.data
Discriminator should ascend on the gradient,
∇θdm1i=1∑m[logD(xx(i))+log(1−D(G(zz(i))))]
m is the mini-batch size and (i) is used to index samples in the mini-batch. xx are samples from pdata and zz are samples from pz.
40classDiscriminatorLogitsLoss(nn.Module):
55def\_\_init\_\_(self,smoothing:float=0.2):56super().\_\_init\_\_()
We use PyTorch Binary Cross Entropy Loss, which is −∑[ylog(y^)+(1−y)log(1−y^)], where y are the labels and y^ are the predictions. Note the negative sign. We use labels equal to 1 for xx from pdata and labels equal to 0 for xx from pG. Then descending on the sum of these is the same as ascending on the above gradient.
BCEWithLogitsLoss combines softmax and binary cross entropy loss.
67self.loss\_true=nn.BCEWithLogitsLoss()68self.loss\_false=nn.BCEWithLogitsLoss()
We use label smoothing because it seems to work better in some cases
71self.smoothing=smoothing
Labels are registered as buffered and persistence is set to False .
74self.register\_buffer('labels\_true',\_create\_labels(256,1.0-smoothing,1.0),False)75self.register\_buffer('labels\_false',\_create\_labels(256,0.0,smoothing),False)
logits_true are logits from D(xx(i)) and logits_false are logits from D(G(zz(i)))
77defforward(self,logits\_true:torch.Tensor,logits\_false:torch.Tensor):
82iflen(logits\_true)\>len(self.labels\_true):83self.register\_buffer("labels\_true",84\_create\_labels(len(logits\_true),1.0-self.smoothing,1.0,logits\_true.device),False)85iflen(logits\_false)\>len(self.labels\_false):86self.register\_buffer("labels\_false",87\_create\_labels(len(logits\_false),0.0,self.smoothing,logits\_false.device),False)8889return(self.loss\_true(logits\_true,self.labels\_true[:len(logits\_true)]),90self.loss\_false(logits\_false,self.labels\_false[:len(logits\_false)]))
Generator should descend on the gradient,
∇θgm1i=1∑m[log(1−D(G(zz(i))))]
93classGeneratorLogitsLoss(nn.Module):
104def\_\_init\_\_(self,smoothing:float=0.2):105super().\_\_init\_\_()106self.loss\_true=nn.BCEWithLogitsLoss()107self.smoothing=smoothing
We use labels equal to 1 for xx from pG. Then descending on this loss is the same as descending on the above gradient.
111self.register\_buffer('fake\_labels',\_create\_labels(256,1.0-smoothing,1.0),False)
113defforward(self,logits:torch.Tensor):114iflen(logits)\>len(self.fake\_labels):115self.register\_buffer("fake\_labels",116\_create\_labels(len(logits),1.0-self.smoothing,1.0,logits.device),False)117118returnself.loss\_true(logits,self.fake\_labels[:len(logits)])
Create smoothed labels
121def\_create\_labels(n:int,r1:float,r2:float,device:torch.device=None):
125returntorch.empty(n,1,requires\_grad=False,device=device).uniform\_(r1,r2)