Back to Annotated Deep Learning Paper Implementations

Fixed Positional Encodings

docs/transformers/positional_encoding.html

latest2.3 KB
Original Source

hometransformers

View code on Github

#

Fixed Positional Encodings

The positional encoding encodes the position along the sequence into a vector of size d_model .

PEp,2i​PEp,2i+1​​=sin(10000dmodel​2i​p​)=cos(10000dmodel​2i​p​)​

Where 1≤2i,2i+1≤dmodel​ are the feature indexes in the encoding, and p is the position.

23importmath2425importnumpyasnp26importtorch27importtorch.nnasnn

#

30classPositionalEncoding(nn.Module):

#

31def\_\_init\_\_(self,d\_model:int,dropout\_prob:float,max\_len:int=5000):32super().\_\_init\_\_()33self.dropout=nn.Dropout(dropout\_prob)3435self.register\_buffer('positional\_encodings',get\_positional\_encoding(d\_model,max\_len),False)

#

37defforward(self,x:torch.Tensor):38pe=self.positional\_encodings[:x.shape[0]].detach().requires\_grad\_(False)39x=x+pe40x=self.dropout(x)41returnx

#

44defget\_positional\_encoding(d\_model:int,max\_len:int=5000):

#

Empty encodings vectors

46encodings=torch.zeros(max\_len,d\_model)

#

Position indexes

48position=torch.arange(0,max\_len,dtype=torch.float32).unsqueeze(1)

#

2∗i

50two\_i=torch.arange(0,d\_model,2,dtype=torch.float32)

#

10000dmodel​2i​

52div\_term=torch.exp(two\_i\*-(math.log(10000.0)/d\_model))

#

PEp,2i​=sin(10000dmodel​2i​p​)

54encodings[:,0::2]=torch.sin(position\*div\_term)

#

PEp,2i+1​=cos(10000dmodel​2i​p​)

56encodings[:,1::2]=torch.cos(position\*div\_term)

#

Add batch dimension

59encodings=encodings.unsqueeze(1).requires\_grad\_(False)6061returnencodings

#

64def\_test\_positional\_encoding():65importmatplotlib.pyplotasplt6667plt.figure(figsize=(15,5))68pe=get\_positional\_encoding(20,100)69plt.plot(np.arange(100),pe[:,0,4:8].numpy())70plt.legend(["dim %d"%pforpin[4,5,6,7]])71plt.title("Positional encoding")72plt.show()737475if\_\_name\_\_=='\_\_main\_\_':76\_test\_positional\_encoding()

labml.ai