docs/transformers/gmlp/experiment.html
This is an annotated PyTorch experiment to train a gMLP model. The paper also applies a Stochastic Depth regularization where some layers are removed randomly during training. We have not implemented that here.
This is based on training loop and configurations for a simple transformer auto-regressive NLP task.
16fromlabmlimportexperiment17fromlabml.configsimportoption18fromlabml\_nn.transformersimportTransformerConfigs19fromlabml\_nn.transformers.basic.autoregressive\_experimentimportConfigsasBasicAutoRegressionConfigs20fromlabml\_nn.transformers.gmlpimportGMLPBlock
This inherits from training loop and configurations for a simple transformer auto-regressive NLP task.
23classConfigs(BasicAutoRegressionConfigs):
Transformer
32transformer:TransformerConfigs='gMLP'
gMLP Block
34gmlp:GMLPBlock
d_ffn for gMLP projection layer
36d\_ffn:int=2048
39@option(Configs.gmlp,'gMLP')40def\_gmlp\_configs(c:Configs):
44returnGMLPBlock(c.d\_model,c.d\_ffn,c.seq\_len)
47@option(Configs.transformer,'gMLP')48def\_transformer\_configs(c:Configs):
We use our configurable transformer implementation
55conf=TransformerConfigs()
Set the vocabulary sizes for embeddings and generating logits
57conf.n\_src\_vocab=c.n\_tokens58conf.n\_tgt\_vocab=c.n\_tokens
Set model size
60conf.d\_model=c.d\_model
Replace the encoder layer with a gMLP layer
62conf.encoder\_layer=c.gmlp6364returnconf
67defmain():
Create experiment
69experiment.create(name="gMLP")
Create configs
71conf=Configs()
Override configurations
73experiment.configs(conf,{
Use character level tokenizer
75'tokenizer':'character',
Prompt separator is blank
77'prompt\_separator':'',
Starting prompt for sampling
79'prompt':'It is ',
Use Tiny Shakespeare dataset
81'text':'tiny\_shakespeare',
Use a context size of 256
84'seq\_len':256,
Train for 128 epochs
86'epochs':128,
Batch size 32
88'batch\_size':32,
Switch between training and validation for 10 times per epoch
91'inner\_iterations':10,
Model size
94'd\_model':512,95'd\_ffn':2048,
Use Noam optimizer
98'optimizer.optimizer':'Noam',99'optimizer.learning\_rate':1.,100})
Set models for saving and loading
103experiment.add\_pytorch\_models({'model':conf.model})
Start the experiment
106withexperiment.start():
Run training
108conf.run()
112if\_\_name\_\_=='\_\_main\_\_':113main()