Simple - Annotated Deep Learning Paper Implementations

Gated Linear Units and Variants

This trains a simple transformer model for auto-regression. We try different variants for the position-wise feedforward network.

Annotated trainer code is at simple.py

Install the labml-nn package

!pip install labml-nn

Imports

import dataclasses

import torch
import torch.nn as nn
from labml import experiment
from labml_nn.transformers.glu_variants.simple import Configs, Trainer

Create an experiment

experiment.create(name="glu_variants")

Initialize configurations

conf = Configs()

Set experiment configurations and assign a configurations dictionary to override configurations

experiment.configs(dataclasses.asdict(conf))

Create Trainer

trainer = Trainer(conf)

Set PyTorch models for loading and saving

experiment.add_pytorch_models({'model': trainer.model})

Start the experiment and run the training loop.

with experiment.start():
    trainer.train()