Back to Annotated Deep Learning Paper Implementations

Pay Attention to MLPs (gMLP)

labml_nn/transformers/gmlp/readme.md

latest496 B

Pay Attention to MLPs (gMLP)

This is a PyTorch implementation of the paper Pay Attention to MLPs.

This paper introduces a Multilayer Perceptron (MLP) based architecture with gating, which they name gMLP. It consists of a stack of $L$ gMLP blocks.

Here is the training code for a gMLP model based autoregressive model.