Back to Annotated Deep Learning Paper Implementations

Gated Linear Units and Variants

docs/transformers/glu_variants/index.html

latest420 B
Original Source