Back to Annotated Deep Learning Paper Implementations

configs.py

docs/RWKV/configs.html

latest1.0 KB
Original Source

homerwkv

View code on Github

#

1fromlabml.configsimportBaseConfigs

#

Transformer Configurations

This defines configurations for a transformer. The configurations are calculate using option functions. These are lazy loaded and therefore only the necessary modules are calculated.

4classRWKVConfigs(BaseConfigs):

#

Number of attention heads

14n\_heads:int=8

#

Transformer embedding size

16d\_model:int=512

#

Number of layers

18n\_layers:int=6

#

Dropout probability

20dropout:float=0.1

#

Number of tokens in the source vocabulary (for token embeddings)

22n\_src\_vocab:int

#

Number of tokens in the target vocabulary (to generate logits for prediction)

24n\_tgt\_vocab:int

labml.ai