Back to Annotated Deep Learning Paper Implementations

GPT-NeoX Tokenizer

docs/neox/tokenizer.html

latest699 B
Original Source

homeneox

View code on Github

#

GPT-NeoX Tokenizer

This initializes a Hugging Face tokenizer from the downloaded vocabulary.

13fromtokenizersimportTokenizer1415fromlabmlimportlab,monit

#

Load NeoX Tokenizer

Returns the tokenizer

[email protected]('Load NeoX Tokenizer')19defget\_tokenizer()-\>Tokenizer:

#

25vocab\_file=lab.get\_data\_path()/'neox'/'slim\_weights'/'20B\_tokenizer.json'26tokenizer=Tokenizer.from\_file(str(vocab\_file))2728returntokenizer

labml.ai