[View code on Github](https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/neox/utils/ init.py)

Utilities and Helpers

15importtyping16fromtypingimportList,Optional1718importtorch1920fromlabmlimportlogger21fromlabml.loggerimportText22fromlabml\_nn.neox.tokenizerimportget\_tokenizer2324iftyping.TYPE\_CHECKING:25fromtokenizersimportTokenizer

Tokenizer singleton

28\_TOKENIZER:Optional['Tokenizer']=None

Get token ids

text is the text to tokenize

Returns the token ids

31defget\_tokens(text:str)-\>List[int]:

38global\_TOKENIZER39if\_TOKENIZERisNone:40\_TOKENIZER=get\_tokenizer()41return\_TOKENIZER.encode\_batch([text])[0].ids

Print tokens from model outputs

Pretty prints target tokens along side outputs from the model(s).

ids are the target token ids
xs are the model(s) outputs

44defprint\_token\_outputs(ids:List[int],\*xs:torch.Tensor):

53ids=ids+[-1]54xs=[[-1]+x[0].max(dim=-1)[1].tolist()forxinxs]5556print\_tokens(ids,xs)

Print tokens

Pretty prints tokens for comparison

target are the target token ids
others are the sampled outputs from the model(s)

59defprint\_tokens(target:List[int],others:List[List[int]]):

Load tokenizer

70global\_TOKENIZER71if\_TOKENIZERisNone:72\_TOKENIZER=get\_tokenizer()

Convert the tokens to list of strings

75text=[]76foriinrange(len(target)):77tokens=[\_TOKENIZER.decode([target[i]])iftarget[i]!=-1else'---']78forjinrange(len(others)):79tokens.append(\_TOKENIZER.decode([others[j][i]])ifothers[j][i]!=-1else'---')8081text.append(tokens)

Stats

84correct=[0for\_inothers]85total=0

Iterate through tokens

88foriinrange(len(target)):89parts=[(f'{i}: ',Text.meta)]90parts+=[('"',Text.subtle),(text[i][0],Text.subtle),('"',Text.subtle),'\t']

Empty target

93iftarget[i]==-1:94forjinrange(len(others)):95parts+=[('"',Text.subtle),(text[i][j+1],Text.subtle),('"',Text.subtle),'\t']9697logger.log(parts)98continue

Number of tokens

101total+=1

Other outputs

104forjinrange(len(others)):105correct[j]+=1ifothers[j][i]==target[i]else0106107parts+=[('"',Text.subtle),108(text[i][j+1],Text.successifothers[j][i]==target[i]elseText.danger),109('"',Text.subtle),'\t']110111logger.log(parts)

Stats

114parts=[(f'{total}',Text.highlight),'\t']115forjinrange(len(others)):116parts+=[(f'{correct[j]}',Text.value),'\t']117logger.log(parts)

Balance layers

Split the n_layers into n_chunks . This is used for pipeline parallel training.

n_layers is the number of layers
n_chunks is the number of chunks

Returns returns a list with the number of layers for each chunk

120defbalance\_layers\_simple(n\_layers:int,n\_chunks:int):

130balance=[]131foriinrange(n\_chunks):132balance.append((n\_layers-sum(balance))//(n\_chunks-i))133134returnlist(reversed(balance))

labml.ai