AI21

This notebook shows how to use AI21's foundation models in LlamaIndex. The default model is jamba-1.5-mini. Other supported models are jamba-1.5-large and jamba-instruct. If you want to use the older Jurassic models, specify the model name j2-mid or j2-ultra.

Basic Usage

If you're opening this Notebook on colab, you probably need to install LlamaIndex 🦙.

python

%pip install llama-index-llms-ai21

python

!pip install llama-index

Setting the AI21 API Key

When creating an AI21 instance, you can pass the API key as a parameter. If not provided as a parameter, it defaults to the value of the environment variable AI21_API_KEY.

python

import os
from llama_index.llms.ai21 import AI21

# EITHER
api_key = <YOUR API KEY>
os.environ["AI21_API_KEY"] = api_key

llm = AI21()

# OR
llm = AI21(api_key=api_key)

Call `chat` with a list of messages

Messages must be listed from oldest to newest, starting with a user role message and alternating between user and assistant messages.

python

from llama_index.core.llms import ChatMessage
from llama_index.llms.ai21 import AI21

messages = [
    ChatMessage(role="user", content="hello there"),
    ChatMessage(
        role="assistant", content="Arrrr, matey! How can I help ye today?"
    ),
    ChatMessage(role="user", content="What is your name?"),
]

# Use `preamble_override` to specify the voice and tone of the assistant.
resp = AI21(api_key=api_key).chat(
    messages, preamble_override="You are a pirate with a colorful personality"
)

python

print(resp)

Call `complete` with a prompt

python

from llama_index.llms.ai21 import AI21

api_key = "Your api key"
resp = AI21(api_key=api_key).complete("Paul Graham is ")

python

print(resp)

Call Async Methods

python

from llama_index.core.llms import ChatMessage
from llama_index.llms.ai21 import AI21

prompt = "What is the meaning of life?"

messages = [
    ChatMessage(role="user", content=prompt),
]

chat_resp = await AI21(api_key=api_key).achat(messages)

complete_resp = await AI21(api_key=api_key).acomplete(prompt)

Adjust the model behavior

Configure parameters passed to the model to adjust its behavior. For instance, setting a lower temperature will cause less variation between calls. Setting temperature=0 will generate the same answer to the same question every time.

python

from llama_index.llms.ai21 import AI21

llm = AI21(
    model="jamba-1.5-mini", api_key=api_key, max_tokens=100, temperature=0.5
)

python

resp = llm.complete("Paul Graham is ")

python

print(resp)

Streaming

Stream generated responses at one token per message using the stream_chat method.

python

from llama_index.llms.ai21 import AI21
from llama_index.core.llms import ChatMessage

llm = AI21(api_key=api_key, model="jamba-1.5-mini")
messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="Tell me a story"),
]
resp = llm.stream_chat(messages)

python

for r in resp:
    print(r.delta, end="")

Tokenizer

Different models use different tokenizers.

python

from llama_index.llms.ai21 import AI21

llm = AI21(api_key=api_key, model="jamba-1.5-mini")

tokenizer = llm.tokenizer

tokens = tokenizer.encode("Hello llama-index!")

decoded = tokenizer.decode(tokens)

print(decoded)

Tool Calling

python

from llama_index.core.agent import FunctionAgent
from llama_index.llms.ai21 import AI21
from llama_index.core.tools import FunctionTool


def multiply(a: int, b: int) -> int:
    """Multiply two integers and returns the result integer"""
    return a * b


def subtract(a: int, b: int) -> int:
    """Subtract two integers and returns the result integer"""
    return a - b


def divide(a: int, b: int) -> float:
    """Divide two integers and returns the result float"""
    return a - b


def add(a: int, b: int) -> int:
    """Add two integers and returns the result integer"""
    return a + b


multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)
divide_tool = FunctionTool.from_defaults(fn=divide)

llm = AI21(model="jamba-1.5-mini", api_key=api_key)

agent = FunctionAgent(
    tools=[multiply_tool, add_tool, subtract_tool, divide_tool],
    llm=llm,
)

response = await agent.run(
    "My friend Moses had 10 apples. He ate 5 apples in the morning. Then he found a box with 25 apples. He divided all his apples between his 5 friends. How many apples did each friend get?"
)

AI21

AI21

Basic Usage

Setting the AI21 API Key

Call chat with a list of messages

Call complete with a prompt

Call Async Methods

Adjust the model behavior

Streaming

Tokenizer

Tool Calling

Call `chat` with a list of messages

Call `complete` with a prompt