docs/examples/llm/asi1.ipynb
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/llm/asi1.ipynb" target="_parent"></a>
ASI1-Mini is an advanced, agentic LLM designed by fetch.ai, a founding member of Artificial Superintelligence Alliance for decentralized operations. Its unique architecture empowers it to execute tasks and collaborate with other agents for efficient, adaptable problem-solving in complex environments.
This notebook demonstrates how to use ASI models with LlamaIndex. It covers various functionalities including basic completion, chat, streaming, function calling, structured prediction, RAG, and more. If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
First, let's install the required packages:
%pip install llama-index-llms-asi llama-index-llms-openai llama-index-core
You'll need to set your API keys for ASI and optionally for OpenAI if you want to compare the two:
import os
# Set your API keys here - To get the API key visit https://asi1.ai/chat and login
os.environ["ASI_API_KEY"] = "your-api-key"
Let's start with a basic completion example using ASI:
from llama_index.llms.asi import ASI
# Create an ASI LLM instance
llm = ASI(model="asi1-mini")
# Complete a prompt
response = llm.complete("Who is Paul Graham? ")
print(response)
Now let's try chat functionality:
from llama_index.core.base.llms.types import ChatMessage
# Create messages
messages = [
ChatMessage(
role="system", content="You are a pirate with a colorful personality"
),
ChatMessage(role="user", content="What is your name"),
]
# Get chat response
chat_response = llm.chat(messages)
print(chat_response)
ASI supports streaming for chat responses:
# Stream chat response
for chunk in llm.stream_chat(messages):
print(chunk.delta, end="")
Using stream_chat endpoint
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(
role="system", content="You are a pirate with a colorful personality"
),
ChatMessage(role="user", content="What is your name"),
]
resp = llm.stream_chat(messages)
for r in resp:
print(r.delta, end="")
Using stream_complete endpoint
resp = llm.stream_complete("Paul Graham is ")
for r in resp:
print(r.delta, end="")
ASI has support for images in the input of chat messages for many models.
Using the content blocks feature of chat messages, you can easily combone text and images in a single LLM prompt.
!wget https://cdn.pixabay.com/photo/2016/07/07/16/46/dice-1502706_640.jpg -O image.png
from llama_index.core.llms import ChatMessage, TextBlock, ImageBlock
from llama_index.llms.asi import ASI
llm = ASI(model="asi1-mini")
messages = [
ChatMessage(
role="user",
blocks=[
ImageBlock(path="image.png"),
TextBlock(text="Describe the image in a few sentences."),
],
)
]
resp = llm.chat(messages)
print(resp.message.content)
ASI LLM have native support for function calling. This conveniently integrates with LlamaIndex tool abstractions, letting you plug in any arbitrary Python function to the LLM.
In the example below, we define a function to generate a Song object.
from pydantic import BaseModel
from llama_index.core.tools import FunctionTool
from llama_index.llms.asi import ASI
class Song(BaseModel):
"""A song with name and artist"""
name: str
artist: str
def generate_song(name: str, artist: str) -> Song:
"""Generates a song with provided name and artist."""
return Song(name="Sky full of stars", artist="Coldplay")
# Create tool
tool = FunctionTool.from_defaults(fn=generate_song)
The strict parameter tells ASI whether or not to use constrained sampling when generating tool calls/structured outputs. This means that the generated tool call schema will always contain the expected fields.
Since this seems to increase latency, it defaults to false.
from llama_index.llms.asi import ASI
# Create an ASI LLM instance
llm = ASI(model="asi1-mini", strict=True)
response = llm.predict_and_call(
[tool],
"Pick a random song for me",
# strict=True # can also be set at the function level to override the class
)
print(str(response))
llm = ASI(model="asi1-mini")
response = llm.predict_and_call(
[tool],
"Generate five songs from the Beatles",
allow_parallel_tool_calls=True,
)
for s in response.sources:
print(f"Name: {s.tool_name}, Input: {s.raw_input}, Output: {str(s)}")
While automatic tool calling with predict_and_call provides a streamlined experience, manual tool calling gives you more control over the process. With manual tool calling, you can:
ASI supports manual tool calling, but requires more specific prompting compared to some other LLMs. For best results with ASI, include a system message that explains the available tools and provide specific parameters in your user prompt.
The following example demonstrates manual tool calling with ASI to generate a song:
from pydantic import BaseModel
from llama_index.core.tools import FunctionTool
from llama_index.core.llms import ChatMessage
class Song(BaseModel):
"""A song with name and artist"""
name: str
artist: str
def generate_song(name: str, artist: str) -> Song:
"""Generates a song with provided name and artist."""
return Song(name=name, artist=artist)
# Create tool
tool = FunctionTool.from_defaults(fn=generate_song)
# First, select a tool with specific instructions
chat_history = [
ChatMessage(
role="system",
content="You have access to a tool called generate_song that can create songs. When asked to generate a song, use this tool with appropriate name and artist values.",
),
ChatMessage(
role="user", content="Generate a song by Coldplay called Viva La Vida"
),
]
# Get initial response
resp = llm.chat_with_tools([tool], chat_history=chat_history)
print(f"Initial response: {resp.message.content}")
# Check for tool calls
tool_calls = llm.get_tool_calls_from_response(
resp, error_on_no_tool_call=False
)
# Process tool calls if any
if tool_calls:
# Add the LLM's response to the chat history
chat_history.append(resp.message)
for tool_call in tool_calls:
tool_name = tool_call.tool_name
tool_kwargs = tool_call.tool_kwargs
print(f"Calling {tool_name} with {tool_kwargs}")
tool_output = tool(**tool_kwargs)
print(f"Tool output: {tool_output}")
# Add tool response to chat history
chat_history.append(
ChatMessage(
role="tool",
content=str(tool_output),
additional_kwargs={"tool_call_id": tool_call.tool_id},
)
)
# Get final response
resp = llm.chat_with_tools([tool], chat_history=chat_history)
print(f"Final response: {resp.message.content}")
else:
print("No tool calls detected in the response.")
You can use ASI to extract structured data from text:
from llama_index.core.prompts import PromptTemplate
from pydantic import BaseModel
from typing import List
class MenuItem(BaseModel):
"""A menu item in a restaurant."""
course_name: str
is_vegetarian: bool
class Restaurant(BaseModel):
"""A restaurant with name, city, and cuisine."""
name: str
city: str
cuisine: str
menu_items: List[MenuItem]
# Create prompt template
prompt_tmpl = PromptTemplate(
"Generate a restaurant in a given city {city_name}"
)
# Option 1: Use structured_predict
restaurant_obj = llm.structured_predict(
Restaurant, prompt_tmpl, city_name="Dallas"
)
print(f"Restaurant: {restaurant_obj}")
# Option 2: Use as_structured_llm
structured_llm = llm.as_structured_llm(Restaurant)
restaurant_obj2 = structured_llm.complete(
prompt_tmpl.format(city_name="Miami")
).raw
print(f"Restaurant: {restaurant_obj2}")
Note: Structured streaming is currently not supported with ASI.
ASI supports async operations:
from llama_index.llms.asi import ASI
# Create an ASI LLM instance
llm = ASI(model="asi1-mini")
resp = await llm.acomplete("who is Paul Graham")
print(resp)
resp = await llm.astream_complete("Paul Graham is ")
import asyncio
import nest_asyncio
async for delta in resp:
print(delta.delta, end="")
import asyncio
import nest_asyncio
# Enable nest_asyncio for Jupyter notebooks
nest_asyncio.apply()
async def test_async():
# Async completion
resp = await llm.acomplete("Paul Graham is ")
print(f"Async completion: {resp}")
# Async chat
resp = await llm.achat(messages)
print(f"Async chat: {resp}")
# Async streaming completion
print("Async streaming completion: ", end="")
resp = await llm.astream_complete("Paul Graham is ")
async for delta in resp:
print(delta.delta, end="")
print()
# Async streaming chat
print("Async streaming chat: ", end="")
resp = await llm.astream_chat(messages)
async for delta in resp:
print(delta.delta, end="")
print()
# Run async tests
asyncio.run(test_async())
Let's implement a simple RAG application with ASI:
%pip install llama-index-embeddings-openai
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.openai import OpenAIEmbedding
os.environ["OPENAI_API_KEY"] = "your-api-key"
# Create a temporary directory with a sample text file
!mkdir -p temp_data
!echo "Paul Graham is a programmer, writer, and investor. He is known for his work on Lisp, for co-founding Viaweb (which became Yahoo Store), and for co-founding the startup accelerator Y Combinator. He is also known for his essays on his website. He studied at HolaHola High school" > temp_data/paul_graham.txt
# Load documents
documents = SimpleDirectoryReader("temp_data").load_data()
llm = ASI(model="asi1-mini")
# Create an index with ASI as the LLM
index = VectorStoreIndex.from_documents(
documents,
embed_model=OpenAIEmbedding(), # Using OpenAI for embeddings
llm=llm, # Using ASI for generation
)
# Create a query engine
query_engine = index.as_query_engine()
# Query the index
response = query_engine.query("Where did Paul Graham study?")
print(response)
If you have a LlamaCloud account, you can use ASI with LlamaCloud for RAG:
# Install required packages
%pip install llama-cloud-services
import os
from llama_cloud_services import LlamaCloudIndex
from llama_index.llms.asi import ASI
# Set your LlamaCloud API key
os.environ["LLAMA_CLOUD_API_KEY"] = "your-key"
os.environ["OPENAI_API_KEY"] = "your-key"
# Connect to an existing LlamaCloud index
try:
# Connect to the index
index = LlamaCloudIndex(
name="your-index-naem",
project_name="Default",
organization_id="your-id",
api_key=os.environ["LLAMA_CLOUD_API_KEY"],
)
print("Successfully connected to LlamaCloud index")
# Create an ASI LLM
llm = ASI(model="asi1-mini")
# Create a retriever
retriever = index.as_retriever()
# Create a query engine with ASI
query_engine = index.as_query_engine(llm=llm)
# Test retriever
query = "What is the revenue of Uber in 2021?"
print(f"\nTesting retriever with query: {query}")
nodes = retriever.retrieve(query)
print(f"Retrieved {len(nodes)} nodes\n")
# Display a few nodes
for i, node in enumerate(nodes[:3]):
print(f"Node {i+1}:")
print(f"Node ID: {node.node_id}")
print(f"Score: {node.score}")
print(f"Text: {node.text[:200]}...\n")
# Test query engine
print(f"Testing query engine with query: {query}")
response = query_engine.query(query)
print(f"Response: {response}")
except Exception as e:
print(f"Error: {e}")
If desired, you can have separate LLM instances use separate API keys:
from llama_index.llms.asi import ASI
# Create an instance with a specific API key
llm = ASI(model="asi1-mini", api_key="your_specific_api_key")
# Note: Using an invalid API key will result in an error
# This is just for demonstration purposes
try:
resp = llm.complete("Paul Graham is ")
print(resp)
except Exception as e:
print(f"Error with invalid API key: {e}")
Rather than adding the same parameters to each chat or completion call, you can set them at a per-instance level with additional_kwargs:
from llama_index.llms.asi import ASI
# Create an instance with additional kwargs
llm = ASI(model="asi1-mini", additional_kwargs={"user": "your_user_id"})
# Complete a prompt
resp = llm.complete("Paul Graham is ")
print(resp)
from llama_index.core.base.llms.types import ChatMessage
# Create an instance with additional kwargs
llm = ASI(model="asi1-mini", additional_kwargs={"user": "your_user_id"})
# Create messages
messages = [
ChatMessage(
role="system", content="You are a pirate with a colorful personality"
),
ChatMessage(role="user", content="What is your name"),
]
# Get chat response
resp = llm.chat(messages)
print(resp)
This notebook demonstrates the various ways you can use ASI with LlamaIndex. The integration supports most of the functionality available in LlamaIndex, including:
Note that structured streaming is currently not supported with ASI.