MultiOn Demo

This notebook walks through an example of using LlamaIndex with MultiOn to browse the web on the users behalf.

First, we import the FunctionAgent that will control the Multion session:

python

# Set up OpenAI
import os

os.environ["OPENAI_API_KEY"] = "sk-your-key"

from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI

We then import the MultiOn tool and initialize our agent with the tool.

python

# Set up MultiOn tool
from llama_index.tools.multion import MultionToolSpec

multion_tool = MultionToolSpec(api_key="your-multion-key")

To support the MultiOn browsing session, we will also give our LlamaIndex agent a tool to search and summarize a users gmail inbox. We set up that tool below. For more information on the gmail tool, see the Gmail notebook here.

We will use this tool later on to allow the agent to gain more context around our emails

python

# Import and initialize our tool spec
from llama_index.tools.google import GmailToolSpec
from llama_index.core.tools.ondemand_loader_tool import OnDemandLoaderTool

# Initialize the Gmail tool to search our inbox
gmail_tool = GmailToolSpec()

# Wrap the tool so we don't overflow the main Agent's context window
gmail_loader_tool = OnDemandLoaderTool.from_tool(
    gmail_tool.to_tool_list()[1],
    name="gmail_search",
    description="""
        This tool allows you to search the users gmail inbox and give directions for how to summarize or process the emails

        You must always provide a query to filter the emails, as well as a query_str to process the retrieved emails.
        All parameters are required
        
        If you need to reply to an email, ask this tool to build the reply directly
        Examples:
            query='from:adam subject:dinner', max_results=5, query_str='Where are adams favourite places to eat'
            query='dentist appointment', max_results=1, query_str='When is the next dentist appointment'
            query='to:jerry', max_results=1, query_str='summarize and then create a response email to jerrys latest email'
            query='is:inbox', max_results=5, query_str='Summarize these emails'
        """,
)

python

# Initialize our Agent with the MultiOn and Gmail loader tool
agent = FunctionAgent(
    tools=[*multion_tool.to_tool_list(), gmail_loader_tool],
    system_prompt="""
    You are an AI agent that assists the user in crafting email responses based on previous conversations.
    
    The gmail_search tool connects directly to an API to search and retrieve emails, and answer questions based on the content.
    The browse tool allows you to control a web browser with natural language to complete arbitrary actions on the web.
    
    Use these two tools together to gain context on past emails and respond to conversations for the user.
    """,
    llm=OpenAI(model="gpt-4.1"),
)

# Context to store chat history
from llama_index.core.workflow import Context

ctx = Context(agent)

Our agent is now set up and ready to browse the web!

python

print(await agent.run("browse to the latest email from Julian and open the email", ctx=ctx))

python

print(
    await agent.run(
        "Summarize the email chain with julian and create a response to the last email"
        " that confirms all the details",
        ctx=ctx,
    )
)

python

print(
    await agent.run(
        "pass the entire generated email to the browser and have it send the email as a"
        " reply to the chain",
        ctx=ctx,
    )
)