docs/v3/migration-guide.mdx
Configuration is now global using pai.config.set() instead of per-dataframe. Several options have been removed:
Removed: save_charts, enable_cache, security, custom_whitelisted_dependencies, save_charts_path, custom_head
v2:
from pandasai import SmartDataframe
config = {
"llm": llm,
"save_charts": True,
"enable_cache": True,
"security": "standard"
}
df = SmartDataframe(data, config=config)
v3:
import pandasai as pai
pai.config.set({
"llm": llm,
"save_logs": True,
"verbose": False,
"max_retries": 3
})
df = pai.DataFrame(data)
Key Changes:
ChartResponse objects for manual handlingMore details: See config docs for configuration examples and more details.
LLMs are now extension-based. Install pandasai-litellm separately for unified access to 100+ models.
v2:
from pandasai.llm import OpenAI
from pandasai import SmartDataframe
llm = OpenAI(api_token="your-api-key")
df = SmartDataframe(data, config={"llm": llm})
v3:
pip install pandasai-litellm
import pandasai as pai
from pandasai_litellm.litellm import LiteLLM
llm = LiteLLM(model="gpt-4o-mini", api_key="your-api-key")
pai.config.set({"llm": llm})
df = pai.DataFrame(data)
Key Changes:
pandasai-litellm for unified LLM interfacepandasai and pandasai-litellmMore details: See Large Language Models for supported models and configuration.
Connectors are now separate extensions. Install only what you need. Cloud connectors require enterprise license.
v2:
from pandasai.connectors import PostgreSQLConnector
from pandasai import SmartDataframe
connector = PostgreSQLConnector(config={
"host": "localhost",
"database": "mydb",
"table": "sales"
})
df = SmartDataframe(connector)
v3:
pip install pandasai-sql[postgres]
import pandasai as pai
df = pai.create(
path="company/sales",
description="Sales data from PostgreSQL",
source={
"type": "postgres",
"connection": {
"host": "localhost",
"database": "mydb",
"user": "${DB_USER}",
"password": "${DB_PASSWORD}"
},
"table": "sales"
}
)
Key Changes:
pandasai-sql[postgres], pandasai-sql[mysql]pai.create() with semantic layer${DB_USER}More details: See Data Ingestion for connector setup and configuration.
Skills use @pai.skill decorator and are automatically registered globally.
v2:
from pandasai.skills import skill
from pandasai import Agent
@skill
def calculate_bonus(salary: float, performance: float) -> float:
"""Calculate employee bonus."""
if performance >= 90:
return salary * 0.15
return salary * 0.10
agent = Agent([df])
agent.add_skills(calculate_bonus)
v3:
import pandasai as pai
from pandasai import Agent
@pai.skill
def calculate_bonus(salary: float, performance: float) -> float:
"""Calculate employee bonus."""
if performance >= 90:
return salary * 0.15
return salary * 0.10
# Skills automatically available - no need to add them
agent = Agent([df])
Key Changes:
@pai.skill instead of @skillagent.add_skills()pai.chat(), SmartDataframe, and AgentMore details: See Skills for detailed usage and examples.
Agent class works mostly the same, but some methods have been removed in v3.
Removed methods: clarification_questions(), rephrase_query(), explain()
v2:
from pandasai import Agent
agent = Agent(df)
clarifications = agent.clarification_questions('What is the GDP?')
rephrased = agent.rephrase_query('What is the GDP?')
explanation = agent.explain()
v3:
from pandasai import Agent
agent = Agent(df)
# ❌ These methods are removed in v3
# Use chat() and follow_up() instead
response = agent.chat('What is the GDP?')
follow_up = agent.follow_up('What about last year?') # New: maintains context
Key Changes:
clarification_questions(), rephrase_query(), and explain() have been removedfollow_up() method maintains conversation contextTraining is now available through local vector stores (ChromaDB, Qdrant, Pinecone, LanceDB) for few-shot learning. The train() method is still available but requires a vector store.
v2:
from pandasai import Agent
agent = Agent(df)
agent.train(queries=["query"], codes=["code"])
v3:
from pandasai import Agent
from pandasai.ee.vectorstores import ChromaDB
# Instantiate with vector store
vector_store = ChromaDB()
agent = Agent(df, vectorstore=vector_store)
# Train with vector store
agent.train(queries=["query"], codes=["code"])
Key Changes:
More details: See Training the Agent for setup and examples.
# Using pip
pip install pandasai pandasai-litellm
# Using poetry
poetry add pandasai pandasai-litellm
# For SQL connectors
pip install pandasai-sql[postgres] # or mysql, sqlite, etc.
# v2 imports
from pandasai import SmartDataframe, SmartDatalake, Agent
from pandasai.llm import OpenAI
from pandasai.skills import skill
from pandasai.connectors import PostgreSQLConnector
# v3 imports
import pandasai as pai
from pandasai import Agent
from pandasai_litellm.litellm import LiteLLM
from pandasai_litellm.litellm import LiteLLM
import pandasai as pai
llm = LiteLLM(model="gpt-4o-mini", api_key="your-api-key")
pai.config.set({
"llm": llm,
"verbose": False,
"save_logs": True,
"max_retries": 3
})
Check the Backwards Compatibility section for details on the difference between SmartDataframe, SmartDatalakes, and the new Semantic DataFrames (pai dataframes). In this way you can decide if migrating or not.
Option A: Keep SmartDataframe (backward compatible)
from pandasai import SmartDataframe
df = SmartDataframe(your_data)
response = df.chat("Your question")
Option B: Use pai.DataFrame (recommended)
import pandasai as pai
# Simple approach
df = pai.DataFrame(your_data)
response = df.chat("Your question")
# With semantic layer (best for production)
df = pai.create(
path="company/sales-data",
df=your_data,
description="Sales data by country and region",
columns={
"country": {"type": "string", "description": "Country name"},
"sales": {"type": "float", "description": "Sales amount in USD"}
}
)
response = df.chat("Your question")
Multiple DataFrames:
# v2 style (still works)
from pandasai import SmartDatalake
lake = SmartDatalake([df1, df2])
# v3 recommended
import pandasai as pai
df1 = pai.DataFrame(data1)
df2 = pai.DataFrame(data2)
response = pai.chat("Your question", df1, df2)
# v2
from pandasai.connectors import PostgreSQLConnector
connector = PostgreSQLConnector(config={...})
df = SmartDataframe(connector)
# v3
import pandasai as pai
df = pai.create(
path="company/database-table",
description="Description of your data",
source={
"type": "postgres",
"connection": {
"host": "localhost",
"database": "mydb",
"user": "${DB_USER}",
"password": "${DB_PASSWORD}"
},
"table": "your_table"
}
)
# v2
from pandasai.skills import skill
@skill
def calculate_metric(value: float) -> float:
"""Calculate custom metric."""
return value * 1.5
agent.add_skills(calculate_metric)
# v3
import pandasai as pai
@pai.skill
def calculate_metric(value: float) -> float:
"""Calculate custom metric."""
return value * 1.5
# Skills automatically available
# Remove: save_charts, enable_cache, security,
# custom_whitelisted_dependencies, save_charts_path
# v3 (keep only these)
pai.config.set({
"llm": llm,
"save_logs": True,
"verbose": False,
"max_retries": 3
})
Test your migration with these examples:
import pandasai as pai
import pandas as pd
df = pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
df = pai.DataFrame(df)
response = df.chat("What is the sum of x?")
print(response)
df1 = pai.DataFrame({"sales": [100, 200, 300]})
df2 = pai.DataFrame({"costs": [50, 100, 150]})
response = pai.chat("What is the total profit?", df1, df2)
print(response)
@pai.skill
def test_skill(x: int) -> int:
"""Double the value."""
return x * 2
df = pai.DataFrame({"values": [1, 2, 3]})
response = df.chat("Double the first value")
print(response)