guides/python/langchain/data-analysis/anthropic/README.md
This example demonstrates how to build a LangChain agent that performs secure data analysis using Daytona sandboxes. The agent uses the DaytonaDataAnalysisTool to execute Python code in an isolated environment, enabling automated data analysis workflows with natural language prompts.
In this example, the agent analyzes a vehicle valuations dataset to understand how vehicle prices vary by manufacturing year and generates a line chart showing average price per year.
[!TIP] It's recommended to use a virtual environment (
venvorpoetry) to isolate project dependencies.
To run this example, you need to set the following environment variables:
DAYTONA_API_KEY: Required for access to Daytona sandboxes. Get it from Daytona DashboardANTHROPIC_API_KEY: Required for Claude AI model access. Get it from Anthropic ConsoleSee the .env.example file for the exact structure. Copy .env.example to .env and fill in your API keys before running.
Before proceeding, complete the following steps:
.env.example to .env and add your API keysCreate and activate a virtual environment:
python3.10 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Install dependencies:
pip install -U langchain langchain-anthropic langchain-daytona-data-analysis python-dotenv
Download the dataset:
curl -o dataset.csv https://download.daytona.io/dataset.csv
Or download manually from https://download.daytona.io/dataset.csv and save as dataset.csv
Run the example:
python data_analysis.py
Analysis Prompt: The main prompt is configured in the agent.invoke() call inside data_analysis.py. You can modify this prompt to analyze different aspects of the data or try different visualization types.
Dataset Description: When uploading the dataset, provide a clear description of the columns and data cleaning instructions to help the agent understand how to process the data.
Result Handler: The process_data_analysis_result() function processes execution artifacts. You can customize this to handle different output types (charts, tables, logs, etc.).
When you run the example, the agent follows this workflow:
You provide the data and describe what insights you need - the agent handles the rest.
When the agent completes the analysis, you'll see output like:
Result stdout Original dataset shape: (100000, 15)
After removing missing values: (100000, 15)
After removing non-numeric values: (99946, 15)
After removing year outliers: (96598, 15)
After removing price outliers: (90095, 15)
Cleaned data summary:
year price_in_euro
count 90095.000000 90095.000000
mean 2016.698563 22422.266707
std 4.457647 12964.727116
min 2005.000000 150.000000
25% 2014.000000 12980.000000
50% 2018.000000 19900.000000
75% 2020.000000 29500.000000
max 2023.000000 62090.000000
Average price by year:
year
2005.0 5968.124319
2006.0 6870.881523
2007.0 8015.234473
2008.0 8788.644495
2009.0 8406.198576
2010.0 10378.815972
2011.0 11540.640435
2012.0 13306.642261
2013.0 14512.707025
2014.0 15997.682899
2015.0 18563.864358
2016.0 20124.556294
2017.0 22268.083322
2018.0 24241.123673
2019.0 26757.469111
2020.0 29400.163494
2021.0 30720.168646
2022.0 33861.717552
2023.0 33119.840175
Name: price_in_euro, dtype: float64
Total number of vehicles analyzed: 90095
Year range: 2005 - 2023
Price range: €150.00 - €62090.00
Overall average price: €22422.27
Chart saved to chart-0.png
The agent generates a professional line chart showing how average vehicle prices increased from 2005 (€5,968) to 2022 (€33,862), with a slight decrease in 2023. The chart is saved as chart-0.png in your project directory.
The DaytonaDataAnalysisTool provides these key methods:
def download_file(remote_path: str) -> bytes
Downloads a file from the sandbox by its remote path.
def upload_file(file: IO, description: str) -> SandboxUploadedFile
Uploads a file to the sandbox with a description of its structure and contents.
def install_python_packages(package_names: str | list[str]) -> None
Installs Python packages in the sandbox using pip.
def close() -> None
Closes and deletes the sandbox environment. Always call this when finished to clean up resources.
For the complete API reference and additional methods, see the documentation.
See the main project LICENSE file for details.