Back to Autogpt

Data

docs/integrations/block-integrations/data.md

0.6.4411.9 KB
Original Source

Data

<!-- MANUAL: file_description -->

Blocks for creating, reading, and manipulating data structures including lists, dictionaries, spreadsheets, and persistent storage.

<!-- END MANUAL -->

Create Dictionary

What it is

Creates a dictionary with the specified key-value pairs. Use this when you know all the values you want to add upfront.

How it works

<!-- MANUAL: how_it_works -->

This block creates a new dictionary from specified key-value pairs in a single operation. It's designed for cases where you know all the data upfront, rather than building the dictionary incrementally.

The block takes a dictionary input and outputs it as-is, making it useful as a starting point for workflows that need to pass structured data between blocks.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
valuesKey-value pairs to create the dictionary withDict[str, Any]Yes

Outputs

OutputDescriptionType
errorError message if dictionary creation failedstr
dictionaryThe created dictionary containing the specified key-value pairsDict[str, Any]

Possible use case

<!-- MANUAL: use_case -->

API Request Payloads: Create complete request body objects with all required fields before sending to an API.

Configuration Objects: Build settings dictionaries with predefined values for initializing services or workflows.

Data Mapping: Transform input data into a structured format with specific keys expected by downstream blocks.

<!-- END MANUAL -->

Create List

What it is

Creates a list with the specified values. Use this when you know all the values you want to add upfront. This block can also yield the list in batches based on a maximum size or token limit.

How it works

<!-- MANUAL: how_it_works -->

This block creates a list from provided values and can optionally chunk it into smaller batches. When max_size is set, the list is yielded in chunks of that size. When max_tokens is set, chunks are sized to fit within token limits for LLM processing.

This batching capability is particularly useful when processing large datasets that need to be split for API limits or memory constraints.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
valuesA list of values to be combined into a new list.List[Any]Yes
max_sizeMaximum size of the list. If provided, the list will be yielded in chunks of this size.intNo
max_tokensMaximum tokens for the list. If provided, the list will be yielded in chunks that fit within this token limit.intNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
listThe created list containing the specified values.List[Any]

Possible use case

<!-- MANUAL: use_case -->

Batch Processing: Split large datasets into manageable chunks for API calls with rate limits.

LLM Token Management: Divide text content into token-limited batches for processing by language models.

Parallel Processing: Create batches of work items that can be processed concurrently by multiple blocks.

<!-- END MANUAL -->

File Read

What it is

Reads a file and returns its content as a string, with optional chunking by delimiter and size limits

How it works

<!-- MANUAL: how_it_works -->

This block reads file content from various sources (URL, data URI, or local path) and returns it as a string. It supports chunking via delimiter (like newlines) or size limits, yielding content in manageable pieces.

Use skip_rows and skip_size to skip header content or initial bytes. When delimiter and limits are set, content is yielded chunk by chunk, enabling processing of large files without loading everything into memory.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
file_inputThe file to read from (URL, data URI, or local path)str (file)Yes
delimiterDelimiter to split the content into rows/chunks (e.g., '\n' for lines)strNo
size_limitMaximum size in bytes per chunk to yield (0 for no limit)intNo
row_limitMaximum number of rows to process (0 for no limit, requires delimiter)intNo
skip_sizeNumber of characters to skip from the beginning of the fileintNo
skip_rowsNumber of rows to skip from the beginning (requires delimiter)intNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
contentFile content, yielded as individual chunks when delimiter or size limits are appliedstr

Possible use case

<!-- MANUAL: use_case -->

Log File Processing: Read and process log files line by line, filtering or transforming each entry.

Large Document Analysis: Read large text files in chunks for summarization or analysis without memory issues.

Data Import: Read text-based data files and process them row by row for database import.

<!-- END MANUAL -->

Persist Information

What it is

Persist key-value information for the current user

How it works

<!-- MANUAL: how_it_works -->

This block stores key-value data that persists across workflow runs. You can scope the persistence to either within_agent (available to all runs of this specific agent) or across_agents (available to all agents for this user).

The stored data remains available until explicitly overwritten, enabling state management and configuration persistence between workflow executions.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
keyKey to store the information understrYes
valueValue to storeValueYes
scopeScope of persistence: within_agent (shared across all runs of this agent) or across_agents (shared across all agents for this user)"within_agent" | "across_agents"No

Outputs

OutputDescriptionType
errorError message if the operation failedstr
valueValue that was storedValue

Possible use case

<!-- MANUAL: use_case -->

User Preferences: Store user settings like preferred language or notification preferences for future runs.

Progress Tracking: Save the last processed item ID to resume batch processing where you left off.

API Token Caching: Store refreshed API tokens that can be reused across multiple workflow executions.

<!-- END MANUAL -->

Read Spreadsheet

What it is

Reads CSV and Excel files and outputs the data as a list of dictionaries and individual rows. Excel files are automatically converted to CSV format.

How it works

<!-- MANUAL: how_it_works -->

This block parses CSV and Excel files, converting each row into a dictionary with column headers as keys. Excel files are automatically converted to CSV format before processing.

Configure delimiter, quote character, and escape character for proper CSV parsing. Use skip_rows to ignore headers or initial rows, and skip_columns to exclude unwanted columns from the output.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
contentsThe contents of the CSV/spreadsheet data to readstrNo
file_inputCSV or Excel file to read from (URL, data URI, or local path). Excel files are automatically converted to CSVstr (file)No
delimiterThe delimiter used in the CSV/spreadsheet datastrNo
quotecharThe character used to quote fieldsstrNo
escapecharThe character used to escape the delimiterstrNo
has_headerWhether the CSV file has a header rowboolNo
skip_rowsThe number of rows to skip from the start of the fileintNo
stripWhether to strip whitespace from the valuesboolNo
skip_columnsThe columns to skip from the start of the rowList[str]No
produce_singular_resultIf True, yield individual 'row' outputs only (can be slow). If False, yield both 'rows' (all data)boolNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
rowThe data produced from each row in the spreadsheetDict[str, str]
rowsAll the data in the spreadsheet as a list of rowsList[Dict[str, str]]

Possible use case

<!-- MANUAL: use_case -->

Data Import: Import product catalogs, contact lists, or inventory data from spreadsheet exports.

Report Processing: Parse generated CSV reports from other systems for analysis or transformation.

Bulk Operations: Process spreadsheets of email addresses, user records, or configuration data row by row.

<!-- END MANUAL -->

Retrieve Information

What it is

Retrieve key-value information for the current user

How it works

<!-- MANUAL: how_it_works -->

This block retrieves previously stored key-value data for the current user. Specify the key and scope to fetch the corresponding value. If the key doesn't exist, the default_value is returned.

Use within_agent scope for agent-specific data or across_agents for data shared across all user agents.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
keyKey to retrieve the information forstrYes
scopeScope of persistence: within_agent (shared across all runs of this agent) or across_agents (shared across all agents for this user)"within_agent" | "across_agents"No
default_valueDefault value to return if key is not foundDefault ValueNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
valueRetrieved value or default valueValue

Possible use case

<!-- MANUAL: use_case -->

Resume Processing: Retrieve the last processed item ID to continue batch operations from where you left off.

Load Preferences: Fetch stored user preferences at workflow start to customize behavior.

State Restoration: Retrieve workflow state saved from a previous run to maintain continuity.

<!-- END MANUAL -->

Screenshot Web Page

What it is

Takes a screenshot of a specified website using ScreenshotOne API

How it works

<!-- MANUAL: how_it_works -->

This block uses the ScreenshotOne API to capture screenshots of web pages. Configure viewport dimensions, output format, and whether to capture the full page or just the visible area.

Optional features include blocking ads, cookie banners, and chat widgets for cleaner screenshots. Caching can be enabled to improve performance for repeated captures of the same page.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
urlURL of the website to screenshotstrYes
viewport_widthWidth of the viewport in pixelsintNo
viewport_heightHeight of the viewport in pixelsintNo
full_pageWhether to capture the full page lengthboolNo
formatOutput format (png, jpeg, webp)"png" | "jpeg" | "webp"No
block_adsWhether to block adsboolNo
block_cookie_bannersWhether to block cookie bannersboolNo
block_chatsWhether to block chat widgetsboolNo
cacheWhether to enable cachingboolNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
imageThe screenshot image datastr (file)

Possible use case

<!-- MANUAL: use_case -->

Visual Documentation: Capture screenshots of web pages for documentation, reports, or archives.

Competitive Monitoring: Regularly screenshot competitor websites to track design and content changes.

Visual Testing: Capture page renders for visual regression testing or design verification workflows.

<!-- END MANUAL -->