Back to Autogpt

Built-in Components

docs/content/forge/components/built-in-components.md

0.6.4411.1 KB
Original Source

Built-in Components

This page lists all 🧩 Components and ⚙️ Protocols they implement that are natively provided. They are used by the AutoGPT agent. Some components have additional configuration options listed in the table, see Component configuration to learn more.

!!! note If a configuration field uses environment variable, it still can be passed using configuration model. ### Value from the configuration takes precedence over env var! Env var will be only applied if value in the configuration is not set.

SystemComponent

Essential component to allow an agent to finish.

DirectiveProvider

  • Constraints about API budget

MessageProvider

  • Current time and date
  • Remaining API budget and warnings if budget is low

CommandProvider

  • finish used when task is completed

UserInteractionComponent

Adds ability to interact with user in CLI.

CommandProvider

  • ask_user used to ask user for input

FileManagerComponent

Adds ability to read and write persistent files to local storage, Google Cloud Storage or Amazon's S3. Necessary for saving and loading agent's state (preserving session).

FileManagerConfiguration

Config variableDetailsTypeDefault
storage_pathPath to agent files, e.g. statestragents/{agent_id}/[^1]
workspace_pathPath to files that agent has access tostragents/{agent_id}/workspace/[^1]

[^1] This option is set dynamically during component construction as opposed to by default inside the configuration model, {agent_id} is replaced with the agent's unique identifier.

DirectiveProvider

  • Resource information that it's possible to read and write files

CommandProvider

  • read_file used to read file
  • write_file used to write file
  • list_folder lists all files in a folder

CodeExecutorComponent

Lets the agent execute non-interactive Shell commands and Python code. Python execution works only if Docker is available.

CodeExecutorConfiguration

Config variableDetailsTypeDefault
execute_local_commandsEnable shell command executionboolFalse
shell_command_controlControls which list is used"allowlist" | "denylist""allowlist"
shell_allowlistList of allowed shell commandsList[str][]
shell_denylistList of prohibited shell commandsList[str][]
docker_container_nameName of the Docker container used for code executionstr"agent_sandbox"

All shell command configurations are expected to be for convience only. This component is not secure and should not be used in production environments. It is recommended to use more appropriate sandboxing.

CommandProvider

  • execute_shell execute shell command
  • execute_shell_popen execute shell command with popen
  • execute_python_code execute Python code
  • execute_python_file execute Python file

ActionHistoryComponent

Keeps track of agent's actions and their outcomes. Provides their summary to the prompt.

ActionHistoryConfiguration

Config variableDetailsTypeDefault
llm_nameName of the llm model used to compress the historyModelName"gpt-3.5-turbo"
max_tokensMaximum number of tokens to use for the history summaryint1024
spacy_language_modelLanguage model used for summary chunking using spacystr"en_core_web_sm"
full_message_countNumber of cycles to include unsummarized in the promptint4

MessageProvider

  • Agent's progress summary

AfterParse

  • Register agent's action

ExecutionFailure

  • Rewinds the agent's action, so it isn't saved

AfterExecute

  • Saves the agent's action result in the history

GitOperationsComponent

Adds ability to iteract with git repositories and GitHub.

GitOperationsConfiguration

Config variableDetailsTypeDefault
github_usernameGitHub username, ENV: GITHUB_USERNAMEstrNone
github_api_keyGitHub API key, ENV: GITHUB_API_KEYstrNone

CommandProvider

  • clone_repository used to clone a git repository

ImageGeneratorComponent

Adds ability to generate images using various providers.

Hugging Face

To use text-to-image models from Hugging Face, you need a Hugging Face API token. Link to the appropriate settings page: Hugging Face > Settings > Tokens

Stable Diffusion WebUI

It is possible to use your own self-hosted Stable Diffusion WebUI with AutoGPT. ### Make sure you are running WebUI with --api enabled.

ImageGeneratorConfiguration

Config variableDetailsTypeDefault
image_providerImage generation provider"dalle" | "huggingface" | "sdwebui""dalle"
huggingface_image_modelHugging Face image model, see available modelsstr"CompVis/stable-diffusion-v1-4"
huggingface_api_tokenHugging Face API token, ENV: HUGGINGFACE_API_TOKENstrNone
sd_webui_urlURL to self-hosted Stable Diffusion WebUIstr"http://localhost:7860"
sd_webui_authBasic auth for Stable Diffusion WebUI, ENV: SD_WEBUI_AUTHstr of format {username}:{password}None

CommandProvider

  • generate_image used to generate an image given a prompt

WebSearchComponent

Allows agent to search the web. Google credentials aren't required for DuckDuckGo. Instructions how to set up Google API key

WebSearchConfiguration

Config variableDetailsTypeDefault
google_api_keyGoogle API key, ENV: GOOGLE_API_KEYstrNone
google_custom_search_engine_idGoogle Custom Search Engine ID, ENV: GOOGLE_CUSTOM_SEARCH_ENGINE_IDstrNone
duckduckgo_max_attemptsMaximum number of attempts to search using DuckDuckGoint3
duckduckgo_backendBackend to be used for DDG sdk"api" | "html" | "lite""api"

DirectiveProvider

  • Resource information that it's possible to search the web

CommandProvider

  • search_web used to search the web using DuckDuckGo
  • google used to search the web using Google, requires API key

WebSeleniumComponent

Allows agent to read websites using Selenium.

WebSeleniumConfiguration

Config variableDetailsTypeDefault
llm_nameName of the llm model used to read websitesModelName"gpt-3.5-turbo"
web_browserWeb browser used by Selenium"chrome" | "firefox" | "safari" | "edge""chrome"
headlessRun browser in headless modeboolTrue
user_agentUser agent used by the browserstr"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"
browse_spacy_language_modelSpacy language model used for chunking textstr"en_core_web_sm"
selenium_proxyHttp proxy to use with SeleniumstrNone

DirectiveProvider

  • Resource information that it's possible to read websites

CommandProvider

  • read_website used to read a specific url and look for specific topics or answer a question

ContextComponent

Adds ability to keep up-to-date file and folder content in the prompt.

MessageProvider

  • Content of elements in the context

CommandProvider

  • open_file used to open a file into context
  • open_folder used to open a folder into context
  • close_context_item remove an item from the context

WatchdogComponent

Watches if agent is looping and switches to smart mode if necessary.

AfterParse

  • Investigates what happened and switches to smart mode if necessary