.openhands/microagents/glossary.md
The core AI entity in OpenHands that can perform software development tasks by interacting with tools, browsing the web, and modifying code.
A component that manages the agent's lifecycle, handles its state, and coordinates interactions between the agent and various tools.
The ability of an agent to hand off specific tasks to other specialized agents for better task completion.
A central registry of different agent types and their capabilities, allowing for easy agent selection and instantiation.
A specific capability or function that an agent can perform, such as file manipulation, web browsing, or code editing.
The current context and status of an agent, including its memory, active tools, and ongoing tasks.
A generalist agent in OpenHands designed to perform tasks by editing and executing code.
A system for web-based interactions and tasks.
A testing and evaluation environment for browser-based agent interactions and tasks.
A tool that enables agents to interact with web pages and perform web-based tasks.
Terminal and execution related functionality.
A persistent terminal session that maintains state and history for bash command execution. This uses tmux under the hood.
System-wide settings and options.
Settings that define an agent's behavior, capabilities, and limitations, including available tools and runtime settings.
Settings that control various aspects of OpenHands behavior, including runtime, security, and agent settings.
Configuration settings for language models used by agents, including model selection and parameters.
Settings for draft mode operations with language models, typically used for faster, lower-quality responses.
Settings that define how the runtime environment should be set up and operated.
Configuration settings that control security features and restrictions.
A sequence of interactions between a user and an agent, including messages, actions, and their results.
Metadata about a conversation, including its status, participants, and timeline.
A component that handles the creation, storage, and retrieval of conversations.
Additional information about conversations, such as tags, timestamps, and related resources.
The current state of a conversation, including whether it's active, completed, or failed.
A storage system for maintaining conversation history and related data.
Every Conversation comprises a series of Events. Each Event is either an Action or an Observation.
A continuous flow of events that represents the ongoing activities and interactions in the system.
A specific operation or command that an agent executes through available tools, such as running a command or editing a file.
The response or result returned by a tool after an agent's action, providing feedback about the action's outcome.
Different ways to interact with OpenHands.
A command-line interface mode for interacting with OpenHands agents without a graphical interface.
A graphical user interface mode for interacting with OpenHands agents through a web interface.
A mode of operation where OpenHands runs without a user interface, suitable for automation and scripting.
The system that decides which parts of the Event Stream (i.e. the conversation history) should be passed into each LLM prompt.
A storage system for maintaining agent memory and context across sessions.
A component that processes and summarizes conversation history to maintain context while staying within token limits.
A very simple Condenser strategy. Reduces conversation history or content to stay within token limits.
A specialized prompt that enhances OpenHands with domain-specific knowledge, repository-specific context, and task-specific workflows.
A central repository of available microagents and their configurations.
A general-purpose microagent available to all OpenHands users, triggered by specific keywords. Located in microagents/.
A type of microagent that provides repository-specific context and guidelines, stored in the .openhands/microagents/ directory.
Components for managing and processing prompts.
A system for caching and reusing common prompts to improve performance.
A component that handles the loading, processing, and management of prompts used by agents, including microagents.
The process of interpreting and structuring responses from language models and tools.
The execution environment where agents perform their tasks, which can be local, remote, or containerized.
A REST API that receives agent actions (e.g. bash commands, python code, browsing actions), executes them in the runtime environment, and returns the results.
A component that handles the execution of actions in the runtime environment, managing the communication between the agent and the runtime.
A containerized runtime environment that provides isolation and reproducibility for agent operations.
A specialized runtime environment built on E2B for secure and isolated code execution.
A runtime environment that executes on the local machine, suitable for development and testing.
A runtime environment built on Modal for scalable and distributed agent operations.
A sandboxed environment that executes code and commands remotely, providing isolation and security for agent operations.
A component that builds a Docker image for the Action Execution Server based on a user-specified base image.
Security-related components and features.
A component that checks agent actions for potential security risks.