docs/architecture/ARCHITECTURE.md
Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve a given objective. This document provides a detailed technical overview of Devika's system architecture and how the various components work together.
At a high level, Devika consists of the following key components:
Let's dive into each of these components in more detail.
The Agent class serves as the central engine that drives Devika's AI planning and execution loop. Here's how it works:
execute method is invoked on the Agent.subsequent_execute method is invoked.Throughout this process, the Agent Core is responsible for:
Devika's cognitive abilities are powered by a collection of specialized sub-agents. Each agent is implemented as a separate Python class. Agents communicate with the underlying LLMs through prompt templates defined in Jinja2 format. Key agents include:
Each agent follows a common pattern:
Agents aim to be stateless and idempotent where possible. State and history is managed by the Agent Core and passed into the agents as needed. This allows for a modular, composable design.
Devika's natural language processing capabilities are driven by state-of-the-art LLMs. The LLM class provides a unified interface to interact with different language models:
The LLM class abstracts out the specifics of each provider's API, allowing agents to interact with the models in a consistent way. It supports:
Choosing the right model for a given use case depends on factors like desired quality, speed, cost etc. The modular design allows swapping out models easily.
Devika can interact with webpages in an automated fashion to gather information and perform actions. This is powered by the Browser and Crawler classes.
The Browser class uses Playwright to provide high-level web automation primitives:
The Crawler class defines an agent that can interact with a webpage based on natural language instructions. It leverages:
The start_interaction function sets up a loop where:
This allows performing a sequence of actions to achieve a higher-level objective (e.g. research a topic, fill out a form, interact with an app etc.)
The ProjectManager class is responsible for creating, updating and querying projects and their associated metadata. Key functions include:
Project metadata is persisted in a SQLite database using SQLModel. The Projects table stores:
This allows the agent to work on multiple projects simultaneously and retain conversation history across sessions.
As the AI agent works on a task, we need to track and display its internal state to the user. The AgentState class handles this by providing an interface to:
Agent state includes information like:
Like projects, agent states are also persisted in the SQLite DB using SQLModel. The AgentStateModel table stores:
Having a persistent log of agent states is useful for:
Devika integrates with external services to augment its capabilities:
The GitHub and Netlify classes provide lightweight wrappers around the respective service APIs.
They handle authentication, making HTTP requests, and parsing responses.
This allows Devika to perform actions like:
Integrations are done in a modular way so that new services can be added easily.
Devika makes use of several utility modules to support its functioning:
Config: Loads and provides access to configuration settings (API keys, folder paths etc.)Logger: Sets up logging to console and file, with support for log levels and colorsReadCode: Recursively reads code files in a directory and converts them into a Markdown formatSentenceBERT: Extracts keywords and semantic information from text using SentenceBERT embeddingsExperts: A collection of domain-specific knowledge bases to assist in certain areas (e.g. webdev, physics, chemistry, math)The utility modules aim to provide reusable functionality that is used across different parts of the system.
Devika is a complex system that combines multiple AI and automation techniques to deliver an intelligent programming assistant. Key design principles include:
By understanding how the different components work together, we can extend, optimize and scale Devika to take on increasingly sophisticated software engineering tasks. The agent-based architecture provides a strong foundation to build more advanced AI capabilities in the future.