Back to Docsgpt

Tools Basics - Enhancing DocsGPT Capabilities

docs/content/Tools/basics.mdx

0.18.08.1 KB
Original Source

import { Callout } from 'nextra/components'; import Image from 'next/image'; import { ToolCards } from '../../components/ToolCards';

Understanding DocsGPT Tools

DocsGPT Tools are powerful extensions that significantly enhance the capabilities of your DocsGPT application. They allow DocsGPT to move beyond its core function of retrieving information from your documents and enable it to perform actions, interact with external data sources, and integrate with other services. You can find and configure available tools within the "Tools" section of the DocsGPT application settings in the user interface.

What are Tools?

  • Purpose: The primary purpose of Tools is to bridge the gap between understanding a user's request (natural language processing by the LLM) and executing a tangible action. This could involve fetching live data from the web, sending notifications, running code snippets, querying databases, or interacting with third-party APIs.

  • LLM as an Orchestrator: The Large Language Model (LLM) at the heart of DocsGPT is designed to act as an intelligent orchestrator. Based on your query and the declared capabilities of the available tools (defined in their metadata), the LLM decides if a tool is needed, which tool to use, and what parameters to pass to it.

  • Action-Oriented Interactions: Tools enable more dynamic and action-oriented interactions. For example:

    • "What's the latest news on renewable energy?" - This might trigger a web search tool to fetch current articles.
    • "Fetch the order status for customer ID 12345 from our database." - This could use a database tool.
    • "Summarize the content of this webpage and send the summary to the #general channel on Telegram." - This might involve a web scraping tool followed by a Telegram notification tool.

Overview of Built-in Tools

DocsGPT includes a suite of pre-built tools designed to expand its capabilities out-of-the-box. Below is an overview of the currently available tools.

<ToolCards items={[ { title: 'API Tool', link: '/Tools/api-tool', description: 'A highly flexible tool that allows DocsGPT to interact with virtually any API without needing to write custom Python code.' }, { title: 'Brave Search', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/brave.py', description: 'Enables DocsGPT to perform real-time web and image searches using the Brave Search API. Requires an API key.' }, { title: 'DuckDuckGo Search', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/duckduckgo.py', description: 'Performs web and image searches using DuckDuckGo. No API key required.' }, { title: 'CryptoPrice', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/cryptoprice.py', description: 'Fetches the current price of specified cryptocurrencies using the CryptoCompare public API.' }, { title: 'Ntfy', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/ntfy.py', description: 'Allows DocsGPT to send push notifications to ntfy topics on a specified server, ideal for alerts and updates.' }, { title: 'Telegram Bot', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/telegram.py', description: 'Allows DocsGPT to send messages or images to Telegram chats via a Telegram Bot. Requires a bot token and chat ID.' }, { title: 'PostgreSQL Database', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/postgres.py', description: 'Connects to a PostgreSQL database to execute SQL queries and retrieve schema information.' }, { title: 'Read Webpage (browser)', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/read_webpage.py', description: 'Fetches the HTML content of a URL and converts it to Markdown for the agent to read.' }, { title: 'Remote Device', link: '/Tools/remote-device', description: 'Runs shell commands on a paired remote machine through the docsgpt-cli host. See the Remote Device guide.' }, { title: 'MCP Tool', link: '/Guides/Integrations/mcp-tool-integration', description: 'Connects to remote Model Context Protocol (MCP) servers to access their dynamic tools and resources.' }, { title: 'Memory', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/memory.py', description: 'Stores and retrieves information across conversations through a per-user memory file directory.' }, { title: 'Notepad', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/notes.py', description: 'A single editable note. Supports viewing, overwriting, and string replacement.' }, { title: 'Todo List', link: 'https://github.com/arc53/DocsGPT/blob/main/application/agents/tools/todo_list.py', description: 'Manages todo items — creating, viewing, updating, and deleting todos.' } ]} />

Default Chat Tools

In a regular chat (no custom agent), DocsGPT can enable a small set of tools automatically so the assistant is useful out of the box. These are the default chat tools, controlled by the DEFAULT_CHAT_TOOLS setting:

env
DEFAULT_CHAT_TOOLS=memory,read_webpage,scheduler
  • Default tools are config-free and run with synthetic, deterministic tool IDs (no manual setup needed).
  • Each user can opt out of individual default tools from their settings; the disabled list is stored per user.
  • Some default tools are excluded from headless runs (scheduled tasks and webhook triggers). For example, scheduler is skipped in those runs to prevent a scheduled task from chaining new schedules on every fire.

To change the defaults for the whole instance, set DEFAULT_CHAT_TOOLS to a comma-separated list of tool names. See App Configuration for the full settings reference.

Using Tools in DocsGPT (User Perspective)

Interacting with tools in DocsGPT is designed to be intuitive:

  1. Natural Language Interaction: As a user, you typically interact with DocsGPT using natural language queries or commands. The LLM within DocsGPT analyzes your input to determine if a specific task can or should be handled by one of the available and configured tools.

  2. Configuration in UI:

    • Tools are generally managed and configured within the DocsGPT application's settings, found under a "Tools" section in the GUI.
    • For tools that interact with external services (like Brave Search, Telegram, or any service via the API Tool), you might need to provide authentication credentials (e.g., API keys, tokens) or specific endpoint information during the tool's setup in the UI.
  3. Prompt Engineering for Tools: While the LLM aims to intelligently use tools, for more complex or reliable agent-like behaviors, you might need to customize the system prompts. Modifying the prompt can guide the LLM on when and how to prioritize or chain tools to achieve specific outcomes, especially if you're building an agent designed to perform a certain sequence of actions every time. For more on this, see Customising Prompts.

Advancing with Tools

Understanding the basics of DocsGPT Tools opens up many possibilities:

  • Leverage the API Tool: For quick integrations with numerous external services, explore the API Tool Detailed Guide.
  • Develop Custom Tools: If you have specific needs not covered by built-in tools or the generic API tool, you can develop your own. See our guide on [Developing Custom Tools](/Tools/creating-a-tool) (placeholder for now).
  • Build AI Agents: Tools are the fundamental building blocks for creating sophisticated AI agents within DocsGPT. Explore how these can be combined by looking into the [Agents section/tab concept - link to be added once available].

By harnessing the power of Tools, you can transform DocsGPT into a more versatile and proactive assistant tailored to your unique workflows.