V4.15.0 - Fastgpt — ContextQMD

import { Alert } from '@/components/docs/Alert';

📦 Upgrade Guide

1. Environment Variable Changes

1.1 fastgpt-app and fastgpt-pro

Check required variables

v4.15.0 introduces stricter environment variable validation. After upgrading, make sure the following variables are configured correctly.

dotenv

# Encryption key. Must be the same in both services.
AES256_SECRET_KEY=
# File token key. Must be the same in both services.
FILE_TOKEN_KEY=
# JWT secret for invoke callbacks. Must be at least 32 characters and the same in both services.
INVOKE_TOKEN_SECRET=

New environment variables

Required

dotenv

# SSE MCP Server address. Leave empty if you do not use SSE.
SSE_MCP_SERVER_PROXY_ENDPOINT=

Optional

The following variables have defaults or can be left unset without affecting normal usage.

dotenv

# File parsing worker concurrency (optional)
PARSE_FILE_WORKERS=10
# File parsing timeout in seconds (optional)
PARSE_FILE_TIMEOUT_SECONDS=600
# HTML-to-Markdown worker concurrency (optional)
HTML_TO_MARKDOWN_WORKERS=10
# Text chunking worker concurrency (optional)
TEXT_TO_CHUNKS_WORKERS=10
# Automatically sync MongoDB indexes. Use boolean strings instead of 0 or 1. (optional)
SYNC_INDEX=true
# Whether to enable trusted reverse proxy client IP verification (optional)
TRUSTED_PROXY_ENABLE=false
# Trusted reverse proxy IP/CIDR list, separated by commas or whitespace. Only takes effect when TRUSTED_PROXY_ENABLE=true.
# Only X-Forwarded-For/X-Real-IP from explicitly trusted proxies will be used for client IP resolution. (optional)
TRUSTED_PROXY_IPS=
# Maximum string length for synchronous system variable replacement, in M. Range: 1-100.
SYSTEM_MAX_STRING_LENGTH_M=100
# Maximum folder depth. Default: 4. Range: 2-20.
MAX_FOLDER_DEPTH=4
# Maximum input array length for Loop/Parallel nodes
WORKFLOW_MAX_LOOP_TIMES=100
# Parallel node concurrency limit. The final value is clamped to [5, 100].
WORKFLOW_PARALLEL_MAX_CONCURRENCY=10

Open-source edition variable changes

The open-source edition no longer uses the config.json configuration file. These settings have moved to environment variables. After removing the volume mount, add the following variables as needed:

dotenv

# Custom PDF parsing service URL
CUSTOM_PDF_PARSE_URL=
# Custom PDF parsing service key
CUSTOM_PDF_PARSE_KEY=
# Doc2x PDF parsing service key
DOC2X_KEY=
# TextIn service App ID
TEXTIN_APP_ID=
# TextIn service Secret Code
TEXTIN_SECRET_CODE=
# hnsw ef_search parameter for vector search. Only applies to PG / OB / OpenGauss.
HNSW_EF_SEARCH=100
# Maximum vector scan tuple count. Only applies to PG.
HNSW_MAX_SCAN_TUPLES=100000
# Maximum Knowledge Base file parsing queue concurrency
DATASET_PARSE_MAX_PROCESS=10
# Maximum vector training queue concurrency
VECTOR_MAX_PROCESS=10
# Maximum Q&A split queue concurrency
QA_MAX_PROCESS=10
# Maximum vision-language model processing queue concurrency
VLM_MAX_PROCESS=10

1.2 code-sandbox

Code Sandbox adds security-related environment variables such as SANDBOX_API_MAX_BODY_MB and SANDBOX_MAX_OUTPUT_MB, and supports grouped run queueing through queueId. Full defaults:

Variable	Default	Description
`SANDBOX_API_MAX_BODY_MB`	`8`	Maximum `/sandbox` API JSON body size, including `variables`, in MB.
`SANDBOX_MAX_OUTPUT_MB`	`10`	Maximum output JSON size for one code execution, including return values and logs, in MB.
`CHECK_INTERNAL_IP`	`true`	Enables internal IP checks for sandbox network requests by default to reduce SSRF risk.
`SANDBOX_MAX_TIMEOUT`	`60000`	Timeout for one code execution, in milliseconds.
`SANDBOX_MAX_MEMORY_MB`	`256`	Memory limit for one sandbox, in MB. The runtime reserves an extra `50` MB for overhead.
`SANDBOX_POOL_SIZE`	`20`	Number of pre-warmed JS/Python workers.
`SANDBOX_REQUEST_MAX_COUNT`	`30`	Maximum number of network requests allowed during one code execution.
`SANDBOX_REQUEST_TIMEOUT`	`60000`	Timeout for one network request from inside the sandbox, in milliseconds.
`SANDBOX_REQUEST_MAX_RESPONSE_MB`	`10`	Maximum response body size for one sandbox network request, in MB.
`SANDBOX_REQUEST_MAX_BODY_MB`	`5`	Maximum request body size for one sandbox network request, in MB.
`SANDBOX_QUEUE_ID_CONCURRENCY`	Empty	Number of requests with the same `queueId` that may enter execution at once. Empty disables queueing.

1.3 fastgpt-plugin

The plugin service has been reworked. You must add AUTH_TOKEN and FASTGPT_BASE_URL, and update the MONGODB_URI variable:

Set AUTH_TOKEN for fastgpt-plugin. It must be at least 32 characters long.
Set PLUGIN_TOKEN in both fastgpt and fastgpt-pro to the same value as fastgpt-plugin's AUTH_TOKEN.
Change the database name in fastgpt-plugin's MONGODB_URI so it does not conflict with FastGPT's MongoDB database name. Example: mongodb://myusername:mypassword@fastgpt-mongo:27017/fastgpt-plugin?authSource=admin.

Additional variables you may adjust

dotenv

# ================ System =====================
# Auth token
AUTH_TOKEN=
# Maximum API request body size (MB)
MAX_API_SIZE=10
# FastGPT service URL. It can be an internal address and is used for callbacks to FastGPT APIs.
FASTGPT_BASE_URL=http://fastgpt-app:3000

# ================ Plugin runtime =====================
# Supported value: localPool
PLUGIN_RUNTIME_MODE=localPool
# Temporary file storage directory. Can be empty.
LOCAL_FILE_BASE_PATH=

# ================ Process pool =====================
# Health check interval (ms)
POOL_HEALTH_CHECK_INTERVAL=30000
# Maximum total process count
POOL_MAX_TOTAL_PODS=100
# Minimum process count for one Service
POOL_SERVICE_MIN_PODS=0
# Maximum process count for one Service
POOL_SERVICE_MAX_PODS=5
# Global idle timeout (ms)
POOL_SERVICE_IDLE_TIMEOUT=60000
# Process runtime timeout (ms)
POOL_SERVICE_POD_TIMEOUT=120000
# Maximum concurrent requests per process
POOL_SERVICE_MAX_CONCURRENT_REQUESTS_PER_POD=10
# Global maximum requests per process before automatic rotation
POOL_SERVICE_MAX_REQUESTS_PER_POD=100
# Global maximum process queue length
POOL_SERVICE_MAX_QUEUE_SIZE=500
# Global process queue timeout (ms)
POOL_SERVICE_QUEUE_TIMEOUT=60000
# Startup retry backoff base delay (ms)
POOL_SERVICE_STARTUP_RETRY_BASE_DELAY=1000
# Startup retry backoff maximum delay (ms)
POOL_SERVICE_STARTUP_RETRY_MAX_DELAY=10000

# ================ Database =====================
MONGODB_URI=mongodb://username:password@localhost:27017/fastgpt?authSource=admin&directConnection=true
MONGO_MAX_LINK=20
SYNC_INDEX=true
REDIS_URL=redis://default:password@localhost:6379/0

# ================ Object storage =====================
# S3 file prefix. Do not change it casually after use.
S3_FILE_BASE_PATH=system/plugin
STORAGE_VENDOR=minio
STORAGE_REGION=us-east-1
STORAGE_ACCESS_KEY_ID=minioadmin
STORAGE_SECRET_ACCESS_KEY=minioadmin
STORAGE_PUBLIC_BUCKET=fastgpt-public
STORAGE_PRIVATE_BUCKET=fastgpt-private
STORAGE_EXTERNAL_ENDPOINT=http://localhost:9000
STORAGE_S3_ENDPOINT=http://localhost:9000
STORAGE_S3_FORCE_PATH_STYLE=true
STORAGE_S3_MAX_RETRIES=3
STORAGE_PUBLIC_ACCESS_EXTRA_SUB_PATH=

# ================ Logs =====================
LOG_ENABLE_CONSOLE=true
# Console log level: "trace" | "debug" | "info" | "warning" | "error" | "fatal"
LOG_CONSOLE_LEVEL=info
LOG_ENABLE_OTEL=false
# Minimum log level stored in OTEL
LOG_OTEL_LEVEL=info
LOG_OTEL_SERVICE_NAME=fastgpt-plugin
LOG_OTEL_URL=http://localhost:4318/v1/logs

# ================ Metrics =====================
METRICS_ENABLE_OTEL=false
METRICS_OTEL_SERVICE_NAME=fastgpt-plugin
METRICS_OTEL_URL=http://localhost:4318/v1/metrics
METRICS_EXPORT_INTERVAL_MS=30000
METRICS_EXPORT_TIMEOUT_MS=10000
METRICS_INCLUDE_PLUGIN_VERSION=true
METRICS_INCLUDE_PLUGIN_ETAG=false
METRICS_INCLUDE_HOSTNAME=true
# For multi-node deployments, use Pod UID / container id / instance id. Empty generates an opaque id.
SERVICE_INSTANCE_ID=
DEPLOYMENT_ENVIRONMENT=

2. OpenSandbox Changes (as needed)

OpenSandbox and other sandbox provider settings have moved to Sandbox Configuration and are no longer built into the deployment yml.

For this upgrade, focus on:

Deploying the agent-sandbox-proxy service.
Updating OpenSandbox-related image versions.
Updating related environment variables in fastgpt-app and fastgpt-pro.

3. Image Changes

Update fastgpt-app (FastGPT main service) image tag: v4.15.0
Update fastgpt-pro (FastGPT commercial edition) image tag: v4.15.0
Update fastgpt-code-sandbox image tag: v4.15.0
Update fastgpt-plugin image tag: v1.0.0
Update aiproxy image tag: v0.6.5

If opensandbox is enabled, also update:

fastgpt-agent-sandbox-proxy image tag: v0.2.0
fastgpt-agent-sandbox image tag: v0.2.0

4. Start Services

Run docker compose up -d to restart services.

5. Reinstall System Tools

After upgrading the plugin service, reinstall all legacy system tools:

Download the zip package that contains all system tools.
Open the fastgpt web app, click Admin in the navbar, click Add Plugin, click Import/Update Plugin, upload the zip package, and confirm.

You can also install them one by one from the plugin marketplace: https://v2.marketplace.fastgpt.cn. The environment variable default now points to this address, so no marketplace-related variables are required.

6. Run Migration Scripts

Before running scripts:

Back up MongoDB, object storage, and your current deployment configuration.
Upgrade fastgpt-app / fastgpt-pro to image versions that include these root-admin APIs.
Prepare a reachable FastGPT {{host}} and {{rootkey}}. All APIs below require rootkey.

6.1 Clean Duplicate appId-chatId Records (optional, but recommended)

The stable release syncs two unique indexes: { appId, chatId } and { sourceType, appId, chatId }. Before the indexes sync successfully, check and clean duplicate appId + chatId records in the chats collection. Otherwise, when SYNC_INDEX=true, index sync may fail with E11000 duplicate key error, and the unique constraint will not take effect.

This API depends on the upgraded fastgpt-app image. If the first stable-release startup already reports a unique index conflict but the service is still reachable, run the dry-run and cleanup commands below, then restart the service so it can sync indexes again.

Run the dry-run first. This does not delete data, and every deployment should run it at least once:

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/cleanupDuplicateChats' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"dryRun":true,"sampleLimit":20}'

Check these response fields:

duplicateDocumentCount: expected number of duplicate chats headers to delete.
samples: duplicate samples, including the retained keepId and candidate deleteIds.
deletedDocumentCount: number actually deleted during apply. It is always 0 in dry-run mode.

If duplicateDocumentCount=0, no apply step is needed. If it is greater than 0, confirm the samples and then apply the cleanup:

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/cleanupDuplicateChats' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"dryRun":false,"sampleLimit":20}'

Cleanup policy: for each duplicate appId + chatId group, the API keeps the record with the latest updateTime. If timestamps are equal, it uses _id descending as a stable tie-breaker. The API only deletes duplicate chats headers. It does not delete message content in chatitems or chat_item_responses.

After cleanup, keep SYNC_INDEX=true and restart fastgpt-app / fastgpt-pro so the service can sync indexes again. You can enter MongoDB and confirm both indexes are unique: true:

db.chats
  .getIndexes()
  .filter((idx) => ['appId_1_chatId_1', 'sourceType_1_appId_1_chatId_1'].includes(idx.name));

6.2 Workflow V1 -> V2 Migration (optional)

Run this only when upgrading directly from a version earlier than <4.8, or when your deployment still contains historical V1 Workflow data. The API defaults to dry-run mode. It scans, converts, and validates the saved structure without writing to the database.

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/v1WorkflowToV2' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"dryRun":true}'

After confirming the returned statistics, apply the migration:

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/v1WorkflowToV2' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"dryRun":false}'

Skip this step if you already completed the V1 -> V2 migration in an earlier version, or if you are upgrading from v4.8 or later.

6.3 Workflow Dirty-Data Cleanup (required)

This script scans and fixes historical enum-expression strings, nullish values, and legacy-structure compatibility issues in apps.modules and app_versions.nodes. All self-hosted deployments should run the dry-run first. If the returned statistics show fixable data, apply the write step.

If 6.2 applies to your deployment, run this script after 6.2 completes.

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/initWorkflowData' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"dryRun":true,"batchSize":1000,"writeBatchSize":10}'

After confirming the dry-run result, apply the cleanup:

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/initWorkflowData' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"dryRun":false,"batchSize":1000,"writeBatchSize":10}'

Lower writeBatchSize if production write pressure is high. Documents that fail Zod validation are reported in the response and are not written back to the database.

6.4 Archive Legacy Sandboxes (optional)

If you used legacy sandbox workspaces, this API can fix historical sandbox status fields and optionally archive inactive workspaces to S3. This step does not affect newly generated sandboxes. Skipping it does not block the v4.15 stable upgrade; old workspaces simply will not be archived automatically. You can also delete old sandboxes manually.

Check only, without triggering archive:

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/initSandboxArchive' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"runArchive":false,"inactiveDays":0}'

If you want to immediately archive inactive workspaces that match the condition:

bash

curl -X POST 'https://{{host}}/api/admin/dataClean/initSandboxArchive' \
  -H 'Content-Type: application/json' \
  -H 'rootkey: {{rootkey}}' \
  -d '{"runArchive":true,"inactiveDays":0}'

Major Impacts

API Key behavior has changed. FastGPT no longer distinguishes between app keys and system keys; only system keys are kept. For OpenAI SDK compatibility, pass the token as apikey-appId. Existing API keys remain compatible and continue to work. For details, see the FastGPT API documentation.
Some APIs now enforce stricter data format validation. If you see a zod parse error, please submit an issue. It may be caused by legacy data or custom data structures that do not match the declared schema.
LLM request traces now enforce team isolation. llm_request_records stores teamId, GET /api/core/ai/record/getRecord queries by { requestId, teamId }, and the unique index changes to { teamId, requestId }. Trace records written before this upgrade do not contain teamId and can no longer be queried; the UI will treat them as expired. Export relevant logs or keep original request details before upgrading if you need to investigate historical calls. If your self-hosted deployment has SYNC_INDEX disabled, run an index sync after upgrading so the old requestId_1 unique index is removed.

🚀 New Features

Added the Skill module. Agent V2 can bind and run static Skills.
Reworked the Agent V2 loop logic to improve stability for multi-step tool calls and orchestration.
Sandbox now supports custom npm and pip sources.
Reworked the plugin system architecture, added plugin-level runtime config, and moved system tool execution to local-pool.
The commercial edition now supports local direct-connect debugging for FastGPT plugins.
Reworked the chatbox UI with quick scroll-to-bottom, model-generated chat titles, and smoother streaming output.
Added LLM-generated chat titles.
Added the Loop node and deprecated the legacy batch execution node.
Knowledge Base search now supports native multimodal embedding models, image-to-image search, and permission filtering in Agent mode.
Multimodal models now support audio and video input.
API Key logic is optimized. API Key management is unified, and requests now explicitly pass app context.
Generated separate DevAPI and System OpenAPI documentation.
Added quick-reply output syntax.
Added DingTalk Knowledge Base integration for third-party Knowledge Bases.
Added model reasoning configuration.
Workflow template export now includes the template name and description.
Global variable inputs now support object-type data.
In tool call mode, when the virtual machine feature is enabled, files uploaded in the user chat input are injected directly into the VM.
Added worker pools for file parsing, HTML-to-Markdown conversion, and text chunking to prevent resource exhaustion under high concurrency.
Added a directory depth environment variable to avoid infinitely nested directories. Configure it with MAX_FOLDER_DEPTH.
S3 now supports CDN configuration.
Rerank now supports defaultConfig.
Share links and portal pages now support language switching and no longer force language detection from the browser.
Chat API now validates duplicate dataId values to prevent invalid data from entering Workflow execution and stream-resume merge logic.
The HTTP node now supports ignoring TLS certificate verification and returning the complete error object.

⚙️ Improvements

Plugin execution entries can now be fetched from object storage and cached in a local directory.
Optimized the OTEL log collection format.
Disabled invalid connection mode in Workflows.
Added mutually exclusive parent-child node selection to prevent jitter when moving selected parent and child nodes together.
Improved Workflow node name, description input, and long-name adaptation.
When the user is redirected from the Workflow editor because the login session expires, the draft is automatically saved for recovery.
In Workflow run details, file fields from form input nodes are displayed as file lists.
Strengthened validation for Workflow array reference types to avoid conflicts with two-dimensional data.
Image processing workers now support configuring whether images are converted to base64 before being sent to the model through MULTIPLE_DATA_TO_BASE64=true.
HTML output now automatically switches to preview mode after generation, reducing the need to open the preview manually.
Improved stream-resume pause and abnormal interruption recovery to reduce chats getting stuck in inaccurate generating or stopping states.
The most recent chat is remembered per app when switching apps, and local chat cache is cleared when switching teams.
Improved the Knowledge Base search test interaction and Knowledge Base data editing modal.
When a Knowledge Base is deleted, app orchestration now shows a graceful prompt.
Improved error prompts during Knowledge Base training and added one-click retry for all failed items.
Invalid Knowledge Base reference markers are now filtered out.
PDF parsing now uses liteparse instead of PDFJs, improving speed by 3x.
xlsx parsing now automatically removes empty rows and columns and supports merged cells.
Added validation for input guide configuration to prevent incorrect custom dictionary URL configuration.
Strengthened security protection for third-party Knowledge Base requests, HTTP tool parsing, IP detection, and Code Sandbox AST checks.
File injection in messages moved from system messages to user messages to improve cache hit rates.
Improved the reason hide toggle so reasoning can be hidden in the UI while still being preserved when requesting the LLM.
Optimized chat2messages adaptation to avoid standalone reason output.
Empty tool responses are now automatically filled with none to avoid errors in some models.
Improved the insufficient-balance prompt for non-admin users and visitors.
Template features are hidden when the user does not have creation permission.
Improved long-name display for apps, Knowledge Bases, files, and folders: names are truncated when they exceed the available width, and the full name is shown on hover.
Improved Skill-related modals, editing interactions, and list API performance.
Improved the login page UI.
Deduplicated site sync rate-limit error prompts.
Added virtual list rendering for apps and Knowledge Bases to improve large-list performance.
LLM request traces now use team-isolated queries to prevent request IDs from exposing request bodies, retrieved Knowledge Base chunks, and model responses across teams.

🐛 Bug Fixes

Fixed an issue where a model response error in Agent V2 mode caused steps to execute repeatedly.
Fixed missing charset in text responses when previewing or downloading Knowledge Base source files.
Fixed abnormal default values in Workflow single-node debugging.
Fixed abnormal defaultConfig override behavior in model configuration.
Fixed TTS playback errors when adapting to the latest OpenAI SDK.
Fixed oversized chunks that could occur when Knowledge Base data chunks contained code blocks.
Fixed abnormal multimodal file link retrieval from models.
Fixed potential security risks related to the training API, HTTP tool parsing, and private S3 object keys.
Fixed abnormal MCP tool expansion for tool calls after interactive nodes.
Fixed abnormal tool call parameter schemas for array and object types in Workflow tools.
Fixed UI offset in publish channel portals.
Fixed the v1/completions API where quoteList in nodeResponse did not return q and a.
Fixed conversation stream resume issues, including form restoration, file list restoration, node response preservation, duplicate interaction appending, temporary history titles, and cross-app chat leakage.
Stop conversation prompts are now synchronized with the backend generation state, and the warning toast shown during stop has been removed.
The v1/chat/completions API previously filtered out q/a/index when returning nodeResponse; this version restores those fields.

🛠️ Code Improvements

Reorganized the overall code structure, upgraded Next.js, switched to Turbopack builds, and upgraded the default container Node.js version to 24.
Unified Agent tool declaration and execution behavior.
The plugin service moved from the legacy runtime structure to a pnpm workspace monorepo, split into HTTP service entry, domain model, use cases, API adapter, infrastructure, SDK, and CLI.
Application-related API interfaces now use zod schemas consistently and generate documentation.
Split AI request, Workflow run detail, and chatbox code to reduce module coupling.
Optimized user-defined API key billing logic and token calculation dependencies.
Server-side environment loading now uses @t3-oss/env-core with stronger type checks. Other services also use centralized environment exports.
Upgraded project tooling, including ESLint, Prettier, textlint, lint-staged, and TS6.
Improved unit test performance, reducing full test runtime from 10 minutes to 5 minutes.
Strengthened GitHub Actions security.
Added design documentation and unit tests for stream-resume-related modules.
Changed the volume manager runtime from Bun to Node.js.
Images are now processed promptly inside workers instead of retaining base64 data, reducing memory usage.
Added string length protection for system string processing. When strings are too large, synchronized replacement stops to avoid high CPU load.
Workflow nodeResponse is now stored in a flattened structure to avoid save failures in large nested Workflows.
Removed temperature and max_tokens from all built-in LLM requests to avoid incompatibility with some models.
Fixed dirty enum-expression strings such as FlowNodeInputTypeEnum.*, FlowNodeOutputTypeEnum.*, and WorkflowIOValueTypeEnum.* in Workflow node configuration that caused input rendering and IO type checks to behave incorrectly.
In Workflow text boxes, Ctrl+C for copying text could be intercepted by node copy behavior, preventing text copy.
The chat API has been abstracted from app-specific handling into a platform-level capability.