core/http/views/settings.html
Configure watchdog and backend request settings
Configure automatic monitoring and management of backend processes
Enable Watchdog
Enable automatic monitoring of backend processes
Enable Idle Check
Automatically stop backends that are idle for too long
Idle Timeout
Time before an idle backend is stopped (e.g., 15m, 1h)
Enable Busy Check
Automatically stop backends that are busy for too long (stuck processes)
Busy Timeout
Time before a busy backend is stopped (e.g., 5m, 30m)
Check Interval
How often the watchdog checks backends and memory usage (e.g., 2s, 30s)
Force Eviction When Busy
Allow evicting models even when they have active API calls (default: disabled for safety)
LRU Eviction Max Retries
Maximum number of retries when waiting for busy models to become idle (default: 30)
LRU Eviction Retry Interval
Interval between retries when waiting for busy models (e.g., 1s, 2s) (default: 1s)
Automatically evict backends when memory usage exceeds a threshold. Uses GPU VRAM if available, otherwise system RAM. Uses LRU strategy.
Current Memory StatusRefresh
System RAM
Memory monitoring unavailable
Enable Memory Reclaimer
Evict backends when memory usage exceeds threshold
Memory Threshold (%)
When memory usage exceeds this, backends will be evicted (50-100%)
Configure how backends handle multiple requests
Max Active Backends
Maximum number of models to keep loaded at once (0 = unlimited, 1 = single backend mode). Least recently used models are evicted when limit is reached.
Parallel Backend Requests
Enable backends to handle multiple requests in parallel (if supported)
Configure default performance parameters for models
Default Threads
Number of threads to use for model inference (0 = auto)
Default Context Size
Default context window size for models
F16 Precision
Use 16-bit floating point precision
Debug Mode
Enable debug logging
Enable Tracing
Enable tracing of requests and responses
Tracing Max Items
Maximum number of tracing items to keep
Configure CORS and CSRF protection
Enable CORS
Enable Cross-Origin Resource Sharing
CORS Allow Origins
Comma-separated list of allowed origins
Enable CSRF Protection
Enable Cross-Site Request Forgery protection
Configure peer-to-peer networking
P2P Token
Authentication token for P2P network (set to 0 to generate a new token)
P2P Network ID
Network identifier for P2P connections
Federated Mode
Enable federated instance mode
Configure agent job retention and cleanup
Job Retention Days
Number of days to keep job history (default: 30)
Configure Open Responses API response storage
Response Store TTL
Time-to-live for stored responses (e.g., 1h, 30m, 0 = no expiration)
Manage API keys for authentication. Keys from environment variables are always included.
API Keys
List of API keys (one per line or comma-separated)
Note: API keys are sensitive. Handle with care.
Configure model and backend galleries
Autoload Galleries
Automatically load model galleries on startup
Autoload Backend Galleries
Automatically load backend galleries on startup
Model Galleries (JSON)
Array of gallery objects with 'url' and 'name' fields
Backend Galleries (JSON)
Array of backend gallery objects with 'url' and 'name' fields
{{template "views/partials/footer" .}}