Metrics

OpenViking provides a machine-oriented metrics system for exposing runtime health, request quality, model usage, resource processing throughput, and probe health states.

Unlike the human-facing /api/v1/observer/* endpoints and the analytics-oriented /api/v1/stats/* endpoints, Metrics are designed for:

high-frequency scraping by Prometheus, Grafana Agent, and similar systems
low-cardinality, aggregatable metric models
monitoring, alerting, capacity observation, and regression diagnosis

Overview

Why Metrics

Metrics are well suited to answer questions like:

Has HTTP traffic increased abnormally over the last few minutes?
Are resource ingestion, retrieval, or model calls getting slower?
Is there queue backlog?
Are key dependencies such as storage, model providers, VikingDB, encryption, and async systems currently healthy?
Is a specific tenant showing abnormal traffic or error rates?

Compared with logs and observer snapshots, metrics are better for:

continuous scraping
time-series aggregation
dashboard visualization
alert rules

How Metrics Differ from Observer and Stats

Capability	Best For	Output Format	Typical Usage
`/metrics`	online monitoring, alerting, trend aggregation	Prometheus exposition text	Grafana dashboards, Prometheus scraping
`/api/v1/observer/*`	human inspection of component snapshots	JSON / status tables	debugging, health checks
`/api/v1/stats/*`	analytics-oriented statistics	JSON	memory health, staleness, session extraction

The boundary is:

/metrics only carries low-cardinality, low-cost metrics
/api/v1/stats/* continues to carry analytics-oriented statistics without being constrained by the Prometheus scraping model

Metrics Architecture

The current metrics stack in OpenViking has four layers:

text

Business logic / HTTP requests / background tasks
          │
          ▼
      DataSource
   (event emission / state reads)
          │
          ▼
      Collector
 (semantic routing + labels)
          │
          ▼
    MetricRegistry
   (in-process metric store)
          │
          ▼
      Exporter
 (Prometheus text rendering)
          │
          ▼
       /metrics

DataSource

DataSources provide inputs to the metrics system in two main forms:

Event-based: business code emits events at key points, such as retrieval completion, successful model calls, or resource ingestion stage completion
Read-based: current state is read before /metrics export, such as queue state, lock state, or probe state

Collector

Collectors turn inputs into metric semantics:

choose which metric to write
choose which labels to attach
define how failure is exposed, such as valid=1/0

MetricRegistry

The MetricRegistry is the in-process metric store that keeps the current metric values and serves them to the exporter.

Exporter

The first exporter implementation is the Prometheus exporter, which renders registry contents into Prometheus exposition text.

Usage

Accessing `/metrics`

In the current implementation, /metrics is not wired to get_request_context or other auth dependencies, so from the code-path perspective it currently behaves as a public scrape endpoint.

bash

curl http://localhost:1933/metrics

If your deployment protects /metrics at the gateway, reverse proxy, or service discovery layer, attach auth according to the deployment environment.

Prometheus Scrape Example

yaml

scrape_configs:
  - job_name: openviking
    metrics_path: /metrics
    static_configs:
      - targets: ["localhost:1933"]

Understanding Common Labels

Label	Meaning	Example
`account_id`	tenant dimension label	`test-account`, `__unknown__`, `__overflow__`
`route`	HTTP route template	`/api/v1/search/find`
`method`	HTTP method	`GET`, `POST`
`status`	request or stage status	`200`, `ok`, `error`
`operation`	structured operation name	`search.find`, `resources.add_resource`
`context_type`	retrieval context type	`resource`
`provider`	model or external service provider	`volcengine`
`model_name`	model name	`doubao-seed-1-8-251228`
`stage`	stage label (defined by each metric family)	resource stage: `parse`; token attribution stage: `embed_query`
`valid`	whether the current sample is fresh and valid	`1` / `0`

Notes:

account_id is only enabled on controlled allowlisted metric families to prevent high-cardinality growth
valid=0 means the current state/probe sample is a fallback or stale value, not that the label itself is malformed
stage semantics depend on the metric family:
- openviking_resource_stage_*: resource ingestion pipeline stages (for example parse/persist/process)
- openviking_operation_tokens_total: token attribution stages (for example embed_query/rerank/vlm)

Key Metric Families

The metric summaries below are based on representative metrics currently exposed by the collectors in openviking/metrics/collectors/.

Requests and Operations

Metric Family	Type	Common Labels	Meaning
`openviking_http_requests_total`	Counter	`account_id, method, route, status`	total HTTP requests
`openviking_http_request_duration_seconds`	Histogram	`account_id, method, route, status`	HTTP latency distribution
`openviking_http_inflight_requests`	Gauge	`account_id, route`	current inflight requests (in-process approximation)
`openviking_operation_requests_total`	Counter	`account_id, operation, status`	total structured operations
`openviking_operation_duration_seconds`	Histogram	`account_id, operation, status`	structured operation duration distribution

Typical usage:

inspect whether /api/v1/search/find or /api/v1/resources is slowing down
inspect whether a specific operation has elevated error rates

Retrieval and Resource Processing

Metric Family	Type	Common Labels	Meaning
`openviking_retrieval_requests_total`	Counter	`account_id, context_type`	retrieval request count
`openviking_retrieval_results_total`	Counter	`account_id, context_type`	total retrieved results
`openviking_retrieval_latency_seconds`	Histogram	`account_id, context_type`	retrieval latency distribution
`openviking_retrieval_zero_result_total`	Counter	`account_id, context_type`	retrieval zero-result count
`openviking_retrieval_rerank_used_total`	Counter	`account_id`	number of retrievals that used rerank
`openviking_retrieval_rerank_fallback_total`	Counter	`account_id`	retrieval rerank fallback count
`openviking_resource_stage_total`	Counter	`account_id, stage, status`	count of resource ingestion stages
`openviking_resource_stage_duration_seconds`	Histogram	`account_id, stage, status`	duration distribution of ingestion stages
`openviking_resource_wait_duration_seconds`	Histogram	`account_id, operation`	resource ingestion wait duration distribution (for example queue waiting)

Typical stage values include:

request
parse
summarize
persist
finalize
process

Vector, Memory, and Semantic Metrics

Metric Family	Type	Common Labels	Meaning
`openviking_vector_searches_total`	Counter	`operation`	vector search count
`openviking_vector_scored_total`	Counter	`operation`	total scored candidates
`openviking_vector_passed_total`	Counter	`operation`	total passed candidates
`openviking_vector_returned_total`	Counter	`operation`	total returned candidates
`openviking_vector_scanned_total`	Counter	`operation`	total scanned candidates
`openviking_memory_extracted_total`	Counter	`operation`	total extracted memory items
`openviking_semantic_nodes_total`	Counter	`status`	total semantic nodes

Model Calls and Tokens

Metric Family	Type	Common Labels	Meaning
`openviking_model_calls_total`	Counter	`model_type, provider, model_name`	unified model call count
`openviking_model_tokens_total`	Counter	`model_type, provider, model_name, token_type`	unified model token count
`openviking_vlm_calls_total`	Counter	`account_id, provider, model_name`	VLM call count
`openviking_vlm_tokens_input_total`	Counter	`account_id, provider, model_name`	VLM input tokens
`openviking_vlm_tokens_output_total`	Counter	`account_id, provider, model_name`	VLM output tokens
`openviking_vlm_tokens_total`	Counter	`account_id, provider, model_name`	VLM total tokens
`openviking_vlm_call_duration_seconds`	Histogram	`account_id, provider, model_name`	VLM call duration distribution
`openviking_embedding_requests_total`	Counter	`account_id, status`	embedding request count
`openviking_embedding_latency_seconds`	Histogram	`account_id, status`	embedding latency distribution
`openviking_embedding_errors_total`	Counter	`account_id, error_code`	embedding error count
`openviking_embedding_calls_total`	Counter	`account_id, provider, model_name`	embedding provider call count (per-call)
`openviking_embedding_call_duration_seconds`	Histogram	`account_id, provider, model_name`	embedding provider call duration distribution (per-call)
`openviking_embedding_tokens_input_total`	Counter	`account_id, provider, model_name`	embedding input tokens (per-call aggregate)
`openviking_embedding_tokens_output_total`	Counter	`account_id, provider, model_name`	embedding output tokens (per-call aggregate; may not appear if always 0)
`openviking_embedding_tokens_total`	Counter	`account_id, provider, model_name`	embedding total tokens (per-call aggregate)
`openviking_rerank_calls_total`	Counter	`account_id, provider, model_name`	rerank provider call count (per-call)
`openviking_rerank_call_duration_seconds`	Histogram	`account_id, provider, model_name`	rerank provider call duration distribution (per-call)
`openviking_rerank_tokens_input_total`	Counter	`account_id, provider, model_name`	rerank input tokens (per-call aggregate)
`openviking_rerank_tokens_output_total`	Counter	`account_id, provider, model_name`	rerank output tokens (per-call aggregate; may not appear if always 0)
`openviking_rerank_tokens_total`	Counter	`account_id, provider, model_name`	rerank total tokens (per-call aggregate)
`openviking_operation_tokens_total`	Counter	`account_id, operation, stage, token_type`	operation token aggregation (token attribution stages)

Notes:

openviking_model_* gives a unified cross-model view for embedding and VLM usage
openviking_vlm_* and openviking_embedding_* are better suited for workload-specific dashboards

Queues, Locks, and Runtime State

Metric Family	Type	Common Labels	Meaning
`openviking_queue_processed_total`	Counter	`queue`	total processed items per queue
`openviking_queue_errors_total`	Counter	`queue`	total error count per queue
`openviking_queue_pending`	Gauge	`queue`	pending queue items
`openviking_queue_in_progress`	Gauge	`queue`	in-progress queue items
`openviking_lock_active`	Gauge	none	current active locks
`openviking_lock_waiting`	Gauge	none	locks currently waiting
`openviking_lock_stale`	Gauge	none	potentially stale locks

These help answer:

Is there queue backlog?
Is there lock contention or stale locking?

Tasks and Task Tracker

Metric Family	Type	Common Labels	Meaning
`openviking_task_pending`	Gauge	`task_type`	pending tasks tracked by task tracker
`openviking_task_running`	Gauge	`task_type`	running tasks tracked by task tracker
`openviking_task_completed`	Gauge	`task_type`	completed tasks tracked by task tracker
`openviking_task_failed`	Gauge	`task_type`	failed tasks tracked by task tracker

Cache

Metric Family	Type	Common Labels	Meaning
`openviking_cache_hits_total`	Counter	`level`	cache hit count
`openviking_cache_misses_total`	Counter	`level`	cache miss count

Session

Metric Family	Type	Common Labels	Meaning
`openviking_session_lifecycle_total`	Counter	`account_id, action, status`	session lifecycle event count
`openviking_session_contexts_used_total`	Counter	`account_id, action`	session contexts used total
`openviking_session_archive_total`	Counter	`account_id, status`	session archive count

Probes and Health State

Metric Family	Type	Common Labels	Meaning
`openviking_service_readiness`	Gauge	may include `valid`	main service readiness
`openviking_api_key_manager_readiness`	Gauge	may include `valid`	API key manager readiness
`openviking_storage_readiness`	Gauge	`probe, valid`	storage probe, for example `agfs`
`openviking_model_provider_readiness`	Gauge	`provider, valid`	model provider readiness
`openviking_async_system_readiness`	Gauge	`probe, valid`	async system readiness
`openviking_retrieval_backend_readiness`	Gauge	`probe, valid`	retrieval backend readiness
`openviking_encryption_component_health`	Gauge	`valid`	overall encryption component health
`openviking_encryption_root_key_ready`	Gauge	`valid`	whether the root key is ready
`openviking_encryption_kms_provider_ready`	Gauge	`provider, valid`	KMS provider readiness

Meaning of valid:

valid="1": the sample was produced by a successful refresh
valid="0": the sample is a fallback or stale value and should be treated with caution

Encryption (Operational Metrics)

Metric Family	Type	Common Labels	Meaning
`openviking_encryption_operations_total`	Counter	`account_id, operation, status`	encrypt/decrypt operation count
`openviking_encryption_duration_seconds`	Histogram	`account_id, operation, status`	encrypt/decrypt duration distribution
`openviking_encryption_bytes_total`	Counter	`account_id, operation`	encrypt/decrypt processed bytes total
`openviking_encryption_payload_size_bytes`	Histogram	`account_id, operation`	encrypt/decrypt payload size distribution
`openviking_encryption_auth_failed_total`	Counter	`account_id, status`	auth-failed count
`openviking_encryption_key_derivation_total`	Counter	`account_id, status`	key derivation count
`openviking_encryption_key_derivation_duration_seconds`	Histogram	`account_id, status`	key derivation duration distribution
`openviking_encryption_key_load_duration_seconds`	Histogram	`account_id, status, provider`	key load duration distribution
`openviking_encryption_key_cache_hits_total`	Counter	`account_id, provider`	key cache hit count
`openviking_encryption_key_cache_misses_total`	Counter	`account_id, provider`	key cache miss count
`openviking_encryption_key_version_usage_total`	Counter	`account_id, key_version`	key version usage count

Component and Observer Aggregate Metrics

Metric Family	Type	Common Labels	Meaning
`openviking_component_health`	Gauge	`component, valid`	component health state
`openviking_component_errors`	Gauge	`component, valid`	component error state
`openviking_observer_components_total`	Gauge	`valid`	number of observed components
`openviking_observer_components_unhealthy`	Gauge	`valid`	number of unhealthy components
`openviking_observer_components_with_errors`	Gauge	`valid`	number of components with errors

Typical component values include:

queue
models
lock
retrieval
vikingdb

VikingDB and Model Usage Statistics

Metric Family	Type	Common Labels	Meaning
`openviking_vikingdb_collection_health`	Gauge	`collection, valid`	collection health
`openviking_vikingdb_collection_vectors`	Gauge	`collection, valid`	current vector count per collection
`openviking_model_usage_available`	Gauge	`model_type, valid`	whether model usage statistics are currently available

Possible model_type values include:

vlm
embedding
rerank

Configuration Example

Enabling Metrics

In ov.conf, the metrics subsystem can be explicitly enabled through server.observability.metrics:

json

{
  "server": {
    "observability": {
      "metrics": {
        "enabled": true,
        "account_dimension": {
          "enabled": true,
          "max_active_accounts": 100,
          "metric_allowlist": [
            "openviking_http_requests_total",
            "openviking_http_request_duration_seconds",
            "openviking_http_inflight_requests",
            "openviking_operation_requests_total",
            "openviking_operation_duration_seconds",
            "openviking_vlm_calls_total",
          "openviking_vlm_call_duration_seconds",
          "openviking_rerank_*"
          ]
        }
      }
    }
  }
}

Recommended mental model:

server.observability.metrics.enabled: master switch for the metrics subsystem
server.observability.metrics.account_dimension: controls whether account_id labels are enabled and where they are allowed

Exporters

By default, OpenViking exports metrics via Prometheus exposition format at /metrics. You can also enable additional exporters under server.observability.metrics.exporters.

Key fields:

server.observability.metrics.exporters.prometheus.enabled: enable the Prometheus exporter (serves /metrics)
server.observability.metrics.exporters.otel.enabled: enable OTLP export from the same in-process registry
server.observability.metrics.exporters.otel.protocol: "grpc" or "http"
server.observability.metrics.exporters.otel.tls.insecure: OTLP/gRPC only; true means plaintext (no TLS)
server.observability.metrics.exporters.otel.endpoint: OTLP endpoint (for gRPC, use host:4317; for HTTP, use a full URL)
server.observability.metrics.exporters.otel.service_name: OTLP service.name resource attribute (default "openviking-server")
server.observability.metrics.exporters.otel.export_interval_ms: OTLP push interval in milliseconds (default 10000)
server.observability.metrics.exporters.otel.headers: optional custom OTLP headers; sent as gRPC metadata for gRPC and HTTP headers for HTTP
When using gRPC, header keys in headers should be lowercase, for example x-byteapm-appkey; HTTP does not have this restriction

Example:

json

{
  "server": {
    "observability": {
      "metrics": {
        "enabled": true,
        "exporters": {
          "prometheus": {
            "enabled": true
          },
          "otel": {
            "enabled": true,
            "protocol": "grpc",
            "tls": {
              "insecure": true
            },
            "endpoint": "otel-collector:4317",
            "service_name": "openviking-server",
            "export_interval_ms": 10000,
            "headers": {}
          }
        }
      }
    }
  }
}

Recommended `account_id` Usage

enabled by default, but only allowlisted metric families will receive tenant ids (empty allowlist still yields __unknown__)
do not turn user_id, session_id, or resource_uri into labels
only enable tenant dimensions on a small set of critical dashboard and alert metrics
metric_allowlist supports a limited wildcard syntax: only trailing * prefix matches (e.g. openviking_rerank_*, openviking_embedding_*)
a standalone * is not supported, nor full glob/regex patterns

Architecture Overview - overall OpenViking architecture
Multi-Tenant - account/user/agent isolation model
Data Encryption - storage-layer encryption and isolation
Metrics API - /metrics endpoint usage
Metrics Design - metrics system design details

Metrics

Metrics

Overview

Why Metrics

How Metrics Differ from Observer and Stats

Metrics Architecture

DataSource

Collector

MetricRegistry

Exporter

Usage

Accessing /metrics

Prometheus Scrape Example

Understanding Common Labels

Key Metric Families

Requests and Operations

Retrieval and Resource Processing

Vector, Memory, and Semantic Metrics

Model Calls and Tokens

Queues, Locks, and Runtime State

Tasks and Task Tracker

Cache

Session

Probes and Health State

Encryption (Operational Metrics)

Component and Observer Aggregate Metrics

VikingDB and Model Usage Statistics

Configuration Example

Enabling Metrics

Exporters

Recommended account_id Usage

Related Documentation

Accessing `/metrics`

Recommended `account_id` Usage