src/go/plugin/framework/functions/README.md
This document describes how the functions manager works today (current implementation).
src/go/plugin/framework/functionsmanager.gomanager_worker.goscheduler.goparser.gofinalizer.goThe parser recognizes these line types:
FUNCTION ...FUNCTION_PAYLOAD ... + payload body + FUNCTION_PAYLOAD_ENDFUNCTION_CANCEL <transaction_id>FUNCTION_PROGRESS ... (recognized/no-op event for manager)QUITPayload-mode control behavior:
FUNCTION_CANCEL <same payload uid>:
FUNCTION_CANCEL <different uid>:
FUNCTION_PROGRESS ...:
QUIT:
FUNCTION* control line:
The dispatcher loop:
Admission checks:
503501scheduler.enqueue until space frees (back-pressures stdin reader -> netdata via OS pipe). The only errors returned from this path are manager-stopping (503) on shutdown and invalid-request (500) for malformed input.defaultWorkerCount = 1)defaultQueueSize = 64)fn.Namefn.Name|<matched-prefix>queued -> running -> awaiting_result500Active requests are tracked by UID:
invState map (active entries)defaultTombstoneTTL = 60s) to block immediate UID reuseAll terminal outputs go through:
m.finalizeTerminal)tryFinalize guarantees:
Awaiting-result observability:
awaiting_resultdefaultAwaitingWarnDelay = 30s, capped by function timeout if lower)awaiting_result (diagnostic only, no forced finalize)Functions manager owns an internal runtime store (metrix.NewRuntimeStore()), and
can register it as a runtime component when runtime service is injected via
SetRuntimeService(...).
Registered component metadata:
functions.managerfunctionsmanagerPathology-focused metrics currently exposed:
netdata.go.plugin.framework.functions.manager.invocations_activenetdata.go.plugin.framework.functions.manager.invocations_awaiting_resultnetdata.go.plugin.framework.functions.manager.scheduler_pendingnetdata.go.plugin.framework.functions.manager.cancel_fallback_totalnetdata.go.plugin.framework.functions.manager.late_terminal_dropped_totalnetdata.go.plugin.framework.functions.manager.duplicate_uid_ignored_total499defaultCancelFallbackDelay = 5s)499Important limitation:
func(Function) (no context.Context parameter)499 is the deterministic safety netShutdown uses one bounded path for ctx.Done(), QUIT, and input close (EOF):
defaultShutdownDrainTimeout = 8s for natural drain499Input close still enters the same bounded path above:
flowchart TD
A["Input line"] --> B["Parser.parseEvent()"]
B -->|call| C["dispatchInvocation()"]
B -->|cancel| D["handleCancelEvent()"]
B -->|progress| E["No-op"]
B -->|quit| F["Shutdown(canceling)"]
B -->|parse error| G["Warn + continue"]
C --> C1{"Admission checks"}
C1 -->|stopping| R503["respf 503"]
C1 -->|unregistered/nil handler| R501["respf 501"]
C1 -->|duplicate/tombstoned UID| DUP["ignore duplicate + log"]
C1 -->|accepted| Q["scheduler.enqueue by route key + state=queued (blocks if queue full)"]
Q --> S["keyScheduler"]
S -->|same key busy| SQ["lane queue (serialized)"]
S -->|key free| W["Worker"]
W --> W1{"Start allowed?"}
W1 -->|ctx canceled / cancelRequested| X["skip"]
W1 -->|yes| W2["state=running; run handler"]
W2 -->|panic| R500["respf 500"]
W2 -->|return| AWAIT["state=awaiting_result"]
D --> D1{"Cancel target state"}
D1 -->|pre - admission payload UID| C499["respf 499"]
D1 -->|queued| C499
D1 -->|running/awaiting| TMR["cancel() + fallback timer"]
D1 -->|unknown/done| NOP["debug no-op"]
TMR -->|timer fires & still unresolved| C499
R501 --> FIN["manager finalizer"]
R503 --> FIN
R500 --> FIN
C499 --> FIN
HRESP["Handler/dyncfg responder terminal output"] --> FIN
FIN --> TF["tryFinalize(): first wins, tombstone set, emit FUNCRESULT"]
TF --> SREL["scheduler.complete(key, uid)"]
SREL -->|promote next same-key request| W
TF --> OUT["stdout FUNCRESULT"]
HRESP -->|late duplicate| DROP["drop + debug log"]
A -->|ctx . Done| F
A -->|input close| F2["Shutdown(bounded canceling path)"]