elixir/docs/token_accounting.md
This document explains how Codex reports token usage through the app-server protocol and how Symphony should account for it.
It is based on the current Codex source in codex-rs, especially:
app-server/README.mdprotocol/src/protocol.rsapp-server/src/bespoke_event_handling.rsapp-server-protocol/src/protocol/v2.rsexec/src/event_processor_with_jsonl_output.rsstate/src/extract.rslast_token_usage means "the latest increment".total_token_usage means "the cumulative total so far".thread/tokenUsage/updated is the live streaming notification for token usage.turn/completed carries final turn state, and turn-level usage is exposed separately from the live thread token stream.usage fields are event-specific. Do not assume every usage payload is a cumulative thread total.Codex defines TokenUsageInfo like this:
pub struct TokenUsageInfo {
pub total_token_usage: TokenUsage,
pub last_token_usage: TokenUsage,
pub model_context_window: Option<i64>,
}
The important behavior is in append_last_usage:
pub fn append_last_usage(&mut self, last: &TokenUsage) {
self.total_token_usage.add_assign(last);
self.last_token_usage = last.clone();
}
That gives the core semantics:
last_token_usage: the newest chunk of usage that was just addedtotal_token_usage: the accumulated total after adding that chunkThis is the most important accounting rule in the Codex source.
codex/event/token_countCodex core emits token count events containing TokenUsageInfo.
These events can carry:
info.total_token_usageinfo.last_token_usageinfo.model_context_windowSymphony sees these events wrapped inside the app-server message stream.
Meaning:
total_token_usage is an absolute cumulative snapshotlast_token_usage is the delta that produced that snapshotthread/tokenUsage/updatedThe app-server converts token count events into a dedicated thread-scoped notification:
let notification = ThreadTokenUsageUpdatedNotification {
thread_id: conversation_id.to_string(),
turn_id,
token_usage,
};
ThreadTokenUsage is defined as:
pub struct ThreadTokenUsage {
pub total: TokenUsageBreakdown,
pub last: TokenUsageBreakdown,
pub model_context_window: Option<i64>,
}
And it is populated directly from TokenUsageInfo:
impl From<CoreTokenUsageInfo> for ThreadTokenUsage {
fn from(value: CoreTokenUsageInfo) -> Self {
Self {
total: value.total_token_usage.into(),
last: value.last_token_usage.into(),
model_context_window: value.model_context_window,
}
}
}
Meaning:
thread/tokenUsage/updated is the canonical live notification for token usagetokenUsage.total is an absolute thread totaltokenUsage.last is the latest increment that produced that totalThe app-server README is explicit: token usage streams separately via thread/tokenUsage/updated.
turn/completedThe app-server README says turn/completed carries final turn state and token usage.
There are two important details:
turn/completed notification contains a final turn object.exec event processor also emits a turn-completed event that includes a usage struct.In the exec event processor, the turn-completed usage is built from the most recent captured total_token_usage:
if let Some(info) = &ev.info {
self.last_total_token_usage = Some(info.total_token_usage.clone());
}
Then on turn completion:
let usage = if let Some(u) = &self.last_total_token_usage {
Usage {
input_tokens: u.input_tokens,
cached_input_tokens: u.cached_input_tokens,
output_tokens: u.output_tokens,
}
}
Important consequence:
usage payload is not the same schema as ThreadTokenUsagethread/tokenUsage/updated accountingusageCodex uses the word usage in multiple places.
That does not mean all usage maps have the same semantics.
Examples:
thread/tokenUsage/updated.tokenUsage.total: absolute cumulative thread totalthread/tokenUsage/updated.tokenUsage.last: latest deltausage: event-specific completion usage payloadRule:
usage map by name aloneThese are safe high-water-mark style counters:
info.total_token_usagetokenUsage.total on thread/tokenUsage/updatedUse these when you want:
These are incremental additions:
info.last_token_usagetokenUsage.last on thread/tokenUsage/updatedUse these only when:
model_context_window is not spend. It is the model's context limit.
Codex also has logic that can "fill to context window", which sets:
total_token_usage.total_tokens = context_windowlast_token_usage.total_tokens = deltaSo total_tokens can reflect context-window normalization behavior, not just a raw upstream token report.
For Symphony, model_context_window should be displayed or logged separately from spend.
Track usage per active Codex thread.
For each thread, keep:
absolute_total: latest accepted absolute total snapshotaccumulated_total: the total you expose in UI/APIlast_seen_turn_idWhen a token-related event arrives, use this precedence:
thread/tokenUsage/updated.tokenUsage.totalTokenCountEvent.info.total_token_usageIgnore these for accounting:
thread/tokenUsage/updated.tokenUsage.lastTokenCountEvent.info.last_token_usageusage mapsusageDo not treat generic params.usage as equivalent to a cumulative thread total unless the event type makes that meaning explicit.
absolute_total, replace the stored absolute total.If you misclassify a per-turn usage payload as an absolute thread total, later turns can appear to stall because a smaller per-turn number is compared against a larger cumulative baseline.
thread/tokenUsage/updated for live reporting.tokenUsage.total as authoritative for thread totals.thread_id, not just issue id.usage map as absolute.tokenUsage.last or last_token_usage into dashboard totals.usage on top of already-counted live thread totals unless you can prove it represents missing spend.When reading raw app-server events:
codex/event/token_count
info.total_token_usagethread/tokenUsage/updated
turn/completed
total_token_usage Is The Durable ChoiceCodex itself consistently prefers cumulative totals when it needs durable state:
info.total_token_usage.total_tokenstotal_token_usage and uses that on turn completionThat is a strong signal for Symphony:
If Symphony documents token reporting externally, the contract should be:
thread/tokenUsage/updated.tokenUsage.totalinfo.total_token_usagelast for totalsthread_idusage by field name alone