Back to Dd Trace Js

Orchestrion (AST Rewriter)

.agents/skills/apm-integrations/references/orchestrion.md

6.0.012.9 KB
Original Source

Orchestrion (AST Rewriter)

Orchestrion is the required default for new instrumentations. It automatically wraps methods via JSON configuration with correct CJS/ESM handling built in. Orchestrion handles ESM code far more reliably than shimmer-based wrapping because it operates at the AST level rather than trying to monkey-patch module exports.

Required Files

Orchestrion integrations need three files:

packages/datadog-instrumentations/src/
├── <name>.js                           # Hooks file (registers module hooks)
└── helpers/
    ├── hooks.js                        # Entry pointing to <name>.js
    └── rewriter/
        ├── index.js                    # Main rewriter logic
        └── instrumentations/
            ├── langchain.js            # Reference: LangChain config
            └── <name>.js              # JSON config

Hooks file (packages/datadog-instrumentations/src/<name>.js):

javascript
'use strict'

const { addHook, getHooks } = require('./helpers/instrument')

for (const hook of getHooks('<npm-package>')) {
  addHook(hook, exports => exports)
}

getHooks reads the orchestrion JSON config and generates addHook entries so the module hooks are registered for the rewriter to process. Without this file, the rewriter will not be triggered.

hooks.js entry (packages/datadog-instrumentations/src/helpers/hooks.js):

javascript
'<name>': () => require('../<name>'),

Config Schema

Each entry in the instrumentations array:

javascript
{
  module: {
    name: string,            // npm package name (e.g. "bullmq", "@langchain/core")
    versionRange: string,    // semver range (e.g. ">=1.0.0")
    filePath: string,        // path within package (e.g. "dist/cjs/classes/queue.js")
  },

  // Option A: functionQuery (recommended)
  functionQuery: {
    kind: 'Async' | 'AsyncIterator' | 'Callback' | 'Sync',  // transform type (see below)
    methodName: string,      // class method or property method name
    className?: string,      // scope to a specific class
    functionName?: string,   // target a FunctionDeclaration (alternative to methodName)
    expressionName?: string, // target a FunctionExpression/ArrowFunctionExpression
    index?: number,          // Callback only: argument index of the callback (-1 = last)
  },

  // Option B: astQuery (advanced, for edge cases)
  astQuery?: string,         // raw ESQuery selector string — bypasses functionQuery entirely

  channelName: string,       // used in the diagnostic channel name
}

functionQuery Targeting

FieldTargets
methodName + classNameA method on a specific class
methodName aloneAny class method or object property method with that name
functionNameA FunctionDeclaration by name
expressionNameA FunctionExpression or ArrowFunctionExpression by name

astQuery (ESQuery Selectors)

For advanced cases where functionQuery fields are insufficient, use astQuery with a raw ESQuery selector string. This is parsed via esquery.parse() and matched against the AST directly. Internally, functionQuery is converted to ESQuery selectors — astQuery lets you write them directly.

Basic Example

javascript
// instrumentations/<name>.js
module.exports = [
  {
    module: {
      name: '<npm-package>',
      versionRange: '>=1.0.0',
      filePath: 'dist/client.js'
    },
    functionQuery: {
      methodName: 'query',
      className: 'Client',
      kind: 'Async'
    },
    channelName: 'Client_query'
  }
]

Multiple methods can be wrapped by adding more entries to the array.

Channel Name Formation

Orchestrion channels follow this pattern:

tracing:orchestrion:{module.name}:{channelName}:{event}

Example with module.name: "@langchain/core" and channelName: "RunnableSequence_invoke":

  • tracing:orchestrion:@langchain/core:RunnableSequence_invoke:start
  • tracing:orchestrion:@langchain/core:RunnableSequence_invoke:asyncStart
  • tracing:orchestrion:@langchain/core:RunnableSequence_invoke:asyncEnd
  • tracing:orchestrion:@langchain/core:RunnableSequence_invoke:end
  • tracing:orchestrion:@langchain/core:RunnableSequence_invoke:error

Function Kinds and Transforms

Orchestrion supports four transform types, selected by the kind field:

KindTransformBehavior
AsynctracePromiseWraps in async arrow, calls channel.tracePromise() — handles promise resolution/rejection
AsyncIteratortraceAsyncIteratorWraps async generators/iterators — creates TWO channels: base and _next (see async-iterator-pattern.md)
CallbacktraceCallbackIntercepts callback at arguments[index] (default: last arg, i.e. -1), wraps it to publish asyncStart/asyncEnd/error events
SynctraceSyncWraps in non-async arrow, calls channel.traceSync() — handles synchronous return/throw. Note: Sync is the default when kind is omitted or unrecognized.

All transforms dispatch to traceFunction (for standalone functions) or traceInstanceMethod (for class methods, including inherited ones via constructor patching).

For Callback kind, use the index field to specify which argument is the callback (defaults to -1, meaning the last argument).

AsyncIterator Pattern (Two Plugins Required)

⚠️ CRITICAL: AsyncIterator is a special transform that requires TWO plugins and has specific implementation requirements.

When to use:

  • Method returns Promise<AsyncIterable>, Promise<AsyncIterableIterator>, or Promise<IterableReadableStream>
  • Async generator functions: async *methodName()

How it works:

  • Orchestrion creates TWO channels: base channel and {channelName}_next channel
  • Main plugin: Creates span when method is called
  • Next plugin: Finishes span when result.done === true (after all iterations complete)

📖 REQUIRED READING: If you are implementing an AsyncIterator integration, you MUST read the complete guide:

👉 AsyncIterator Pattern Reference 👈

This pattern is complex and easy to get wrong. The reference document covers:

  • Two-channel pattern details
  • Complete plugin implementation examples
  • Common mistakes and how to avoid them
  • Testing strategies
  • Full working example (LangGraph)

DO NOT attempt to implement AsyncIterator without reading the full reference.

Finding the Right filePath

  1. Install the package: npm install <package>
  2. Search for the method definition:
bash
grep -r "methodName" node_modules/<package>/
  1. Use the path relative to the package root

IMPORTANT: Patch both CJS and ESM code paths. Many libraries duplicate their classes across separate CJS and ESM builds (e.g., dist/cjs/client.js and dist/esm/client.js). Each file path needs its own entry in the instrumentations array with the same functionQuery and channelName. If only one is patched, the instrumentation will silently fail for the other module format.

Common locations:

  • dist/cjs/index.js / dist/esm/index.js — separate CJS/ESM builds
  • dist/index.js — single compiled output
  • lib/client.js — source files
  • src/index.mjs — ESM source

Plugin Subscription for Orchestrion

Set static prefix to match the orchestrion channel base. The TracingPlugin base class automatically subscribes to all events and routes them to bindStart, bindFinish, etc.

javascript
class MyPlugin extends TracingPlugin {
  static id = '<name>'
  static prefix = 'tracing:orchestrion:<npm-package>:Client_query'

  bindStart (ctx) {
    const query = ctx.arguments?.[0]
    const instance = ctx.self

    this.startSpan(this.operationName(), {
      resource: query,
      meta: { component: '<name>' }
    }, ctx)

    return ctx.currentStore
  }
}

For integrations wrapping multiple methods, create a separate plugin class per method (each with its own static prefix), then combine them in a CompositePlugin. See langchain for this pattern.

The ctx Object in Orchestrion

  • ctx.arguments — the original method arguments (array)
  • ctx.self — the this context of the wrapped method (instance)
  • ctx.result — return value (on asyncEnd/end)
  • ctx.error — thrown error (on error)
  • ctx.currentStore — set by startSpan in bindStart

Propagating Synchronous Errors From bindStart

Subscribers on the prefix :start channel and bindStore transforms cannot propagate a synchronous throw to the caller of an orchestrion-wrapped function. Both are wrapped in try { ... } catch (err) { process.nextTick(() => triggerUncaughtException(err)); ... } by Node's diagnostics_channel (see lib/diagnostics_channel.js's wrapStoreRun and publish). The error surfaces async as an uncaught exception, after the wrapped fn has already run and the call has returned normally.

The wrapper's own catch block, however, does rethrow:

js
return ch.start.runStores(__apm$ctx, () => {
  try {
    const result = __apm$traced();
    __apm$ctx.result = result;
    return result;
  } catch (err) {
    __apm$ctx.error = err;
    ch.error.publish(__apm$ctx);
    throw err;                  // <- propagates the wrapped fn's error to the caller
  } finally {
    ch.end.publish(__apm$ctx);
  }
});

When a contract requires assert.throws(() => wrapped(...))-style synchronous propagation from a :start observer (the canonical case is AppSec WAF's abortController.abort() model), use the Proxy-on-arguments pattern:

js
bindStart (ctx) {
  // ... normal setup, span creation ...
  const abortController = new AbortController()
  if (startCh.hasSubscribers) {
    startCh.publish({ abortController, args })       // subscribers run sync
    if (abortController.signal.aborted) {
      // ctx.arguments is the SAME array reference the wrapper spreads into
      // the wrapped fn (__apm$wrapped.apply(this, ctx.arguments)). Replace
      // arguments[0] with a Proxy whose getters throw AbortError. The
      // wrapped fn's first property access (typically a destructure of args)
      // triggers the trap → the wrapper's catch+rethrow propagates AbortError
      // to the caller. Span lifecycle still completes via end.publish.
      ctx.arguments[0] = new Proxy({}, {
        get () { throw new AbortError('Aborted') },
        has () { throw new AbortError('Aborted') },
      })
      ctx.ddAborted = true
      return ctx.currentStore
    }
  }
  // ... rest of bindStart ...
}

error (ctx) {
  if (ctx.ddAborted) return                          // abort != error tag
  // ... regular error handling ...
}

Reference implementation: packages/datadog-plugin-graphql/src/execute.js (apm:graphql:execute:start contract).

Why this works:

  • ctx.arguments and the rewriter's __apm$arguments are the same array reference (confirmed by capturing the rewriter output: const __apm$traced = () => __apm$wrapped.apply(this, __apm$arguments)).
  • The wrapped fn's body almost always touches arguments[0] on the first statement (destructure, property read, validation). Any read triggers the Proxy trap.
  • The orchestrion wrapper's catch { ...; throw err } propagates the thrown error synchronously to the caller — confirmed in the rewriter template (vendor's code-transformer/index.js, wrapSync function).
  • :end still fires in finally, so the plugin's end(ctx) runs as usual and the span lifecycle completes cleanly. Combined with an error(ctx) that no-ops when ctx.ddAborted, the span finishes with error === 0 (matches the abort contract).

When NOT to use this:

  • If you control the wrapped fn (e.g., it's a method you can wrap with shimmer on top of orchestrion), do that — clearer and avoids the Proxy indirection.
  • If the wrapped fn might catch its own errors before they escape (some user-resolver patterns do this), the AbortError won't propagate. Verify the wrapped fn does NOT have a top-level try/catch around the property access that triggers the trap.

Common Issues

Wrong filePath

Symptom: No channel events published Fix: Verify the method is actually defined in that file (not re-exported from elsewhere)

Case Mismatch

Symptom: Method not found Fix: Match exact class/method name casing

Multiple Build Outputs

Symptom: Works in one context, not another Fix: Check if the package has separate CJS/ESM builds with different file paths; each needs its own entry in the instrumentations array

Reference Implementations

Langchain (canonical, multi-method):

  • Config: packages/datadog-instrumentations/src/helpers/rewriter/instrumentations/langchain.js
  • Hooks file: packages/datadog-instrumentations/src/langchain.js
  • Plugin: packages/datadog-plugin-langchain/src/tracing.js

BullMQ (simpler, single-package):

  • Config: packages/datadog-instrumentations/src/helpers/rewriter/instrumentations/bullmq.json
  • Hooks file: packages/datadog-instrumentations/src/bullmq.js