src/libraries/System.Diagnostics.DiagnosticSource/src/HierarchicalRequestId.md
This document describes hierarchical Request-Id schema for HTTP Correlation Protocol for telemetry correlation.
The main requirement for Request-Id is uniqueness, any two requests processed by the cluster must not collide. Guids or big random number help to achieve it, but they require other identifiers to query all requests related to the operation.
Hierarchical Request-Id looks like |<root-id>.<local-id1>.<local-id2>. (e.g. |9e74f0e5-efc4-41b5-86d1-3524a43bd891.bcec871c_1.) and holds all information needed to trace whole operation and particular request.
Root-id serves as common identifier for all requests involved in operation processing and local-ids represent internal activities (and requests) done within scope of this operation.
Upstream service/client application may be instrumented with other tracing system, so implementation MAY have compatibility layer that parses another set of trace headers.
Therefore implementation SHOULD be tolerant to other formats of trace identifiers and do the best effort to keep root-id equivalent in particular tracing system.
If Request-Id was not provided from upstream service and implementation decides to trace the request, it MUST generate new Request-Id (see Root Request Id Generation) to represent incoming request.
In heterogeneous environment implementations of this protocol with hierarchical Request-Id may interact with other services that do not implement this protocol, but still have notion of request Id. Implementation or logging system should be able unambiguously identify if given Request-Id has hierarchical schema.
Therefore every implementation which support hierarchical structure MUST prepend "|" (vertical bar) to generated Request-Id.
It also MUST append "." (dot) to the end of generated Request-Id to unambiguously mark end of it (e.g. search for |123 may return |1234, but search for |123. would be exact)
Root Request-Id is the top most Request-Id generated by the first instrumented service. In a hierarchical Request-Id, it is a root node and common for all requests involved in operation processing. It MUST be unique to every high-level operation in the system, so for every traced operation, implementation MUST generate sufficiently large identifier: e.g. GUID, 64-bit or 128-bit random number. Note that random numbers could be encoded to string to decrease Request-Id length.
Root Request-Id MUST contain only Base64 and "-" (hyphen) characters.
Same considerations are applied to client applications making HTTP requests and generating root Request-Id.
Note that in addition to unique part, it may be useful to include some meaningful information such as host name, device or process id, etc. Implementation is free to do it, keeping root id relatively short.
When Request-Id is provided by upstream service, there is no guarantee that it is unique within the entire system.
Implementation SHOULD make it unique by adding small suffix to incoming Request-Id to represent internal activity and use it for outgoing requests. If implementation does not trust incoming Request-Id in the least, suffix may be as long as Root Request Id. We recommend appending random string of 8 characters length (e.g. 32-bit hex-encoded random integer).
Suffix MUST contain only Base64 and "-" (hyphen) characters
Implementation MUST append "_" (underscore) to mark the end of generated incoming Request-Id.
When making request to downstream service, implementation MUST append small id to the incoming Request-Id and pass a new Request-Id to downstream service.
Implementation MUST append "." (dot) to mark the end of generated outgoing Request-Id.
It may be useful to split incoming request processing to multiple logical sub-operations and assign different identifiers to them, similarly as it is done for outgoing request, except the sub-operation is processed within the same service.
Extending Request-Id may cause it to exceed length limit.
To handle overflow, implementation:
As a result Request-Id will look like:
Beginning-Of-Incoming-Request-Id.LocalId#
Thus, to the extent possible, Request-Id will keep valid part of hierarchical Id.
Overflow suffix should be large enough to ensure new Request-Id does not collide with one of previous/future Request-Ids within the same operation. Using random 32-bytes integer (or 8 chars string) is a good candidate for it. Note that applications could asynchronously start multiple outgoing requests almost at the same time, which makes timestamp even with ticks precision bad candidate for overflow suffix.
Let's consider three services: service-a, service-b and service-c. User calls service-a, which calls service-b to fulfill the user request
User -> service-a -> service-b
Request-Id and generates a new root Request-Id |Guid.Request-Id: |Guid.Request-Id by appending request number to the parent request id: |Guid.1.Request-Id: |Guid.1.Request-Id: |Guid.1.|Guid.1.da4e9679_ to uniquely describe operation within service-bRequest-Id: |Guid.1.da4e9679_Request-Id: |Guid.1.As a result log records may look like:
| Message | Component name | Context |
|---|---|---|
| user starts request to service-a | user | |
| incoming request | service-a | Request-Id=|Guid. |
| request to service-b | service-a | Request-Id=|Guid.1. |
| incoming request | service-b | Request-Id=|Guid.1.da4e9679_ |
| response | service-b | Request-Id=|Guid.1.da4e9679_ |
| response from service-b | service-a | Request-Id=|Guid.1. |
| response | service-a | Request-Id=|Guid. |
| response from service-a | user |
|Guid., logs for particular request may be queried by exact Request-Id match