docs/compilers/Dynamic Analysis/MetadataFormat.md
The format is based on concepts defined in the ECMA-335 Partition II metadata standard and in Portable PDB format.
The physical layout of the Dynamic Analysis metadata blob starts with Header, followed by Tables, followed by Heaps. Layout of each of these three parts is defined in the following sections.
When stored in a managed PE file, the Dynamic Analysis metadata blob is embedded as a Manifest Resource (see ECMA-335 §6.2.2 and §22.24) of name <DynamicAnalysisData>.
Unless stated otherwise, all binary values are stored in little-endian format.
This document uses the term (un)signed compressed integer for an encoding of an (un)signed 29-bit integer as defined in ECMA §23.2.
| Offset | Size | Field | Description |
|---|---|---|---|
| 0 | 4 | Signature | 0x44 0x41 0x4D 0x44 (ASCII string: "DAMD") |
| 4 | 1 | MajorVersion | Major version of the format (0) |
| 5 | 1 | MinorVersion | Minor version of the format (2) |
| 6 | 4*T | TableRowCounts | Row count (encoded as uint32) of each table in the metadata |
| 6 + T*4 | 4*H | HeapSizes | Size (encoded as uint32) of each heap in bytes |
The number of tables in this version of the format is T = 2. These tables are Document Table and Method Table and their sizes are stored in this order. No table may contain more than 0x1000000 rows.
The number of heaps in this version of the format is H = 2. These heaps are GUID Heap and Blob Heap and their sizes are stored in this order. No heap may be larger than 2^19 bytes (0.5 GB).
Entities stored in tables are referred to by row id if used in a context that implies the table. The first row of the table has row id 1. If the table is not implied by the context the entity is referred to by its token -- a 32-bit unsigned integer that combines the id of the table (in highest 8 bits) and the row id of the entity within that table (in lowest 24 bits).
The Document table has the following columns:
The Method table has the following columns:
The encoding of GUID heap is the same as the encoding of ECMA #GUID heap defined in ECMA-335 §24.2.5.
Values stored in GUID heap are referred to by its index. The first value stored in the heap has index 1, the second value stored in the heap has index 2, etc.
The encoding of Blob heap is the same as the encoding of ECMA #Blob heap defined in ECMA-335 §24.2.4.
Values stored in Blob heap are referred to by its offset in the heap (the distance between the start of the heap and the first byte ot the encoded value). The first value of the heap has offset 0, size 1B, and encoded value 0x00 (it represents an empty blob).
Span is a quadruple of integers and a document reference:
The values of must satisfy the following constraints
Spans blob has the following structure:
Blob ::= header span-record (span-record | document-record)*
| component | value stored | integer representation |
|---|---|---|
| InitialDocument | Document row id | unsigned compressed |
| component | value stored | integer representation |
|---|---|---|
| ΔLines | EndLine - StartLine | unsigned compressed |
| ΔColumns | EndColumn - StartColumn | ΔLines = 0: unsigned compressed, non-zero |
| ΔLines > 0: signed compressed | ||
| δStartLine | StartLine if this is the first span-record | unsigned compressed |
| StartLine - PreviousSpan.StartLine otherwise | signed compressed | |
| δStartColumn | StartColumn if this is the first span-record | unsigned compressed |
| StartColumn - PreviousSpan.StartColumn otherwise | signed compressed |
Where PreviousSpan is the span encoded in the previous span-record.
| component | value stored | integer representation |
|---|---|---|
| ΔLines | 0 | unsigned compressed |
| ΔColumns | 0 | unsigned compressed |
| Document | Document row id | unsigned compressed |
Each span-record represents a single Span. When decoding the blob the Document property of a Span is determined by the closest preceding document-record and by InitialDocument if there is no preceding document-record.
The values of Start Line, Start Column, End Line and End Column of a Span are calculated based upon the values of the previous Span (if any) and the data stored in the record.
Note This encoding is similar to encoding of sequence points blob in Portable PDB format.
Document name blob is a sequence:
Blob ::= separator part+
where
The document name is a concatenation of the parts separated by the separator.
Note This encoding is the same as the encoding of document name blob in Portable PDB format.