Back to Roslyn

Dynamic Analysis Metadata Format Specification (v 0.2)

docs/compilers/Dynamic Analysis/MetadataFormat.md

11.0.1007.1 KB
Original Source

Dynamic Analysis Metadata Format Specification (v 0.2)

Overview

The format is based on concepts defined in the ECMA-335 Partition II metadata standard and in Portable PDB format.

Metadata Layout

The physical layout of the Dynamic Analysis metadata blob starts with Header, followed by Tables, followed by Heaps. Layout of each of these three parts is defined in the following sections.

When stored in a managed PE file, the Dynamic Analysis metadata blob is embedded as a Manifest Resource (see ECMA-335 §6.2.2 and §22.24) of name <DynamicAnalysisData>.

Unless stated otherwise, all binary values are stored in little-endian format.

This document uses the term (un)signed compressed integer for an encoding of an (un)signed 29-bit integer as defined in ECMA §23.2.

OffsetSizeFieldDescription
04Signature0x44 0x41 0x4D 0x44 (ASCII string: "DAMD")
41MajorVersionMajor version of the format (0)
51MinorVersionMinor version of the format (2)
64*TTableRowCountsRow count (encoded as uint32) of each table in the metadata
6 + T*44*HHeapSizesSize (encoded as uint32) of each heap in bytes

The number of tables in this version of the format is T = 2. These tables are Document Table and Method Table and their sizes are stored in this order. No table may contain more than 0x1000000 rows.

The number of heaps in this version of the format is H = 2. These heaps are GUID Heap and Blob Heap and their sizes are stored in this order. No heap may be larger than 2^19 bytes (0.5 GB).

<a name="Tables"></a>Tables

Entities stored in tables are referred to by row id if used in a context that implies the table. The first row of the table has row id 1. If the table is not implied by the context the entity is referred to by its token -- a 32-bit unsigned integer that combines the id of the table (in highest 8 bits) and the row id of the entity within that table (in lowest 24 bits).

<a name="DocumentTable"></a>Document Table: 0x01

The Document table has the following columns:

  • Name (Blob heap index of document name blob)
  • HashAlgorithm (Guid heap index)
  • Hash (Blob heap index)

<a name="MethodTable"></a>Method Table: 0x02

The Method table has the following columns:

<a name="Heaps"></a>Heaps

GUID

The encoding of GUID heap is the same as the encoding of ECMA #GUID heap defined in ECMA-335 §24.2.5.

Values stored in GUID heap are referred to by its index. The first value stored in the heap has index 1, the second value stored in the heap has index 2, etc.

Blob

The encoding of Blob heap is the same as the encoding of ECMA #Blob heap defined in ECMA-335 §24.2.4.

Values stored in Blob heap are referred to by its offset in the heap (the distance between the start of the heap and the first byte ot the encoded value). The first value of the heap has offset 0, size 1B, and encoded value 0x00 (it represents an empty blob).

<a name="SpansBlob"></a>Spans Blob

Span is a quadruple of integers and a document reference:

  • Start Line
  • Start Column
  • End Line
  • End Column
  • Document

The values of must satisfy the following constraints

  • Start Line is within range [0, 0x20000000)
  • End Line is within range [0, 0x20000000)
  • Start Column is within range [0, 0x10000)
  • End Column is within range [0, 0x10000)
  • End Line is greater or equal to Start Line.
  • If Start Line is equal to End Line then End Column is greater than Start Column.

Spans blob has the following structure:

Blob ::= header span-record (span-record | document-record)*

header

componentvalue storedinteger representation
InitialDocumentDocument row idunsigned compressed

span-record

componentvalue storedinteger representation
ΔLinesEndLine - StartLineunsigned compressed
ΔColumnsEndColumn - StartColumnΔLines = 0: unsigned compressed, non-zero
ΔLines > 0: signed compressed
δStartLineStartLine if this is the first span-recordunsigned compressed
StartLine - PreviousSpan.StartLine otherwisesigned compressed
δStartColumnStartColumn if this is the first span-recordunsigned compressed
StartColumn - PreviousSpan.StartColumn otherwisesigned compressed

Where PreviousSpan is the span encoded in the previous span-record.

document-record

componentvalue storedinteger representation
ΔLines0unsigned compressed
ΔColumns0unsigned compressed
DocumentDocument row idunsigned compressed

Each span-record represents a single Span. When decoding the blob the Document property of a Span is determined by the closest preceding document-record and by InitialDocument if there is no preceding document-record.

The values of Start Line, Start Column, End Line and End Column of a Span are calculated based upon the values of the previous Span (if any) and the data stored in the record.


Note This encoding is similar to encoding of sequence points blob in Portable PDB format.


<a name="DocumentNameBlob"></a>Document Name Blob

Document name blob is a sequence:

Blob ::= separator part+

where

  • separator is a UTF8 encoded character, or byte 0 to represent an empty separator.
  • part is a compressed integer into the Blob heap, where the part is stored in UTF8 encoding (0 represents an empty string).

The document name is a concatenation of the parts separated by the separator.


Note This encoding is the same as the encoding of document name blob in Portable PDB format.