Back to Materialize

mz_introspection

doc/user/content/reference/system-catalog/mz_introspection.md

12335.6 KB
Original Source

The following sections describe the available objects in the mz_introspection schema.

{{< warning >}} The objects in the mz_introspection schema are not part of Materialize's stable interface. Backwards-incompatible changes to these objects may be made at any time. {{< /warning >}}

{{< warning >}} SELECT statements may reference these objects, but creating views that reference these objects is not allowed. {{< /warning >}}

Introspection relations are maintained by independently collecting internal logging information within each of the replicas of a cluster. Thus, in a multi-replica cluster, queries to these relations need to be directed to a specific replica by issuing the command SET cluster_replica = <replica_name>. Note that once this command is issued, all subsequent SELECT queries, for introspection relations or not, will be directed to the targeted replica. Replica targeting can be cancelled by issuing the command RESET cluster_replica.

For each of the below introspection relations, there exists also a variant with a _per_worker name suffix. Per-worker relations expose the same data as their global counterparts, but have an extra worker_id column that splits the information by Timely Dataflow worker.

mz_active_peeks

The mz_active_peeks view describes all read queries ("peeks") that are pending in the dataflow layer.

<!-- RELATION_SPEC mz_introspection.mz_active_peeks -->
FieldTypeMeaning
iduuidThe ID of the peek request.
object_idtextThe ID of the collection the peek is targeting. Corresponds to mz_catalog.mz_indexes.id, mz_catalog.mz_materialized_views.id, mz_catalog.mz_sources.id, or mz_catalog.mz_tables.id.
typetextThe type of the corresponding peek: index if targeting an index or temporary dataflow; persist for a source, materialized view, or table.
timemz_timestampThe timestamp the peek has requested.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_active_peeks_per_worker -->

mz_arrangement_sharing

The mz_arrangement_sharing view describes how many times each arrangement in the system is used.

<!-- RELATION_SPEC mz_introspection.mz_arrangement_sharing -->
FieldTypeMeaning
operator_iduint8The ID of the operator that created the arrangement. Corresponds to mz_dataflow_operators.id.
countbigintThe number of operators that share the arrangement.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_sharing_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_sharing_raw -->

mz_arrangement_sizes

The mz_arrangement_sizes view describes the size of each arrangement in the system.

The size, capacity, and allocations are an approximation, which may underestimate the actual size in memory. Specifically, reductions can use more memory than we show here.

<!-- RELATION_SPEC mz_introspection.mz_arrangement_sizes -->
FieldTypeMeaning
operator_iduint8The ID of the operator that created the arrangement. Corresponds to mz_dataflow_operators.id.
recordsbigintThe number of records in the arrangement.
batchesbigintThe number of batches in the arrangement.
sizebigintThe utilized size in bytes of the arrangement.
capacitybigintThe capacity in bytes of the arrangement. Can be larger than the size.
allocationsbigintThe number of separate memory allocations backing the arrangement.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_sizes_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_records_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_batcher_allocations_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_batcher_capacity_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_batcher_records_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_batcher_size_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_batches_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_heap_allocations_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_heap_capacity_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_arrangement_heap_size_raw -->

mz_compute_error_counts

The mz_compute_error_counts view describes the counts of errors in objects exported by dataflows in the system.

Dataflow exports that don't have any errors are not included in this view.

<!-- RELATION_SPEC mz_introspection.mz_compute_error_counts -->
FieldTypeMeaning
export_idtextThe ID of the dataflow export. Corresponds to mz_compute_exports.export_id.
countnumericThe count of errors present in this dataflow export.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_error_counts_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_error_counts_raw -->

mz_compute_exports

The mz_compute_exports view describes the objects exported by dataflows in the system.

<!-- RELATION_SPEC mz_introspection.mz_compute_exports -->
FieldTypeMeaning
export_idtextThe ID of the index, materialized view, or subscription exported by the dataflow. Corresponds to mz_catalog.mz_indexes.id, mz_catalog.mz_materialized_views.id, or mz_internal.mz_subscriptions.
dataflow_iduint8The ID of the dataflow. Corresponds to mz_dataflows.id.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_exports_per_worker -->

mz_compute_frontiers

The mz_compute_frontiers view describes the frontier of each dataflow export in the system. The frontier describes the earliest timestamp at which the output of the dataflow may change; data prior to that timestamp is sealed.

<!-- RELATION_SPEC mz_introspection.mz_compute_frontiers -->
FieldTypeMeaning
export_idtextThe ID of the dataflow export. Corresponds to mz_compute_exports.export_id.
timemz_timestampThe next timestamp at which the dataflow output may change.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_frontiers_per_worker -->

mz_compute_import_frontiers

The mz_compute_import_frontiers view describes the frontiers of each dataflow import in the system. The frontier describes the earliest timestamp at which the input into the dataflow may change; data prior to that timestamp is sealed.

<!-- RELATION_SPEC mz_introspection.mz_compute_import_frontiers -->
FieldTypeMeaning
export_idtextThe ID of the dataflow export. Corresponds to mz_compute_exports.export_id.
import_idtextThe ID of the dataflow import. Corresponds to mz_catalog.mz_sources.id or mz_catalog.mz_tables.id or mz_compute_exports.export_id.
timemz_timestampThe next timestamp at which the dataflow input may change.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_import_frontiers_per_worker -->

mz_compute_operator_durations_histogram

The mz_compute_operator_durations_histogram view describes a histogram of the duration in nanoseconds of each invocation for each dataflow operator.

<!-- RELATION_SPEC mz_introspection.mz_compute_operator_durations_histogram -->
FieldTypeMeaning
iduint8The ID of the operator. Corresponds to mz_dataflow_operators.id.
duration_nsuint8The upper bound of the duration bucket in nanoseconds.
countnumericThe (noncumulative) count of invocations in the bucket.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_operator_durations_histogram_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_operator_durations_histogram_raw -->

mz_cluster_prometheus_metrics

The mz_cluster_prometheus_metrics source exposes Prometheus metrics collected from each cluster replica process's internal metrics registry. Metrics are scraped periodically and presented as rows. Histograms are flattened into separate bucket, sum, and count rows. Summaries are flattened into separate quantile, sum, and count rows.

<!-- RELATION_SPEC mz_introspection.mz_cluster_prometheus_metrics NO_COMMENTS -->
FieldTypeMeaning
process_iduint8The ID of the process that collected the metric.
metric_nametextThe name of the Prometheus metric.
metric_typetextThe type of the metric: counter, gauge, histogram, or summary.
labelsmapThe label key-value pairs associated with the metric.
valuedouble precisionThe numeric value of the metric.
helptextThe help string describing the metric.

mz_dataflows

The mz_dataflows view describes the dataflows in the system.

<!-- RELATION_SPEC mz_introspection.mz_dataflows -->
FieldTypeMeaning
iduint8The ID of the dataflow.
nametextThe internal name of the dataflow.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflows_per_worker -->

mz_dataflow_addresses

The mz_dataflow_addresses view describes how the dataflow channels and operators in the system are nested into scopes.

<!-- RELATION_SPEC mz_introspection.mz_dataflow_addresses -->
FieldTypeMeaning
iduint8The ID of the channel or operator. Corresponds to mz_dataflow_channels.id or mz_dataflow_operators.id.
addressbigint listA list of scope-local indexes indicating the path from the root to this channel or operator.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_addresses_per_worker -->

mz_dataflow_arrangement_sizes

The mz_dataflow_arrangement_sizes view describes the size of arrangements per operators under each dataflow.

<!-- RELATION_SPEC mz_introspection.mz_dataflow_arrangement_sizes -->
FieldTypeMeaning
iduint8The ID of the dataflow. Corresponds to mz_dataflows.id.
nametextThe name of the dataflow.
recordsbigintThe number of records in all arrangements in the dataflow.
batchesbigintThe number of batches in all arrangements in the dataflow.
sizebigintThe utilized size in bytes of the arrangements.
capacitybigintThe capacity in bytes of the arrangements. Can be larger than the size.
allocationsbigintThe number of separate memory allocations backing the arrangements.

mz_dataflow_channels

The mz_dataflow_channels view describes the communication channels between dataflow operators. A communication channel connects one of the outputs of a source operator to one of the inputs of a target operator.

<!-- RELATION_SPEC mz_introspection.mz_dataflow_channels -->
FieldTypeMeaning
iduint8The ID of the channel.
from_indexuint8The scope-local index of the source operator. Corresponds to mz_dataflow_addresses.address.
from_portuint8The source operator's output port.
to_indexuint8The scope-local index of the target operator. Corresponds to mz_dataflow_addresses.address.
to_portuint8The target operator's input port.
typetextThe container type of the channel.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_channels_per_worker -->

mz_dataflow_channel_operators

The mz_dataflow_channel_operators view associates dataflow channels with the operators that are their endpoints.

<!-- RELATION_SPEC mz_introspection.mz_dataflow_channel_operators -->
FieldTypeMeaning
iduint8The ID of the channel. Corresponds to mz_dataflow_channels.id.
from_operator_iduint8The ID of the source of the channel. Corresponds to mz_dataflow_operators.id.
from_operator_addressuint8 listThe address of the source of the channel. Corresponds to mz_dataflow_addresses.address.
to_operator_iduint8The ID of the target of the channel. Corresponds to mz_dataflow_operators.id.
to_operator_addressuint8 listThe address of the target of the channel. Corresponds to mz_dataflow_addresses.address.
typetextThe container type of the channel.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_channel_operators_per_worker -->

mz_dataflow_global_ids

The mz_dataflow_global_ids view associates dataflow ids with global ids (ids of the form u8 or t5).

<!-- RELATION_SPEC mz_introspection.mz_dataflow_global_ids -->
FieldTypeMeaning
iduint8The dataflow ID.
global_idtextA global ID associated with that dataflow.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_dataflow_global_ids_per_worker -->

mz_dataflow_operators

The mz_dataflow_operators view describes the dataflow operators in the system.

<!-- RELATION_SPEC mz_introspection.mz_dataflow_operators -->
FieldTypeMeaning
iduint8The ID of the operator.
nametextThe internal name of the operator.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_operators_per_worker -->

mz_dataflow_operator_dataflows

The mz_dataflow_operator_dataflows view describes the dataflow to which each operator belongs.

<!-- RELATION_SPEC mz_introspection.mz_dataflow_operator_dataflows -->
FieldTypeMeaning
iduint8The ID of the operator. Corresponds to mz_dataflow_operators.id.
nametextThe internal name of the operator.
dataflow_iduint8The ID of the dataflow hosting the operator. Corresponds to mz_dataflows.id.
dataflow_nametextThe internal name of the dataflow hosting the operator.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_operator_dataflows_per_worker -->

mz_dataflow_operator_parents

The mz_dataflow_operator_parents view describes how dataflow operators are nested into scopes, by relating operators to their parent operators.

<!-- RELATION_SPEC mz_introspection.mz_dataflow_operator_parents -->
FieldTypeMeaning
iduint8The ID of the operator. Corresponds to mz_dataflow_operators.id.
parent_iduint8The ID of the operator's parent operator. Corresponds to mz_dataflow_operators.id.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_operator_parents_per_worker -->

mz_expected_group_size_advice

The mz_expected_group_size_advice view provides advice on opportunities to set query hints. Query hints are applicable to dataflows maintaining MIN, MAX, or Top K query patterns. The maintainance of these query patterns is implemented inside an operator scope, called a region, through a hierarchical scheme for either aggregation or Top K computations.

<!-- RELATION_SPEC mz_introspection.mz_expected_group_size_advice -->
FieldTypeMeaning
dataflow_iduint8The ID of the dataflow. Corresponds to mz_dataflows.id.
dataflow_nametextThe internal name of the dataflow hosting the min/max aggregation or Top K.
region_iduint8The ID of the root operator scope. Corresponds to mz_dataflow_operators.id.
region_nametextThe internal name of the root operator scope for the min/max aggregation or Top K.
levelsbigintThe number of levels in the hierarchical scheme implemented by the region.
to_cutbigintThe number of levels that can be eliminated (cut) from the region's hierarchy.
savingsnumericA conservative estimate of the amount of memory in bytes to be saved by applying the hint.
hintdouble precisionThe hint value that will eliminate to_cut levels from the region's hierarchy.

mz_mappable_objects

The mz_mappable_objects identifies indexes (and their underlying views) and materialized views which can be debugged using the mz_lir_mapping view.

<!-- RELATION_SPEC mz_introspection.mz_mappable_objects -->
FieldTypeMeaning
nametextThe name of the object.
global_idtextThe global ID of the object.

See Which part of my query runs slowly or uses a lot of memory? for examples of debugging with mz_mappable_objects and mz_lir_mapping.

mz_lir_mapping

The mz_lir_mapping view describes the low-level internal representation (LIR) plan that corresponds to global ids of indexes (and their underlying views) and materialized views. You can find a list of all debuggable objects in mz_mappable_objects. LIR is a higher-level representation than dataflows; this view is used for profiling and debugging indices and materialized views. Note that LIR is not a stable interface and may change at any time. In particular, you should not attempt to parse operator descriptions. LIR nodes are implemented by zero or more dataflow operators with sequential ids. We use the range [operator_id_start, operator_id_end) to record this information. If an LIR node was implemented without any dataflow operators, operator_id_start will be equal to operator_id_end.

<!-- RELATION_SPEC mz_introspection.mz_lir_mapping -->
FieldTypeMeaning
global_idtextThe global ID.
lir_iduint8The LIR node ID.
operatortextThe LIR operator, in the format OperatorName INPUTS [OPTIONS].
parent_lir_iduint8The parent of this LIR node. May be NULL.
nestinguint2The nesting level of this LIR node.
operator_id_startuint8The first dataflow operator ID implementing this LIR operator (inclusive).
operator_id_enduint8The first dataflow operator ID after this LIR operator (exclusive).
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_lir_mapping_per_worker -->

mz_message_counts

The mz_message_counts view describes the messages and message batches sent and received over the dataflow channels in the system. It distinguishes between individual records (sent, received) and batches of records (batch_sent, batch_sent).

<!-- RELATION_SPEC mz_introspection.mz_message_counts -->
FieldTypeMeaning
channel_iduint8The ID of the channel. Corresponds to mz_dataflow_channels.id.
sentnumericThe number of messages sent.
receivednumericThe number of messages received.
batch_sentnumericThe number of batches sent.
batch_receivednumericThe number of batches received.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_message_counts_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_message_batch_counts_received_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_message_batch_counts_sent_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_message_counts_received_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_message_counts_sent_raw -->

mz_peek_durations_histogram

The mz_peek_durations_histogram view describes a histogram of the duration in nanoseconds of read queries ("peeks") in the dataflow layer.

<!-- RELATION_SPEC mz_introspection.mz_peek_durations_histogram -->
FieldTypeMeaning
typetextThe peek variant: index or persist.
duration_nsuint8The upper bound of the bucket in nanoseconds.
countnumericThe (noncumulative) count of peeks in this bucket.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_peek_durations_histogram_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_peek_durations_histogram_raw -->

mz_records_per_dataflow

The mz_records_per_dataflow view describes the number of records in each dataflow.

<!-- RELATION_SPEC mz_introspection.mz_records_per_dataflow -->
FieldTypeMeaning
iduint8The ID of the dataflow. Corresponds to mz_dataflows.id.
nametextThe internal name of the dataflow.
recordsbigintThe number of records in the dataflow.
batchesbigintThe number of batches in the dataflow.
sizebigintThe utilized size in bytes of the arrangements.
capacitybigintThe capacity in bytes of the arrangements. Can be larger than the size.
allocationsbigintThe number of separate memory allocations backing the arrangements.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_records_per_dataflow_per_worker -->

mz_records_per_dataflow_operator

The mz_records_per_dataflow_operator view describes the number of records in each dataflow operator in the system.

<!-- RELATION_SPEC mz_introspection.mz_records_per_dataflow_operator -->
FieldTypeMeaning
iduint8The ID of the operator. Corresponds to mz_dataflow_operators.id.
nametextThe internal name of the operator.
dataflow_iduint8The ID of the dataflow. Corresponds to mz_dataflows.id.
recordsbigintThe number of records in the operator.
batchesbigintThe number of batches in the dataflow.
sizebigintThe utilized size in bytes of the arrangement.
capacitybigintThe capacity in bytes of the arrangement. Can be larger than the size.
allocationsbigintThe number of separate memory allocations backing the arrangement.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_records_per_dataflow_operator_per_worker -->

mz_scheduling_elapsed

The mz_scheduling_elapsed view describes the total amount of time spent in each dataflow operator.

<!-- RELATION_SPEC mz_introspection.mz_scheduling_elapsed -->
FieldTypeMeaning
iduint8The ID of the operator. Corresponds to mz_dataflow_operators.id.
elapsed_nsnumericThe total elapsed time spent in the operator in nanoseconds.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_scheduling_elapsed_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_scheduling_elapsed_raw -->

mz_scheduling_parks_histogram

The mz_scheduling_parks_histogram view describes a histogram of dataflow worker park events. A park event occurs when a worker has no outstanding work.

<!-- RELATION_SPEC mz_introspection.mz_scheduling_parks_histogram -->
FieldTypeMeaning
slept_for_nsuint8The actual length of the park event in nanoseconds.
requested_nsuint8The requested length of the park event in nanoseconds.
countnumericThe (noncumulative) count of park events in this bucket.
<!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_scheduling_parks_histogram_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_scheduling_parks_histogram_raw --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_hydration_times_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_compute_operator_hydration_statuses_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_operator_reachability --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_operator_reachability_per_worker --> <!-- RELATION_SPEC_UNDOCUMENTED mz_introspection.mz_dataflow_operator_reachability_raw -->