Back to Materialize

Sources

doc/user/content/concepts/sources.md

1232.0 KB
Original Source

Overview

Sources describe external systems you want Materialize to read data from, and provide details about how to decode and interpret that data. A simplistic way to think of this is that sources represent streams and their schemas; this isn't entirely accurate, but provides an illustrative mental model.

In terms of SQL, sources are similar to a combination of tables and clients.

  • Like tables, sources are structured components that users can read from.
  • Like clients, sources are responsible for reading data. External sources provide all of the underlying data to process.

By looking at what comprises a source, we can develop a sense for how this combination works.

Source components

Sources consist of the following components:

ComponentUseExample
ConnectorProvides actual bytes of data to MaterializeKafka
FormatStructures of the external source's bytes, i.e. its schemaAvro
EnvelopeExpresses how Materialize should handle the incoming data + any additional formatting informationUpsert

Connectors

Materialize bundles native connectors for the following external systems:

{{< include-md file="shared-content/multilink-box-native-connectors.md" >}}

For details on the syntax, supported formats and features of each connector, check out the dedicated CREATE SOURCE documentation pages.

Sources and clusters

Sources require compute resources in Materialize, and so need to be associated with a cluster. If possible, dedicate a cluster just for sources.

See also Operational guidelines.