doc/user/content/self-managed-deployments/_index.md
Whereas Materialize Cloud gives you a fully managed service for Materialize, Self-Managed Materialize allows you to deploy Materialize in your own infrastructure.
Self-Managed Materialize deployments on Kubernetes consist of several layers of components that work together to provide a fully functional database environment. Understanding these components and how they interact is essential for deploying, managing, and troubleshooting your Self-Managed Materialize.
This page provides an overview of the core architectural components in a Self-Managed deployment, from the infrastructure level (Helm chart) down to the application level (clusters and replicas).
A Self-Managed Materialize deployment is organized into the following layers:
| Layer | Component | Description |
|---|---|---|
| Infrastructure | Helm Chart | Package manager component that bootstraps the Kubernetes deployment |
| Orchestration | Materialize Operator | Kubernetes operator that manages Materialize instances |
| Database | Materialize Instance | The Materialize database instance itself |
| Compute | Clusters and Replicas | Isolated compute resources for workloads |
The Helm chart is the entry point for deploying Materialize in a self-managed Kubernetes environment. It serves as a package manager component that defines and deploys the Materialize Operator.
You interact with the Helm chart through standard Helm commands. For example:
To add the Materialize Helm chart repository:
helm repo add materialize https://materializeinc.github.io/materialize
To update the repository index:
helm repo update materialize
To install the Materialize Helm chart and deploy the Materialize Operator and other resources:
helm install materialize materialize/materialize-operator
To upgrade the the Materialize Helm chart (and the Materialize Operator and other resources):
helm upgrade materialize materialize/materialize-operator
To uninstall the Helm chart (and the Materialize Operator and other resources):
helm uninstall materialize
helm install materialize materialize/materialize-operator
When you install the the Materialize Helm Chart, it:
Once installed, the Materialize Operator handles the deployment and management of Materialize instances.
The Materialize Operator (implemented as orchestratord) is a Kubernetes operator that automates the deployment and lifecycle management of Materialize instances. It implements the Kubernetes operator pattern to extend Kubernetes with domain-specific knowledge about Materialize.
The operator watches for Materialize custom resources and creates/manages all the Kubernetes resources required to run a Materialize instance, including:
balancerd and console pod used as the ingress layer for Materialize.environmentd and clusterd which are the database control plane and compute resources respectively.For configuration options for the Materialize Operator, see the Materialize Operator Configuration page.
A Materialize instance is the actual database that you connect to and interact with. Each instance is an isolated Materialize deployment (deployed via a Kubernetes Custom Resource) with its own data, configuration, and compute resources.
When you create a Materialize instance, the operator deploys three core components as Kubernetes resources:
balancerd: A pgwire and HTTP proxy that routes all Materialize client
connections to environmentd for handling. balancerd is deployed as a
Kubernetes Deployment.
environmentd: The main database control plane, deployed as a StatefulSet.
environmentd runs as a Kubernetes pod and is the primary component of a
Materialize instance. It houses the control plane and contains:
On startup, environmentd will create several built-in clusters.
console: Web-based administration interface, deployed as a Deployment.
A Materialize instance manages:
To deploy Materialize instances with the operator, create and apply Materialize custom resources definitions(CRDs). For a full list of fields available for the Materialize CR, see Materialize CRD Field Descriptions.
apiVersion: materialize.cloud/v1alpha1
kind: Materialize
metadata:
name: 12345678-1234-1234-1234-123456789012
namespace: materialize-environment
spec:
environmentdImageRef: materialize/environmentd:{{< self-managed/versions/get-latest-version >}}
# ... additional fields omitted for brevity
When you first apply the Materialize custom resource, the operator automatically creates all required Kubernetes resources.
To modify a custom resource, update the CRD with your changes, including the
requestRollout field with a new UUID value. When you apply the CRD, the
operator will roll out the changes.
{{< note >}} If you do not specify a new requestRollout UUID, the operator
watches for updates but does not roll out the changes.
{{< /note >}}
For a full list of fields available for the Materialize CR, see Materialize CRD Field Descriptions.
See also:
Once deployed, you interact with a Materialize instance through the Materialize Console or standard PostgreSQL-compatible tools and drivers:
# Connect with psql
psql "postgres://materialize@<host>:6875/materialize"
Once connected, you can issue SQL commands to create sources, define views, run queries, and manage the database:
-- Create a source
CREATE SOURCE my_source FROM KAFKA ...;
-- Create a materialized view
CREATE MATERIALIZED VIEW my_view AS
SELECT ... FROM my_source ...;
-- Query the view
SELECT * FROM my_view;
Clusters are isolated pools of compute resources that execute workloads in Materialize. They provide resource isolation and fault tolerance for your data processing pipelines.
For a comprehensive overview of clusters in Materialize, see the Clusters concept page.
Each replica contains identical compute resources and processes the same data independently, providing fault tolerance and high availability.
When you create a cluster with one or more replicas in Materialize, the instance coordinates with the operator to create:
For example:
-- Create a cluster with 2 replicas
CREATE CLUSTER my_cluster SIZE = '100cc', REPLICATION FACTOR = 2;
This creates two separate StatefulSets in Kubernetes, each running compute processes.
You interact with clusters primarily through SQL:
-- Create a cluster
CREATE CLUSTER ingest_cluster SIZE = '50cc', REPLICATION FACTOR = 1;
-- Use the previous cluster for a source
CREATE SOURCE my_source
IN CLUSTER ingest_cluster
FROM KAFKA ...;
-- Create a cluster for materialized views
CREATE CLUSTER compute_cluster SIZE = '100cc', REPLICATION FACTOR = 2;
-- Use the previous cluster for a materialized view
CREATE MATERIALIZED VIEW my_view
IN CLUSTER compute_cluster AS
SELECT ... FROM my_source ...;
-- Resize a cluster
ALTER CLUSTER compute_cluster SET (SIZE = '200cc');
Materialize handles the underlying Kubernetes resource creation and management automatically.
The following outlines the workflow process, summarizing how the various components work together:
Install the Helm chart: This deploys the Materialize Operator to your Kubernetes cluster.
Create a Materialize instance: Apply a Materialize custom resource. The
operator detects this and creates all necessary Kubernetes resources,
including the environmentd, balancerd, and console pods.
Connect to the instance: Use the Materialize Console on port 8080 to
connecto to the console service endpoint or SQL client on port 6875 to
connect to the balancerd service endpoint.
If authentication is enabled, you must first connect to the Materialize Console and set up users.
Create clusters: Issue SQL commands to create clusters. Materialize coordinates with the operator to provision StatefulSets for replicas.
Run your workloads: Create sources, materialized views, indexes, and sinks on your clusters.
To help you get started, Materialize provides Terraform modules.
{{< important >}} These modules are intended for evaluation/demonstration purposes and for serving as a template when building your own production deployment. The modules should not be directly relied upon for production deployments: future releases of the modules will contain breaking changes. Instead, to use as a starting point for your own production deployment, either:
Fork the repo and pin to a specific version; or
Use the code as a reference when developing your own deployment.
{{</ important >}}
{{< tabs >}} {{< tab "Terraform Modules (New!)" >}}
Materialize provides Terraform modules, which provides concrete examples and an opinionated model for deploying Materialize.
{{< yaml-table data="self_managed/terraform_list" >}}
{{< /tab >}} {{< tab "Legacy Terraform Modules" >}}
{{< yaml-table data="self_managed/terraform_list_legacy" >}} {{< /tab >}} {{< /tabs >}}