Back to Materialize

Install on GCP

doc/user/content/self-managed-deployments/installation/install-on-gcp.md

12313.4 KB
Original Source

Materialize provides a set of modular Terraform modules that can be used to deploy all services required for Materialize to run on Google Cloud. The module is intended to provide a simple set of examples on how to deploy Materialize. It can be used as is or modules can be taken from the example and integrated with existing DevOps tooling.

{{% self-managed/materialize-components-sentence %}} The example on this page deploys a complete Materialize environment on GCP using the modular Terraform setup from this repository.

{{< warning >}}

{{< self-managed/terraform-disclaimer >}}

{{< /warning >}}

What Gets Created

This example provisions the following infrastructure:

Networking

ResourceDescription
VPC NetworkCustom VPC with auto-create subnets disabled
Subnet192.168.0.0/20 primary range with private Google access enabled
Secondary RangesPods: 192.168.64.0/18, Services: 192.168.128.0/20
Cloud RouterFor NAT and routing configuration
Cloud NATFor outbound internet access from private nodes
VPC PeeringService networking connection for Cloud SQL private access

Compute

ResourceDescription
GKE ClusterRegional cluster with Workload Identity enabled
Generic Node Poole2-standard-8 machines, autoscaling 2-5 nodes, 50GB disk, for general workloads
Materialize Node Pooln2-highmem-8 machines, autoscaling 2-5 nodes, 100GB disk, 1 local SSD, swap enabled, dedicated taints for Materialize workloads
Service AccountGKE service account with workload identity binding

Database

ResourceDescription
Cloud SQL PostgreSQLPrivate IP only (no public IP)
Tierdb-custom-2-4096 (2 vCPUs, 4GB memory)
Databasematerialize database with UTF8 charset
Usermaterialize user with auto-generated password
NetworkConnected via VPC peering for private access

Storage

ResourceDescription
Cloud Storage BucketRegional bucket for Materialize persistence
AccessHMAC keys for S3-compatible access (Workload Identity service account with storage permissions is configured but not currently used by Materialize for GCS access, in future we will remove HMAC keys and support access to GCS either via Workload Identity Federation or via Kubernetes ServiceAccounts that impersonate IAM service accounts)
VersioningDisabled (for testing; enable in production)

Kubernetes Add-ons

ResourceDescription
cert-managerCertificate management controller for Kubernetes that automates TLS certificate provisioning and renewal
Self-signed ClusterIssuerProvides self-signed TLS certificates for Materialize instance internal communication (balancerd, console). Used by the Materialize instance for secure inter-component communication.

Materialize

ResourceDescription
OperatorMaterialize Kubernetes operator in the materialize namespace
InstanceSingle Materialize instance in the materialize-environment namespace
Load BalancersGCP Load Balancers for access to Materialize {{< yaml-table data="self_managed/default_ports" >}}

Prerequisites

GCP Account Requirements

A Google account with permission to:

  • Enable Google Cloud APIs/services on for your project.
  • Create:
    • GKE clusters
    • Cloud SQL instances
    • Cloud Storage buckets
    • VPC networks and networking resources
    • Service accounts and IAM bindings

Required Tools

License Key

{{< yaml-table data="self_managed/license_key" >}}

Getting started: Simple example

{{< warning >}}

{{< self-managed/terraform-disclaimer >}}

{{< /warning >}}

{{< tip >}}

The simple example used in this tutorial enables Password authentication for the Materialize instance. To use a different authentication method, update authenticator_kind. See Authentication for the supported authentication mechanisms. s {{< /tip >}}

Step 1: Set Up the Environment

  1. Open a terminal window.

  2. Clone the Materialize Terraform repository and go to the gcp/examples/simple directory.

    bash
    git clone https://github.com/MaterializeInc/materialize-terraform-self-managed.git
    cd materialize-terraform-self-managed/gcp/examples/simple
    
  3. Authenticate to GCP with your user account.

    bash
    gcloud auth login
    
  4. Find the list of GCP projects:

    bash
    gcloud projects list
    
  5. Set your active GCP project, substitute with your <PROJECT_ID>.

    bash
    gcloud config set project <PROJECT_ID>
    
  6. Enable the following APIs for your project:

    bash
    gcloud services enable container.googleapis.com               # For creating Kubernetes clusters
    gcloud services enable compute.googleapis.com                 # For creating GKE nodes and other compute resources
    gcloud services enable sqladmin.googleapis.com                # For creating databases
    gcloud services enable cloudresourcemanager.googleapis.com    # For managing GCP resources
    gcloud services enable servicenetworking.googleapis.com       # For private network connections
    gcloud services enable iamcredentials.googleapis.com          # For security and authentication
    gcloud services enable iam.googleapis.com                     # For managing IAM service accounts and policies
    gcloud services enable storage.googleapis.com                 # For Cloud Storage buckets
    
  7. Authenticate application default credentials for Terraform

    bash
    gcloud auth application-default login
    

Step 2: Configure Terraform Variables

  1. Create a terraform.tfvars file and specify the following variables:

    VariableDescription
    project_idSet to your GCP project ID.
    name_prefixSet a prefix for all resource names (e.g., simple-demo) as well as your release name for the Operator
    regionSet the GCP region for the deployment (e.g., us-central1).
    license_keySet to your Materialize license key.
    labelsSet to the labels to apply to resources.
    bash
    project_id  = "my-gcp-project"
    name_prefix = "simple-demo"
    region      = "us-central1"
    license_key = "your-materialize-license-key"
    labels = {
      environment = "demo"
      created_by  = "terraform"
    }
    # internal_load_balancer = false   # default = true (internal load balancer). You can set to false = public load balancer.
    # ingress_cidr_blocks = ["x.x.x.x/n", ...]
    # k8s_apiserver_authorized_networks  = ["x.x.x.x/n", ...]
    

    {{% include-from-yaml data="self_managed/installation" name="installation-tfvars-variables-optional" %}}

Step 3: Apply the Terraform

  1. Initialize the Terraform directory to download the required providers and modules:

    bash
    terraform init
    
  2. Apply the Terraform configuration to create the infrastructure.

    bash
    terraform apply
    

    If you are satisfied with the planned changes, type yes when prompted to proceed.

  3. From the output, you will need the following field(s) to connect:

    • console_load_balancer_ip for the Materialize Console
    • balancerd_load_balancer_ip to connect PostgreSQL-compatible clients/drivers.
    • external_login_password_mz_system.
    bash
    terraform output -raw <field_name>
    

    {{< tip >}} Your shell may show an ending marker (such as %) because the output did not end with a newline. Do not include the marker when using the value. {{< /tip >}}

  4. Configure kubectl to connect to your GKE cluster, replacing:

    • <your-gke-cluster-name> with your cluster name; i.e., the gke_cluster_name in the Terraform output. For the sample example, your cluster name has the form <name_prefix>-gke; e.g., simple-demo-gke

    • <your-region> with your cluster location; i.e., the gke_cluster_location in the Terraform output. Your region can also be found in your terraform.tfvars file.

    • <your-project-id> with your GCP project ID.

    bash
    # gcloud container clusters get-credentials <your-gke-cluster-name> --region <your-region> --project <your-project-id>
    gcloud container clusters get-credentials $(terraform output -raw gke_cluster_name) \
     --region $(terraform output -raw gke_cluster_location) \
     --project <your-project-id>
    

Step 4. Optional. Verify the status of your deployment

  1. Check the status of your deployment: {{% include-from-yaml data="self_managed/installation" name="installation-verify-status" %}}

Step 5: Connect to Materialize

You can connect to Materialize via the Materialize Console or PostgreSQL-compatible tools/drivers using the following ports:

{{< yaml-table data="self_managed/default_ports" >}}

Connect using the Materialize Console

{{% include-from-yaml data="self_managed/installation" name="installation-access-methods" %}}

Using the console_load_balancer_ip and external_login_password_mz_system from the Terraform output, you can connect to Materialize via the Materialize Console.

  1. To connect to the Materialize Console, open a browser to https://<console_load_balancer_ip>:8080, substituting your <console_load_balancer_ip>.

    From the terminal, you can type:

    sh
    open "https://$(terraform output -raw console_load_balancer_ip):8080/materialize"
    

    {{< tip >}}

    {{% include-from-yaml data="self_managed/installation" name="install-uses-self-signed-cluster-issuer" %}}

    {{< /tip >}}

  2. Log in as mz_system, using external_login_password_mz_system as the password.

  3. Create new users and log out.

    In general, other than the initial login to create new users for new deployments, avoid using mz_system since mz_system also used by the Materialize Operator for upgrades and maintenance tasks.

    For more information on authentication and authorization for Self-Managed Materialize, see:

  4. Login as one of the created user.

Connect using psql

{{% include-from-yaml data="self_managed/installation" name="installation-access-methods" %}}

Using the balancerd_load_balancer_ip and external_login_password_mz_system from the Terraform output, you can connect to Materialize via PostgreSQL-compatible clients/drivers, such as psql:

  1. To connect using psql, in the connection string, specify:

    • mz_system as the user
    • balancerd_load_balancer_ip as the host
    • 6875 as the port:
    bash
    psql "postgres://mz_system@$(terraform output -raw balancerd_load_balancer_ip):6875/materialize"
    

    When prompted for the password, enter the external_login_password_mz_system value.

  2. Create new users and log out.

    In general, other than the initial login to create new users for new deployments, avoid using mz_system since mz_system also used by the Materialize Operator for upgrades and maintenance tasks.

    For more information on authentication and authorization for Self-Managed Materialize, see:

  3. Login as one of the created user.

Customizing Your Deployment

{{< tip >}} To reduce cost in your demo environment, you can tweak machine types and database tiers in main.tf. {{< /tip >}}

You can customize each module independently.

{{< note >}} GCP Storage Authentication Limitation: Materialize currently only supports HMAC key authentication for GCS access (S3-compatible API). While the modules configure both HMAC keys and Workload Identity, Materialize uses HMAC keys for actual storage access. {{< /note >}}

See also:

Cleanup

{{% self-managed/cleanup-cloud %}}

See Also