distribution/ecs/README.md
Create a Quickwit module using:
module "quickwit" {
source = "github.com/quickwit-oss/quickwit/distribution/ecs/quickwit"
vpc_id = # VPC in which all resources will be created
subnet_ids = [...] # At least 2 private subnets must be specified
quickwit_ingress_cidr_blocks = [...] # List of CIDR blocks allowed to access to the Quickwit API
}
The Quickwit cluster is running on a private subnet. For ECS to pull the image:
quickwit/quickwit, the subnets
specified must be configured with a NAT Gateway (no public IPs are attached to
the tasks)To get the list of available configurations, check the ./quickwit/variables.tf
file.
Metastore database backups are disabled as restoring one would lead to
inconsistencies with the index store on S3. To ensure high availability, you
should enable rds_config.multi_az instead. To use your own Postgres database
instead of creating a new RDS instance, configure the
external_postgres_uri_secret_arn variable (e.g ARN of an SSM parameter with
the value postgres://user:password@domain:port/db).
Using NAT Gateways for the image registry is quite costly (approx. $0.05/hour/AZ). If you are not already using NAT Gateways in the AZs where Quickwit will be deployed, you should probably push the Quickwit image to ECR and use ECR interface VPC endpoints instead (approx. ~$0.01/hour/AZ).
When using the default image, you will quickly run into the Docker Hub rate
limiting. We recommend pushing the Quickwit image to ECR and configure that as
quickwit_image. Note that the architecture of the image that you push to ECR
must match the quickwit_cpu_architecture variable (ARM64 by default).
Sidecar container and custom logging configurations can be configured using the
variables sidecar_container_definitions, sidecar_container_dependencies,
log_configuration, enable_cloudwatch_logging. See custom log
routing.
You can use sidecars to inject additional secrets as files. This can be
useful for configuring sources such as Kafka. See ./example/kafka.tf for an
example.
To access external AWS services like the Kinesis source, use the
quickwit_indexer.extra_task_policy_arns variable to attach the necessary
IAM policies to indexers.
We provide an example of self contained deployment with an ad-hoc VPC.
[!IMPORTANT] This stack costs ~$200/month to run (Fargate tasks, NAT Gateways and RDS)
To make it easy to access your Quickwit cluster, the example stack includes
a bastion instance. Access is secured using an SSH key pair that you need to
provide (e.g generated with ssh-keygen -t ed25519).
In the ./example directory, create a terraform.tfvars file with the public
key of your RSA key pair:
bastion_public_key = "ssh-ed25519 ..."
[!NOTE] You can skip the creation of the bastion by not specifying the
bastion_public_keyvariable, but that would make it hard to access and experiment with the created Quickwit cluster.
In the same directory (./example) run:
terraform init
terraform apply
The successful apply command should output the IP of the bastion EC2 instance.
You can port forward Quickwit's search UI using:
ssh -N -L 7280:searcher.quickwit:7280 -i {your-private-key-file} ubuntu@{bastion_ip}
To ingest some example dataset, log into the bastion:
ssh -i {your-private-key-file} ubuntu@{bastion_ip}
# create the log index
wget https://raw.githubusercontent.com/quickwit-oss/quickwit/main/config/tutorials/hdfs-logs/index-config.yaml
curl -X POST \
-H "content-type: application/yaml" \
--data-binary @index-config.yaml \
http://indexer.quickwit:7280/api/v1/indexes
# import some data
wget https://quickwit-datasets-public.s3.amazonaws.com/hdfs-logs-multitenants-10000.json
curl -X POST \
-H "content-type: application/json" \
--data-binary @hdfs-logs-multitenants-10000.json \
http://indexer.quickwit:7280/api/v1/hdfs-logs/ingest?commit=force
If your SSH tunnel to the searcher is still running, you should be able to see the ingested data in the UI.
By default, the example stack uses Docker Hub to pull the Quickwit image. This
is convenient but it quickly runs into rate limiting. To avoid this, in the
terraform.tfvars file, set the dockerhub_pull_through_creds_secret_arn to a
AWS Secret with the following content:
{"username":"...","accessToken":"..."}
This will: