Back to Automq

DevKit - AutoMQ Local Development Environment

devkit/README.md

1.6.614.7 KB
Original Source

DevKit - AutoMQ Local Development Environment

Overview

DevKit is AutoMQ's local development environment. One-command Kafka cluster with MinIO S3 storage, JDWP debugging, chaos testing, and optional feature stacks.

Core principle: All commands run via just. Kafka commands execute inside containers — use shortcuts or just bin from the host.

Supported node counts: 1, 3, 4, 5. Node numbering starts at 0.

Prerequisites

  • Docker & Docker Compose
  • justbrew install just
  • envsubstbrew install gettext (macOS)
  • JDK 17+ (for just start-build only)

Quick Start

bash
# First time: build image + code, then start
just start-build        # single node
just start-build 3      # 3-node cluster

# After first build
just start              # single node
just start 3            # 3-node cluster
just start 3 tabletopic # 3-node + Iceberg table topic

Architecture

MinIO (S3 storage)
  └── mc (bucket init, one-shot)
        └── AutoMQ Nodes
              ├── node 0-2: controller + broker (KRaft voters)
              └── node 3+:  broker only

Node roles by count: a 4-node cluster has nodes 0-2 as controllers, node 3 as broker-only.

Optional feature stacks: tabletopic → Schema Registry + Iceberg REST Catalog, analytics → Spark + Trino.

Decision: Which start command to use?

Java code changed + Dockerfile changed → just start-build
Java code changed only                 → just restart-build
Config files changed only              → just restart
Nothing changed                        → just start [N]

Lifecycle

bash
just start                      # Start (skip build), single node
just start 3 tabletopic         # Start 3-node + features
just start-build                # Build image & code, then start
just stop                       # Stop containers (keep data)
just clean                      # Stop + remove all volumes
just restart                    # Restart containers (no rebuild)
just restart-build              # Rebuild code & image, then restart
just status                     # Show container status
just logs                       # Last 100 lines of all nodes
just logs 0                     # Last 100 lines of node-0
just logs 0 500                 # Last 500 lines of node-0
just logs-follow                # Follow all logs (blocks)
just logs-follow 0              # Follow node-0 logs (blocks)
just shell                      # Enter node-0 shell (interactive)
just shell 1                    # Enter node-1 shell (interactive)
just exec 0 ls /opt/automq/bin  # Run a single command in node-0
just build-image                # Rebuild Docker image only
just wait                       # Wait until all nodes healthy (auto-called by start)

just logs outputs to stdout — pipe it freely:

bash
just logs 0 | grep ERROR
just logs 0 500 | grep WARN
just logs 0 | head -20          # oldest entries
just logs 0 | tail -20          # newest entries

Topic Operations

bash
just topic-list                                     # List all topics
just topic-create my-topic --partitions 16          # Create topic
just topic-describe my-topic                        # Describe topic

# Producer — reads from stdin, supports --node N and all Kafka producer flags
just produce my-topic                               # Interactive (type messages, Ctrl-D to end)
echo "hello" | just produce my-topic               # Pipe a single message
just produce my-topic --node 1                      # Produce via node-1

# Produce key:value messages
printf 'k1:v1\nk2:v2\n' | just produce my-topic \
  --property parse.key=true \
  --property "key.separator=:"

# Specify client-id (use --producer-property, not --client-id)
echo "hello" | just produce my-topic \
  --producer-property client.id=my-producer

# Consumer — supports --node N and all Kafka consumer flags
just consume my-topic --from-beginning              # Consume from beginning
just consume my-topic --node 1                      # Consume via node-1
just consume my-topic --max-messages 10             # Stop after 10 messages

# Print keys with custom separator (special characters passed safely)
just consume my-topic --from-beginning \
  --property print.key=true \
  --property "key.separator=:"

# Specify client-id (use --consumer-property, not --client-id)
just consume my-topic --from-beginning \
  --consumer-property client.id=my-consumer

Consumer Group Operations

bash
just group-list                                     # List all consumer groups
just group-describe my-group                        # Show offsets and lag per partition
just group-members my-group                         # Show active members and assignments
just group-reset my-group my-topic --to-earliest    # Reset to earliest offset
just group-reset my-group my-topic --to-latest      # Reset to latest offset
just group-reset my-group my-topic --to-offset 100  # Reset to specific offset

Cluster Operations

bash
just broker-list                # List brokers and supported API versions
just broker-config              # Describe all broker configs (cluster-wide)
just broker-config 0            # Describe broker 0 configs only

Bin (direct passthrough)

Run any Kafka bin/ script inside a node container. Commands run via docker exec, so localhost:9092 refers to the broker inside the container.

bash
just bin kafka-topics.sh --bootstrap-server localhost:9092 --list
just bin kafka-configs.sh --bootstrap-server localhost:9092 --describe --entity-type brokers
just bin --node 1 kafka-broker-api-versions.sh --bootstrap-server localhost:9092

# One-shot shell command in a container
just exec 0 ls /opt/automq/bin
just exec 0 cat /config/node-0.properties

Perf

bash
# -1 = unlimited throughput
just perf-produce my-topic --num-records 100000 --record-size 1024 --throughput -1
just perf-consume my-topic --messages 100000

Diagnostics

Arthas (Java runtime diagnostics)

bash
just arthas         # Attach to node-0 Kafka process (interactive)
just arthas 1       # Attach to node-1 (interactive)

# Non-interactive: run a single Arthas command and exit
just arthas-exec 0 version
just arthas-exec 0 'dashboard -n 1'       # one snapshot of CPU/memory/threads
just arthas-exec 0 'thread -n 5'          # top 5 busy threads
just arthas-exec 0 'jvm'                  # JVM info (heap, GC, classloader)
just arthas-exec 0 'sc kafka.server.*'    # search loaded classes

# Note: streaming commands like watch/trace require active traffic to produce
# output and will block until -n limit is reached. Use just arthas for those.

JMX

JMX exposed on node-0 at port 9999 via jmxterm (pre-installed).

bash
just jmx -e 'domains'                                                              # List all JMX domains
just jmx -e 'beans -d kafka.server'                                                # List beans in domain
just jmx -e 'info -b kafka.server:name=BrokerState,type=KafkaServer'              # View bean attributes
just jmx -e 'get -b kafka.server:name=BrokerState,type=KafkaServer Value'         # Broker state (3 = running)
just jmx -e 'get -b kafka.server:name=MessagesInPerSec,type=BrokerTopicMetrics Count'
just jmx --node 1 -e 'get -b kafka.server:name=BrokerState,type=KafkaServer Value'

JDK tools

bash
just shell      # Enter node-0 shell — jstack, jmap, jcmd available
just shell 1    # Enter node-1 shell

To change JVM heap size, edit HEAP_OPTS in justfile (default: -Xms256m -Xmx256m).

IDEA Remote Debugging

JDWP is enabled on all nodes at startup. Port mapping: node-0 → 5005, node-1 → 5006, node-2 → 5007, ...

bash
# In IDEA: Run → Edit Configurations → + → Remote JVM Debug → localhost:5005
just start
# Then attach debugger

Chaos Testing

Inject faults using tc netem (network delay/loss) and iptables (traffic blocking). NET_ADMIN capability is already set in docker-compose.yml.

Always run just chaos-reset when done — clears all tc rules, iptables rules, and unpauses MinIO.

Node network

bash
just chaos-delay                    # 200ms delay on node-0
just chaos-delay 500 node-1         # 500ms delay on node-1
just chaos-loss                     # 5% packet loss on node-0
just chaos-loss 10 node-1           # 10% packet loss on node-1

S3 (MinIO) faults

bash
# Targeted: only affects traffic between the specified node and MinIO
just chaos-s3-delay 500 node-0      # 500ms latency to S3 on node-0
just chaos-s3-loss 10 node-0        # 10% packet loss to S3 on node-0

# Global
just chaos-s3-down                  # Pause MinIO (S3 completely unavailable)
just chaos-s3-up                    # Resume MinIO

# Partition (iptables DROP)
just chaos-s3-partition node-1      # Isolate node-1 from S3
just chaos-s3-partition-reset       # Restore S3 connectivity only

Node-to-node partition

bash
just chaos-partition node-0 node-1  # Bidirectional isolation (iptables DROP)
just chaos-partition-reset          # Restore node connectivity only

Reset & status

bash
just chaos-reset                    # Remove ALL chaos rules (tc + iptables + unpause MinIO)
just chaos-status                   # Show active rules per node and MinIO state

Scenarios

S3 slow response

bash
just chaos-s3-delay 500 node-0
just produce my-topic
just chaos-reset

S3 outage

bash
just chaos-s3-down
just logs
just chaos-s3-up

Single node S3 partition

bash
just chaos-s3-partition node-1
just logs 1
just chaos-s3-partition-reset

Controller network partition

bash
just start 3
just chaos-partition node-0 node-1
just logs
just chaos-partition-reset

Combined faults

bash
just start 3
just chaos-delay 200 node-0
just chaos-s3-loss 10 node-0
just produce my-topic
just chaos-reset

Features

Append feature names to just start. Multiple features can be combined.

bash
just start 1 telemetry              # Single node + OTel metrics
just start 3 tabletopic             # 3-node + Iceberg table topic
just start 5 zerozone analytics     # 5-node + zone router + query engines
FeatureWhat it enablesExtra services
tabletopicIceberg table topic, REST catalog, schema registrySchema Registry (:8081), Iceberg REST (:8181)
zerozoneZone router, auto AZ assignment (round-robin: az-0/1/2)— (config only)
telemetryOTel metrics export via OTLP HTTP — edit config/features/telemetry.properties to set collector endpoint before use— (config only)
analyticsSpark + Trino for querying Iceberg tablesSpark (:8888), Trino (:8090)

Namespace (Multi-Instance Isolation)

Run multiple DevKit instances on the same machine without port conflicts. Each namespace gets a unique COMPOSE_PROJECT_NAME and port offset.

Requires separate git worktrees — each worktree has its own devkit/ directory and .devkit/compose.env, so each instance is fully isolated. You cannot run multiple namespaces from the same devkit directory.

bash
# In worktree A:
just set-ns lag-test 1    # ID=1 → ports offset by +100
just start 3

# In worktree B:
just set-ns perf 2        # ID=2 → ports offset by +200
just start

# Revert to default ports
just clear-ns
  • Name: 1-8 chars, must start with a letter, only [a-z0-9-]
  • ID: 1-9, determines port offset (ID * 100)
  • Config is stored in .devkit/compose.env (gitignored)

Port Allocation

NodeKafkaControllerJDWPJMX
09092909350059999
119092190935006
229092290935007
339092390935008
449092490935009

Other services:

Configuration

5-layer system — later layers override earlier ones. Each layer is a template rendered via envsubst before Docker starts. Final merged config: .devkit/config/node-N.properties.

LayerFileVariablesPurpose
1config/defaults.propertiesS3 varsS3 endpoints, KRaft basics
2config/role/server.properties or broker.propertiesS3 vars, NODE_IDListeners, process roles
3config/role/node.propertiesNODE_ID, CLUSTER_ID, VOTERS, BOOTSTRAPKRaft identity
4config/features/*.propertiesAll vars + AZ_NAMEFeature-specific settings
5config/custom.propertiesverbatim (no substitution)Highest priority overrides

Variables

S3 variables (defined in justfile, Layer 1/2/4):

VariableDefaultDescription
S3_REGIONus-east-1S3 region
S3_ENDPOINThttp://minio:9000S3 endpoint URL
S3_DATA_BUCKETautomq-dataData bucket
S3_OPS_BUCKETautomq-opsOps bucket
S3_PATH_STYLEtruePath-style access (required for MinIO)
S3_ACCESS_KEYadminS3 access key (exported as KAFKA_S3_ACCESS_KEY to containers)
S3_SECRET_KEYpasswordS3 secret key (exported as KAFKA_S3_SECRET_KEY to containers)

Per-node variables (computed per node, Layer 2/3/4):

VariableExampleDescription
NODE_ID0, 1, 2Node index (0-based)
CLUSTER_IDabc-123Stable UUID, persisted in .devkit/cluster-id
VOTERS0@node-0:9093,1@node-1:9093KRaft controller voter list
BOOTSTRAPnode-0:9093,node-1:9093Controller bootstrap servers
AZ_NAMEaz-0, az-1, az-2Round-robin by NODE_ID % 3 (used by zerozone)

Customization examples

Use real AWS S3 — edit justfile:

just
S3_REGION      := "ap-northeast-1"
S3_ENDPOINT    := "https://s3.ap-northeast-1.amazonaws.com"
S3_DATA_BUCKET := "my-automq-data"
S3_OPS_BUCKET  := "my-automq-ops"
S3_PATH_STYLE  := "false"
S3_ACCESS_KEY  := "AKIA..."
S3_SECRET_KEY  := "..."

Change listeners (e.g. SASL) — edit config/role/server.properties:

properties
process.roles=broker,controller
listeners=SASL_PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=SASL_PLAINTEXT://node-${NODE_ID}:9092,CONTROLLER://node-${NODE_ID}:9093

Add broker settings — create config/custom.properties (gitignored):

properties
num.partitions=4
log.retention.hours=24

Add a new feature — create config/features/my-feature.properties:

properties
my.setting=value
my.s3.path=s3://${S3_DATA_BUCKET}/my-prefix
my.node.label=node-${NODE_ID}
bash
just start 1 my-feature