devkit/README.md
DevKit is AutoMQ's local development environment. One-command Kafka cluster with MinIO S3 storage, JDWP debugging, chaos testing, and optional feature stacks.
Core principle: All commands run via just. Kafka commands execute inside containers — use shortcuts or just bin from the host.
Supported node counts: 1, 3, 4, 5. Node numbering starts at 0.
brew install justenvsubst — brew install gettext (macOS)just start-build only)# First time: build image + code, then start
just start-build # single node
just start-build 3 # 3-node cluster
# After first build
just start # single node
just start 3 # 3-node cluster
just start 3 tabletopic # 3-node + Iceberg table topic
MinIO (S3 storage)
└── mc (bucket init, one-shot)
└── AutoMQ Nodes
├── node 0-2: controller + broker (KRaft voters)
└── node 3+: broker only
Node roles by count: a 4-node cluster has nodes 0-2 as controllers, node 3 as broker-only.
Optional feature stacks: tabletopic → Schema Registry + Iceberg REST Catalog, analytics → Spark + Trino.
Java code changed + Dockerfile changed → just start-build
Java code changed only → just restart-build
Config files changed only → just restart
Nothing changed → just start [N]
just start # Start (skip build), single node
just start 3 tabletopic # Start 3-node + features
just start-build # Build image & code, then start
just stop # Stop containers (keep data)
just clean # Stop + remove all volumes
just restart # Restart containers (no rebuild)
just restart-build # Rebuild code & image, then restart
just status # Show container status
just logs # Last 100 lines of all nodes
just logs 0 # Last 100 lines of node-0
just logs 0 500 # Last 500 lines of node-0
just logs-follow # Follow all logs (blocks)
just logs-follow 0 # Follow node-0 logs (blocks)
just shell # Enter node-0 shell (interactive)
just shell 1 # Enter node-1 shell (interactive)
just exec 0 ls /opt/automq/bin # Run a single command in node-0
just build-image # Rebuild Docker image only
just wait # Wait until all nodes healthy (auto-called by start)
just logs outputs to stdout — pipe it freely:
just logs 0 | grep ERROR
just logs 0 500 | grep WARN
just logs 0 | head -20 # oldest entries
just logs 0 | tail -20 # newest entries
just topic-list # List all topics
just topic-create my-topic --partitions 16 # Create topic
just topic-describe my-topic # Describe topic
# Producer — reads from stdin, supports --node N and all Kafka producer flags
just produce my-topic # Interactive (type messages, Ctrl-D to end)
echo "hello" | just produce my-topic # Pipe a single message
just produce my-topic --node 1 # Produce via node-1
# Produce key:value messages
printf 'k1:v1\nk2:v2\n' | just produce my-topic \
--property parse.key=true \
--property "key.separator=:"
# Specify client-id (use --producer-property, not --client-id)
echo "hello" | just produce my-topic \
--producer-property client.id=my-producer
# Consumer — supports --node N and all Kafka consumer flags
just consume my-topic --from-beginning # Consume from beginning
just consume my-topic --node 1 # Consume via node-1
just consume my-topic --max-messages 10 # Stop after 10 messages
# Print keys with custom separator (special characters passed safely)
just consume my-topic --from-beginning \
--property print.key=true \
--property "key.separator=:"
# Specify client-id (use --consumer-property, not --client-id)
just consume my-topic --from-beginning \
--consumer-property client.id=my-consumer
just group-list # List all consumer groups
just group-describe my-group # Show offsets and lag per partition
just group-members my-group # Show active members and assignments
just group-reset my-group my-topic --to-earliest # Reset to earliest offset
just group-reset my-group my-topic --to-latest # Reset to latest offset
just group-reset my-group my-topic --to-offset 100 # Reset to specific offset
just broker-list # List brokers and supported API versions
just broker-config # Describe all broker configs (cluster-wide)
just broker-config 0 # Describe broker 0 configs only
Run any Kafka bin/ script inside a node container. Commands run via docker exec, so localhost:9092 refers to the broker inside the container.
just bin kafka-topics.sh --bootstrap-server localhost:9092 --list
just bin kafka-configs.sh --bootstrap-server localhost:9092 --describe --entity-type brokers
just bin --node 1 kafka-broker-api-versions.sh --bootstrap-server localhost:9092
# One-shot shell command in a container
just exec 0 ls /opt/automq/bin
just exec 0 cat /config/node-0.properties
# -1 = unlimited throughput
just perf-produce my-topic --num-records 100000 --record-size 1024 --throughput -1
just perf-consume my-topic --messages 100000
just arthas # Attach to node-0 Kafka process (interactive)
just arthas 1 # Attach to node-1 (interactive)
# Non-interactive: run a single Arthas command and exit
just arthas-exec 0 version
just arthas-exec 0 'dashboard -n 1' # one snapshot of CPU/memory/threads
just arthas-exec 0 'thread -n 5' # top 5 busy threads
just arthas-exec 0 'jvm' # JVM info (heap, GC, classloader)
just arthas-exec 0 'sc kafka.server.*' # search loaded classes
# Note: streaming commands like watch/trace require active traffic to produce
# output and will block until -n limit is reached. Use just arthas for those.
JMX exposed on node-0 at port 9999 via jmxterm (pre-installed).
just jmx -e 'domains' # List all JMX domains
just jmx -e 'beans -d kafka.server' # List beans in domain
just jmx -e 'info -b kafka.server:name=BrokerState,type=KafkaServer' # View bean attributes
just jmx -e 'get -b kafka.server:name=BrokerState,type=KafkaServer Value' # Broker state (3 = running)
just jmx -e 'get -b kafka.server:name=MessagesInPerSec,type=BrokerTopicMetrics Count'
just jmx --node 1 -e 'get -b kafka.server:name=BrokerState,type=KafkaServer Value'
just shell # Enter node-0 shell — jstack, jmap, jcmd available
just shell 1 # Enter node-1 shell
To change JVM heap size, edit
HEAP_OPTSinjustfile(default:-Xms256m -Xmx256m).
JDWP is enabled on all nodes at startup. Port mapping: node-0 → 5005, node-1 → 5006, node-2 → 5007, ...
# In IDEA: Run → Edit Configurations → + → Remote JVM Debug → localhost:5005
just start
# Then attach debugger
Inject faults using tc netem (network delay/loss) and iptables (traffic blocking). NET_ADMIN capability is already set in docker-compose.yml.
Always run just chaos-reset when done — clears all tc rules, iptables rules, and unpauses MinIO.
just chaos-delay # 200ms delay on node-0
just chaos-delay 500 node-1 # 500ms delay on node-1
just chaos-loss # 5% packet loss on node-0
just chaos-loss 10 node-1 # 10% packet loss on node-1
# Targeted: only affects traffic between the specified node and MinIO
just chaos-s3-delay 500 node-0 # 500ms latency to S3 on node-0
just chaos-s3-loss 10 node-0 # 10% packet loss to S3 on node-0
# Global
just chaos-s3-down # Pause MinIO (S3 completely unavailable)
just chaos-s3-up # Resume MinIO
# Partition (iptables DROP)
just chaos-s3-partition node-1 # Isolate node-1 from S3
just chaos-s3-partition-reset # Restore S3 connectivity only
just chaos-partition node-0 node-1 # Bidirectional isolation (iptables DROP)
just chaos-partition-reset # Restore node connectivity only
just chaos-reset # Remove ALL chaos rules (tc + iptables + unpause MinIO)
just chaos-status # Show active rules per node and MinIO state
just chaos-s3-delay 500 node-0
just produce my-topic
just chaos-reset
just chaos-s3-down
just logs
just chaos-s3-up
just chaos-s3-partition node-1
just logs 1
just chaos-s3-partition-reset
just start 3
just chaos-partition node-0 node-1
just logs
just chaos-partition-reset
just start 3
just chaos-delay 200 node-0
just chaos-s3-loss 10 node-0
just produce my-topic
just chaos-reset
Append feature names to just start. Multiple features can be combined.
just start 1 telemetry # Single node + OTel metrics
just start 3 tabletopic # 3-node + Iceberg table topic
just start 5 zerozone analytics # 5-node + zone router + query engines
| Feature | What it enables | Extra services |
|---|---|---|
tabletopic | Iceberg table topic, REST catalog, schema registry | Schema Registry (:8081), Iceberg REST (:8181) |
zerozone | Zone router, auto AZ assignment (round-robin: az-0/1/2) | — (config only) |
telemetry | OTel metrics export via OTLP HTTP — edit config/features/telemetry.properties to set collector endpoint before use | — (config only) |
analytics | Spark + Trino for querying Iceberg tables | Spark (:8888), Trino (:8090) |
Run multiple DevKit instances on the same machine without port conflicts. Each namespace gets a unique COMPOSE_PROJECT_NAME and port offset.
Requires separate git worktrees — each worktree has its own devkit/ directory and .devkit/compose.env, so each instance is fully isolated. You cannot run multiple namespaces from the same devkit directory.
# In worktree A:
just set-ns lag-test 1 # ID=1 → ports offset by +100
just start 3
# In worktree B:
just set-ns perf 2 # ID=2 → ports offset by +200
just start
# Revert to default ports
just clear-ns
[a-z0-9-]ID * 100).devkit/compose.env (gitignored)| Node | Kafka | Controller | JDWP | JMX |
|---|---|---|---|---|
| 0 | 9092 | 9093 | 5005 | 9999 |
| 1 | 19092 | 19093 | 5006 | — |
| 2 | 29092 | 29093 | 5007 | — |
| 3 | 39092 | 39093 | 5008 | — |
| 4 | 49092 | 49093 | 5009 | — |
Other services:
5-layer system — later layers override earlier ones. Each layer is a template rendered via envsubst before Docker starts. Final merged config: .devkit/config/node-N.properties.
| Layer | File | Variables | Purpose |
|---|---|---|---|
| 1 | config/defaults.properties | S3 vars | S3 endpoints, KRaft basics |
| 2 | config/role/server.properties or broker.properties | S3 vars, NODE_ID | Listeners, process roles |
| 3 | config/role/node.properties | NODE_ID, CLUSTER_ID, VOTERS, BOOTSTRAP | KRaft identity |
| 4 | config/features/*.properties | All vars + AZ_NAME | Feature-specific settings |
| 5 | config/custom.properties | verbatim (no substitution) | Highest priority overrides |
S3 variables (defined in justfile, Layer 1/2/4):
| Variable | Default | Description |
|---|---|---|
S3_REGION | us-east-1 | S3 region |
S3_ENDPOINT | http://minio:9000 | S3 endpoint URL |
S3_DATA_BUCKET | automq-data | Data bucket |
S3_OPS_BUCKET | automq-ops | Ops bucket |
S3_PATH_STYLE | true | Path-style access (required for MinIO) |
S3_ACCESS_KEY | admin | S3 access key (exported as KAFKA_S3_ACCESS_KEY to containers) |
S3_SECRET_KEY | password | S3 secret key (exported as KAFKA_S3_SECRET_KEY to containers) |
Per-node variables (computed per node, Layer 2/3/4):
| Variable | Example | Description |
|---|---|---|
NODE_ID | 0, 1, 2 | Node index (0-based) |
CLUSTER_ID | abc-123 | Stable UUID, persisted in .devkit/cluster-id |
VOTERS | 0@node-0:9093,1@node-1:9093 | KRaft controller voter list |
BOOTSTRAP | node-0:9093,node-1:9093 | Controller bootstrap servers |
AZ_NAME | az-0, az-1, az-2 | Round-robin by NODE_ID % 3 (used by zerozone) |
Use real AWS S3 — edit justfile:
S3_REGION := "ap-northeast-1"
S3_ENDPOINT := "https://s3.ap-northeast-1.amazonaws.com"
S3_DATA_BUCKET := "my-automq-data"
S3_OPS_BUCKET := "my-automq-ops"
S3_PATH_STYLE := "false"
S3_ACCESS_KEY := "AKIA..."
S3_SECRET_KEY := "..."
Change listeners (e.g. SASL) — edit config/role/server.properties:
process.roles=broker,controller
listeners=SASL_PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=SASL_PLAINTEXT://node-${NODE_ID}:9092,CONTROLLER://node-${NODE_ID}:9093
Add broker settings — create config/custom.properties (gitignored):
num.partitions=4
log.retention.hours=24
Add a new feature — create config/features/my-feature.properties:
my.setting=value
my.s3.path=s3://${S3_DATA_BUCKET}/my-prefix
my.node.label=node-${NODE_ID}
just start 1 my-feature