README.md
Redpanda Connect is a stream processor that moves data between a wide range of sources and sinks, with support for hydration, enrichment, transformation, and filtering along the way.
That includes a rich set of change-data-capture (CDC) connectors — for Postgres, MySQL, MongoDB, Oracle, MSSQL, and more — so database changes can flow through your pipelines as first-class events.
It uses Bloblang for mapping, runs as a single static binary or container image, and is easy to operate and monitor.
input:
gcp_pubsub:
project: foo
subscription: bar
pipeline:
processors:
- mapping: |
root.message = this
root.meta.link_count = this.links.length()
root.user.age = this.user.age.number()
output:
redis_streams:
url: tcp://TODO:6379
stream: baz
max_in_flight: 20
Linux:
curl -LO https://github.com/redpanda-data/redpanda/releases/latest/download/rpk-linux-amd64.zip
unzip rpk-linux-amd64.zip -d ~/.local/bin/
macOS (Homebrew):
brew install redpanda-data/tap/redpanda
Docker:
docker pull docker.redpanda.com/redpandadata/connect
See the getting started guide for more options.
rpk connect run ./config.yaml
With Docker:
# From a config file
docker run --rm -v /path/to/your/config.yaml:/connect.yaml docker.redpanda.com/redpandadata/connect run
# With inline overrides
docker run --rm -p 4195:4195 docker.redpanda.com/redpandadata/connect run \
-s "input.type=http_server" \
-s "output.type=kafka" \
-s "output.kafka.addresses=kafka-server:9092" \
-s "output.kafka.topic=redpanda_topic"
The catalog includes AWS (DynamoDB, Kinesis, S3, SQS, SNS), Azure (Blob, Queue, Table), GCP (Pub/Sub, Cloud Storage, BigQuery), Kafka, NATS (JetStream, Streaming), NSQ, MQTT, AMQP 0.91 (RabbitMQ), AMQP 1, Redis, Cassandra, Elasticsearch, HDFS, HTTP (server, client, websockets), MongoDB, and SQL (MySQL, PostgreSQL, ClickHouse, MSSQL) — and a lot more in the components documentation.
Delivery guarantees can be a tricky subject. Redpanda Connect processes and acknowledges messages using an in-process transaction model with no disk-persisted state, so when it's connecting at-least-once sources and sinks it can guarantee at-least-once delivery — even through crashes, disk corruption, or other server faults.
That's the default, with no caveats, which keeps deployment and scaling straightforward.
Two HTTP endpoints are exposed for orchestration probes:
/ping — liveness probe; always returns 200./ready — readiness probe; returns 200 once both input and output are connected, otherwise 503.Redpanda Connect exposes metrics to Statsd, Prometheus, a JSON HTTP endpoint, and other backends.
OpenTelemetry traces are emitted natively, so you can visualize what's happening inside a pipeline end-to-end.
Redpanda Connect ships with tooling for configuration discovery, debugging, and organization — see the configuration guide.
Requires a currently supported Go version:
git clone [email protected]:redpanda-data/connect
cd connect
task build:all
Components that link against external C libraries (for example zmq4) aren't included by default. To pull them in, set the x_benthos_extra build tag:
# With go
go install -tags "x_benthos_extra" github.com/redpanda-data/connect/v4/cmd/redpanda-connect@latest
# With task
TAGS=x_benthos_extra task build:all
This tag may change or be split into more granular tags in future releases. If the required system libraries aren't installed, the build will fail with an error like ld: library not found for -lzmq.
A multi-stage Dockerfile builds a minimal scratch-based image:
task docker:all
docker run --rm \
-v /path/to/your/config.yaml:/config.yaml \
-v /tmp/data:/data \
-p 4195:4195 \
docker.redpanda.com/redpandadata/connect run /config.yaml
Writing your own plugins in Go is straightforward — check out the API docs and the example plugin repository for reference implementations.
Redpanda Connect uses golangci-lint for linting and gofumpt for formatting. You can configure your editor to use gofumpt automatically — instructions are here.
task fmt # format the codebase
task lint # lint the codebase
task test # unit and template tests
Contributions are welcome. Before opening a pull request, please make sure it has been:
task testtask linttask fmtMost integration tests spin up Docker containers, so they're skipped by task test. You can run them individually with:
go test -run "^Test.*Integration.*$" ./internal/impl/<connector directory>/...