docs/using-scylla/integrations/scylla-cdc-source-connector.rst
.. toctree:: :hidden:
scylla-cdc-source-connector-quickstart
ScyllaDB CDC Source Connector is a source connector capturing row-level changes in the tables of a ScyllaDB cluster. It is a Debezium connector, compatible with Kafka Connect (with Kafka 2.6.0+) and built on top of scylla-cdc-java library. The source code of the connector is available at GitHub <https://github.com/scylladb/scylla-cdc-source-connector>_.
The connector reads the CDC log for specified tables and produces Kafka messages for each row-level INSERT, UPDATE or DELETE operation. The connector is able to split reading the CDC log across multiple processes: the connector can start a separate Kafka Connect task for reading each :doc:Vnode of ScyllaDB cluster </architecture/ringarchitecture/index> allowing for high throughput. You can limit the number of started tasks by using tasks.max property.
ScyllaDB CDC Source Connector seamlessly handles schema changes and topology changes (adding, removing nodes from ScyllaDB cluster). The connector is fault-tolerant, retrying reading data from ScyllaDB in case of failure. It periodically saves the current position in ScyllaDB CDC log using Kafka Connect offset tracking (configurable by offset.flush.interval.ms parameter). If the connector is stopped, it is able to resume reading from previously saved offset. ScyllaDB CDC Source Connector has at-least-once semantics.
The connector has the following capabilities:
ScyllaDB CDC </features/cdc/cdc-intro>. The connector replicates the following operations: INSERT, UPDATE, DELETE (single row deletes)The connector has the following limitations:
INSERT, UPDATE, DELETE) - partition deletes and row range deletes are not replicatedLIST, SET, MAP) and UDT - columns with those types are omitted from generated messagesThe following documents will help you get started with ScyllaDB CDC Source Connector:
ScyllaDB CDC Source Connector Quickstart <scylla-cdc-source-connector-quickstart>