docs/platform/connector-development/partner-certified-destinations.md
Thank you for contributing and committing to maintain your Airbyte destination connector 🥂
This document outlines the minimum expectations for partner-certified destination. We will strongly recommend that partners use the relevant CDK, but also want to support developers that need to develop in a different language. This document covers concepts implicitly built into our CDKs for this use-case.
Partner Certified Destination: A destination which is fully supported by the maintainers of the platform that is being loaded to. These connectors are not guaranteed by Airbyte directly, but instead the maintainers of the connector contribute fixes and improvements to ensure a quality experience for Airbyte users. Partner destinations are noted as such with a special “Partner” badge on the Integrations page, distinguishing them from other community maintained connectors on the Marketplace.
Bulk Destinations: A destination which accepts tables and columns as input, files, or otherwise unconstrained content. The majority of bulk destinations are database-like tabular (warehouses, data lakes, databases), but may also include file or blob destinations. The defining characteristic of bulk destinations is that they accept data in the shape of the source (e.g. tables, columns or content doesn’t change much from the representation of the source). These destinations can usually hold large amounts of data, and are the fastest to load.
Publish Destinations: A publish-type destination, often called a “reverse ETL” destination loads data to an external service or API. These destinations may be “picky”, having specific schema requirements for incoming streams. Common publish-type use cases include: publishing data to a REST API, publishing data to a messaging endpoint (e.g email, push notifications, etc.), and publishing data to an LLM vector store. Specific examples include: Destination-Pinecone, Destination-Vectara, and Destination-Weaviate. These destinations can usually hold finite amounts of data, and slower to load.
Create a public Github repo/project to be shared with Airbyte and it's users.
Monitor a Slack channel for communications directly from the Airbyte Support and Development teams.
Respect a 3 business day first response maximum to customer inquries or bug reports.
Maintain >=95% first-sync success and >=95% overall sync success on your destination connector. Note: config_errors are not counted against this metric.
Adhere to a regular update cadence for either the relevant Airbyte-managed CDK, or a commit to updating your connector to meet any new platform requirements at least once every 6 months.
Important bugs are audited and major problems are solved within a reasonable timeframe.
Validate that the connector is using HTTPS and secure-only access to customer data.
We won’t call out every requirement of the Airbyte Protocol (link) but below are important requirements that are specific to Destinations and/or specific to Airbyte 1.0 Destinations.
Destinations must capture state messages from sources, and must emit those state messages to STDOUT only after all preceding records have been durably committed to the destination
Destinations must append record counts to the Source’s state message before emitting (New for Airbyte 1.0)
State messages should be emitted with no gap longer than 15 minutes
Syncs should always be re-runnable without negative side effects. For instance, if the table is loaded multiple times, the destination should dedupe records according to the provided primary key information if and when available.
If deduping is disabled, then loads should either fully replace or append to destination tables - according to the user-provided setting in the configured catalog.
Bulk Destinations should handle metadata and logging of exceptions in a consistent manner.
Note: Because Publish Destinations have little control over table structures, these constraints do not apply to Publish or Reverse-ETL Destinations. This does not apply to vector store destinations, for instance.
Columns should include all top-level field declarations.
Tables should always include the following Airbyte metadata columns: _airbyte_meta, _airbyte_extracted_at, and _airbyte_raw_id
Bulk Destinations must utilize _airbyte_meta.changes[] to record in-flight fixes or changes
Bulk Destinations must accept new columns arriving from the source. (“Schema Evolution”)
All destinations are required to adhere to standard configuration practices for connectors. These requirements include, but are not limited to the following:
SPEC output should include RegEx validation rules for configuration parameters. These will be used in the Airbyte Platform UI to pre-validate user inputs, and provide appropriate guidance to users during setup.CHECK operation should consider all configuration inputs and produce reasonable error messages for most common configuration errors.SPEC should be properly annotated with "airbyte_secret" : true in the config requirements. This informs the Airbyte Platform that values should not be echoed to the screen during user input, and it ensures that secrets are properly handled as such when storing and retrieving settings in the backend.AllowedHosts - limiting which APIs/IPs this connector can communicate with.Every attempt should be made to ensure data does not lose fidelity during transit and that syncs do not fail due to data type mapping issues.
Note: Publish-type destinations may be excluded from some or all of the below rules, if they are constrained to use predefined types. In these cases, the destination should aim to fail early so the user can reconfigure their source before causing any data corruption or data inconsistencies from partially-loaded datasets.
Data types should be at least as large as needed to store incoming data.
Floats should be handled with the maximum possible size for floating point numbers
double precision floating point type.Decimals should be handled with the largest-possible precision and scale, generally DECIMAL(38, 9)
float in the source catalog.Destinations should always have a “failsafe” type they can use, in case source type is not known
anyOf(string, object). In the case that a good type cannot be chosen, we should fall back to either string types or variable/variant/json types.Any errors must be logged by the destination using an approved protocol. Silent errors are not permitted, but we bias towards not failing an entire sync when other valid records are able to be written. Only if errors cannot be logged using an approved protocol, then the destination must fail and should raise the error to the attention of the user and the platform.
Bulk Destinations: Errors should be recorded along with the record data, in the _airbyte_meta column, under the _airbyte_meta.changes key.
Publish Destinations: In absence of another specific means of communicating to the user that there was an issue, the destination must fail if it is not able to write data to the destination platform. (Additional approved logging protocols may be added in the future for publish-type destinations - for instance, dead letter queues, destination-specific state artifacts, and/or other durable storage medium which could be configured by the user.