Back to Airbyte

Airbyte CI CLI

airbyte-ci/connectors/pipelines/README.md

2.0.0112.4 KB
Original Source

Airbyte CI CLI

What is it?

airbyte-ci is a command line interface to run CI/CD pipelines. The goal of this CLI is to offer developers a tool to run these pipelines locally and in a CI context with the same guarantee. It can prevent unnecessary commit -> push cycles developers typically go through when they when to test their changes against a remote CI. This is made possible thanks to the use of Dagger, a CI/CD engine relying on Docker Buildkit to provide reproducible builds. Our pipeline are declared with Python code, the main entrypoint is here. This documentation should be helpful for both local and CI use of the CLI. We indeed power connector testing in the CI with this CLI.

How to install

Requirements

  • A running Docker engine with version >= 20.10.23

Install or Update

The recommended way to install airbyte-ci is using the Makefile.

sh
# from the root of the airbyte repository
make tools.airbyte-ci.install

Setting up connector secrets access

If you plan to use Airbyte CI to run CAT (Connector Acceptance Tests), we recommend setting up GSM access so that Airbyte CI can pull remote secrets from GSM. For setup instructions, see the CI Credentials package (which Airbyte CI uses under the hood) README's Get GSM Access instructions.

Updating the airbyte-ci tool

To reinstall airbyte-ci, run the following command:

sh
airbyte-ci update

or if that fails, you can reinstall it with the following command:

sh
# from the root of the airbyte repository
make tools.airbyte-ci.install

Checking the airbyte-ci install

To check that airbyte-ci is installed correctly, run the following command:

sh
make tools.airbyte-ci.check

Cleaning the airbyte-ci install

To clean the airbyte-ci install, run the following command:

sh
make tools.airbyte-ci.clean

Disabling telemetry

We collect anonymous usage data to help improve the tool. If you would like to disable this, you can set the AIRBYTE_CI_DISABLE_TELEMETRY environment variable to true.

Installation for development

Pre-requisites

  • Poetry >= 1.1.8
  • Python >= 3.10

Installation

If you are developing on pipelines, we recommend installing airbyte-ci with poetry:

bash
cd airbyte-ci/connectors/pipelines/
poetry install
poetry env activate
cd ../../

Alternatively, you can install airbyte-ci with pipx so that the entrypoint is available in your PATH:

bash
make tools.airbyte-ci.install

However, this will not automatically install the dependencies for the local dependencies of airbyte-ci, or respect the lockfile.

Its often best to use the poetry steps instead.

Running Tests

From airbyte-ci/connectors/pipelines:

bash
poetry run pytest tests

You can also run a subset of tests:

bash
poetry run pytest pipelines/models/steps.py

More options, such as running test by keyword matching, are available - see the pytest CLI documentation for all the available options.```

Checking Code Format (Pipelines)

bash
poetry run ruff check pipelines

Commands reference

At this point you can run airbyte-ci commands.

<a id="airbyte-ci-command-group"></a>airbyte-ci command group

The main command group option has sensible defaults. In local use cases you're not likely to pass options to the airbyte-ci command group.

Options

OptionDefault valueMapped environment variableDescription
--yes/--yFalseAgrees to all prompts.
--yes-auto-update/--no-auto-updateTrueAgrees to the auto update prompts.
--enable-update-check/--disable-update-checkTrueTurns on the update check feature
--enable-dagger-run/--disable-dagger-run--enable-dagger-runDisables the Dagger terminal UI.
--is-local/--is-ci--is-localDetermines the environment in which the CLI runs: local environment or CI environment.
--git-branchThe checked out git branch nameCI_GIT_BRANCHThe git branch on which the pipelines will run.
--git-revisionThe current branch headCI_GIT_REVISIONThe commit hash on which the pipelines will run.
--diffed-branchmasterBranch to which the git diff will happen to detect new or modified files.
--gha-workflow-run-idGHA CI only - The run id of the GitHub action workflow
--ci-contextmanualThe current CI context: manual for manual run, pull-request, nightly_builds, master
--pipeline-start-timestampCurrent epoch timeCI_PIPELINE_START_TIMESTAMPStart time of the pipeline as epoch time. Used for pipeline run duration computation.
--show-dagger-logs/--hide-dagger-logs--hide-dagger-logsFlag to show or hide the dagger logs.

<a id="connectors-command-subgroup"></a>connectors command subgroup

Available commands:

  • airbyte-ci connectors test: Run tests for one or multiple connectors.
  • airbyte-ci connectors build: Build docker images for one or multiple connectors.
  • airbyte-ci connectors publish: Publish a connector to Airbyte's DockerHub.

Options

OptionMultipleDefault valueMapped Environment VariableDescription
--use-remote-secrets/--use-local-secretsFalseIf --use-remote-secrets, connectors configuration will be pulled from Google Secret Manager. Requires the GCP_GSM_CREDENTIALS environment variable to be set with a service account with permission to read GSM secrets. If --use-local-secrets the connector configuration will be read from the local connector secrets folder. If this flag is not used and a GCP_GSM_CREDENTIALS environment variable is set remote secrets will be used, local secrets will be used otherwise.
--nameTrueSelect a specific connector for which the pipeline will run. Can be used multiple times to select multiple connectors. The expected name is the connector technical name. e.g. source-pokeapi
--support-levelTrueSelect connectors with a specific support level: community, certified. Can be used multiple times to select multiple support levels.
--metadata-queryFalseFilter connectors by the data field in the metadata file using a simpleeval query. e.g. 'data.ab_internal.ql == 200'
--use-local-cdkFalseFalseBuild with the airbyte-cdk from the local repository. " "This is useful for testing changes to the CDK.
--languageTrueSelect connectors with a specific language: python, low-code, java. Can be used multiple times to select multiple languages.
--modifiedFalseFalseRun the pipeline on only the modified connectors on the branch or previous commit (depends on the pipeline implementation). Archived connectors are ignored.
--concurrencyFalse5Control the number of connector pipelines that can run in parallel. Useful to speed up pipelines or control their resource usage.
--metadata-change-only/--not-metadata-change-onlyFalse--not-metadata-change-onlyOnly run the pipeline on connectors with changes on their metadata.yaml file.
--enable-dependency-scanning / --disable-dependency-scanningFalse --disable-dependency-scanningWhen enabled the dependency scanning will be performed to detect the connectors to select according to a dependency change.
--docker-hub-usernameDOCKER_HUB_USERNAMEYour username to connect to DockerHub. Required for the publish subcommand.
--docker-hub-passwordDOCKER_HUB_PASSWORDYour password to connect to DockerHub. Required for the publish subcommand.

<a id="connectors-list-command"></a>connectors list command

Retrieve the list of connectors satisfying the provided filters.

Examples

List all connectors:

airbyte-ci connectors list

List all connectors and write the output to a file: airbyte-ci connectors list --output=connectors.json

List certified connectors:

airbyte-ci connectors --support-level=certified list

List connectors changed on the current branch:

airbyte-ci connectors --modified list

List connectors with a specific language:

airbyte-ci connectors --language=python list

List connectors with multiple filters:

airbyte-ci connectors --language=low-code --support-level=certified list

<a id="connectors-test-command"></a>connectors test command

Run a test pipeline for one or multiple connectors.

Examples

Test a single connector: airbyte-ci connectors --name=source-pokeapi test

Test multiple connectors: airbyte-ci connectors --name=source-pokeapi --name=source-bigquery test

Test certified connectors: airbyte-ci connectors --support-level=certified test

Test connectors changed on the current branch: airbyte-ci connectors --modified test

Run acceptance test only on the modified connectors, just run its full refresh tests: airbyte-ci connectors --modified test --only-step="acceptance" --acceptance.-k=test_full_refresh

What it runs

mermaid
flowchart TD
    entrypoint[[For each selected connector]]
    subgraph static ["Static code analysis"]
      qa[Run QA checks]
      sem["Check version follows semantic versioning"]
      incr["Check version is incremented"]
      metadata_validation["Run metadata validation on metadata.yaml"]
      sem --> incr
    end
    subgraph tests ["Tests"]
        build[Build connector docker image]
        unit[Run unit tests]
        integration[Run integration tests]
        pyairbyte_validation[Python CLI smoke tests via PyAirbyte]
        cat[Run connector acceptance tests]
        secret[Load connector configuration]

        unit-->secret
        unit-->build
        secret-->integration
        secret-->cat
        secret-->pyairbyte_validation
        build-->integration
        build-->cat
    end
    entrypoint-->static
    entrypoint-->tests
    report["Build test report"]
    tests-->report
    static-->report

Options

OptionMultipleDefault valueDescription
--skip-step/-xTrueSkip steps by id e.g. -x unit -x acceptance
--only-step/-kTrueOnly run specific steps by id e.g. -k unit -k acceptance
--fail-fastFalseFalseAbort after any tests fail, rather than continuing to run additional tests. Use this setting to confirm a known bug is fixed (or not), or when you only require a pass/fail result.
--code-tests-onlyTrueFalseSkip any tests not directly related to code updates. For instance, metadata checks, version bump checks, changelog verification, etc. Use this setting to help focus on code quality during development.
--concurrent-catFalseFalseMake CAT tests run concurrently using pytest-xdist. Be careful about source or destination API rate limits.
--<step-id>.<extra-parameter>=<extra-parameter-value>TrueYou can pass extra parameters for specific test steps. More details in the extra parameters section below
--ci-requirementsFalse

Note:

  • The above options are implemented for Java connectors but may not be available for Python connectors. If an option is not supported, the pipeline will not fail but instead the 'default' behavior will be executed.

Extra parameters

You can pass extra parameters to the following steps:

  • unit
  • integration
  • acceptance

This allows you to override the default parameters of these steps. For example, you can only run the test_read test of the acceptance test suite with: airbyte-ci connectors --name=source-pokeapi test --acceptance.-k=test_read Here the -k parameter is passed to the pytest command running acceptance tests. Please keep in mind that the extra parameters are not validated by the CLI: if you pass an invalid parameter, you'll face a late failure during the pipeline execution.

<a id="connectors-build-command"></a>connectors build command

Run a build pipeline for one or multiple connectors and export the built docker image to the local docker host. It's mainly purposed for local use.

Build a single connector: airbyte-ci connectors --name=source-pokeapi build

Build a single connector with a custom image tag: airbyte-ci connectors --name=source-pokeapi build --tag=my-custom-tag

Build a single connector for multiple architectures: airbyte-ci connectors --name=source-pokeapi build --architecture=linux/amd64 --architecture=linux/arm64

You will get:

  • airbyte/source-pokeapi:dev-linux-amd64
  • airbyte/source-pokeapi:dev-linux-arm64

Build multiple connectors: airbyte-ci connectors --name=source-pokeapi --name=source-bigquery build

Build certified connectors: airbyte-ci connectors --support-level=certified build

Build connectors changed on the current branch: airbyte-ci connectors --modified build

What it runs

For Python and Low Code connectors:

mermaid
flowchart TD
    arch(For each platform amd64/arm64)
    connector[Build connector image]
    load[Load to docker host with :dev tag, current platform]
    spec[Get spec]
    arch-->connector-->spec--"if success"-->load

For Java connectors:

mermaid
flowchart TD
    arch(For each platform amd64/arm64)
    distTar[Gradle distTar task run]
    base[Build integration base]
    java_base[Build integration base Java]
    normalization[Build Normalization]
    connector[Build connector image]

    arch-->base-->java_base-->connector
    distTar-->connector
    normalization--"if supports normalization"-->connector

    load[Load to docker host with :dev tag]
    spec[Get spec]
    connector-->spec--"if success"-->load

Options

OptionMultipleDefault valueDescription
--architecture/-aTrueLocal platformDefines for which architecture(s) the connector image will be built.
--tagFalsedevImage tag for the built image.

<a id="connectors-publish-command"></a>connectors publish command

Run a publish pipeline for one or multiple connectors. It's mainly purposed for CI use to release a connector update.

Examples

Publish all connectors modified in the head commit: airbyte-ci connectors --modified publish

Options

OptionRequiredDefaultMapped environment variableDescription
--pre-release/--main-releaseFalse--pre-releaseWhether to publish the pre-release or the main release version of a connector. Defaults to pre-release. For main release you have to set the credentials to interact with the GCS bucket.
--spec-cache-gcs-credentialsFalseSPEC_CACHE_GCS_CREDENTIALSThe service account key to upload files to the GCS bucket hosting spec cache.
--spec-cache-bucket-nameFalseSPEC_CACHE_BUCKET_NAMEThe name of the GCS bucket where specs will be cached.
--metadata-service-gcs-credentialsFalseMETADATA_SERVICE_GCS_CREDENTIALSThe service account key to upload files to the GCS bucket hosting the metadata files.
--metadata-service-bucket-nameFalseMETADATA_SERVICE_BUCKET_NAMEThe name of the GCS bucket where metadata files will be uploaded.
--slack-webhookFalseSLACK_WEBHOOKThe Slack webhook URL to send notifications to.
--slack-channelFalseSLACK_CHANNELThe Slack channel name to send notifications to.
--ci-requirementsFalseOutput the CI requirements as a JSON payload. It is used to determine the CI runner to use.
--python-registry-tokenFalsePYTHON_REGISTRY_TOKENThe API token to authenticate with the registry. For pypi, the pypi- prefix needs to be specified
--python-registry-urlFalsehttps://upload.pypi.org/legacy/PYTHON_REGISTRY_URLThe python registry to publish to. Defaults to main pypi
--python-registry-check-urlFalsehttps://pypi.org/pypiPYTHON_REGISTRY_CHECK_URLThe python registry url to check whether a package is published already
--promote-release-candidateFalseFalsePromote the release candidate version of selected connectors as main version.
--rollback-release-candidateFalseFalseRollback the release candidate version of the selector connectors.

I've added an empty "Default" column, and you can fill in the default values as needed.

What it runs

mermaid
flowchart TD
    validate[Validate the metadata file]
    check[Check if the connector image already exists]
    build[Build the connector image for all platform variants]
    publish_to_python_registry[Push the connector image to the python registry if enabled]
    upload_spec[Upload connector spec to the spec cache bucket]
    push[Push the connector image from DockerHub, with platform variants]
    pull[Pull the connector image from DockerHub to check SPEC can be run and the image layers are healthy]
    upload_metadata[Upload its metadata file to the metadata service bucket]

    validate-->check-->build-->upload_spec-->publish_to_python_registry-->push-->pull-->upload_metadata

Python registry publishing

If remoteRegistries.pypi.enabled in the connector metadata is set to true, the connector will be published to the python registry. To do so, the --python-registry-token and --python-registry-url options are used to authenticate with the registry and publish the connector. If the current version of the connector is already published to the registry, the publish will be skipped (the --python-registry-check-url is used for the check).

On a pre-release, the connector will be published as a .dev<N> version.

The remoteRegistries.pypi.packageName field holds the name of the used package name. It should be set to airbyte-source-<package name>. Certified Python connectors are required to have PyPI publishing enabled.

An example remoteRegistries entry in a connector metadata.yaml looks like this:

yaml
remoteRegistries:
  pypi:
    enabled: true
    packageName: airbyte-source-pokeapi

<a id="connectors-up-to-date"></a>connectors up-to-date command

Meant to be run on a cron script.

Actions:

  • Set the latest base image version on selected connectors
  • Run poetry update on selected connectors
  • Bump the connector version and update the changelog
  • Open a PR with the changes, set auto-merge label on it.
Usage: airbyte-ci connectors up-to-date [OPTIONS]

Options:
  --no-bump    Don't bump the version or changelog.
  --dep TEXT  Give a specific set of `poetry add` dependencies to update. For
              example: --dep airbyte-cdk==0.80.0 --dep pytest@^6.2
  --open-reports    Auto open reports in the browser.
  --create-prs      Create pull requests for each updated connector.
  --auto-merge    Set the auto-merge label on created PRs.
  --help      Show this message and exit.

Examples

Get source-openweather up to date. If there are changes, bump the version and add to changelog:

  • airbyte-ci connectors --name=source-openweather up-to-date: upgrades main dependecies
  • airbyte-ci connectors --name=source-openweather up-to-date
  • airbyte-ci connectors --name=source-openweather up-to-date --create-prs: make a pull request for it
  • airbyte-ci connectors --name=source-openweather up-to-date --no-bump: don't change the version or changelog

<a id="connectors-bump-version"></a>connectors bump-version command

Bump the version of the selected connectors. A placeholder will be added to the changelog file for the new entry PR number. Use the connectors pull-request command to create a PR, it will update the changelog entry with the PR number.

Examples

Bump source-openweather: airbyte-ci connectors --name=source-openweather bump-version patch "<changelog-entry>"

Arguments

ArgumentDescription
BUMP_TYPEmajor, minor, patch, rc, or version:<explicit-version>
CHANGELOG_ENTRYThe changelog entry that will get added to the connector documentation

Options

OptionDescription
--pr-numberExplicitly set the PR number in the changelog entry, a placeholder will be set otherwise.
--rcBump the version by the specified bump type and append the release candidate suffix.

<a id="connectors-upgrade-cdk"></a>connectors upgrade-cdk command

Updates the CDK version of the selected connectors. For Python connectors, sets the airbyte-cdk dependency in pyproject.toml and refreshes the lockfile, updating only essential dependencies.

Examples

airbyte-ci connectors --language=python upgrade-cdk -> Updates all python connectors to the caret range of the latest version. airbyte-ci connectors --name=source-openweather upgrade-cdk "3.0.0" -> Pins source-openweather to version 3.0.0 airbyte-ci connectors --modified upgrade-cdk "<4" -> Updates all modified connectors to the highest available version of major version 3.x.x

Arguments

ArgumentDescription
CDK_VERSIONCDK version constraint to set (default to ^{most_recent_patch_version})

Notes

When using < (less than) or > (greater than) for the CDK_VERSION argument, it must be wrapped in quotation marks ("<3"). Otherwise the shell (zsh or bash) will interprete these characters as redirection operators.

<a id="connectors-migrate-to-base-image"></a>connectors migrate-to-base-image command

Make a connector using a Dockerfile migrate to the base image by:

  • Removing its Dockerfile
  • Updating its metadata to use the latest base image version
  • Updating its documentation to explain the build process
  • Bumping by a patch version

Examples

Migrate source-openweather to use the base image: airbyte-ci connectors --name=source-openweather migrate-to-base-image

<a id="connectors-migrate-to-poetry"></a>connectors migrate-to-poetry command

Migrate connectors the poetry package manager.

Examples

Migrate source-openweather to use the base image: airbyte-ci connectors --name=source-openweather migrate-to-poetry airbyte-ci connectors --name=source-openweather migrate-to-poetry --changelog --bump patch

<a id="connectors-migrate-to-inline-schemas"></a>connectors migrate-to-inline-schemas command

Migrate .json schemas into manifest.yaml files, when present.

Usage: airbyte-ci connectors migrate-to-inline-schemas [OPTIONS]

Options:
  --report  Auto open report browser.
  --help    Show this message and exit.

Examples

Migrate source-quickbooks to use inline schemas: airbyte-ci connectors --name=source-quickbooks migrate-to-inline-schemas

<a id="connectors-pull-request"></a>connectors pull-request command

Makes a pull request for all changed connectors. If the branch already exists, it will update the existing one.

Usage: airbyte-ci connectors pull-request [OPTIONS]

Options:
  -m, --message TEXT          Commit message and pull request title and
                              changelog (if enabled).  [required]
  -b, --branch_id TEXT        update a branch named <branch_id>/<connector-
                              name> instead generating one from the message.
                              [required]
  --report                    Auto open report browser.
  --title TEXT                Title of the PR to be created or edited
                              (optional - defaults to message or no change).
  --body TEXT                 Body of the PR to be created or edited (optional
                              - defaults to empty or not change).
  --help                      Show this message and exit.

Examples

Make a PR for all changes, bump the version and make a changelog in those PRs. They will be on the branch ci_update/round2/<connector-name>: airbyte-ci connectors --modified pull-request -m "upgrading connectors" -b ci_update/round2

Do it just for a few connectors: airbyte-ci connectors --name source-aha --name source-quickbooks pull-request -m "upgrading connectors" -b ci_update/round2

You can also set or set/change the title or body of the PR: airbyte-ci connectors --name source-aha --name source-quickbooks pull-request -m "upgrading connectors" -b ci_update/round2 --title "New title" --body "full body\n\ngoes here"

<a id="connectors-list-command"></a>connectors generate-erd command

Generates a couple of files and publish a new ERD to dbdocs. The generated files are:

  • <source code_directory>/erd/discovered_catalog.json: the catalog used to generate the estimated relations and the dbml file
  • <source code_directory>/erd/estimated_relationships.json: the output of the LLM trying to figure out the relationships between the different streams
  • <source code_directory>/erd/source.dbml: the file used the upload the ERDs to dbdocs

Pre-requisites:

  • The config file use to discover the catalog should be available in <source code_directory>/secrets/config.json

Create initial diagram workflow or on connector's schema change

Steps

  • Ensure the pre-requisites mentioned above are met
  • Run DBDOCS_TOKEN=<token> GENAI_API_KEY=<api key> airbyte-ci connectors --name=<source name> generate-erd
  • Create a PR with files <source code_directory>/erd/estimated_relationships.json and <source code_directory>/erd/source.dbml for documentation purposes

Expected Outcome

  • The diagram is available in dbdocs
  • <source code_directory>/erd/estimated_relationships.json and <source code_directory>/erd/source.dbml are updated on master

On manual validation

Steps

  • If not exists, create file <source code_directory>/erd/confirmed_relationships.json with the following format and add:
    • relations describes the relationships that we know exist
    • false_positives describes the relationships the LLM found that we know do not exist
{
    "streams": [
        {
            "name": <stream_name>,
            "relations": {
                <stream_name property>: "<target stream>.<target stream column>"
            }
            "false_positives": {
                <stream_name property>: "<target stream>.<target stream column>"
            }
        },
        <...>
    ]
}
  • Ensure the pre-requisites mentioned above are met
  • Run DBDOCS_TOKEN=<token> airbyte-ci connectors --name=<source name> generate-erd -x llm_relationships
  • Create a PR with files <source code_directory>/erd/confirmed_relationships.json and <source code_directory>/erd/source.dbml for documentation purposes

Options

OptionRequiredDefaultMapped environment variableDescription
--skip-step/-xFalseSkip steps by id e.g. -x llm_relationships -x publish_erd

<a id="format-subgroup"></a>format command subgroup

airbyte-ci format is no longer available. To format code in this repository, we're using pre-commit. Assuming pre-commit is installed, pre-commit run will run the formatters for you.

<a id="poetry-subgroup"></a>poetry command subgroup

Available commands:

  • airbyte-ci poetry publish

Options

OptionRequiredDefaultMapped environment variableDescription
--package-pathTrueThe path to the python package to execute a poetry command on.

Examples

  • Publish a python package: airbyte-ci poetry --package-path=path/to/package publish --publish-name=my-package --publish-version="1.2.3" --python-registry-token="..." --registry-url="http://host.docker.internal:8012/"

<a id="format-check-command"></a>publish command

This command publishes poetry packages (using pyproject.toml) or python packages (using setup.py) to a python registry.

For poetry packages, the package name and version can be taken from the pyproject.toml file or be specified as options.

Options

OptionRequiredDefaultMapped environment variableDescription
--publish-nameFalseThe name of the package. Not required for poetry packages that define it in the pyproject.toml file
--publish-versionFalseThe version of the package. Not required for poetry packages that define it in the pyproject.toml file
--python-registry-tokenTruePYTHON_REGISTRY_TOKENThe API token to authenticate with the registry. For pypi, the pypi- prefix needs to be specified
--python-registry-urlFalsehttps://upload.pypi.org/legacy/PYTHON_REGISTRY_URLThe python registry to publish to. Defaults to main pypi

<a id="metadata-validate-command-subgroup"></a>metadata command subgroup

Available commands:

  • airbyte-ci metadata deploy orchestrator

<a id="metadata-upload-orchestrator"></a>metadata deploy orchestrator command

This command deploys the metadata service orchestrator to production. The DAGSTER_CLOUD_METADATA_API_TOKEN environment variable must be set.

Example

airbyte-ci metadata deploy orchestrator

What it runs

mermaid
flowchart TD
    test[Run orchestrator tests] --> deploy[Deploy orchestrator to Dagster Cloud]

<a id="tests-command"></a>tests command

This command runs the poe tasks declared in the [tool.airbyte-ci] section of our internal poetry packages. Feel free to checkout this Pydantic model to see the list of available options in [tool.airbyte-ci] section.

You can find the list of internal packages here

Options

OptionRequiredMultipleDescription
--poetry-package-path/-pFalseTruePoetry packages path to run the poe tasks for.
--modifiedFalseFalseRun poe tasks of modified internal poetry packages.
--ci-requirementsFalseFalseOutput the CI requirements as a JSON payload. It is used to determine the CI runner to use.

Examples

You can pass multiple --poetry-package-path options to run poe tasks.

E.G.: running Poe tasks on the modified internal packages of the current branch: airbyte-ci test --modified

<a id="migrate-to-manifest-only-command"></a>migrate-to-manifest-only command

This command migrates valid connectors to the manifest-only format. It contains two steps:

  1. Check: Validates whether a connector is a candidate for the migration. If not, the operation will be skipped.
  2. Migrate: Strips out all unneccessary files/folders, leaving only the root-level manifest, metadata, icon, and acceptance/integration test files. Unwraps the manifest (references and $parameters) so it's compatible with Connector Builder.

Examples

bash
airbyte-ci connectors --name=source-pokeapi migrate-to-manifest-only
airbyte-ci connectors --language=low-code migrate-to-manifest-only

Changelog

VersionPRDescription
5.5.0#64164Remove the MetadataValidation step from the airbyte-ci pipeline. This is now done via a shell script.
5.4.0#64135Delete the base_images sub-package. Connector base images are now built using Dockerfiles
5.3.0#61598Add trackable commit text and github-native auto-merge in up-to-date, auto-merge, rc-promote, and rc-rollback
5.2.5#60325Update slack team to oc-extensibility-critical-systems
5.2.4#59724Fix components mounting and test dependencies for manifest-only unit tests
5.1.0#53238Add ability to opt out of version increment checks via metadata flag
5.0.1#52664Update Python version requirement from 3.10 to 3.11.
4.49.4#52104Stream Gradle task output to the step logger
5.0.0#52647Removed migration and formatting commands.
4.49.3#52102Load docker image to local docker host for java connectors
4.49.2#52090Re-add custom task parameters in GradleTask
4.49.1#52087Wire the --enable-report-auto-open correctly for connector tests
4.49.0#52033Run gradle as a subprocess and not via Dagger
4.48.9#51609Fix ownership of shared cache volume for non root connectors
4.48.8#51582Fix typo in migrate-to-inline-schemas command
4.48.7#51579Give back the ownership of /tmp to the original user on finalize build
4.48.6#51577Run finalize build scripts as root
4.48.5#49827Bypasses CI checks for promoted release candidate PRs.
4.48.4#51003Install git in the build / test connector container when --use-cdk-ref is passed.
4.48.3#50988Remove deprecated --no-update flag from poetry commands
4.48.2#50871Speed up connector modification detection.
4.48.1#50410Java connector build: give ownership of built artifacts to the current image user.
4.48.0#49960Deprecate airbyte-ci format command
4.47.0#49832Build java connectors from the base image declared in metadata.yaml.
4.46.5#49835Fix connector language discovery for projects with Kotlin Gradle build scripts.
4.46.4#49462Support Kotlin Gradle build scripts in connectors.
4.46.3#49465Fix --use-local-cdk on rootless connectors.
4.46.2#49136Fix failed install of python components due to non-root permissions.
4.46.1#49146Update crane image address as the one we were using has been deleted by the maintainer.
4.46.0#48790Add unit tests step for manifest-only connectors
4.45.3#48927Fix bug in determine_changelog_entry_comment
4.45.2#48868Fix ownership issues while using --use-local-cdk
4.45.1#48872Make the connectors list command write its output to a JSON file.
4.45.0#48866Adds --rc option to bump-version command
4.44.2#48725up-to-date: specific changelog comment for base image upgrade to rootless.
4.44.1#48836Manifest-only connector build: give ownership of copied file to the current user.
4.44.0#48818Use local CDK or CDK ref for manifest only connector build.
4.43.1#48824Allow uploading CI reports to GCS with fewer permissions set.
4.43.0#36545Switch to airbyte user when available in Python base image.
4.42.2#48404Include advanced_auth in spec migration for manifest-only pipeline
4.42.1#47316Connector testing: skip incremental acceptance test when the connector is not released.
4.42.0#47386Version increment check: make sure consecutive RC remain on the same version.
4.41.9#47483Fix build logic used in up-to-date to support any connector language.
4.41.8#47447Use cache_ttl for base image registry listing in up-to-date.
4.41.7#47444Remove redundant --ignore-connector error from up-to-date. --metadata-query can be used instead.
4.41.6#47308Connector testing: skip incremental acceptance test when the connector is not released.
4.41.5#47255Fix DisableProgressiveRollout following Dagger API change.
4.41.4#47203Fix some with_exec and entrypoint usage following Dagger upgrade
4.41.3#47189Fix up-to-date which did not export doc to the right path
4.41.2#47185Fix the bump version command which did not update the changelog.
4.41.1#46914Upgrade to Dagger 0.13.3
4.41.0#46914Rework the connector rollback pipeline for progressive rollout
4.40.0#46380The bump-version command now allows the rc bump type.
4.39.0#46696Bump PyAirbyte dependency and replace airbyte-lib-validate-source CLI command with new validate command
4.38.0#46380connectors up-to-date now supports manifest-only connectors!
4.37.0#46380Include custom components file handling in manifest-only migrations
4.36.2#46278Fixed a bug in RC rollout and promote not taking semaphore
4.36.1#46274airbyte-ci format js respects .prettierc and .prettierignore
4.36.0#44877Implement --promote/rollback-release-candidate in connectors publish.
4.35.6#45632Add entry to format file ignore list (destination-*/expected-spec.json)
4.35.5#45672Fix docs mount during publish
4.35.4#42584Mount connector directory to metadata validation
4.35.3#45393Resolve symlinks in SimpleDockerStep.
4.35.2#45360Updated dependencies.
4.35.1#45160Remove deps.toml dependency for java connectors.
4.35.0#44879Mount components.py when building manifest-only connector image
4.34.2#44786Pre-emptively skip archived connectors when searching for modified files
4.34.1#44557Conditionally propagate parameters in manifest-only migration
4.34.0#44551connectors publish do not push the latest tag when the current version is a release candidate.
4.33.1#44465Ignore version check if only erd folder is changed
4.33.0#44377Upload connector SBOM to metadata service bucket on publish.
4.32.5#44173Bug fix for live tests' --should-read-with-state handling.
4.32.4#44025Ignore third party connectors on publish.
4.32.3#44118Improve error handling in live tests.
4.32.2#43970Make connectors publish early exit if no connectors are selected.
4.32.1#41642Avoid transient publish failures by increasing POETRY_REQUESTS_TIMEOUT and setting retries on PublishToPythonRegistry.
4.32.0#43969Add an --ignore-connector option to up-to-date
4.31.5#43934Track deleted files when generating pull-request
4.31.4#43724Do not send slack message on connector pre-release.
4.31.3#43426Ignore archived connectors on connector selection from modified files.
4.31.2#43433Fix 'changed_file' indentation in 'pull-request' command
4.31.1#43442Resolve type check failure in bump version
4.31.0#42970Add explicit version set to bump version
4.30.1#43386Fix 'format' command usage bug in airbyte-enterprise.
4.30.0#42583Updated dependencies
4.29.0#42576New command: migrate-to-manifest-only
4.28.3#42046Trigger connector tests on doc change.
4.28.2#43297migrate-to-inline_schemas removes unused schema files and empty schema dirs.
4.28.1#42972Add airbyte-enterprise support for format commandi
4.28.0#42849Couple selection of strict-encrypt variants (e vice versa)
4.27.0#42574Live tests: run from connectors test pipeline for connectors with sandbox connections
4.26.1#42905Rename the docker cache volume to avoid using the corrupted previous volume.
4.26.0#42849Send publish failures messages to #connector-publish-failures
4.25.4#42463Add validation before live test runs
4.25.3#42437Ugrade-cdk: Update to work with Python connectors using poetry
4.25.2#42077Live/regression tests: add status check for regression test runs
4.25.1#42410Live/regression tests: disable approval requirement on forks
4.25.0#42044Live/regression tests: add support for selecting from a subset of connections
4.24.3#42040Always send regression test approval status check; skip on auto-merge PRs.
4.24.2#41676Send regression test approval status check when skipped.
4.24.1#41642Use the AIRBYTE_GITHUB_REPO environment variable to run airbyte-ci in other repos.
4.24.0#41627Require manual regression test approval for certified connectors
4.23.1#41541Add support for submodule use-case.
4.23.0#39906Add manifest only build pipeline
4.22.0#41623Make airbyte-ci run on private forks.
4.21.1#41029up-to-date: mount local docker config to Syft to pull private images and benefit from increased DockerHub rate limits.
4.21.0#40547Make bump-version accept a --pr-number option.
4.20.3#40754Accept and ignore additional args in migrate-to-poetry pipeline
4.20.2#40709Fix use of GH token.
4.20.1#40698Add live tests evaluation mode options.
4.20.0#38816Add command for running all live tests (validation + regression).
4.19.0#39600Productionize the up-to-date command
4.18.3#39341Fix --use-local-cdk option: change no-deps to force-reinstall
4.18.2#39483Skip IncrementalAcceptanceTests when AcceptanceTests succeed.
4.18.1#39457Make slugify consistent with live-test
4.18.0#39366Implement IncrementalAcceptance tests to only fail CI on community connectors when there's an Acceptance tests regression.
4.17.0#39321Bust the java connector build cache flow to get fresh yum packages on a daily basis.
4.16.0#38772Add pipeline to replace usage of AirbyteLogger.
4.15.7#38772Fix regression test connector image retrieval.
4.15.6#38783Fix a variable access error with repo_dir in the bump-version command.
4.15.5#38732Update metadata deploy pipeline to 3.10
4.15.4#38646Make airbyte-ci able to test external repos.
4.15.3#38645Fix typo preventing correct secret mounting on Python connectors integration tests.
4.15.2#38628Introduce ConnectorTestContext to avoid trying fetching connector secret in the PublishContext.
4.15.1#38615Do not eagerly fetch connector secrets.
4.15.0#38322Introduce a SecretStore abstraction to fetch connector secrets from metadata files.
4.14.1#38582Fixed bugs in up-to-date flags, pull-request version change logic.
4.14.0#38281Conditionally run test suites according to connectorTestSuitesOptions in metadata files.
4.13.3#38221Add dagster cloud dev deployment pipeline opitions
4.13.2#38246Remove invalid connector test step options.
4.13.1#38020Add auto_merge as an internal package to test.
4.13.0#32715Tag connector metadata with git info
4.12.7#37787Remove requirements on dockerhub credentials to run QA checks.
4.12.6#36497Add airbyte-cdk to list of poetry packages for testing
4.12.5#37785Set the --yes-auto-update flag to True by default.
4.12.4#37786(fixed 4.12.2): Do not upload dagger log to GCP when no credentials are available.
4.12.3#37783Revert 4.12.2
4.12.2#37778Do not upload dagger log to GCP when no credentials are available.
4.12.1#37765Relax the required env var to run in CI and handle their absence gracefully.
4.12.0#37690Pass custom CI status name in connectors test
4.11.0#37641Updates to run regression tests in GitHub Actions.
4.10.5#37641Reintroduce changes from 4.10.0 with a fix.
4.10.4#37641Temporarily revert changes from version 4.10.0
4.10.3#37615Fix KeyError when running migrate-to-poetry
4.10.2#37614Fix UnboundLocalError: local variable 'add_changelog_entry_result' referenced before assignment in migrate-to-base-image
4.10.1#37622Temporarily disable regression tests in CI
4.10.0#37616Improve modified files comparison when the target branch is from a fork.
4.9.0#37440Run regression tests with airbyte-ci connectors test
4.8.0#37404Accept a git-repo-url option on the airbyte-ci root command to checkout forked repo.
4.7.4#37485Allow java connectors to be written in kotlin.
4.7.3#37101Pin PyAirbyte version.
4.7.2#36962Re-enable connector dependencies upload on publish.
4.7.1#36961Temporarily disable python connectors dependencies upload until we find a schema the data team can work with.
4.7.0#36892Upload Python connectors dependencies list to GCS on publish.
4.6.5#36722Fix incorrect pipeline names
4.6.4#36480Burst the Gradle Task cache if a new CDK version was released
4.6.3#36527Handle extras as well as groups in airbyte ci test [poetry packages]
4.6.2#36220Allow using migrate-to-base-image without PULL_REQUEST_NUMBER
4.6.1#36319Fix ValueError related to PR number in migrate-to-poetry
4.6.0#35583Implement the airbyte-ci connectors migrate-to-poetry command.
4.5.4#36206Revert poetry cache removal during nightly builds
4.5.3#34586Extract connector changelog modification logic into its own class
4.5.2#35802Fix bug with connectors bump-version command
4.5.1#35786Declare live_tests as an internal poetry package.
4.5.0#35784Format command supports kotlin
4.4.0#35317Augment java connector reports to include full logs and junit test results
4.3.2#35536Make QA checks run correctly on *-strict-encrypt connectors.
4.3.1#35437Do not run QA checks on publish, just MetadataValidation.
4.3.0#35438Optionally disable telemetry with environment variable.
4.2.4#35325Use connectors_qa for QA checks and remove redundant checks.
4.2.3#35322Declare connectors_qa as an internal package for testing.
4.2.2#35364Fix connector tests following gradle changes in #35307.
4.2.1#35204Run poetry check before poetry install on poetry package install.
4.2.0#35103Java 21 support.
4.1.4#35039Fix bug which prevented gradle test reports from being added.
4.1.3#35010Use poetry install --no-root in the builder container.
4.1.2#34945Only install main dependencies when running poetry install.
4.1.1#34430Speed up airbyte-ci startup (and airbyte-ci format).
4.1.0#34923Include gradle test reports in HTML connector test report.
4.0.0#34736Run poe tasks declared in internal poetry packages.
3.10.4#34867Remove connector ops team
3.10.3#34836Add check for python registry publishing enabled for certified python sources.
3.10.2#34044Add pypi validation testing.
3.10.1#34756Enable connectors tests in draft PRs.
3.10.0#34606Allow configuration of separate check URL to check whether package exists already.
3.9.0#34606Allow configuration of python registry URL via environment variable.
3.8.1#34607Improve gradle dependency cache volume protection.
3.8.0#34316Expose Dagger engine image name in --ci-requirements and add --ci-requirements to the airbyte-ci root command group.
3.7.3#34560Simplify Gradle task execution framework by removing local maven repo support.
3.7.2#34555Override secret masking in some very specific special cases.
3.7.1#34441Support masked secret scrubbing for java CDK v0.15+
3.7.0#34343allow running connector upgrade_cdk for java connectors
3.6.1#34490Fix inconsistent dagger log path typing
3.6.0#34111Add python registry publishing
3.5.3#34339only do minimal changes on a connector version_bump
3.5.2#34381Bind a sidecar docker host for airbyte-ci test
3.5.1#34321Upgrade to Dagger 0.9.6 .
3.5.0#33313Pass extra params after Gradle tasks.
3.4.2#34301Pass extra params after Gradle tasks.
3.4.1#34067Use dagster-cloud 1.5.7 for deploy
3.4.0#34276Introduce --only-step option for connector tests.
3.3.0#34218Introduce --ci-requirements option for client defined CI runners.
3.2.0#34050Connector test steps can take extra parameters
3.1.3#34136Fix issue where dagger excludes were not being properly applied
3.1.2#33972Remove secrets scrubbing hack for --is-local and other small tweaks.
3.1.1#33979Fix AssertionError on report existence again
3.1.0#33994Log more context information in CI.
3.0.2#33987Fix type checking issue when running --help
3.0.1#33981Fix issues with deploying dagster, pin pendulum version in dagster-cli install
3.0.0#33582Upgrade to Dagger 0.9.5
2.14.3#33964Reintroduce mypy with fixes for AssertionError on publish and missing report URL on connector test commit status.
2.14.2#33954Revert mypy changes
2.14.1#33956Exclude pnpm lock files from auto-formatting
2.14.0#33941Enable in-connector normalization in destination-postgres
2.13.1#33920Report different sentry environments
2.13.0#33784Make airbyte-ci test able to run any poetry command
2.12.0#33313Add upgrade CDK command
2.11.0#32188Add -x option to connector test to allow for skipping steps
2.10.12#33419Make ClickPipelineContext handle dagger logging.
2.10.11#33497Consider nested .gitignore rules in format.
2.10.10#33449Add generated metadata models to the default format ignore list.
2.10.9#33370Fix bug that broke airbyte-ci test
2.10.8#33249Exclude git ignored files from formatting.
2.10.7#33248Fix bug which broke airbyte-ci connectors tests when optional DockerHub credentials env vars are not set.
2.10.6#33170Remove Dagger logs from console output of format.
2.10.5#33097Improve format performances, exit with 1 status code when fix changes files.
2.10.4#33206Add "-y/--yes" Flag to allow preconfirmation of prompts
2.10.3#33080Fix update failing due to SSL error on install.
2.10.2#33008Fix local connector build.
2.10.1#32928Fix BuildConnectorImages constructor.
2.10.0#32819Add --tag option to connector build.
2.9.0#32816Add --architecture option to connector build.
2.8.1#32999Improve Java code formatting speed
2.8.0#31930Move pipx install to airbyte-ci-dev, and add auto-update feature targeting binary
2.7.3#32847Improve --modified behaviour for pull requests.
2.7.2#32839Revert changes in v2.7.1.
2.7.1#32806Improve --modified behaviour for pull requests.
2.7.0#31930Merge airbyte-ci-internal into airbyte-ci
2.6.0#31831Add airbyte-ci format commands, remove connector-specific formatting check
2.5.9#32427Re-enable caching for source-postgres
2.5.8#32402Set Dagger Cloud token for airbyters only
2.5.7#31628Add ClickPipelineContext class
2.5.6#32139Test coverage report on Python connector UnitTest.
2.5.5#32114Create cache mount for /var/lib/docker to store images in dind context.
2.5.4#32090Do not cache docker login.
2.5.3#31974Fix latest CDK install and pip cache mount on connector install.
2.5.2#31871Deactivate PR comments, add HTML report links to the PR status when its ready.
2.5.1#31774Add a docker configuration check on airbyte-ci startup.
2.5.0#31766Support local connectors secrets.
2.4.0#31716Enable pre-release publish with local CDK.
2.3.1#31748Use AsyncClick library instead of base Click.
2.3.0#31699Support optional concurrent CAT execution.
2.2.6#31752Only authenticate when secrets are available.
2.2.5#31718Authenticate the sidecar docker daemon to DockerHub.
2.2.4#31535Improve gradle caching when building java connectors.
2.2.3#31688Fix failing CheckBaseImageUse step when not running on PR.
2.2.2#31659Support builds on x86_64 platform
2.2.1#31653Fix CheckBaseImageIsUsed failing on non certified connectors.
2.2.0#30527Add a new check for python connectors to make sure certified connectors use our base image.
2.1.1#31488Improve airbyte-ci start time with Click Lazy load
2.1.0#31412Run airbyte-ci from any where in airbyte project
2.0.4#31487Allow for third party connector selections
2.0.3#31525Refactor folder structure
2.0.2#31533Pip cache volume by python version.
2.0.1#31545Reword the changelog entry when using migrate-to-base-image.
2.0.0#31424Remove airbyte-ci connectors format command.
1.9.4#31478Fix running tests for connector-ops package.
1.9.3#31457Improve the connector documentation for connectors migrated to our base image.
1.9.2#31426Concurrent execution of java connectors tests.
1.9.1#31455Fix None docker credentials on publish.
1.9.0#30520New commands: bump-version, upgrade_base_image, migrate-to-base-image.
1.8.0#30520New commands: bump-version, upgrade_base_image, migrate-to-base-image.
1.7.2#31343Bind Pytest integration tests to a dockerhost.
1.7.1#31332Disable Gradle step caching on source-postgres.
1.7.0#30526Implement pre/post install hooks support.
1.6.0#30474Test connector inside their containers.
1.5.1#31227Use python 3.11 in amazoncorretto-bazed gradle containers, run 'test' gradle task instead of 'check'.
1.5.0#30456Start building Python connectors using our base images.
1.4.6 #31087Throw error if airbyte-ci tools is out of date
1.4.5#31133Fix bug when building containers using with_integration_base_java_and_normalization.
1.4.4#30743Add --disable-report-auto-open and --use-host-gradle-dist-tar to allow gradle integration.
1.4.3#30595Add --version and version check
1.4.2#30595Remove directory name requirement
1.4.1#30595Load base migration guide into QA Test container for strict encrypt variants
1.4.0#30330Add support for pyproject.toml as the prefered entry point for a connector package
1.3.0#30461Add --use-local-cdk flag to all connectors commands
1.2.3#30477Fix a test regression introduced the previous version.
1.2.2#30438Add workaround to always stream logs properly with --is-local.
1.2.1#30384Java connector test performance fixes.
1.2.0#30330Add --metadata-query option to connectors command
1.1.3#30314Stop patching gradle files to make them work with airbyte-ci.
1.1.2#30279Fix correctness issues in layer caching by making atomic execution groupings
1.1.1#30252Fix redundancies and broken logic in GradleTask, to speed up the CI runs.
1.1.0#29509Refactor the airbyte-ci test command to run tests on any poetry package.
1.0.0#28000Remove release stages in favor of support level from airbyte-ci.
0.5.0#28000Run connector acceptance tests with dagger-in-dagger.
0.4.7#29156Improve how we check existence of requirement.txt or setup.py file to not raise early pip install errors.
0.4.6#28729Use keyword args instead of positional argument for optional paramater in Dagger's API
0.4.5#29034Disable Dagger terminal UI when running publish.
0.4.4#29064Make connector modified files a frozen set.
0.4.3#29033Disable dependency scanning for Java connectors.
0.4.2#29030Make report path always have the same prefix: airbyte-ci/.
0.4.1#28855Improve the selected connectors detection for connectors commands.
0.4.0#28947Show Dagger Cloud run URLs in CI
0.3.2#28789Do not consider empty reports as successfull.
0.3.1#28938Handle 5 status code on MetadataUpload as skipped
0.3.0#28869Enable the Dagger terminal UI on local airbyte-ci execution
0.2.3#28907Make dagger-in-dagger work for airbyte-ci tests command
0.2.2#28897Sentry: Ignore error logs without exceptions from reporting
0.2.1#28767Improve pytest step result evaluation to prevent false negative/positive.
0.2.0#28857Add the airbyte-ci tests command to run the test suite on any airbyte-ci poetry package.
0.1.1#28858Increase the max duration of Connector Package install to 20mn.
0.1.0Alpha version not in production yet. All the commands described in this doc are available.

More info

This project is owned by the Connectors Operations team. We share project updates and remaining stories before its release to production in this EPIC.

Troubleshooting

Commands

make tools.airbyte-ci.check

This command checks if the airbyte-ci command is appropriately installed.

make tools.airbyte-ci.clean

This command removes the airbyte-ci command from your system.

Common issues

airbyte-ci is not found

If you get the following error when running airbyte-ci:

bash
$ airbyte-ci
zsh: command not found: airbyte-ci

It means that the airbyte-ci command is not in your PATH.

Try running

bash
make make tools.airbyte-ci.check

For some hints on how to fix this.

But when in doubt it can be best to run

bash
make tools.airbyte-ci.clean

Then reinstall the CLI with

bash
make tools.airbyte-ci.install

Development

airbyte-ci is not found

To fix this, you can either:

  • Ensure that airbyte-ci is installed with pipx. Run pipx list to check if airbyte-ci is installed.
  • Run pipx ensurepath to add the pipx binary directory to your PATH.
  • Add the pipx binary directory to your PATH manually. The pipx binary directory is usually ~/.local/bin.

python3.10 not found

If you get the following error when running pipx install --editable --force --python=python3.10 airbyte-ci/connectors/pipelines/:

bash
$ pipx install --editable --force --python=python3.10 airbyte-ci/connectors/pipelines/
Error: Python 3.10 not found on your system.

It means that you don't have Python 3.10 installed on your system.

To fix this, you can either:

  • Install Python 3.10 with pyenv. Run pyenv install 3.10 to install the latest Python version.
  • Install Python 3.10 with your system package manager. For instance, on Ubuntu you can run sudo apt install python3.10.
  • Ensure that Python 3.10 is in your PATH. Run which python3.10 to check if Python 3.10 is installed and in your PATH.

Any type of pipeline failure

First you should check that the version of the CLI you are using is the latest one. You can check the version of the CLI with the --version option:

bash
$ airbyte-ci --version
airbyte-ci, version 0.1.0

and compare it with the version in the pyproject.toml file:

bash
$ cat airbyte-ci/connectors/pipelines/pyproject.toml | grep version

If you get any type of pipeline failure, you can run the pipeline with the --show-dagger-logs option to get more information about the failure.

bash
$ airbyte-ci --show-dagger-logs connectors --name=source-pokeapi test

and when in doubt, you can reinstall the CLI with the --force option:

bash
$ pipx reinstall pipelines --force