airbyte-ci/connectors/connectors_insights/README.md
Connectors Insights is a Python project designed to generate various insights from analysis of our connectors code. This project utilizes Poetry for dependency management and packaging.
The project generates the following artifacts:
insights.json: Contains general insights and metadata about the connectors.sbom.json: Contains the Software Bill Of Material. Produced by Syft.To install the project and its dependencies, ensure you have Poetry installed, then run:
poetry install
The Connectors Insights project provides a command-line interface (CLI) to generate the artifacts. Below is the command to run the CLI:
# From airbyte root directory
connectors-insights generate --output-directory <path-to-local-output-dir> --gcs-uri=gs://<bucket>/<key-prefix> --connector-directory airbyte-integrations/connectors/ --concurrency 2 --rewrite
generate: The command to generate the artifacts.
-o, --output-dir: Specifies the local directory where the generated artifacts will be saved. In this example, artifacts are saved to /Users/augustin/Desktop/insights.
-g, --gcs-uri: The Google Cloud Storage (GCS) URI prefix where the artifacts will be uploaded. In the form: gs://<bucket>/<key-prefix>.
-d, --connector-directory: The directory containing the connectors. This option points to the location of the connectors to be analyzed, here it is airbyte-integrations/connectors/.
-c, --concurrency: Sets the level of concurrency for the generation process. In this example, it is set to 2.
--rewrite: If provided, this flag indicates that existing artifacts should be rewritten if they already exist.
To generate the artifacts and save them both locally and to GCS, you can use the following command:
connectors-insights generate --output-directory <path-to-local-output-dir> --gcs-uri=gs://<bucket>/<key-prefix> --connector-directory airbyte-integrations/connectors/ --concurrency 2 --rewrite
This command will generate insights.json and sbom.json files, saving them to the specified local directory and uploading them to the specified GCS URI if --gcs-uri is passed.
This CLI is currently running nightly in GitHub Actions. The workflow can be found in .github/workflows/connector_insights.yml.
Update Python version requirement from 3.10 to 3.11.
Fix permissions issue when installing pylint in connector container.
Update dagger to 0.13.3.
Use SBOM from the connector registry (SPDX format) instead of generating SBOM in the connector insights.
Bugfix: Ignore CI on master report if it's not accessible.
Skip manifest inferred insights when the connector does not have a manifest.yaml file.
Adding manifest_uses_parameters, manifest_uses_custom_components, and manifest_custom_components_classes insights.
Do not generate insights for *-scaffold-* and *-strict-encrypt connectors.
Share .docker/config.json with syft to benefit from increased DockerHub rate limit.