documentation/integrations/other/airbyte.md
Airbyte is an open-source ETL platform designed to help you sync data from a wide range of sources into your desired destinations. With its extensive library of connectors, Airbyte allows users to build scalable data pipelines effortlessly. This guide will walk you through the process of integrating Airbyte with QuestDB, enabling efficient storage and querying of data in a high-performance time-series database.
Before you begin, ensure you have the following:
Airbyte uses a user-friendly interface for configuration, allowing you to easily set up your sources and destinations. Once Airbyte is running, navigate to the Airbyte dashboard at http://localhost:8000.
localhost or your QuestDB server IP8812adminAirbyte allows you to test your configuration after entering the details. Click on Test Connection to ensure that the connection to the source and destination is successful. Once confirmed, you can proceed to create a connection between the source and destination.
In this section, we will demonstrate how to set up Airbyte to extract data from a PostgreSQL source and send it to QuestDB.
{
"sourceType": "postgresql",
"configuration": {
"host": "your_postgres_host",
"port": 5432,
"database": "your_database",
"username": "your_username",
"password": "your_password",
"ssl": false,
"table_name": "your_table"
}
}
SELECT * FROM your_table LIMIT 10;
After configuring your PostgreSQL source, the next step is to configure the destination to send the extracted data to QuestDB.
{
"destinationType": "questdb",
"configuration": {
"host": "localhost",
"port": 8812,
"database": "your_database",
"username": "admin",
"password": "",
"table": "your_table",
"batch_size": 1000
}
}
[
{
"column1": "value1",
"column2": 123,
"column3": "2023-10-19T12:00:00Z"
},
{
"column1": "value2",
"column2": 456,
"column3": "2023-10-19T12:05:00Z"
}
]
Once you have both the source and destination configured, you can create a connection between them.
{
"sourceId": "your_postgres_source_id",
"destinationId": "your_questdb_destination_id",
"syncMode": "full_refresh",
"schedule": {
"units": 1,
"timeUnit": "hours"
}
}
Once the sync process is initiated, you can monitor its status directly in the Airbyte dashboard.
{
"status": "COMPLETED",
"records_transferred": 150,
"start_time": "2023-10-19T12:00:00Z",
"end_time": "2023-10-19T12:10:00Z",
"errors": []
}
If you encounter issues during the sync process, consider the following debugging steps:
Access the QuestDB Web Console:
Navigate to the QuestDB Web Console at http://127.0.0.1:9000/. Once you're on the console, check the Schema Navigator in the top left. You should see two new tables:
cpu (generated from inputs.cpu)mem (generated from inputs.mem)Run a Query on the cpu Table:
To verify that data is being correctly written to the cpu table, type the following query in the editor and click RUN:
SELECT * FROM cpu;
Inspect the Results: After running the query, the results should display columns for each metric collected by the Telegraf plugin for monitoring CPU usage, such as:
usage_idleusage_userusage_systemusage_iowaitHere's an example of what the results may look like:
| timestamp | usage_idle | usage_user | usage_system | usage_iowait |
|----------------------|------------|------------|--------------|--------------|
| 2024-10-19T12:00:00Z | 60 | 30 | 10 | 0 |
| 2024-10-19T12:05:00Z | 58 | 31 | 9 | 2 |
mem Table:
Similarly, you can check the mem table by running the following query:SELECT * FROM mem;
This will display memory usage statistics collected by Telegraf, which might include:
totalavailableusedfree| timestamp | total | available | used | free |
|----------------------|------------|-----------|-------------|-------------|
| 2024-10-19T12:00:00Z | 8000 | 3000 | 4000 | 1000 |
| 2024-10-19T12:05:00Z | 8000 | 2900 | 4100 | 1000 |
To ensure a smooth and efficient integration between Airbyte and QuestDB, consider the following best practices:
# Example Airbyte configuration for incremental sync
sync_mode: incremental
-- Example SQL for creating a table in QuestDB
CREATE TABLE my_table (
id INT,
name STRING,
created_at TIMESTAMP
) timestamp(created_at);
# Example Airbyte connection configuration with retries
retries: 3
Monitor Resource Utilization
Query Optimization
-- Example SQL for creating an index
CREATE INDEX ON my_table (name);
-- Example SQL for counting records after load
SELECT COUNT(*) FROM my_table;
If you do not see the expected tables or data in QuestDB:
This guide outlines the integration of Airbyte with QuestDB, enabling seamless data ingestion from various sources into QuestDB for efficient querying and analytics. Key topics covered include:
For further details and resources, refer to the following links: