docs/integrations/data-integrations/ckan.mdx
This handler facilitates integration with CKAN. an open-source data catalog platform for managing and publishing open data. CKAN organizes datasets and stores data in its DataStore.To retrieve data from CKAN, the CKANAPI must be used.
Before proceeding, ensure the following prerequisites are met:
The CKAN handler is included with MindsDB by default, so no additional installation is required.
To use the CKAN handler, you need to provide the URL of the CKAN instance you want to connect to. You can do this by setting the CKAN_URL environment variable. For example:
CREATE DATABASE ckan_datasource
WITH ENGINE = 'ckan',
PARAMETERS = {
"url": "https://your-ckan-instance-url.com",
"api_key": "your-api-key-if-required"
};
NOTE: Some CKAN instances will require you to provide an API Token. You can create one in the CKAN user panel.
The CKAN handler provides three main tables:
datasets: Lists all datasets in the CKAN instance.resources: Lists all resources metadata across all packages.datastore: Allows querying individual datastore resources.List all datasets:
SELECT * FROM `your-datasource`.datasets;
List all resources:
SELECT * FROM `your-datasource`.resources ;
Query a specific datastore resource:
SELECT * FROM `your-datasource`.datastore WHERE resource_id = 'your-resource-id';
Replace your-resource-id-here with the actual resource ID you want to query.
The CKAN handler supports automatic pagination when querying datastore resources. This allows you to retrieve large datasets without worrying about API limits.
You can still use the LIMIT clause to limit the number of rows returned by the query. For example:
SELECT * FROM ckan_datasource.datastore
WHERE resource_id = 'your-resource-id-here'
LIMIT 1000;