Back to Superset

Importing and Exporting Datasources

docs/admin_docs/configuration/importing-exporting-datasources.mdx

2021.41.04.1 KB
Original Source

Importing and Exporting Datasources

The superset cli allows you to import and export datasources from and to YAML. Datasources include databases. The data is expected to be organized in the following hierarchy:

text
├──databases
|  ├──database_1
|  |  ├──table_1
|  |  |  ├──columns
|  |  |  |  ├──column_1
|  |  |  |  ├──column_2
|  |  |  |  └──... (more columns)
|  |  |  └──metrics
|  |  |     ├──metric_1
|  |  |     ├──metric_2
|  |  |     └──... (more metrics)
|  |  └── ... (more tables)
|  └── ... (more databases)

Exporting Datasources to YAML

You can print your current datasources to stdout by running:

bash
superset export_datasources

To save your datasources to a ZIP file run:

bash
superset export_datasources -f <filename>

By default, default (null) values will be omitted. Use the -d flag to include them. If you want back references to be included (e.g. a column to include the table id it belongs to) use the -b flag.

Alternatively, you can export datasources using the UI:

  1. Open Sources -> Databases to export all tables associated to a single or multiple databases. (Tables for one or more tables)
  2. Select the items you would like to export.
  3. Click Actions -> Export to YAML
  4. If you want to import an item that you exported through the UI, you will need to nest it inside its parent element, e.g. a database needs to be nested under databases a table needs to be nested inside a database element.

In order to obtain an exhaustive list of all fields you can import using the YAML import run:

bash
superset export_datasource_schema

As a reminder, you can use the -b flag to include back references.

Importing Datasources

In order to import datasources from a ZIP file, run:

bash
superset import_datasources -p <path / filename>

The optional username flag -u sets the user used for the datasource import. The default is 'admin'. Example:

bash
superset import_datasources -p <path / filename> -u 'admin'

Legacy Importing Datasources

From older versions of Superset to current version

When using Superset version 4.x.x to import from an older version (2.x.x or 3.x.x) importing is supported as the command legacy_import_datasources and expects a JSON or directory of JSONs. The options are -r for recursive and -u for specifying a user. Example of legacy import without options:

bash
superset legacy_import_datasources -p <path or filename>

From older versions of Superset to older versions

When using an older Superset version (2.x.x & 3.x.x) of Superset, the command is import_datasources. ZIP and YAML files are supported and to switch between them the feature flag VERSIONED_EXPORT is used. When VERSIONED_EXPORT is True, import_datasources expects a ZIP file, otherwise YAML. Example:

bash
superset import_datasources -p <path or filename>

When VERSIONED_EXPORT is False, if you supply a path all files ending with yaml or yml will be parsed. You can apply additional flags (e.g. to search the supplied path recursively):

bash
superset import_datasources -p <path> -r

The sync flag -s takes parameters in order to sync the supplied elements with your file. Be careful this can delete the contents of your meta database. Example:

bash
superset import_datasources -p <path / filename> -s columns,metrics

This will sync all metrics and columns for all datasources found in the <path /filename> in the Superset meta database. This means columns and metrics not specified in YAML will be deleted. If you would add tables to columns,metrics those would be synchronised as well.

If you don’t supply the sync flag (-s) importing will only add and update (override) fields. E.g. you can add a verbose_name to the column ds in the table random_time_series from the example datasets by saving the following YAML to file and then running the import_datasources command.

yaml
databases:
- database_name: main
  tables:
  - table_name: random_time_series
    columns:
    - column_name: ds
      verbose_name: datetime