Data Import And Export Jsw9 - Prisma1

import Warning from 'components/Markdown/Warning'

export const meta = { title: "Data Import & Export", position: 40, }

Data import

Data to be imported needs to adhere to the Normalized Data Format (NDF). As of today, the conversion from any concrete data source (like MySQL, MongoDB or Firebase) to NDF must be performed manually. In the future, the Prisma CLI will support importing from these data sources directly.

Here is a general overview of the data import process:

+--------------+                    +----------------+                       +------------+
| +--------------+                  |                |                       |            |
| |            | |                  |                |                       |            |
| | SQL        | |  (1) transform   |      NDF       |  (2) chunked upload   |   Prisma   |
| | MongoDB    | | +--------------> |                | +-------------------> |            |
| | JSON       | |                  |                |                       |            |
| |            | |                  |                |                       |            |
+--------------+ |                  +----------------+                       +------------+
  +--------------+

As mentioned above, step 1 has to be performed manually. Step 2 can then be done by either using the raw import API or the prisma import command from the CLI.

To view the current state of supported transformations in the CLI and submit a vote for the one you need, you can check out this GitHub issue.

When uploading files in NDF, you need to provide the import data split across three different kinds of files:

nodes: Data for individual nodes (i.e. databases records)
lists: Data for a list of nodes
relations: Data for related nodes

You can upload an unlimited number of files for each of these types, but it's recommended that each file should be at most 1 MB large. Otherwise you might run into timeouts.

Important considerations

General constraints

When importing data, the id field of a node can be at most 25 characters long.

Idempotency

Note that import operations are not idempotent. This means running an import always adds data to your service. It never updates existing nodes. This means importing the same dataset multiple times will lead to undefined behaviour.

For example, importing a node with the same id more than once will lead to undefined behaviour and likely break your service!

Data Validation

The import API does not perform any validation checks on the data to be imported. When using the CLI to import data, basic validation checks are executed.

Importing invalid data leads to undefined behaviour and might break your service! As the service maintainer, you are responsible to ensure the validity of the imported data.

Tip: A good way to ensure valid data when importing is inspecting the data of a previous export on a service with an identical datamodel.

Data import with the CLI

The Prisma CLI offers the prisma import command. It accepts one option:

--data (short: -d): A file path to a directory containing the data to be imported (this can either be regular or a zipped directory)

Under the hood, the CLI uses the import API that's described in the next section. However, using the CLI provides some major benefits:

uploading multiple files at once (rather than having to upload each file individually)
leveraging the CLI's authentication mechanism (i.e. you don't need to manually send your authentication token)
ability to pause and resume an ongoing import
import from various data sources like MySQL, MongoDB or Firebase (not available yet)

Input format

When importing data using the CLI, the files containing the data in NDF need to be located in directories called after their type: nodes, lists and relations.

NDF files are JSON files following a specific structure, so each file containing import data needs to end on .json. When placed in their respective directories (nodes, lists or relations), the .json-files need to be numbered incrementally, starting with 1, e.g. 1.json. The file name can be prepended with any number of zeros, e.g. 01.json or 0000001.json.

Example

Consider the following file structure defining a Prisma service:

.
├── data
│   ├── lists
│   │   ├── 0001.json
│   │   ├── 0002.json
│   │   └── 0003.json
│   ├── nodes
│   │   ├── 0001.json
│   │   └── 0002.json
│   └── relations
│       └── 0001.json
├── datamodel.prisma
└── prisma.yml

data contains the files data to be imported. Further, all files ending on .json are adhering to NDF. To import the data from these files, you can simply run the following command in the terminal:

prisma import --data data

Data import using the raw import API

The raw import API is exposed under the /import path of your service's HTTP endpoint. For example:

http://localhost:4466/my-app/dev/import
https://eu1.prisma.sh/my-app/prod/import

One request can upload JSON data (in NDF) of at most 10 MB in size. Note that you need to provide your authentication token in the HTTP Authorization header of the request!

Here is an example curl command for uploading some JSON data (of NDF type nodes):

curl 'http://localhost:4466/my-app/dev/import' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJPbmxpbmUgSldUIEJ1aWxkZXIiLCJpYXQiOjE1MTM1OTQzMTEsImV4cCI6MTU0NTEzMDMxMSwiYXVkIjasd3d3LmV4YW1wbGUuY29tIiwic3ViIjoianJvY2tldEBleGFtcGxlLmNvbSIsIkdpdmVuTmFtZSI6IkpvaG5ueSIsIlN1cm5hbWUiOiJSb2NrZXQiLCJFbWFpbCI6Impyb2NrZXRAZXhhbXBsZS5jb20iLCJSb2xlIjpbIk1hbmFnZXIiLCJQcm9qZWN0IEFkbWluaXN0cmF0b3IiXX0.L7DwH7vIfTSmuwfxBI82D64DlgoLBLXOwR5iMjZ_7nI' \
-d '{"valueType":"nodes","values":[{"_typeName":"Model0","id":"0","a":"test","b":0,"createdAt":"2017-11-29 14:35:13"},{"_typeName":"Model1","id":"1","a":"test","b":1},{"_typeName":"Model2","id":"2","a":"test","b":2,"createdAt":"2017-11-29 14:35:13"},{"_typeName":"Model0","id":"3","a":"test","b":3},{"_typeName":"Model3","id":"4","a":"test","b":4,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"},{"_typeName":"Model3","id":"5","a":"test","b":5},{"_typeName":"Model3","id":"6","a":"test","b":6},{"_typeName":"Model4","id":"7"},{"_typeName":"Model4","id":"8","string":"test","int":4,"boolean":true,"dateTime":"1015-11-29 14:35:13","float":13.333,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"},{"_typeName":"Model5","id":"9","string":"test","int":4,"boolean":true,"dateTime":"1015-11-29 14:35:13","float":13.333,"createdAt":"2017-11-29 14:35:13","updatedAt":"2017-11-29 14:35:13"}]}' \
-sSv

The generic version for curl (using placeholders) would look as follows:

curl '__SERVICE_ENDPOINT__/import' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer __JWT_AUTH_TOKEN__' \
-d '{"valueType":"__NDF_TYPE__","values": __DATA__ }' \
-sSv

Data export

Exporting data can be done either using the CLI or the raw export API. In both cases, the downloaded data is formatted in JSON and adheres to the Normalized Data Format (NDF). As the exported data is in NDF, it can directly be imported into a service with an identical schema. This can be useful when test data is needed for a service, e.g. in a dev stage.

Data export with the CLI

The Prisma CLI offers the prisma export command. It accepts one option:

--export-path (short: -e): A file path to a .zip-directory which will be created by the CLI and where the exported data is stored

Under the hood, the CLI uses the export API that's described in the next section. However, using the CLI provides some major benefits:

leveraging the CLI's authentication mechanism (i.e. you don't need to manually send your authentication token)
writing downloaded data directly to file system
cursor management in case multiple requests are needed to export all application data (when doing this manually you need to send multiple requests and adjust the cursor upon each)

Output format

The data is exported in NDF and will be placed in three directories that are named after the different NDF types: nodes, lists and relations.

Data export using the raw export API

The raw export API is exposed under the /export path of your service's HTTP endpoint. For example:

http://localhost:4466/my-app/dev/export
https://database.prisma.sh/my-app/prod/export

One request can download JSON data (in NDF) of at most 10 MB in size. Note that you need to provide your authentication token in the HTTP Authorization header of the request!

The endpoint expects a POST request where the body contains JSON with the following contents:

json

{
  "fileType": "nodes",
  "cursor": {
    "table": 0,
    "row": 0,
    "field": 0,
    "array": 0
  }
}

The values in cursor describe the offsets in the database from where on data should be exported. Note that each response for an export request will return a new cursor with either of two states:

Terminated (not full): If all the values for table, row, field and array are returned as -1 it means the export has completed.
Non-terminated (_full): If any of the values for table, row, field or array is different from -1, it means the maximum size of 10 MB for this response has been reached. If this happens, you can use the returned cursor values as the input for your next export request.

Example

Here is an example curl command for uploading some JSON data (of NDF type nodes):

curl 'http://localhost:4466/my-app/dev/export' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJPbmxpbmUgSldUIEJ1aWxkZXIiLCJpYXQiOjE1MTM1OTQzMTEsImV4cCI6MTU0NTEzMDMxMSwiYXVkIjasd3d3LmV4YW1wbGUuY29tIiwic3ViIjoianJvY2tldEBleGFtcGxlLmNvbSIsIkdpdmVuTmFtZSI6IkpvaG5ueSIsIlN1cm5hbWUiOiJSb2NrZXQiLCJFbWFpbCI6Impyb2NrZXRAZXhhbXBsZS5jb20iLCJSb2xlIjpbIk1hbmFnZXIiLCJQcm9qZWN0IEFkbWluaXN0cmF0b3IiXX0.L7DwH7vIfTSmuwfxBI82D64DlgoLBLXOwR5iMjZ_7nI' \
-d '{"fileType":"nodes","cursor":{"table":0,"row":0,"field":0,"array":0}}' \
-sSv

The generic version for curl (using placeholders) would look as follows:

curl '__SERVICE_ENDPOINT__/export' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer __JWT_AUTH_TOKEN__' \
-d '{"fileType":"__NDF_TYPE__","cursor": {"table":__TABLE__,"row":__ROW__,"field":__FIELD__,"array":__ARRAY__}} }' \
-sSv

Normalized Data Format

The Normalized Data Format (NDF) is used as an intermediate data format for import and export in Prisma services. NDF describes a specific structure for JSON.

NDF value types

When using the NDF, data is split across three different "value types":

Nodes: Contains data for the scalar fields of nodes
Lists: Contains data for list fields of nodes
Relations: Contains data to connect two nodes via a relation by their relation fields

Structure

The structure for a JSON document in NDF is an object with the following two keys:

valueType: Indicates the value type of the data in the document (this can be either "nodes", "lists" or "relations")
values: Contains the actual data (adhering to the value type) as an array

The examples in the following are based on this datamodel:

graphql

type User {
  id: String! @unique
  firstName: String!
  lastName: String!
  hobbies: [String!]!
  partner: User
}

Nodes

In case the valueType is "nodes", the structure for the objects inside the values array is as follows:

{
  "valueType": "nodes",
  "values": [
    { "_typeName": STRING, "id": STRING, "<scalarField1>": ANY, "<scalarField2>": ANY, ..., "<scalarFieldN>": ANY },
    ...
  ]
}

The notations expresses that the fields _typeName and id are of type string. _typeName refers to the name of the SDL type from your datamodel. The <scalarFieldX>-placeholders will be the names of the scalar fields of that SDL type.

For example, the following JSON document can be used to import the scalar values for two User nodes:

json

{
  "valueType": "nodes",
  "values": [
    {"_typeName": "User", "id": "johndoe", "firstName": "John", "lastName": "Doe"},
    {"_typeName": "User", "id": "sarahdoe", "firstName": "Sarah", "lastName": "Doe"}
  ]
}

Lists

In case the valueType is "lists", the structure for the objects inside the values array is as follows:

{
  "valueType": "lists",
  "values": [
    { "_typeName": STRING, "id": STRING, "<scalarListField>": [ANY] },
    ...
  ]
}

The notations expresses that the fields _typeName and id are of type string. _typeName refers to the name of the SDL type from your datamodel. The <scalarListField>-placeholder is the name of the of the list fields of that SDL type. Note that in contrast to the scalar list fields, each object can only values only for one field.

For example, the following JSON document can be used to import the values for the hobbies list field of two User nodes:

json

{
  "valueType": "lists",
  "values": [
    {"_typeName": "User", "id": "johndoe", "hobbies": ["Fishing", "Cooking"]},
    {"_typeName": "User", "id": "sarahdoe", "hobbies": ["Biking", "Coding"]}
  ]
}

Relations

In case the valueType is "relations", the structure for the objects inside the values array is as follows:

{
  "valueType": "relations",
  "values": [
    [
      { "_typeName": STRING, "id": STRING, "fieldName": STRING },
      { "_typeName": STRING, "id": STRING, "fieldName": STRING }
    ],
    ...
  ]
}

The notations expresses that the fields _typeName, id and fieldName are of type string.

_typeName refers to a name of an SDL type from your datamodel. The <relationField>-placeholder is the name of the of the relation field of that SDL type. Since the goal of the relation data is to connect two nodes via a relation, each element inside the values array by itself is a pair (written as an array which always contains exactly two elements) rather than a single object as was the case for "nodes" and "lists".

For example, the following JSON document can be used to create a relation between two User nodes via the partner relation field:

json

{
  "valueType": "relations",
  "values": [
    [
      { "_typeName": "User", "id": "johndoe", "fieldName": "partner" },
      { "_typeName": "User", "id": "sarahdoe", "fieldName": "partner" }
    ]
  ]
}