site/content/en/docs/workspace/attach-cloud-storage.md
In CVAT, you can use Amazon S3, Azure Blob Storage, Backblaze B2, and Google Cloud Storage storages to import and export image datasets for your tasks.
Check out:
To create bucket, do the following:
Create an AWS account.
Go to the Amazon S3 console, and select Create bucket.
Specify the name and region of the bucket. You can also copy the settings of another bucket by selecting Choose bucket.
Enable Block all public access. For access, you will use access key ID and secret access key.
Select Create bucket.
A new bucket will appear on the list of buckets.
{{% alert title="Note" color="primary" %}} The manifest file is optional. {{% /alert %}}
You need to upload data for annotation and the manifest.jsonl file.
Prepare data. For more information, refer on how to prepare the dataset.
Open the bucket and select Upload.
Drag the manifest file and image folder on the page and select Upload:
To add access permissions, do the following:
Go to IAM and select Add users.
Set User name and enable Access key - programmatic access.
Select Next: Permissions.
Select Create group, enter the group name.
Use search to find and select:
(Optional) Add tags for the user and go to the next page.
Save Access key ID and Secret access key.
For more information, consult Creating an IAM user in your AWS account
On how to grant public access to the bucket, consult Configuring block public access settings for your S3 buckets
To attach storage, do the following:
Fill in the following fields:
| CVAT | Amazon S3 |
|---|---|
| Display name | Preferred display name for your storage. |
| Description | (Optional) Add description of storage. |
| Provider | From drop-down list select Amazon S3. |
| Bucket name | Name of the Bucket. |
| Authentication type | Depends on the bucket setup: |
After filling in all the fields, select Submit.
{{% alert title="Note" color="primary" %}} The manifest file is optional. {{% /alert %}}
To prepare the manifest file, do the following:
You can configure credentials by running aws configure.
You will need to enter Access Key ID and Secret Access Key as well as the region.
aws configure
Access Key ID: <your Access Key ID>
Secret Access Key: <your Secret Access Key>
Copy the content of the bucket to a folder on your computer:
aws s3 cp <s3://bucket-name> <yourfolder> --recursive
After copying the files, you can create a manifest file as described in {{< ilink "/docs/dataset_management/dataset_manifest" "prepare manifest file section" >}}:
python <cvat repository>/utils/dataset_manifest/create.py --output-dir <yourfolder> <yourfolder>
When the manifest file is ready, upload it to the S3 bucket:
For read and write permissions when you created the user, run:
aws s3 cp <yourfolder>/manifest.jsonl <s3://bucket-name>
For read-only permissions, use the download through the browser, select upload, drag the manifest file to the page and select upload.
Backblaze B2 is an S3-compatible cloud storage service. It can be used in CVAT by selecting Amazon S3 as the provider and specifying the Backblaze B2 endpoint (for example, https://s3.us-west-004.backblazeb2.com).
To create a B2 bucket, do the following:
The new bucket will appear in your buckets list.
{{% alert title="Note" color="primary" %}} The manifest file is optional. {{% /alert %}}
You need to upload data for annotation and optionally the manifest.jsonl file.
Alternatively, you can use the Backblaze CLI or any S3-compatible tool like the AWS CLI with B2 endpoints.
To access your B2 bucket from CVAT, you need to create Application Keys:
{{% alert title="Warning" color="warning" %}} Store your applicationKey securely. It will only be displayed once during creation and cannot be recovered later. {{% /alert %}}
For more information, consult B2 Application Keys.
To attach B2 storage to CVAT, do the following:
Fill in the following fields:
| CVAT field | Backblaze B2 value |
|---|---|
| Display name | Preferred display name for your storage. |
| Description | (Optional) Add a description of the storage. |
| Provider | From the drop-down list, select Amazon S3 (Backblaze B2 is S3-compatible). |
| Bucket name | Name of your B2 bucket. |
| Authentication type | Select Key ID and secret access key pair. |
| Access key ID | Enter the keyID from your B2 Application Key. |
| Secret access key | Enter the applicationKey from your B2 Application Key. |
| Endpoint URL | Required for B2: Enter your B2 S3 endpoint URL (for example, https://s3.us-west-004.backblazeb2.com). You can find the endpoint in your bucket details page. |
| Prefix | (Optional) Use to limit CVAT to a specific folder within the bucket. |
| Manifests | (Optional) Select + Add manifest and specify a manifest file name such as manifest.jsonl. |
{{% alert title="Important" color="primary" %}}
When using Backblaze B2, you must specify the Endpoint URL field
with your B2 S3-compatible endpoint (e.g., https://s3.us-west-004.backblazeb2.com).
This tells CVAT to connect to Backblaze instead of Amazon S3.
{{% /alert %}}
After filling in all the fields, select Submit.
To create bucket, do the following:
Create Google account and log into it.
On the Google Cloud page, select Start Free, then enter the required data and accept the terms of service.
{{% alert title="Note" color="primary" %}} Google requires to add payment, you will need a bank card to accomplish step 2. {{% /alert %}}
Create a Bucket with the following parameters:
Set a default class > Standard.Enforce public access prevention on this bucket >
Uniform (default).NoneYou will be forwarded to the bucket.
{{% alert title="Note" color="primary" %}} The manifest file is optional. {{% /alert %}}
You need to upload data for annotation and the manifest.jsonl file.
To access Google Cloud Storage get a Project ID from cloud resource manager page
And follow instructions below based on the preferable type of access.
For authenticated access you need to create a service account and key file.
To create a service account:
To create a key:
For more information about keys, consult Learn more about creating keys.
To configure anonymous access:
allUsers,
select roles: Cloud Storage Legacy > Storage Legacy Bucket Reader.Now you can attach the Google Cloud Storage bucket to CVAT.
To attach storage, do the following:
Fill in the following fields:
| CVAT | Google Cloud Storage |
|---|---|
| Display name | Preferred display name for your storage. |
| Description | (Optional) Add description of storage. |
| Provider | From drop-down list select Google Cloud Storage. |
| Bucket name | Name of the bucket. You can find it on the storage browser page. |
| Authentication type | Depends on the bucket setup: |
After filling in all the fields, select Submit.
To create bucket, do the following:
Create an Microsoft Azure account and log into it.
Go to Azure portal, hover over the resource , and in the pop-up window select Create.
Enter a name for the group and select Review + create, check the entered data and select Create.
Go to the resource groups page, navigate to the group that you created and select Create resources.
On the marketplace page, use search to find Storage account.
Select on Storage account and on the next page select Create.
On the Basics tab, fill in the following fields:
On the Advanced page, fill in the following fields:
On the Networking tab, fill in the following fields:
If you want to change public access, enable Public access from all networks.
Select Next>Data protection.
You do not need to change anything in other tabs until you need some specific setup.
Select Review and wait for the data to load.
Select Create. Deployment will start.
After deployment is over, select Go to resource.
To create container, do the following:
Go to the containers section and on the top menu select +Container
Enter the name of the container.
(Optional) In the Public access level drop-down, select type of the access.
Note: this field will inactive if you disabled Allow enabling public access on containers.
You need to upload data for annotation and the manifest.jsonl file.
Prepare data. For more information, refer on how to prepare the dataset.
Go to container and select Upload.
Select Browse for files and select images.
{{% alert title="Note" color="primary" %}} If images are in folder, specify folder in the Advanced settings > Upload to folder. {{% /alert %}}
Select Upload.
Use the SAS token or connection string to grant secure access to the container.
To configure the credentials:
For personal use, you can use the Access Key from your storage account in the CVAT SAS Token field.
To get the Access Key:
To attach storage, do the following:
Fill in the following fields:
| CVAT | Azure |
|---|---|
| Display name | Preferred display name for your storage. |
| Description | (Optional) Add description of storage. |
| Provider | From drop-down list select Azure Blob Storage. |
| Container name` | Name of the cloud storage container. |
| Authentication type | Depends on the container setup. |
| Account name and SAS token: <ul><li>Account name enter storage account name. <li>SAS token is located in the Shared access signature section of your Storage account.</ul>. Anonymous access: for anonymous access Allow enabling public access on containers must be enabled. | |
| Prefix | (Optional) Used to filter data from the bucket. By setting a default prefix, you ensure that only data from a specific folder in the cloud is used in CVAT. This will affect which files you see when creating a task with cloud data. |
| Manifests | (Optional) Select + Add manifest and enter the name of the manifest file with an extension. For example: manifest.jsonl. |
After filling in all the fields, select Submit.
For example, the dataset is The Oxford-IIIT Pet Dataset:
python <cvat repository>/utils/dataset_manifest/create.py --output-dir <your_folder> <your_folder>