Back to Go Cloud

Blob

internal/website/content/howto/blob/_index.md

0.45.011.9 KB
Original Source

Blobs are a common abstraction for storing unstructured data on Cloud storage services and accessing them via HTTP. This guide shows how to work with blobs in the Go CDK.

<!--more-->

The blob package supports operations like reading and writing blobs (using standard io package interfaces), deleting blobs, and listing blobs in a bucket.

Subpackages contain driver implementations of blob for various services, including Cloud and on-prem solutions. You can develop your application locally using fileblob, then deploy it to multiple Cloud providers with minimal initialization reconfiguration.

Opening a Bucket {#opening}

The first step in interacting with unstructured storage is to instantiate a portable *blob.Bucket for your storage service.

The easiest way to do so is to use blob.OpenBucket and a service-specific URL pointing to the bucket, making sure you "blank import" the driver package to link it in.

go
import (
	"gocloud.dev/blob"
	_ "gocloud.dev/blob/<driver>"
)
...
bucket, err := blob.OpenBucket(context.Background(), "<driver-url>")
if err != nil {
    return fmt.Errorf("could not open bucket: %v", err)
}
defer bucket.Close()
// bucket is a *blob.Bucket; see usage below
...

See [Concepts: URLs][] for general background and the [guide below][] for URL usage for each supported service.

Alternatively, if you need fine-grained control over the connection settings, you can call the constructor function in the driver package directly.

go
import "gocloud.dev/blob/<driver>"
...
bucket, err := <driver>.OpenBucket(...)
...

You may find the wire package useful for managing your initialization code when switching between different backing services.

See the [guide below][] for constructor usage for each supported service.

[Concepts: URLs]: {{< ref "/concepts/urls.md" >}} [guide below]: {{< ref "#services" >}}

Prefixed Buckets {#prefix}

You can wrap a *blob.Bucket to always operate on a subfolder of the bucket using blob.PrefixedBucket:

{{< goexample "gocloud.dev/blob.ExamplePrefixedBucket" >}}

Alternatively, you can configure the prefix directly in the blob.OpenBucket URL:

{{< goexample "gocloud.dev/blob.Example_openFromURLWithPrefix" >}}

Single Key Buckets {#singlekey}

You can wrap a *blob.Bucket to always operate on a single key using blob.SingleKeyBucket:

{{< goexample "gocloud.dev/blob.ExampleSingleKeyBucket" >}}

Alternatively, you can configure the single key directly in the blob.OpenBucket URL:

{{< goexample "gocloud.dev/blob.Example_openFromURLWithSingleKey" >}}

The resulting bucket will ignore the key parameter to its functions, and always refer to the single key. This can be useful to allow configuration of a specific "file" via a single URL.

List functions will not work on single key buckets.

Using a Bucket {#using}

Once you have opened a bucket for the storage provider you want, you can store and access data from it using the standard Go I/O patterns described below. Other operations like listing and reading metadata are documented in the blob package documentation.

Writing Data to a Bucket {#writing}

To write data to a bucket, you create a writer, write data to it, and then close the writer. Closing the writer commits the write to the provider, flushing any buffers, and releases any resources used while writing, so you must always check the error of Close.

The writer implements io.Writer, so you can use any functions that take an io.Writer like io.Copy or fmt.Fprintln.

{{< goexample src="gocloud.dev/blob.ExampleBucket_NewWriter" imports="0" >}}

In some cases, you may want to cancel an in-progress write to avoid the blob being created or overwritten. A typical reason for wanting to cancel a write is encountering an error in the stream your program is copying from. To abort a write, you cancel the Context you pass to the writer. Again, you must always Close the writer to release the resources, but in this case you can ignore the error because the write's failure is expected.

{{< goexample src="gocloud.dev/blob.ExampleBucket_NewWriter_cancel" imports="0" >}}

Reading Data from a Bucket {#reading}

Once you have written data to a bucket, you can read it back by creating a reader. The reader implements io.Reader, so you can use any functions that take an io.Reader like io.Copy or io/io.ReadAll. You must always close a reader after using it to avoid leaking resources.

{{< goexample src="gocloud.dev/blob.ExampleBucket_NewReader" imports="0" >}}

Many storage providers provide efficient random-access to data in buckets. To start reading from an arbitrary offset in the blob, use NewRangeReader.

{{< goexample src="gocloud.dev/blob.ExampleBucket_NewRangeReader" imports="0" >}}

Deleting a Bucket {#deleting}

You can delete blobs using the Bucket.Delete method.

{{< goexample src="gocloud.dev/blob.ExampleBucket_Delete" imports="0" >}}

Other Usage Samples

Supported Storage Services {#services}

Google Cloud Storage {#gcs}

Google Cloud Storage (GCS) URLs in the Go CDK closely resemble the URLs you would see in the gsutil CLI.

blob.OpenBucket will use Application Default Credentials; if you have authenticated via gcloud auth application-default login, it will use those credentials. See Application Default Credentials to learn about authentication alternatives, including using environment variables.

{{< goexample "gocloud.dev/blob/gcsblob.Example_openBucketFromURL" >}}

Full details about acceptable URLs can be found under the API reference for gcsblob.URLOpener.

GCS Constructor {#gcs-ctor}

The gcsblob.OpenBucket constructor opens a GCS bucket. You must first create a *net/http.Client that sends requests authorized by Google Cloud Platform credentials. (You can reuse the same client for any other API that takes in a *gcp.HTTPClient.) You can find functions in the gocloud.dev/gcp package to set this up for you.

{{< goexample "gocloud.dev/blob/gcsblob.ExampleOpenBucket" >}}

S3 {#s3}

S3 URLs in the Go CDK closely resemble the URLs you would see in the AWS CLI. You should specify the region query parameter to ensure your application connects to the correct region.

It will create an AWS Config based on the AWS SDK V2; see AWS V2 Config to learn more.

Full details about acceptable URLs can be found under the API reference for s3blob.URLOpener.

{{< goexample "gocloud.dev/blob/s3blob.Example_openBucketFromURL" >}}

S3 Constructor {#s3-ctor}

The s3blob.OpenBucket constructor opens an S3 bucket. You must first create an AWS Config with the same region as your bucket:

{{< goexample "gocloud.dev/blob/s3blob.ExampleOpenBucket" >}}

S3-Compatible Servers {#s3-compatible}

The Go CDK can also interact with S3-compatible storage servers that recognize the same REST HTTP endpoints as S3, like Minio, Ceph, or SeaweedFS. You can change the endpoint by changing the Endpoint field on the *aws.Config you pass to s3blob.OpenBucket. If you are using blob.OpenBucket, you can switch endpoints by using the S3 URL using query parameters like so:

go
bucket, err := blob.OpenBucket("s3://mybucket?" +
    "endpoint=my.minio.local:8080&" +
    "disable_https=true&" +
    "s3ForcePathStyle=true")

See aws.V2ConfigFromURLParams for more details on supported URL options for S3.

Azure Blob Storage {#azure}

Azure Blob Storage URLs in the Go CDK allow you to identify Azure Blob Storage containers when opening a bucket with blob.OpenBucket. Go CDK uses the environment variables AZURE_STORAGE_ACCOUNT, AZURE_STORAGE_KEY, and AZURE_STORAGE_SAS_TOKEN, among others, to configure the credentials.

{{< goexample "gocloud.dev/blob/azureblob.Example_openBucketFromURL" >}}

Full details about acceptable URLs can be found under the API reference for azureblob.URLOpener.

Azure Blob Constructor {#azure-ctor}

The azureblob.OpenBucket constructor opens an Azure Blob Storage container. azureblob operates on Azure Storage Block Blobs. You must first create an Azure Service Client before you can open a container.

{{< goexample "gocloud.dev/blob/azureblob.ExampleOpenBucket" >}}

Local Storage {#local}

The Go CDK provides blob drivers for storing data in memory and on the local filesystem. These are primarily intended for testing and local development, but may be useful in production scenarios where an NFS mount is used.

Local storage URLs take the form of either mem:// or file:/// URLs. Memory URLs are always mem:// with no other information and always create a new bucket. File URLs convert slashes to the operating system's native file separator, so on Windows, C:\foo\bar would be written as file:///C:/foo/bar.

go
import (
    "gocloud.dev/blob"
    _ "gocloud.dev/blob/fileblob"
    _ "gocloud.dev/blob/memblob"
)

// ...

bucket1, err := blob.OpenBucket(ctx, "mem://")
if err != nil {
    return err
}
defer bucket1.Close()

bucket2, err := blob.OpenBucket(ctx, "file:///path/to/dir")
if err != nil {
    return err
}
defer bucket2.Close()

Local Storage Constructors {#local-ctor}

You can create an in-memory bucket with memblob.OpenBucket:

{{< goexample "gocloud.dev/blob/memblob.ExampleOpenBucket" >}}

You can use a local filesystem directory with fileblob.OpenBucket:

{{< goexample "gocloud.dev/blob/fileblob.ExampleOpenBucket" >}}