docs/src/main/sphinx/object-storage/file-system-s3.md
Trino includes a native implementation to access Amazon S3 and compatible storage systems with a catalog using the Delta Lake, Hive, Hudi, or Iceberg connectors. While Trino is designed to support S3-compatible storage systems, only AWS S3 and MinIO are tested for compatibility. For other storage systems, perform your own testing and consult your vendor for more information.
Enable the native implementation with fs.native-s3.enabled=true in your
catalog properties file.
Use the following properties to configure general aspects of S3 file system support:
:::{list-table} :widths: 40, 60 :header-rows: 1
fs.native-s3.enabledfalse. Set to true to use S3 and enable all other properties.s3.endpoints3.regions3.cross-region-accessfalse.s3.path-style-accesss3.storage-classSTANDARD. Other allowed
values are: STANDARD_IA, INTELLIGENT_TIERING, REDUCED_REDUNDANCY, ONEZONE_IA,
GLACIER, DEEP_ARCHIVE, OUTPOSTS, GLACIER_IR, SNOW, EXPRESS_ONEZONE.s3.signer-typeAwsS3V4Signer, Aws4Signer, AsyncAws4Signer, Aws4UnsignedPayloadSigner,
EventStreamAws4Signer.s3.canned-aclNONE, which has the same
effect as PRIVATE. If the files are to be uploaded to an S3 bucket owned
by a different AWS user, the canned ACL may be set to one of the following:
PRIVATE, PUBLIC_READ, PUBLIC_READ_WRITE, AUTHENTICATED_READ,
BUCKET_OWNER_READ, or BUCKET_OWNER_FULL_CONTROL.s3.sse.typeNONE
for no encryption. Other valid values are S3 for encryption by S3 managed
keys, KMS for encryption with a key from the AWS Key Management
Service (KMS), and CUSTOMER for encryption with a customer-provided key
from s3.sse.customer-key. Note that S3 automatically uses SSE so NONE
and S3 are equivalent. S3-compatible systems might behave differently.s3.sse.kms-key-ids3.sse.customer-keys3.sse.type set to
CUSTOMER.s3.streaming.part-size5MB and 256MB are
valid. Defaults to 32MB.s3.requester-paysfalse.s3.max-connections500.s3.connection-ttls3.connection-max-idle-times3.socket-connect-timeouts3.socket-timeouts3.tcp-keep-alivefalse.s3.http-proxys3.http-proxy.securetrue to enable HTTPS for the proxy server.s3.http-proxy.usernames3.http-proxy.passwords3.http-proxy.non-proxy-hostss3.http-proxy.preemptive-basic-authfalse.s3.retry-modeLEGACY.
Other allowed values are STANDARD and ADAPTIVE. The STANDARD mode
includes a standard set of errors that are retried. ADAPTIVE mode
includes the functionality of STANDARD mode with automatic client-side
throttling.s3.max-error-retries20.s3.use-web-identity-token-credentials-providertrue to only use the web identity token credentials provider,
instead of the default providers chain. This can be useful when running
Trino on Amazon EKS and using IAM roles for service accounts
(IRSA)
Defaults to false.s3.application-idUser-Agent header
for all requests sent to S3. Defaults to Trino.
:::Use the following properties to configure the authentication to S3 with access and secret keys, STS, or an IAM role:
:::{list-table} :widths: 40, 60 :header-rows: 1
s3.aws-access-keys3.aws-secret-keys3.sts.endpoints3.sts.regions3.iam-roles3.role-session-nametrino-filesystem.s3.external-idTrino supports flexible security mapping for S3, allowing for separate credentials or IAM roles for specific users or S3 locations. The IAM role for a specific query can be selected from a list of allowed roles by providing it as an extra credential.
Each security mapping entry may specify one or more match criteria. If multiple criteria are specified, all criteria must match. The following match criteria are available:
user: Regular expression to match against username. Example: alice|bobgroup: Regular expression to match against any of the groups that the user
belongs to. Example: finance|salesprefix: S3 URL prefix. You can specify an entire bucket or a path within a
bucket. The URL must start with s3:// but also matches for s3a or s3n.
Example: s3://bucket-name/abc/xyz/The security mapping must provide one or more configuration settings:
accessKey and secretKey: AWS access key and secret key. This overrides
any globally configured credentials, such as access key or instance credentials.iamRole: IAM role to use if no user provided role is specified as an
extra credential. This overrides any globally configured IAM role. This role
is allowed to be specified as an extra credential, although specifying it
explicitly has no effect.roleSessionName: Optional role session name to use with iamRole. This can only
be used when iamRole is specified. If roleSessionName includes the string
${USER}, then the ${USER} portion of the string is replaced with the
current session's username. If roleSessionName is not specified, it defaults
to trino-session.allowedIamRoles: IAM roles that are allowed to be specified as an extra
credential. This is useful because a particular AWS account may have permissions
to use many roles, but a specific user should only be allowed to use a subset
of those roles.kmsKeyId: ID of KMS-managed key to be used for client-side encryption.allowedKmsKeyIds: KMS-managed key IDs that are allowed to be specified as an extra
credential. If list contains *, then any key can be specified via extra credential.sseCustomerKey: The customer provided key (SSE-C) for server-side encryption.allowedSseCustomerKey: The SSE-C keys that are allowed to be specified as an extra
credential. If list contains *, then any key can be specified via extra credential.endpoint: The S3 storage endpoint server. This optional property can be used
to override S3 endpoints on a per-bucket basis.region: The S3 region to connect to. This optional property can be used
to override S3 regions on a per-bucket basis.The security mapping entries are processed in the order listed in the JSON configuration.
Therefore, specific mappings must be specified before less specific mappings.
For example, the mapping list might have URL prefix s3://abc/xyz/ followed by
s3://abc/ to allow different configuration for a specific path within a bucket
than for other paths within the bucket. You can specify the default configuration
by not including any match criteria for the last entry in the list.
In addition to the preceding rules, the default mapping can contain the optional
useClusterDefault boolean property set to true to use the default S3 configuration.
It cannot be used with any other configuration settings.
If no mapping entry matches and no default is configured, access is denied.
The configuration JSON is read from a file via s3.security-mapping.config-file
or from an HTTP endpoint via s3.security-mapping.config-uri.
Example JSON configuration:
{
"mappings": [
{
"prefix": "s3://bucket-name/abc/",
"iamRole": "arn:aws:iam::123456789101:role/test_path"
},
{
"user": "bob|charlie",
"iamRole": "arn:aws:iam::123456789101:role/test_default",
"allowedIamRoles": [
"arn:aws:iam::123456789101:role/test1",
"arn:aws:iam::123456789101:role/test2",
"arn:aws:iam::123456789101:role/test3"
]
},
{
"prefix": "s3://special-bucket/",
"accessKey": "AKIAxxxaccess",
"secretKey": "iXbXxxxsecret"
},
{
"prefix": "s3://regional-bucket/",
"iamRole": "arn:aws:iam::123456789101:role/regional-user",
"endpoint": "https://bucket.vpce-1a2b3c4d-5e6f.s3.us-east-1.vpce.amazonaws.com",
"region": "us-east-1"
},
{
"prefix": "s3://encrypted-bucket/",
"kmsKeyId": "kmsKey_10"
},
{
"user": "test.*",
"iamRole": "arn:aws:iam::123456789101:role/test_users"
},
{
"group": "finance",
"iamRole": "arn:aws:iam::123456789101:role/finance_users"
},
{
"iamRole": "arn:aws:iam::123456789101:role/default"
}
]
}
:::{list-table} Security mapping properties :header-rows: 1
s3.security-mapping.enabledfalse.
Must be set to true for all other properties be used.s3.security-mapping.config-files3.security-mapping.config-uris3.security-mapping.json-pointers3.security-mapping.iam-role-credential-names3.security-mapping.kms-key-id-credential-names3.security-mapping.sse-customer-key-credential-names3.security-mapping.refresh-periodprop-type-duration. By default, the configuration is not refreshed.s3.security-mapping.colon-replacement(fs-legacy-s3-migration)=
Trino includes legacy Amazon S3 support to use with a catalog using the Delta Lake, Hive, Hudi, or Iceberg connectors. Upgrading existing deployments to the current native implementation is recommended. Legacy support is deprecated and will be removed.
To migrate a catalog to use the native file system implementation for S3, make the following edits to your catalog configuration:
fs.native-s3.enabled=true catalog configuration property.:::{list-table} :widths: 35, 35, 65 :header-rows: 1
hive.s3.aws-access-keys3.aws-access-keyhive.s3.aws-secret-keys3.aws-secret-keyhive.s3.iam-roles3.iam-roles3.role-session-name in preceding sections
for more role configuration options.hive.s3.external-ids3.external-idhive.s3.endpoints3.endpointhttps:// prefix to make the value a correct URL.hive.s3.regions3.regionhive.s3.sse.enableds3.sse.type set to the default value of NONE is equivalent to
hive.s3.sse.enabled=false.hive.s3.sse.types3.sse.typehive.s3.sse.kms-key-ids3.sse.kms-key-idhive.s3.upload-acl-types3.canned-aclhive.s3.streaming.part-sizes3.streaming.part-sizehive.s3.proxy.host, hive.s3.proxy.ports3.http-proxylocalhost:8888.hive.s3.proxy.protocols3.http-proxy.secureTRUE to enable HTTPS.hive.s3.proxy.non-proxy-hostss3.http-proxy.non-proxy-hostshive.s3.proxy.usernames3.http-proxy.usernamehive.s3.proxy.passwords3.http-proxy.passwordhive.s3.proxy.preemptive-basic-auths3.http-proxy.preemptive-basic-authhive.s3.sts.endpoints3.sts.endpointhive.s3.sts.regions3.sts.regionhive.s3.max-error-retriess3.max-error-retriess3.retry-mode in preceding sections for more retry behavior
configuration options.hive.s3.connect-timeouts3.socket-connect-timeouthive.s3.connect-ttls3.connection-ttls3.connection-max-idle-time in preceding section for more
connection keep-alive options.hive.s3.socket-timeouts3.socket-timeouts3.tcp-keep-alive in preceding sections for more socket
connection keep-alive options.hive.s3.max-connectionss3.max-connectionshive.s3.path-style-accesss3.path-style-accesshive.s3.signer-types3.signer-type:::
Remove the following legacy configuration properties if they exist in your catalog configuration:
hive.s3.storage-classhive.s3.signer-classhive.s3.staging-directoryhive.s3.pin-client-to-current-regionhive.s3.ssl.enabledhive.s3.sse.enabledhive.s3.kms-key-idhive.s3.encryption-materials-providerhive.s3.streaming.enabledhive.s3.max-client-retrieshive.s3.max-backoff-timehive.s3.max-retry-timehive.s3.multipart.min-file-sizehive.s3.multipart.min-part-sizehive.s3-file-system-typehive.s3.user-agent-prefix