docs/src/main/sphinx/connector/sqlserver.md
The SQL Server connector allows querying and creating tables in an external Microsoft SQL Server database. This can be used to join data between different systems like SQL Server and Hive, or between two different SQL Server instances.
To connect to SQL Server, you need:
The connector can query a single database on a given SQL Server instance. Create
a catalog properties file that specifies the SQL server connector by setting the
connector.name to sqlserver.
For example, to access a database as example, create the file
etc/catalog/example.properties. Replace the connection properties as
appropriate for your setup:
connector.name=sqlserver
connection-url=jdbc:sqlserver://<host>:<port>;databaseName=<databaseName>;encrypt=false
connection-user=root
connection-password=secret
The connection-url defines the connection information and parameters to pass
to the SQL Server JDBC driver. The supported parameters for the URL are
available in the SQL Server JDBC driver documentation.
The connection-user and connection-password are typically required and
determine the user credentials for the connection, often a service user. You can
use {doc}secrets </security/secrets> to avoid actual values in the catalog
properties files.
(sqlserver-tls)=
The JDBC driver, and therefore the connector, automatically use Transport Layer Security (TLS) encryption and certificate validation. This requires a suitable TLS certificate configured on your SQL Server database host.
If you do not have the necessary configuration established, you can disable
encryption in the connection string with the encrypt property:
connection-url=jdbc:sqlserver://<host>:<port>;databaseName=<databaseName>;encrypt=false
Further parameters like trustServerCertificate, hostNameInCertificate,
trustStore, and trustStorePassword are details in the TLS section of
SQL Server JDBC driver documentation.
The SQL Server connector can only access a single SQL Server database within a single catalog. Thus, if you have multiple SQL Server databases, or want to connect to multiple SQL Server instances, you must configure multiple instances of the SQL Server connector.
To add another catalog, simply add another properties file to etc/catalog
with a different name, making sure it ends in .properties. For example,
if you name the property file sales.properties, Trino creates a
catalog named sales using the configured connector.
The SQL Server connector supports additional catalog properties to configure the behavior of the connector and the issues queries to the database.
:::{list-table} :widths: 45, 55 :header-rows: 1
sqlserver.snapshot-isolation.disabledfalse, which means that snapshot
isolation is enabled.
:::(sqlserver-fte-support)=
The connector supports {doc}/admin/fault-tolerant-execution of query
processing. Read and write operations are both supported with any retry policy.
The SQL Server connector provides access to all schemas visible to the specified
user in the configured database. For the following examples, assume the SQL
Server catalog is example.
You can see the available schemas by running SHOW SCHEMAS:
SHOW SCHEMAS FROM example;
If you have a schema named web, you can view the tables
in this schema by running SHOW TABLES:
SHOW TABLES FROM example.web;
You can see a list of the columns in the clicks table in the web database
using either of the following:
DESCRIBE example.web.clicks;
SHOW COLUMNS FROM example.web.clicks;
Finally, you can query the clicks table in the web schema:
SELECT * FROM example.web.clicks;
If you used a different name for your catalog properties file, use
that catalog name instead of example in the above examples.
(sqlserver-type-mapping)=
Because Trino and SQL Server each support types that the other does not, this
connector {ref}modifies some types <type-mapping-overview> when reading or
writing data. Data types may not map the same way in both directions between
Trino and the data source. Refer to the following sections for type mapping in
each direction.
The connector maps SQL Server types to the corresponding Trino types following this table:
:::{list-table} SQL Server type to Trino type mapping :widths: 30, 30, 40 :header-rows: 1
BITBOOLEANTINYINTSMALLINTTINYINT is actually unsigned TINYINTSMALLINTSMALLINTINTEGERINTEGERBIGINTBIGINTDOUBLE PRECISIONDOUBLEREALREALDECIMAL[(p[, s])], NUMERIC[(p[, s])]DECIMAL(p, s)CHAR[(n)]CHAR(n)1 <= n <= 8000NCHAR[(n)]CHAR(n)1 <= n <= 4000VARCHAR[(n | max)], NVARCHAR[(n | max)]VARCHAR(n)1 <= n <= 8000, max = 2147483647TEXTVARCHAR(2147483647)NTEXTVARCHAR(1073741823)VARBINARY[(n | max)]VARBINARY1 <= n <= 8000, max = 2147483647DATEDATETIME[(n)]TIME(n)0 <= n <= 7DATETIME2[(n)]TIMESTAMP(n)0 <= n <= 7SMALLDATETIMETIMESTAMP(0)DATETIMEOFFSET[(n)]TIMESTAMP(n) WITH TIME ZONE0 <= n <= 7
:::The connector maps Trino types to the corresponding SQL Server types following this table:
:::{list-table} Trino type to SQL Server type mapping :widths: 30, 30, 40 :header-rows: 1
BOOLEANBITTINYINTTINYINT[0, 127]SMALLINTSMALLINTINTEGERINTEGERBIGINTBIGINTREALREALDOUBLEDOUBLE PRECISIONDECIMAL(p, s)DECIMAL(p, s)VARBINARYVARBINARY(max)DATEDATETIME(n)TIME(n)0 <= n <= 7TIMESTAMP(n)DATETIME2(n)0 <= n <= 7
:::Complete list of SQL Server data types.
(sqlserver-numeric-mapping)=
For SQL Server FLOAT[(n)]:
n is not specified maps to Trino Double1 <= n <= 24 maps to Trino REAL24 < n <= 53 maps to Trino DOUBLE(sqlserver-character-mapping)=
For Trino CHAR(n):
1 <= n <= 4000 maps SQL Server NCHAR(n)n > 4000 maps SQL Server NVARCHAR(max)For Trino VARCHAR(n):
1 <= n <= 4000 maps SQL Server NVARCHAR(n)n > 4000 maps SQL Server NVARCHAR(max)(sqlserver-sql-support)=
The connector provides read access and write access to data and metadata in SQL Server. In addition to the globally available and read operation statements, the connector supports the following features:
(sqlserver-insert)=
(sqlserver-update)=
(sqlserver-delete)=
(sqlserver-alter-table)=
(sqlserver-procedures)=
(sqlserver-table-functions)=
The connector provides specific {doc}table functions </functions/table> to
access SQL Server.
(sqlserver-query-function)=
query(varchar) -> tableThe query function allows you to query the underlying database directly. It
requires syntax native to SQL Server, because the full query is pushed down and
processed in SQL Server. This can be useful for accessing native features which
are not implemented in Trino or for improving query performance in situations
where running a query natively may be faster.
For example, query the example catalog and select the top 10 percent of
nations by population:
SELECT
*
FROM
TABLE(
example.system.query(
query => 'SELECT
TOP(10) PERCENT *
FROM
tpch.nation
ORDER BY
population DESC'
)
);
(sqlserver-procedure-function)=
procedure(varchar) -> tableThe procedure function allows you to run stored procedures on the underlying
database directly. It requires syntax native to SQL Server, because the full query
is pushed down and processed in SQL Server. In order to use this table function set
sqlserver.stored-procedure-table-function-enabled to true.
:::{note}
The procedure function does not support running StoredProcedures that return multiple statements,
use a non-select statement, use output parameters, or use conditional statements.
:::
:::{warning} This feature is experimental only. The function has security implication and syntax might change and be backward incompatible. :::
The follow example runs the stored procedure employee_sp in the example catalog and the
example_schema schema in the underlying SQL Server database:
SELECT
*
FROM
TABLE(
example.system.procedure(
query => 'EXECUTE example_schema.employee_sp'
)
);
If the stored procedure employee_sp requires any input
append the parameter value to the procedure statement:
SELECT
*
FROM
TABLE(
example.system.procedure(
query => 'EXECUTE example_schema.employee_sp 0'
)
);
The connector includes a number of performance improvements, detailed in the following sections.
(sqlserver-table-statistics)=
The SQL Server connector can use {doc}table and column statistics </optimizer/statistics> for {doc}cost based optimizations </optimizer/cost-based-optimizations>, to improve query processing performance
based on the actual data in the data source.
The statistics are collected by SQL Server and retrieved by the connector.
The connector can use information stored in single-column statistics. SQL Server Database can automatically create column statistics for certain columns. If column statistics are not created automatically for a certain column, you can create them by executing the following statement in SQL Server Database.
CREATE STATISTICS example_statistics_name ON table_schema.table_name (column_name);
SQL Server Database routinely updates the statistics. In some cases, you may want to force statistics update (e.g. after defining new column statistics or after changing data in the table). You can do that by executing the following statement in SQL Server Database.
UPDATE STATISTICS table_schema.table_name;
Refer to SQL Server documentation for information about options, limitations and additional considerations.
(sqlserver-pushdown)=
The connector supports pushdown for a number of operations:
join-pushdownlimit-pushdowntopn-pushdown{ref}Aggregate pushdown <aggregation-pushdown> for the following functions:
avgcountmaxminsumstddevstddev_popstddev_sampvariancevar_popvar_sampThe connector supports pushdown of predicates on VARCHAR and NVARCHAR
columns if the underlying columns in SQL Server use a case-sensitive collation.
The following operators are pushed down:
=<>INNOT INTo ensure correct results, operators are not pushed down for columns using a case-insensitive collation.
(sqlserver-bulk-insert)=
You can optionally use the bulk copy API to drastically speed up write operations.
Enable bulk copying and a lock on the destination table to meet minimal logging requirements.
The following table shows the relevant catalog configuration properties and their default values:
:::{list-table} Bulk load properties :widths: 30, 60, 10 :header-rows: 1
sqlserver.bulk-copy-for-write.enabledbulk_copy_for_write.falsesqlserver.bulk-copy-for-write.lock-destination-tablebulk_copy_for_write_lock_destination_table. Setting is only used when
bulk-copy-for-write.enabled=true.false
:::Limitations:
You can specify the data compression policy for SQL Server tables
with the data_compression table property. Valid policies are NONE, ROW or PAGE.
Example:
CREATE TABLE example_schema.scientists (
recordkey VARCHAR,
name VARCHAR,
age BIGINT,
birthday DATE
)
WITH (
data_compression = 'ROW'
);