plugins/inputs/sqlserver/README.md
This plugin provides metrics for your SQL Server instance. Recorded metrics are lightweight and use Dynamic Management Views supplied by SQL Server.
[!NOTE] This plugin supports SQL server versions supported by Microsoft (see lifecycle dates), Azure SQL Databases (Single), Azure SQL Managed Instances, Azure SQL Elastic Pools and Azure Arc-enabled SQL Managed Instances.
⭐ Telegraf v0.10.1 🏷️ datastore 💻 all
Plugins support additional global and plugin configuration settings for tasks such as modifying metrics, tags, and fields, creating aliases, and configuring plugin ordering. See CONFIGURATION.md for more details.
This plugin supports secrets from secret-stores for the servers option.
See the secret-store documentation for more details on how
to use them.
# Read metrics from Microsoft SQL Server
[[inputs.sqlserver]]
## Specify instances to monitor with a list of connection strings.
## All connection parameters are optional.
## By default, the host is localhost, listening on default port, TCP 1433.
## for Windows, the user is the currently running AD user (SSO).
## See https://github.com/microsoft/go-mssqldb for detailed connection
## parameters, in particular, tls connections can be created like so:
## "encrypt=true;certificate=<cert>;hostNameInCertificate=<SqlServer host fqdn>"
servers = [
"Server=192.168.1.10;Port=1433;User Id=<user>;Password=<pw>;app name=telegraf;log=1;",
]
## Timeout for query execution operation
## Note that the timeout for queries is per query not per gather.
## 0 value means no timeout
# query_timeout = "0s"
## Authentication method
## valid methods: "connection_string", "AAD"
# auth_method = "connection_string"
## ClientID is the is the client ID of the user assigned identity of the VM
## that should be used to authenticate to the Azure SQL server.
# client_id = ""
## "database_type" enables a specific set of queries depending on the database type. If specified, it replaces azuredb = true/false and query_version = 2
## In the config file, the sql server plugin section should be repeated each with a set of servers for a specific database_type.
## Possible values for database_type are - "SQLServer" or "AzureSQLDB" or "AzureSQLManagedInstance" or "AzureSQLPool"
database_type = "SQLServer"
## A list of queries to include. If not specified, all the below listed queries are used.
include_query = []
## A list of queries to explicitly ignore.
exclude_query = ["SQLServerAvailabilityReplicaStates", "SQLServerDatabaseReplicaStates"]
## Force using the deprecated ADAL authentication method instead of the recommended
## MSAL method. Setting this option is not recommended and only exists for backward
## compatibility.
# use_deprecated_adal_authentication = false
## Queries enabled by default for database_type = "SQLServer" are -
## SQLServerPerformanceCounters, SQLServerWaitStatsCategorized, SQLServerDatabaseIO, SQLServerProperties, SQLServerMemoryClerks,
## SQLServerSchedulers, SQLServerRequests, SQLServerVolumeSpace, SQLServerCpu, SQLServerAvailabilityReplicaStates, SQLServerDatabaseReplicaStates,
## SQLServerRecentBackups
## Queries enabled by default for database_type = "AzureSQLDB" are -
## AzureSQLDBResourceStats, AzureSQLDBResourceGovernance, AzureSQLDBWaitStats, AzureSQLDBDatabaseIO, AzureSQLDBServerProperties,
## AzureSQLDBOsWaitstats, AzureSQLDBMemoryClerks, AzureSQLDBPerformanceCounters, AzureSQLDBRequests, AzureSQLDBSchedulers
## Queries enabled by default for database_type = "AzureSQLManagedInstance" are -
## AzureSQLMIResourceStats, AzureSQLMIResourceGovernance, AzureSQLMIDatabaseIO, AzureSQLMIServerProperties, AzureSQLMIOsWaitstats,
## AzureSQLMIMemoryClerks, AzureSQLMIPerformanceCounters, AzureSQLMIRequests, AzureSQLMISchedulers
## Queries enabled by default for database_type = "AzureSQLPool" are -
## AzureSQLPoolResourceStats, AzureSQLPoolResourceGovernance, AzureSQLPoolDatabaseIO, AzureSQLPoolWaitStats,
## AzureSQLPoolMemoryClerks, AzureSQLPoolPerformanceCounters, AzureSQLPoolSchedulers
## Queries enabled by default for database_type = "AzureArcSQLManagedInstance" are -
## AzureSQLMIDatabaseIO, AzureSQLMIServerProperties, AzureSQLMIOsWaitstats,
## AzureSQLMIMemoryClerks, AzureSQLMIPerformanceCounters, AzureSQLMIRequests, AzureSQLMISchedulers
## Following are old config settings
## You may use them only if you are using the earlier flavor of queries, however it is recommended to use
## the new mechanism of identifying the database_type there by use it's corresponding queries
## Optional parameter, setting this to 2 will use a new version
## of the collection queries that break compatibility with the original
## dashboards.
## Version 2 - is compatible from SQL Server 2012 and later versions and also for SQL Azure DB
# query_version = 2
## If you are using AzureDB, setting this to true will gather resource utilization metrics
# azuredb = false
## Toggling this to true will emit an additional metric called "sqlserver_telegraf_health".
## This metric tracks the count of attempted queries and successful queries for each SQL instance specified in "servers".
## The purpose of this metric is to assist with identifying and diagnosing any connectivity or query issues.
## This setting/metric is optional and is disabled by default.
# health_metric = false
## Possible queries across different versions of the collectors
## Queries enabled by default for specific Database Type
## database_type = AzureSQLDB by default collects the following queries
## - AzureSQLDBWaitStats
## - AzureSQLDBResourceStats
## - AzureSQLDBResourceGovernance
## - AzureSQLDBDatabaseIO
## - AzureSQLDBServerProperties
## - AzureSQLDBOsWaitstats
## - AzureSQLDBMemoryClerks
## - AzureSQLDBPerformanceCounters
## - AzureSQLDBRequests
## - AzureSQLDBSchedulers
## database_type = AzureSQLManagedInstance by default collects the following queries
## - AzureSQLMIResourceStats
## - AzureSQLMIResourceGovernance
## - AzureSQLMIDatabaseIO
## - AzureSQLMIServerProperties
## - AzureSQLMIOsWaitstats
## - AzureSQLMIMemoryClerks
## - AzureSQLMIPerformanceCounters
## - AzureSQLMIRequests
## - AzureSQLMISchedulers
## database_type = AzureSQLPool by default collects the following queries
## - AzureSQLPoolResourceStats
## - AzureSQLPoolResourceGovernance
## - AzureSQLPoolDatabaseIO
## - AzureSQLPoolOsWaitStats,
## - AzureSQLPoolMemoryClerks
## - AzureSQLPoolPerformanceCounters
## - AzureSQLPoolSchedulers
## database_type = SQLServer by default collects the following queries
## - SQLServerPerformanceCounters
## - SQLServerWaitStatsCategorized
## - SQLServerDatabaseIO
## - SQLServerProperties
## - SQLServerMemoryClerks
## - SQLServerSchedulers
## - SQLServerRequests
## - SQLServerVolumeSpace
## - SQLServerCpu
## - SQLServerRecentBackups
## and following as optional (if mentioned in the include_query list)
## - SQLServerAvailabilityReplicaStates
## - SQLServerDatabaseReplicaStates
## Maximum number of open connections to the database, 0 allows the driver to decide.
# max_open_connections = 0
## Maximum number of idle connections in the connection pool, 0 allows the driver to decide.
# max_idle_connections = 0
For available options in the servers DSN check the driver documentation.
The plugin supports the named-pipe and LPC protocol on Windows AMD64 and i386 for connections. On other platforms those protocols are not available. See the protocol configuration section of the driver documentation on how to specify the protocols.
You have to create a login on every SQL Server instance or Azure SQL Managed instance you want to monitor, with following script:
USE master;
GO
CREATE LOGIN [telegraf] WITH PASSWORD = N'mystrongpassword';
GO
GRANT VIEW SERVER STATE TO [telegraf];
GO
GRANT VIEW ANY DEFINITION TO [telegraf];
GO
For Azure SQL Database, you require the View Database State permission and can create a user with a password directly in the database.
CREATE USER [telegraf] WITH PASSWORD = N'mystrongpassword';
GO
GRANT VIEW DATABASE STATE TO [telegraf];
GO
For Azure SQL Elastic Pool, please follow the following instructions to collect metrics. On master logical database, create an SQL login 'telegraf' and assign it to the server-level role ##MS_ServerStateReader##.
CREATE LOGIN [telegraf] WITH PASSWORD = N'mystrongpassword';
GO
ALTER SERVER ROLE ##MS_ServerStateReader##
ADD MEMBER [telegraf];
GO
Elastic pool metrics can be collected from any database in the pool if a user
for the telegraf login is created in that database. For collection to work,
this database must remain in the pool, and must not be renamed. If you plan
to add/remove databases from this pool, create a separate database for
monitoring purposes that will remain in the pool.
[!NOTE] To avoid duplicate monitoring data, do not collect elastic pool metrics from more than one database in the same pool.
GO
CREATE USER [telegraf] FOR LOGIN telegraf;
For Service SID authentication to SQL Server (Windows service installations only) check the howto document. In an administrative command prompt configure the telegraf service for use with a service SID
sc.exe sidtype "telegraf" unrestricted
To create the login for the telegraf service run the following script:
USE master;
GO
CREATE LOGIN [NT SERVICE\telegraf] FROM WINDOWS;
GO
GRANT VIEW SERVER STATE TO [NT SERVICE\telegraf];
GO
GRANT VIEW ANY DEFINITION TO [NT SERVICE\telegraf];
GO
Remove User Id and Password keywords from the connection string in your config file to use windows authentication.
[[inputs.sqlserver]]
servers = ["Server=192.168.1.10;Port=1433;app name=telegraf;log=1;",]
To set up a configurable timeout, add timeout to the connections string in your config file.
servers = [
"Server=192.168.1.10;Port=1433;User Id=<user>;Password=<pw>;app name=telegraf;log=1;dial timeout=30",
]
Azure SQL Database instances support two main methods of authentication: SQL authentication and AAD authentication. The recommended practice is to use AAD authentication when possible as it is a more modern authentication protocol, allows for easier credential and role management and can eliminate the need to include passwords in connection strings.
If more then one managed identity is assigned to the VM, you need specify the
client_id of the identity you wish to use to authenticate with the SQL Server.
Please check SQL Server driver documentation for available options.
AAD based auth is currently only supported for Azure SQL Database and Azure SQL Managed Instance but not for SQL Server. To use MSI configure "system-assigned managed identity" for Azure resources on the Monitoring VM (the VM connecting to the SQL server/database) using the Azure portal. Create a user with the name of the Monitoring VM as the principal on the database being monitored using the below script. This might require allow-listing the client machine's IP address (from where the below SQL script is being run) on the SQL Server resource.
In case of multiple assigned identities on one VM you can use the parameter
user_assigned_id to specify the client_id.
EXECUTE ('IF EXISTS(SELECT * FROM sys.database_principals WHERE name = ''<Monitoring_VM_Name>'')
BEGIN
DROP USER [<Monitoring_VM_Name>]
END')
EXECUTE ('CREATE USER [<Monitoring_VM_Name>] FROM EXTERNAL PROVIDER')
EXECUTE ('GRANT VIEW DATABASE STATE TO [<Monitoring_VM_Name>]')
On the SQL Server resource of the database(s) being monitored, go to "Firewalls and Virtual Networks" tab and allowlist the monitoring VM IP address. On the Monitoring VM, update the telegraf config file with the database connection string in the following format. The connection string only provides the server and database name, but no password (since the VM's system-assigned managed identity would be used for authentication). The auth method must be set to "AAD"
servers = [
"Server=<Azure_SQL_Server_Name>.database.windows.net;Port=1433;Database=<Azure_SQL_Database_Name>;app name=telegraf;log=1;",
]
auth_method = "AAD"
To provide backwards compatibility, this plugin support two versions of metrics queries.
[!NOTE] Version 2 queries are not backwards compatible with the old queries. Any dashboards or queries based on the old query format will not work with the new format. The version 2 queries only report raw metrics, no math has been done to calculate deltas. To graph this data you must calculate deltas in your dashboarding software.
[!CAUTION] The
query_versionoption was deprecated in Telegraf v1.16. All future development will be under configuration optiondatabase_type.
The original metrics queries provide:
sys.dm_os_performance_counterssys.dm_os_wait_statssys.dm_os_memory_clerkssys.dm_io_virtual_file_statssys.dm_io_virtual_file_statssys.dm_io_virtual_file_statssys.databasessys.dm_os_volume_statssys.dm_os_ring_buffersIf you are using the original queries all stats have the following tags:
servername: hostname:instancetype: type of stats to easily filter measurements[!CAUTION] The
query_versionoption was deprecated in Telegraf v1.16. All future development will be under configuration optiondatabase_type.
The new (version 2) metrics provide:
Database IO: IO stats from sys.dm_io_virtual_file_stats.
Memory Clerk: Memory clerk breakdown from sys.dm_os_memory_clerks, most
clerks have been given a friendly name.
Performance Counters: A select list of performance counters from
sys.dm_os_performance_counters. Some of the
important metrics included:
Server properties: Number of databases in all possible states (online, offline, suspect, etc.), cpu count, total physical memory, available physical memory, SQL Server service uptime, SQL Server SPID, and SQL Server version. In the case of Azure SQL relevant properties such as Tier, #Vcores, Memory etc.
Wait stats: Wait time in ms, number of waiting tasks, resource wait time, signal wait time, max wait time in ms, wait type, and wait category. The waits are categorized using the same categories used in Query Store.
Schedulers: This captures sys.dm_os_schedulers.
SqlRequests: This captures a snapshot of sys.dm_exec_requests
and sys.dm_exec_sessions that gives you running
requests as well as wait types and blocking sessions.
Telegraf's monitoring request is omitted unless it is
a heading blocker. Also includes sleeping sessions
with open transactions.
VolumeSpace: uses sys.dm_os_volume_stats to get total, used and
occupied space on every disk that contains a data or
log file. (Note that even if enabled it won't get any
data from Azure SQL Database or SQL Managed Instance).
It is pointless to run this with high frequency
(ie: every 10s), but it won't cause any problem.
Cpu: uses the buffer ring (sys.dm_os_ring_buffers) to
get CPU data, the table is updated once per minute.
(Note that even if enabled it won't get any data from
Azure SQL Database or SQL Managed Instance).
In order to allow tracking on a per statement basis this query produces a unique tag for each query. Depending on the database workload, this may result in a high cardinality series. Reference the FAQ for tips on managing series cardinality.
Azure Managed Instances
sys.server_resource_statssys.dm_instance_resource_governanceAzure SQL Database in addition to other stats
sys.dm_db_wait_statssys.dm_user_db_resource_governancesys.dm_db_resource_statsThese are metrics for Azure SQL Database (single database) and are very similar to version 2 but split out for maintenance reasons, better ability to test,differences in DMVs:
sys.dm_io_virtual_file_stats
including resource governance time, RBPEX, IO
for Hyperscale.sys.dm_os_memory_clerks.sys.dm_user_db_resource_governancesys.dm_os_performance_counters including
cloud specific counters for SQL Hyperscale.sys.dm_db_wait_stats,
number of waiting tasks, resource wait time,
signal wait time, max wait time in ms, wait
type, and wait category. The waits are
categorized using the same categories used in
Query Store. These waits are collected only
as of the end of the a statement. and for a
specific database only.sys.dm_os_wait_stats,
number of waiting tasks, resource wait time,
signal wait time, max wait time in ms, wait
type, and wait category. The waits are
categorized using the same categories used in
Query Store. These waits are collected as
they occur and instance widesys.dm_exec_sessions and
sys.dm_exec_requests. Telegraf's monitoring
request is omitted unless it is a heading blockersys.dm_os_schedulers snapshots.These are metrics for Azure SQL Managed instance, are very similar to version 2 but split out for maintenance reasons, better ability to test, differences in DMVs:
sys.dm_io_virtual_file_stats
including resource governance time, RBPEX, IO
for Hyperscale.sys.dm_os_memory_clerks.sys.dm_instance_resource_governancesys.dm_os_performance_counters including
cloud specific counters for SQL Hyperscale.sys.dm_os_wait_stats,
number of waiting tasks, resource wait time,
signal wait time, max wait time in ms, wait
type, and wait category. The waits are
categorized using the same categories used in
Query Store. These waits are collected as
they occur and instance widesys.dm_exec_sessions and
sys.dm_exec_requests. Telegraf's monitoring
request is omitted unless it is a heading blockersys.dm_os_schedulers snapshots.These are metrics for Azure SQL to monitor resources usage at Elastic Pool level. These metrics require additional permissions to be collected, please ensure to check additional setup section in this documentation.
sys.dm_resource_governor_resource_pools_history_ex.sys.dm_user_db_resource_governance.sys.dm_io_virtual_file_stats.sys.dm_os_wait_stats.sys.dm_os_memory_clerks.sys.dm_os_performance_counters.
Note: Performance counters where the
cntr_type column value is 537003264 are
already returned with a percentage format
between 0 and 100. For other counters,
please check sys.dm_os_performance_counters
documentation.sys.dm_os_schedulers snapshots.sys.dm_io_virtual_file_statssys.dm_os_memory_clerks,
most clerks have been given a friendly name.sys.dm_os_performance_counters. Some of the
important metrics included:
sys.dm_os_schedulers.sys.dm_exec_requests
and sys.dm_exec_sessions that gives you
running requests as well as wait types and
blocking sessions.sys.dm_os_volume_stats to get total,
used and occupied space on every disk that
contains a data or log file. (Note that even
if enabled it won't get any data from Azure
SQL Database or SQL Managed Instance). It is
pointless to run this with high frequency
(ie: every 10s), but it won't cause any problem.sys.dm_os_ring_buffers)
to get CPU data, the table is updated once
per minute. (Note that even if enabled it
won't get any data from Azure SQL Database or
SQL Managed Instance).sys.dm_hadr_availability_replica_states
for a High Availability / Disaster
Recovery (HADR) setupsys.dm_hadr_database_replica_states
for a High Availability / Disaster Recovery
(HADR) setupmsdb.dbo.backupsetsys.dm_tran_persistent_version_store_stats
for databases with Accelerated Database
Recovery enabledThe guiding principal is that all data collected from the same primary DMV ends up in the same measure irrespective of database_type.
sqlserver_database_io - Used by AzureSQLDBDatabaseIO,
AzureSQLMIDatabaseIO, SQLServerDatabaseIO,
DatabaseIO given the data is from sys.dm_io_virtual_file_statssqlserver_waitstats - Used by WaitStatsCategorized,
AzureSQLDBOsWaitstats, AzureSQLMIOsWaitstatssqlserver_server_properties - Used by SQLServerProperties,
AzureSQLDBServerProperties,
AzureSQLMIServerProperties, ServerPropertiessqlserver_memory_clerks - Used by SQLServerMemoryClerks,
AzureSQLDBMemoryClerks, AzureSQLMIMemoryClerks,
MemoryClerksqlserver_performance - Used by SQLServerPerformanceCounters,
AzureSQLDBPerformanceCounters,
AzureSQLMIPerformanceCounters, PerformanceCounterssys.dm_os_schedulers - Used by SQLServerSchedulers,
AzureSQLDBServerSchedulers, AzureSQLMIServerSchedulersThe following Performance counter metrics can be used directly, with no delta calculations:
Version 2 queries have the following tags:
sql_instance: Physical host and instance name (hostname:instance)database_name: For Azure SQLDB, database_name denotes the name of the
Azure SQL Database as server name is a logical construct.All collection versions (version 1, version 2, and database_type) support an
optional plugin health metric called sqlserver_telegraf_health. This metric
tracks if connections to SQL Server are succeeding or failing. Users can
leverage this metric to detect if their SQL Server monitoring is not working
as intended.
In the configuration file, toggling health_metric to true will enable
collection of this metric. By default, this value is set to false and
the metric is not collected. The health metric emits one record for each
connection specified by servers in the configuration file.
The health metric emits the following tags:
sql_instance - Name of the server specified in the connection string. This
value is emitted as-is in the connection string. If the
server could not be parsed from the connection string, a
constant placeholder value is emitteddatabase_name - Name of the database or (initial catalog) specified in the
connection string. This value is emitted as-is in the
connection string. If the database could not be parsed from
the connection string, a constant placeholder value is
emittedThe health metric emits the following fields:
attempted_queries - Number of queries that were attempted for this connectionsuccessful_queries - Number of queries that completed successfully for this connectiondatabase_type - Type of database as specified by database_type.
If database_type is empty, the QueryVersion and
AzureDB fields are concatenated insteadIf attempted_queries and successful_queries are not equal for
a given connection, some metrics were not successfully gathered for
that connection. If successful_queries is 0, no metrics were successfully
gathered.
sqlserver_cpu_other_process_cpu{host="servername",measurement_db_type="SQLServer",sql_instance="SERVERNAME:INST"} 9
sqlserver_performance{counter="Log File(s) Size (KB)",counter_type="65792",host="servername",instance="instance_name",measurement_db_type="SQLServer",object="MSSQL$INSTANCE_NAME:Databases",sql_instance="SERVERNAME:INSTANCE_NAME"} 1.048568e+06