plugins/inputs/statsd/README.md
This service plugin gathers metrics from a Statsd server.
β Telegraf v0.2.0 π·οΈ applications π» all
This plugin is a service input. Normal plugins gather metrics determined by the interval setting. Service plugins start a service to listen and wait for metrics or events to occur. Service plugins have two key differences from normal plugins:
interval setting may not apply--test, --test-wait, and --once may not produce
output for this pluginPlugins support additional global and plugin configuration settings for tasks such as modifying metrics, tags, and fields, creating aliases, and configuring plugin ordering. See CONFIGURATION.md for more details.
# Statsd Server
[[inputs.statsd]]
## Protocol, must be "tcp", "udp4", "udp6" or "udp" (default=udp)
protocol = "udp"
## MaxTCPConnection - applicable when protocol is set to tcp (default=250)
max_tcp_connections = 250
## Enable TCP keep alive probes (default=false)
tcp_keep_alive = false
## Specifies the keep-alive period for an active network connection.
## Only applies to TCP sockets and will be ignored if tcp_keep_alive is false.
## Defaults to the OS configuration.
# tcp_keep_alive_period = "2h"
## Address and port to host UDP listener on
service_address = ":8125"
## The following configuration options control when telegraf clears it's cache
## of previous values. If set to false, then telegraf will only clear it's
## cache when the daemon is restarted.
## Reset gauges every interval (default=true)
delete_gauges = true
## Reset counters every interval (default=true)
delete_counters = true
## Reset sets every interval (default=true)
delete_sets = true
## Reset timings & histograms every interval (default=true)
delete_timings = true
## Enable aggregation temporality adds temporality=delta or temporality=commulative tag, and
## start_time field, which adds the start time of the metric accumulation.
## You should use this when using OpenTelemetry output.
# enable_aggregation_temporality = false
## Percentiles to calculate for timing & histogram stats.
percentiles = [50.0, 90.0, 99.0, 99.9, 99.95, 100.0]
## separator to use between elements of a statsd metric
metric_separator = "_"
## Parses extensions to statsd in the datadog statsd format
## currently supports metrics, datadog tags, events, and service checks.
## http://docs.datadoghq.com/guides/dogstatsd/
datadog_extensions = false
## Parses distributions metric as specified in the datadog statsd format
## https://docs.datadoghq.com/developers/metrics/types/?tab=distribution#definition
datadog_distributions = false
## Keep or drop the container id as tag. Included as optional field
## in DogStatsD protocol v1.2 if source is running in Kubernetes
## https://docs.datadoghq.com/developers/dogstatsd/datagram_shell/?tab=metrics#dogstatsd-protocol-v12
datadog_keep_container_tag = false
## Statsd data translation templates, more info can be read here:
## https://github.com/influxdata/telegraf/blob/master/docs/TEMPLATE_PATTERN.md
# templates = [
# "cpu.* measurement*"
# ]
## Number of UDP messages allowed to queue up, once filled,
## the statsd server will start dropping packets
allowed_pending_messages = 10000
## Number of worker threads used to parse the incoming messages.
# number_workers_threads = 5
## Number of timing/histogram values to track per-measurement in the
## calculation of percentiles. Raising this limit increases the accuracy
## of percentiles but also increases the memory usage and cpu time.
percentile_limit = 1000
## Maximum socket buffer size in bytes, once the buffer fills up, metrics
## will start dropping. Defaults to the OS default.
# read_buffer_size = 65535
## Max duration (TTL) for each metric to stay cached/reported without being updated.
# max_ttl = "10h"
## Sanitize name method
## By default, telegraf will pass names directly as they are received.
## However, upstream statsd now does sanitization of names which can be
## enabled by using the "upstream" method option. This option will a) replace
## white space with '_', replace '/' with '-', and remove characters not
## matching 'a-zA-Z_\-0-9\.;='.
#sanitize_name_method = ""
## Replace dots (.) with underscore (_) and dashes (-) with
## double underscore (__) in metric names.
# convert_names = false
## Convert all numeric counters to float
## Enabling this would ensure that both counters and guages are both emitted
## as floats.
# float_counters = false
## Emit timings `metric_<name>_count` field as float, the same as all other
## histogram fields
# float_timings = false
## Emit sets as float
# float_sets = false
The statsd plugin is a special type of plugin which runs a backgrounded statsd listener service while telegraf is running.
The format of the statsd messages was based on the format described in the original etsy statsd implementation. In short, the telegraf statsd listener will accept:
users.current.den001.myapp:32|g <- standardusers.current.den001.myapp:+10|g <- additiveusers.current.den001.myapp:-10|gdeploys.test.myservice:1|c <- increments by 1deploys.test.myservice:101|c <- increments by 101deploys.test.myservice:1|c|@0.1 <- with sample rate, increments by 10users.unique:101|susers.unique:101|susers.unique:102|s <- would result in a count of 2 for users.uniqueload.time:320|msload.time.nanoseconds:1|hload.time:200|ms|@0.1 <- sampled 1/10 of the timeload.time:320|dload.time.nanoseconds:1|dload.time:200|d|@0.1 <- sampled 1/10 of the timeIt is possible to omit repetitive names and merge individual stats into a single line by separating them with additional colons:
users.current.den001.myapp:32|g:+10|g:-10|gdeploys.test.myservice:1|c:101|c:1|c|@0.1users.unique:101|s:101|s:102|sload.time:320|ms:200|ms|@0.1This also allows for mixed types in a single line:
foo:1|c:200|msThe string foo:1|c:200|ms is internally split into two individual metrics
foo:1|c and foo:200|ms which are added to the aggregator separately.
In order to take advantage of InfluxDB's tagging system, we have made a couple additions to the standard statsd protocol. First, you can specify tags in a manner similar to the line-protocol, like this:
users.current,service=payroll,region=us-west:32|g
Meta:
metric_type=<gauge|set|counter|timing|histogram>Outputted measurements will depend entirely on the measurements that the user sends, but here is a brief rundown of what you can expect to find from each metric type:
delete_counters=true.users:<user_id>|s.
No matter how many times the same user_id is sent, the count will only increase
by 1.statsd_<name>_lower: The lower bound is the lowest value statsd saw
for that stat during that interval.statsd_<name>_upper: The upper bound is the highest value statsd saw
for that stat during that interval.statsd_<name>_mean: The mean is the average of all values statsd saw
for that stat during that interval.statsd_<name>_median: The median is the middle of all values statsd saw
for that stat during that interval.statsd_<name>_stddev: The stddev is the sample standard deviation
of all values statsd saw for that stat during that interval.statsd_<name>_sum: The sum is the sample sum of all values statsd saw
for that stat during that interval.statsd_<name>_count: The count is the number of timings statsd saw
for that stat during that interval. It is not averaged.statsd_<name>_percentile_<P> The Pth percentile is a value x such
that P% of all the values statsd saw for that stat during that time
period are below x. The most common value that people use for P is the
90, this is a great number to try to optimize.When datadog_extensions is enabled, the plugin also supports
Datadog service checks in the format:
_sc|<name>|<status>|d:<timestamp>|h:<hostname>|#<tag_key_1>:<tag_value_1>|m:<message>
<name> - service check name (required)<status> - 0=OK, 1=Warning, 2=Critical, 3=Unknown (required)d:<timestamp> - optional Unix timestamph:<hostname> - optional hostname override#<tags> - optional tags (same format as metrics)m:<message> - optional messageExample:
echo "_sc|my.service.check|0|#env:prod|m:Service is healthy" | nc -u -w1 127.0.0.1 8125
Service checks produce a metric with measurement name statsd_service_check:
check_name: The service check namesource: Hostname (from h: field or default)# sectionstatus (int): Status code (0-3)status_text (string): "ok", "warning", "critical", or "unknown"message (string): Optional message from m: fieldThe plugin supports specifying templates for transforming statsd buckets into InfluxDB measurement names and tags. The templates have a measurement keyword, which can be used to specify parts of the bucket that are to be used in the measurement name. Other words in the template are used as tag names. For example, the following template:
templates = [
"measurement.measurement.region"
]
would result in the following transformation:
cpu.load.us-west:100|g
=> cpu_load,region=us-west 100
Users can also filter the template to use based on the name of the bucket, using glob matching, like so:
templates = [
"cpu.* measurement.measurement.region",
"mem.* measurement.measurement.host"
]
which would result in the following transformation:
cpu.load.us-west:100|g
=> cpu_load,region=us-west 100
mem.cached.localhost:256|g
=> mem_cached,host=localhost 256
Consult the Template Patterns documentation for additional details.