doc/cephfs/cephfs-top.rst
.. _cephfs-top:
CephFS provides top(1)-like utility to display various Ceph Filesystem metrics
in real time. cephfs-top is a curses-based Python script which makes use of stats
plugin in Ceph Manager to fetch (and display) metrics.
Ceph Filesystem clients periodically forward various metrics to Ceph Metadata Servers (MDS) which in turn get forwarded to Ceph Manager by MDS rank zero. Each active MDS forwards its respective set of metrics to MDS rank zero. Metrics are aggregated and forwarded to Ceph Manager.
Metrics are divided into two categories - global and per-mds. Global metrics represent set of metrics for the filesystem as a whole (e.g., client read latency) whereas per-mds metrics are for a particular MDS rank (e.g., number of subtrees handled by an MDS).
.. note:: Currently, only global metrics are tracked.
stats plugin is disabled by default and should be enabled via::
$ ceph mgr module enable stats
Once enabled, Ceph Filesystem metrics can be fetched via::
$ ceph fs perf stats
The output format is JSON and contains fields as follows:
version: Version of stats outputglobal_counters: List of global performance metricscounters: List of per-mds performance metricsclient_metadata: Ceph Filesystem client metadataglobal_metrics: Global performance countersmetrics: Per-MDS performance counters (currently, empty) and delayed ranks.. note:: delayed_ranks is the set of active MDS ranks that are reporting stale metrics.
This can happen in cases such as (temporary) network issue between MDS rank zero
and other active MDSs.
Metrics can be fetched for a particular client and/or for a set of active MDSs. To fetch metrics for a particular client (e.g., for client-id: 1234)::
$ ceph fs perf stats --client_id=1234
To fetch metrics only for a subset of active MDSs (e.g., MDS rank 1 and 2)::
$ ceph fs perf stats --mds_rank=1,2
cephfs-topcephfs-top utility relies on stats plugin to fetch performance metrics and display in
top(1)-like format. cephfs-top is available as part of cephfs-top package.
By default, cephfs-top uses client.fstop user to connect to a Ceph cluster::
$ ceph auth get-or-create client.fstop mon 'allow r' mds 'allow r' osd 'allow r' mgr 'allow r' $ cephfs-top
chit : Cap hit Percentage of file capability hits over total number of caps
dlease : Dentry lease Percentage of dentry leases handed out over the total dentry lease requests
ofiles : Opened files Number of opened files
oicaps : Pinned caps Number of pinned caps
oinodes : Opened inodes Number of opened inodes
rtio : Total size of read IOs Number of bytes read in input/output operations generated by all process
wtio : Total size of write IOs Number of bytes written in input/output operations generated by all processes
raio : Average size of read IOs Mean of number of bytes read in input/output operations generated by all process over total IO done
waio : Average size of write IOs Mean of number of bytes written in input/output operations generated by all process over total IO done
rsp : Read speed Speed of read IOs with respect to the duration since the last refresh of clients
wsp : Write speed Speed of write IOs with respect to the duration since the last refresh of clients
rlatavg : Average read latency Mean value of the read latencies
rlatsd : Standard deviation (variance) for read latency Dispersion of the metric for the read latency relative to its mean
wlatavg : Average write latency Mean value of the write latencies
wlatsd : Standard deviation (variance) for write latency Dispersion of the metric for the write latency relative to its mean
mlatavg : Average metadata latency Mean value of the metadata latencies
mlatsd : Standard deviation (variance) for metadata latency Dispersion of the metric for the metadata latency relative to its mean
To use a non-default user (other than client.fstop) use::
$ cephfs-top --id <name>
By default, cephfs-top connects to cluster name ceph. To use a non-default cluster name::
$ cephfs-top --cluster <cluster>
cephfs-top refreshes stats every second by default. To choose a different refresh interval use::
$ cephfs-top -d <seconds>
Refresh interval should be a positive integer.
To dump the metrics to stdout without creating a curses display use::
$ cephfs-top --dump
To dump the metrics of the given filesystem to stdout without creating a curses display use::
$ cephfs-top --dumpfs <fs_name>
m : Filesystem selection Displays a menu of filesystems for selection.
s : Sort field selection Designates the sort field. 'cap_hit' is the default.
l : Client limit Sets the limit on the number of clients to be displayed.
r : Reset Resets the sort field and limit value to the default.
q : Quit Exit the utility if you are at the home screen (all filesystem info), otherwise escape back to the home screen.
The metrics display can be scrolled using the Arrow Keys, PgUp/PgDn, Home/End and mouse.
Sample screenshot running cephfs-top with 2 filesystems:
.. image:: cephfs-top.png
.. note:: Minimum compatible Python version for cephfs-top is 3.6.0. cephfs-top is supported on distributions RHEL 8, Ubuntu 18.04, CentOS 8 and above.