doc/tracing.md
Bitcoin Core includes statically defined tracepoints to allow for more observability during development, debugging, code review, and production usage. These tracepoints make it possible to keep track of custom statistics and enable detailed monitoring of otherwise hidden internals. They have little to no performance impact when unused.
eBPF and USDT Overview
======================
┌──────────────────┐ ┌──────────────┐
│ tracing script │ │ bitcoind │
│==================│ 2. │==============│
│ eBPF │ tracing │ hooks │ │
│ code │ logic │ into┌─┤►tracepoint 1─┼───┐ 3.
└────┬───┴──▲──────┘ ├─┤►tracepoint 2 │ │ pass args
1. │ │ 4. │ │ ... │ │ to eBPF
User compiles │ │ pass data to │ └──────────────┘ │ program
Space & loads │ │ tracing script │ │
─────────────────┼──────┼─────────────────┼────────────────────┼───
Kernel │ │ │ │
Space ┌──┬─▼──────┴─────────────────┴────────────┐ │
│ │ eBPF program │◄──────┘
│ └───────────────────────────────────────┤
│ eBPF kernel Virtual Machine (sandboxed) │
└──────────────────────────────────────────┘
1. The tracing script compiles the eBPF code and loads the eBPF program into a kernel VM
2. The eBPF program hooks into one or more tracepoints
3. When the tracepoint is called, the arguments are passed to the eBPF program
4. The eBPF program processes the arguments and returns data to the tracing script
The Linux kernel can hook into the tracepoints during runtime and pass data to sandboxed eBPF programs running in the kernel. These eBPF programs can, for example, collect statistics or pass data back to user-space scripts for further processing.
The two main eBPF front-ends with support for USDT are bpftrace and
BPF Compiler Collection (BCC). BCC is used for complex tools and daemons and
bpftrace is preferred for one-liners and shorter scripts. Examples for both can
be found in contrib/tracing.
The currently available tracepoints are listed here.
netnet:inbound_messageIs called when a message is received from a peer over the P2P network. Passes information about our peer, the connection and the message as arguments.
Arguments passed:
int64pointer to C-style String (normally up to 68 characters1)pointer to C-style String (max. length 20 characters)pointer to C-style String (max. length 20 characters)uint64pointer to unsigned chars (i.e. bytes)Note: The message is passed to the tracepoint in full, however, due to space limitations in the eBPF kernel VM it might not be possible to pass the message to user-space in full. Messages longer than 32kb might be cut off. This can be detected in tracing scripts by comparing the message size to the length of the passed message.
net:outbound_messageIs called when a message is sent to a peer over the P2P network. Passes information about our peer, the connection and the message as arguments.
Arguments passed:
int64pointer to C-style String (normally up to 68 characters1)pointer to C-style String (max. length 20 characters)pointer to C-style String (max. length 20 characters)uint64pointer to unsigned chars (i.e. bytes)Note: The message is passed to the tracepoint in full, however, due to space limitations in the eBPF kernel VM it might not be possible to pass the message to user-space in full. Messages longer than 32kb might be cut off. This can be detected in tracing scripts by comparing the message size to the length of the passed message.
net:inbound_connectionIs called when a new inbound connection is opened to us. Passes information about the peer and the number of inbound connections including the newly opened connection.
Arguments passed:
int64pointer to C-style String (normally up to 68 characters1)pointer to C-style String (max. length 20 characters)uint32 (1 = IPv4, 2 = IPv6, 3 = Onion, 4 = I2P, 5 = CJDNS). See Network enum in netaddress.h.uint64 including the newly opened inbound connection.net:outbound_connectionIs called when a new outbound connection is opened by us. Passes information about the peer and the number of outbound connections including the newly opened connection.
Arguments passed:
int64pointer to C-style String (normally up to 68 characters1)pointer to C-style String (max. length 20 characters)uint32 (1 = IPv4, 2 = IPv6, 3 = Onion, 4 = I2P, 5 = CJDNS). See Network enum in netaddress.h.uint64 including the newly opened outbound connection.net:evicted_inbound_connectionIs called when an inbound connection is evicted by us. Passes information about the evicted peer and the time at connection establishment.
Arguments passed:
int64pointer to C-style String (normally up to 68 characters1)pointer to C-style String (max. length 20 characters)uint32 (1 = IPv4, 2 = IPv6, 3 = Onion, 4 = I2P, 5 = CJDNS). See Network enum in netaddress.h.uint64.net:misbehaving_connectionIs called when a connection is misbehaving. Passes the peer id and a reason for the peers misbehavior.
Arguments passed:
int64.pointer to C-style String (max. length 128 characters).net:closed_connectionIs called when a connection is closed. Passes information about the closed peer and the time at connection establishment.
Arguments passed:
int64pointer to C-style String (normally up to 68 characters1)pointer to C-style String (max. length 20 characters)uint32 (1 = IPv4, 2 = IPv6, 3 = Onion, 4 = I2P, 5 = CJDNS). See Network enum in netaddress.h.uint64.validationvalidation:block_connectedIs called after a block is connected to the chain. Can, for example, be used
to benchmark block connections together with -reindex.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)int32uint64int32uint64uint64utxocacheThe following tracepoints cover the in-memory UTXO cache. UTXOs are, for example,
added to and removed (spent) from the cache when we connect a new block.
Note: Bitcoin Core uses temporary clones of the main UTXO cache
(chainstate.CoinsTip()). For example, the RPCs generateblock and
getblocktemplate call TestBlockValidity(), which applies the UTXO set
changes to a temporary cache. Similarly, mempool consistency checks, which are
frequent on regtest, also apply the UTXO set changes to a temporary cache.
Changes to the main UTXO cache and to temporary caches trigger the tracepoints.
We can't tell if a temporary cache or the main cache was changed.
utxocache:flushIs called after the in-memory UTXO cache is flushed.
Arguments passed:
int64uint32. It's an enumerator class with values
0 (NONE), 1 (IF_NEEDED), 2 (PERIODIC), 3 (FORCE_FLUSH), 4 (FORCE_SYNC)uint64uint64boolutxocache:addIs called when a coin is added to a UTXO cache. This can be a temporary UTXO cache too.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)uint32uint32int64boolutxocache:spentIs called when a coin is spent from a UTXO cache. This can be a temporary UTXO cache too.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)uint32uint32int64boolutxocache:uncacheIs called when a coin is purposefully unloaded from a UTXO cache. This happens, for example, when we load an UTXO into a cache when trying to accept a transaction that turns out to be invalid. The loaded UTXO is uncached to avoid filling our UTXO cache up with irrelevant UTXOs.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)uint32uint32int64boolcoin_selectioncoin_selection:selected_coinsIs called when SelectCoins completes.
Arguments passed:
pointer to C-style stringpointer to C-style stringint64int64int64coin_selection:normal_create_tx_internalIs called when the first CreateTransactionInternal completes.
Arguments passed:
pointer to C-style stringCreateTransactionInternal succeeded as boolint64int32coin_selection:attempting_aps_create_txIs called when CreateTransactionInternal is called the second time for the optimistic
Avoid Partial Spends selection attempt. This is used to determine whether the next
tracepoints called are for the Avoid Partial Spends solution, or a different transaction.
Arguments passed:
pointer to C-style stringcoin_selection:aps_create_tx_internalIs called when the second CreateTransactionInternal with Avoid Partial Spends enabled completes.
Arguments passed:
pointer to C-style stringboolCreateTransactionInternal succeeded as boolint64int32mempoolmempool:addedIs called when a transaction is added to the node's mempool. Passes information about the transaction.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)int32int64mempool:removedIs called when a transaction is removed from the node's mempool. Passes information about the transaction.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)pointer to C-style String (max. length 9 characters)int32int64uint64mempool:replacedIs called when a transaction in the node's mempool is getting replaced by another. Passes information about the replaced and replacement transactions.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)int32int64uint64pointer to unsigned chars (i.e. 32 bytes in little-endian)int32int64bool indicating if the argument 5. is a transaction ID or package hash (true if it's a transaction ID)Note: In cases where a replacement transaction or package replaces multiple existing transactions in the mempool, the tracepoint is called once for each replaced transaction, with data of the replacement transaction or package being the same in each call.
mempool:rejectedIs called when a transaction is not permitted to enter the mempool. Passes information about the rejected transaction.
Arguments passed:
pointer to unsigned chars (i.e. 32 bytes in little-endian)pointer to C-style String (max. length 118 characters)Use the TRACEPOINT macro to add a new tracepoint. If not yet included, include
util/trace.h (defines the tracepoint macros) with #include <util/trace.h>.
Each tracepoint needs a context and an event. Please use snake_case and
try to make sure that the tracepoint names make sense even without detailed
knowledge of the implementation details. You can pass zero to twelve arguments
to the tracepoint. Each tracepoint also needs a global semaphore. The semaphore
gates the tracepoint arguments from being processed if we are not attached to
the tracepoint. Add a TRACEPOINT_SEMAPHORE(context, event) with the context
and event of your tracepoint in the top-level namespace at the beginning of
the file. Do not forget to update the tracepoint list in this document.
For example, the net:outbound_message tracepoint in src/net.cpp with six
arguments.
// src/net.cpp
TRACEPOINT_SEMAPHORE(net, outbound_message);
…
void CConnman::PushMessage(…) {
…
TRACEPOINT(net, outbound_message,
pnode->GetId(),
pnode->m_addr_name.c_str(),
pnode->ConnectionTypeAsString().c_str(),
sanitizedType.c_str(),
msg.data.size(),
msg.data.data()
);
…
}
If needed, an extra if (TRACEPOINT_ACTIVE(context, event)) {...} check can be
used to prepare somewhat expensive arguments right before the tracepoint. While
the tracepoint arguments are only prepared when we attach something to the
tracepoint, an argument preparation should never hang the process. Hashing and
serialization of data structures is probably fine, a sleep(10s) not.
// An example tracepoint with an expensive argument.
TRACEPOINT_SEMAPHORE(example, gated_expensive_argument);
…
if (TRACEPOINT_ACTIVE(example, gated_expensive_argument)) {
expensive_argument = expensive_calculation();
TRACEPOINT(example, gated_expensive_argument, expensive_argument);
}
Tracepoints need a clear motivation and use case. The motivation should outweigh the impact on, for example, code readability. There is no point in adding tracepoints that don't end up being used.
When adding a new tracepoint, provide an example. Examples can show the use case and help reviewers testing that the tracepoint works as intended. The examples can be kept simple but should give others a starting point when working with the tracepoint. See existing examples in contrib/tracing/.
Tracepoints should have a semi-stable API. Users should be able to rely on the tracepoints for scripting. This means tracepoints need to be documented, and the argument order ideally should not change. If there is an important reason to change argument order, make sure to document the change and update the examples using the tracepoint.
Keep the eBPF Virtual Machine limits in mind. eBPF programs receiving data from the tracepoints run in a sandboxed Linux kernel VM. This VM has a limited stack size of 512 bytes. Check if it makes sense to pass larger amounts of data, for example, with a tracing script that can handle the passed data.
bpftrace argument limitWhile tracepoints can have up to 12 arguments, bpftrace scripts currently only
support reading from the first six arguments (arg0 till arg5) on x86_64.
bpftrace currently lacks real support for handling and printing binary data,
like block header hashes and txids. When a tracepoint passes more than six
arguments, then string and integer arguments should preferably be placed in the
first six argument fields. Binary data can be placed in later arguments. The BCC
supports reading from all 12 arguments.
Generally, strings should be passed into the TRACEPOINT macros as pointers to
C-style strings (a null-terminated sequence of characters). For C++
std::strings, c_str() can be used. It's recommended to document the
maximum expected string size if known.
Multiple tools can list the available tracepoints in a bitcoind binary with
USDT support.
To list probes in Bitcoin Core, use info probes in gdb:
$ gdb ./build/bin/bitcoind
…
(gdb) info probes
Type Provider Name Where Semaphore Object
stap net inbound_message 0x000000000014419e 0x0000000000d29bd2 /build/bin/bitcoind
stap net outbound_message 0x0000000000107c05 0x0000000000d29bd0 /build/bin/bitcoind
stap validation block_connected 0x00000000002fb10c 0x0000000000d29bd8 /build/bin/bitcoind
…
readelfThe readelf tool can be used to display the USDT tracepoints in Bitcoin Core.
Look for the notes with the description NT_STAPSDT.
$ readelf -n ./build/bin/bitcoind | grep NT_STAPSDT -A 4 -B 2
Displaying notes found in: .note.stapsdt
Owner Data size Description
stapsdt 0x0000005d NT_STAPSDT (SystemTap probe descriptors)
Provider: net
Name: outbound_message
Location: 0x0000000000107c05, Base: 0x0000000000579c90, Semaphore: 0x0000000000d29bd0
Arguments: -8@%r12 8@%rbx 8@%rdi 8@192(%rsp) 8@%rax 8@%rdx
…
tplistThe tplist tool is provided by BCC (see Installing BCC). It displays kernel
tracepoints or USDT probes and their formats (for more information, see the
tplist usage demonstration). There are slight binary naming differences
between distributions. For example, on
Ubuntu the binary is called tplist-bpfcc.
$ tplist -l ./build/bin/bitcoind -v
b'net':b'outbound_message' [sema 0xd29bd0]
1 location(s)
6 argument(s)
…