Back to Coreutils

Benchmarking cut

src/uu/cut/BENCHMARKING.md

0.8.01.7 KB
Original Source

Benchmarking cut

Performance profile

In normal use cases a significant amount of the total execution time of cut is spent performing I/O. When invoked with the -f option (cut fields) some CPU time is spent on detecting fields (in Searcher::next). Other than that some small amount of CPU time is spent on breaking the input stream into lines.

How to

When fixing bugs or adding features you might want to compare performance before and after your code changes.

  • hyperfine can be used to accurately measure and compare the total execution time of one or more commands.

    shell
    cargo build --release --package uu_cut
    
    hyperfine -w3 "./target/release/cut -f2-4,8 -d' ' input.txt" "cut -f2-4,8 -d' ' input.txt"
    

    You can put those two commands in a shell script to be sure that you don't forget to build after making any changes.

When optimizing or fixing performance regressions seeing the number of times a function is called, and the amount of time it takes can be useful.

  • cargo flamegraph generates flame graphs from function level metrics it records using perf or dtrace

    shell
    cargo flamegraph --bin cut --package uu_cut -- -f1,3-4 input.txt > /dev/null
    

What to benchmark

There are four different performance paths in cut to benchmark.

  • Byte ranges -c/--characters or -b/--bytes e.g. cut -c 2,4,6-
  • Byte ranges with output delimiters e.g. cut -c 4- --output-delimiter=/
  • Fields e.g. cut -f -4
  • Fields with output delimiters e.g. cut -f 7-10 --output-delimiter=:

Choose a test input file with large number of lines so that program startup time does not significantly affect the benchmark.