Benchmarking cut

Performance profile

In normal use cases a significant amount of the total execution time of cut is spent performing I/O. When invoked with the -f option (cut fields) some CPU time is spent on detecting fields (in Searcher::next). Other than that some small amount of CPU time is spent on breaking the input stream into lines.

How to

When fixing bugs or adding features you might want to compare performance before and after your code changes.

hyperfine can be used to accurately measure and compare the total execution time of one or more commands.
shell
```
cargo build --release --package uu_cut

hyperfine -w3 "./target/release/cut -f2-4,8 -d' ' input.txt" "cut -f2-4,8 -d' ' input.txt"
```
You can put those two commands in a shell script to be sure that you don't forget to build after making any changes.

When optimizing or fixing performance regressions seeing the number of times a function is called, and the amount of time it takes can be useful.

cargo flamegraph generates flame graphs from function level metrics it records using perf or dtrace
shell
```
cargo flamegraph --bin cut --package uu_cut -- -f1,3-4 input.txt > /dev/null
```

What to benchmark

There are four different performance paths in cut to benchmark.

Byte ranges -c/--characters or -b/--bytes e.g. cut -c 2,4,6-
Byte ranges with output delimiters e.g. cut -c 4- --output-delimiter=/
Fields e.g. cut -f -4
Fields with output delimiters e.g. cut -f 7-10 --output-delimiter=:

Choose a test input file with large number of lines so that program startup time does not significantly affect the benchmark.