rust/unicode/README.md
This crate contains unicode utilites adapted for working with non-contiguous bytes (such as a rope.)
Much of the contents of this repo are generated automatically by scripts from Unicode data files.
This current file is the result of some archaeology; documentation on how to rebuild the various files was missing, and I am attempting to reconstruct it.
Constructing the various tables require the various data files. These are
available through the components of the Unicode standard
directory, for a given unicode version. In particular, we require
LineBreak.txt.
This file should be placed in a directory: I use data.
src/tables.rs is generated with the script located at tools/mk_tables.py,
and can be built with,
$ python3 tools/mk_tables.py data > src/tables.rs
where data is the path to the created data directory.
the unit tests in src/lib.rs are also generated by this script, by passing
the --tests and --tests-str flags (separately, to separate invocations) of
the script, and then copying the output over into the body of these tests.