Back to Chromium

README

components/url_formatter/spoof_checks/top_domains/README.md

150.0.7829.21.6 KB
Original Source

Top Domains Utilities

  • domains.list

    A top domain list, one per line. Used as an input to make_top_domain_skeletons. See http://go/chrome-top-domains-update for update instructions.

    This list can contain ASCII and unicode domains. Unicode domains should not be encoded in punycode.

  • domains.skeletons

    The checked-in output of make_top_domain_skeletons. Processed during the build to generate domains-trie-inc.cc, which is used by idn_spoof_checker.cc. This must be regenerated as follows if ICU is updated, since skeletons can differ across ICU versions:

    $ ninja -C $build_outdir make_top_domain_skeletons $ $build_outdir/make_top_domain_skeletons

  • test_domains.list

    A list of domains to use in IDNToUnicode test instead of the actual top domain list. Manually edited to match what's in IDNToUnicode test.

  • test_domains.skeletons

    Generated output of test_domains.list along with domains.skeletons by make_top_domain_skeletons.

  • top_domain_generator.cc

    Generates the Huffman encoded Trie containing a map of skeletons to top domains. For now, the skeletons must be ASCII. Unicode domains are supported but they are written as punycode to the trie.

  • make_top_domain_list_variables.cc / top_bucket_domains.h

    make_top_domain_list_variables.cc is run at compile time to generate information about the top bucket domains (currently, skeletons and keywords are created from these domains). This information is then embedded directly into the chrome binary, and can be accessed via the variables in the top_bucketdomains namespace.