README - Chromium — ContextQMD

tools/gritsettings README

This directory contains several files that apply globally to the Chrome resource generation system (which uses GRIT - see tools/grit).

resource_ids.spec: This file is used to assign fake start IDs for resources and strings used by Chromium. This is done to ensure that resource ids are unique across all .grd files. If you are adding a new .grd file, please add a new entry to this file. Detailed instructions are found below.

translation_expectations.pyl: Specifies which grd files should be translated and into which languages they should be translated. Used by the internal translation process.

startup_resources_[platform].txt: These files provide a pre-determined resource id ordering that will be used by GRIT when assigning resources ids. The goal is to have the resource loaded during Chrome startup be ordered first in the .pak files, so that fewer page faults are suffered during Chrome start up. To update or generate one of these files, follow these instructions:

Build a Chrome official release build and launch it with command line: --print-resource-ids and save the output to a file (e.g. res.txt).
Generate the startup_resources_[platform].txt via the following command (you can redirect its output to the new file location):

tools/grit/grit/format/gen_predetermined_ids.py res_ids.txt out/gn

In the above command, res_ids.txt is the file produced in step 1 and out/gn is you Chrome build directory where you compiled Chrome. The output of the command can be added as a new startup_resource_[platform]
If this is a new file, modify tools/grit/grit_rule.gni to set its path via grit_predetermined_resource_ids_file for the given platform.

Updating resource_ids.spec

The [###] integers in resource_ids.spec are fake start IDs, whose relative ordering determine relationship among .grd entries. Actual resource IDs are dynamically generated, with ID usage requirements from .grd files if static, or injected using "META": {"sizes": {...}} if generated.

To add a new .grd entry, first find the location to add it:

Find the matching section (chrome/app/, chrome/browser/, etc.).
Preserve alphabetical order within a section.

Simple case: Sufficient most of the time

This applies if the new entry is not at the beginning or end of sections, and nearby .grd entries have distinct-looking fake start IDs without "META": {"join": 2} oddity. For example, adding foo/bar/ccc.grd here:

  "foo/bar/aaa.grd": {
    "includes": [1140],
    "structures": [1160],
  },
  "foo/bar/bbb.grd": {
    "includes": [1180],
  },
  "foo/bar/ddd.grd": {
    "includes": [1200],
  },

The best location is between bbb.grd and ddd.grd. Each resource type (say, includes and messages) must have a sub-entry with fake start IDs. The values select should be intermediate to neighboring ones, and sorted, e.g.:

  "foo/bar/aaa.grd": {
    "includes": [1140],
    "structures": [1160],
  },
  "foo/bar/bbb.grd": {
    "includes": [1180],
  },
  "foo/bar/ccc.grd": {  # NEW [comment for illustration only]
    "includes": [1185],
    "messages": [1190],
  },
  "foo/bar/ddd.grd": {
    "includes": [1200],
  },

Done!

Special case: Crowded fake start IDs

If fake start IDs get too crowded, then (from this directory) run

python3 ../grit/grit.py update_resource_ids -i resource_ids.spec --fake > temp
mv temp resource_ids.spec

before (and/or after) your edit to make room. After this, your CLs will have many diffs in fake start IDs, but that's okay since relative orderings are preserved.

Special case: Generated .grd files

If your .grd file is generated, then in general it's unavailable (e.g., due to build config) for ID assignment to detect how many IDs to reserve. You will need to provide a bound on resources for each resource type ({includes, messages, structures}) to reserve, specified using "META": {"sizes": {}}. For example, in

  "foo/bar/aaa.grd": {
    "includes": [1140],
    "structures": [1160],
  },
  "foo/bar/bbb.grd": {
    "includes": [1180],
  },
  "<(SHARED_INTERMEDIATE_DIR)/foo/bar/ccc.grd": {  # NEW
    "META": {"sizes": {"includes": [20], "messages": [50],}},
    "includes": [1185],
    "messages": [1190],
  },
  "foo/bar/ddd.grd": {
    "includes": [1200],
  },

we reserved 20 IDs for <include> resources, and 50 for <message> resources. Hopefully you'll only need to do this once, but if in the future the generated resources exceeds the bound, then you'll need to update resource_ids.spec.

Special case: Add at end of split

Suppose you desired location is preceded by repeated values ("splits") and followed by "META": {"join": ##}, e.g.:

  "foo/bar/bbb.grd": {
    "includes": [1160],
  },
  "foo/bar/bbb_helper.grd": {
    "includes": [1180],
  },
  "foo/bar/bbb_testing.grd": {
    "includes": [1160],
  },
  "foo/bar/bbb_testing_helper.grd": {
    "includes": [1180],
  },
  "foo/bar/ddd.grd": {
    "META": {"join": 2}
    "includes": [1200],
  },

This means:

{bbb.grd, bbb_helper.grd} is one group.
{bbb_testing.grd, bbb_testing_helper.grd} is another group.
The two groups are mutually exclusive, and IDs can be reused.
ddd.grd is used by everyone; it "joins" the mutually exclusive branches.

If you're adding foo/bar/ccc.grd, and it's unrelated to bbb*.grd then the required update would be:

  "foo/bar/bbb.grd": {
    "includes": [1160],
  },
  "foo/bar/bbb_helper.grd": {
    "includes": [1180],
  },
  "foo/bar/bbb_testing.grd": {
    "includes": [1160],
  },
  "foo/bar/bbb_testing_helper.grd": {
    "includes": [1180],
  },
  "foo/bar/ccc.grd": {  # NEW
    "META": {"join": 2}
    "includes": [1190],
  },
  "foo/bar/ddd.grd": {  # "META": {"join": 2} got moved.
    "includes": [1200],
  },

If you're adding a new entry to the start or end of sections, pay attention to potential splits across sections. In particular, note that chrome/ and ios/chrome/ sections are very long splits that finally get joined by the "everything else" section!

Special case: Add new alternative to existing split

Using the same example as above: If you're adding a new alternative, say, foo/bar/bbb_demo.grd then the change should be:

  "foo/bar/bbb.grd": {
    "includes": [1160],
  },
  "foo/bar/bbb_helper.grd": {
    "includes": [1180],
  },
  "foo/bar/bbb_demo.grd": {  # NEW, with 1160 to specify alternative.
    "includes": [1160],
  },
  "foo/bar/bbb_testing.grd": {
    "includes": [1160],
  },
  "foo/bar/bbb_testing_helper.grd": {
    "includes": [1180],
  },
  "foo/bar/ddd.grd": {
    "META": {"join": 3}  # 2 became 3.
    "includes": [1200],
  },

Special case: Creating split

Starting from the simple case again:

  "foo/bar/aaa.grd": {
    "includes": [1140],
    "structures": [1160],
  },
  "foo/bar/bbb.grd": {
    "includes": [1180],
  },
  "foo/bar/ddd.grd": {
    "includes": [1200],
  },

If you want to add mutually-exclusive foo/bar/ccc.grd and foo/bar/ccc_testing.grd, then you'd create a split by repeating a fake start ID, then update the succeeding entry to join these splits:

  "foo/bar/aaa.grd": {
    "includes": [1140],
    "structures": [1160],
  },
  "foo/bar/bbb.grd": {
    "includes": [1180],
  },
  "foo/bar/ccc.grd": {  # NEW
    "includes": [1190],
  },
  "foo/bar/ccc_testing.grd": {  # NEW, reusing number to indicate split.
    "includes": [1190],
  },
  "foo/bar/ddd.grd": {
    "META": {"join": 2},  # Add join.
    "includes": [1200],
  },

Details on resource_ids.spec

For simplicity, let's disregard that .grd files can have different types of resources ({includes, messages, structures}). Henceforth, each x.grd specifies count(x.grd) to specify the number of globally distinct resource IDs (integers from 0 to 32767) it needs.

Automatic resource ID assignment computes for each x.grd, a start ID start(x.grd), which indicates that resource IDs start(x.grd), start(x.grd) + 1, ..., start(x.grd) + count(x.grd) - 1 are reserved for x.grd.

Complication arise from the following:

Start IDs among .grd files must follow a preassigned order for determinism (i.e., no "first-come, first-serve" assignment).
Generated .grd files' count() are in general not known a priori. By contrast, static .grd files in the source can have count() determined by parsing.
Depending on build config, some .grd files may be excluded. To avoid wasteful large gaps in resource ID space, mutually exclusive .grd files can share resource IDs.

The (former) resource_ids file was introduced to coordinate resource ID assignment. The file lists every (static and generated) .grd file across all build configs, and specifies start IDs for lookup. This allows resource targets to be built independently.

However, resource_ids required frequent curation. Moreover, adding new resources to a .grd entry (even with extra IDs reserved for buffering) can "bump" start IDs of adjacent entries. This process can cascade downward, leading to multiple updates. The workflow was tedious and error-prone.

Automatic start ID assignment was later introduced. It utilizes the following:

For static .grd files, count() can be easily parsed.
For generated .grd files, a bound on count() can be specified as extra data, stored in new field ("META": {"size": {...}}).
The order of elements in resource_ids along with preassigned start IDs encode some implicit "ordering structure" that specifies:
- Ordering of start IDs among .grd files.
- Mutual exclusion among groups of .grd files.

resource_ids is still used, but it's now generated from resource_id.spec, which is a new "specification" file that's largely similar to resource_ids, featuring the following:

.grd entries and order are kept.
{"META": {"size": {...}} is used to specify size bounds of generated .grd files.
Start IDs are replaced with "fake start IDs" to capture "ordering structure".

It remains to clarify what "ordering structure" is, and how to represent it with fake start IDs.

Suppose that all .grd files are used, then start ID generation is straightforward: Visit .grd files sequentially, then evaluate start() as cumulative sums of count(). We call this a "chain", which conceptually resembles some "series" connection looking like A <-- B <-- C.

In reality, multiple chains exist and .grd files in different chains can be mutually exclusive, allowing ID reuse. Conceptually these resemble "parallel" connections looking like:

A <-- B1 <-- C1 <-- D
|                   |
+ <-- B2 <-- C2 <-- +

Here, A is a split and D is a join. The vertical bars | and + merely stretches vertices downward.

So we're getting into graph theory territory. It turns out that a suitable representation is subtle. To avoid redundant and confusing artifacts, we need to be more precise:

Each X.grd file is represented by a directed edge weighed by count(X.grd), embodying a constraint (more on this later).
Vertices are variables representing potential start() ID values, possibly shared among multiple .grd files owing to mutual exclusivity.
Define sentinel vertex S as the minimal resource ID available. Abstractly We can see it as 0, though in practice 400 is used.
Define sentinel vertex 'T = 32768' as the (invalid) terminal resource ID.

Consider the following graph generation rules:

Start from S <-- T. This depicts a S.grd file with start(S.grd) = S, and that is (trivially) constrained by start(S.grd) + count(S.grd) <= T.
Series op: Transform A <-- C into A <-- B <-- C. Meaning: Initially A.grd (old edge) has start(A.grd) = A assigned. C can be T, or the start ID of some other .grd file(s). IDs are constrained by A + count(A.grd) <= C. The transformation adds B.grd (new edge) with start ID start(B.grd) = B. The old constraint (edge) is replaced by 2 new constraints A + count(A.grd) <= B and B + count(B.grd) <= C.
Parallel op: Transform A <-- B into A <== B, i.e., create a double edge. Meaning: Initially A1.grd (old edge) has start(A1.grd) = A assigned. B can be T, or the start ID of some other .grd file(s). IDs are constrained by A + count(A1.grd) <= B. The transformation adds A2.grd (new edge) that's mutually exclusive with A1.grd, and can therefore share the same start value A, i.e., start(A2.grd) = A. The old constraint remains, and we're adding a new one A + count(A2.grd) <= B. The combined effect of the constraints is A + max(count(A1.grd), count(A2.grd)) <= B.

Composing rule 1 with arbitrary copies of rule {2, 3] (each targeting an existing edge, while adding a new one from a new .grd file) results in a series-parallel graph with one source T, one sink S, and no cycles. This graph exactly represents "ordering structure". Computationally it represents assigning each .grd file's start ID to some to a variables (vertices), subject to constraints (edges weighted by count()).

Given the above graph, solving the assignment problem is easy (omitted). The challenge is to represent the series-parallel graph in resource_ids.spec. Some possibilities:

Explicitly define vertices, then list .grd files as edges (each with 2 vertices): It turns out this has "too much freedom", and can lead to errors (e.g., cycles), requiring validation. Also, how do we name vertices?
Use bracketing: Series op is associative; parallel op is associative and commutative (over isomorphism). Therefore we can represent series op with a list (order important) and parallel op as a set (order unimportant). The graph becomes an alternating-nested list and set (noting that list of lists and set of sets can be flattened)! For example, [a,{b,[c,{d,e},f]},{g,h,i},j] would represent (using # for vertices):

S <-(a)- # <----------(b)---------- # <-(g)- # <-(j)- T
         |                          |        |
         + <-(c)- # <-(d)- # <-(f)- + <-(h)- +
                  |        |        |        |
                  + <-(e)- +        + <-(i)- +

We might do this in the future -- but this would be a drastic change.

Use fake start IDs: This is the choice taken, since it mostly maintains the old "API" of updating numbers in a file.

To encode a series-parallel graph as a sequence of fake start IDs, first we note that chains formed by repeated series ops can be represented by increasing fake start IDs. For example: [0, 1, 2, 3, 4] matches:

S <-0-- # <-1-- # <-2-- # <-3-- # <-4-- T

Note: The numbers are not to be confused with count() as edge weights.

Next, repeated or decreasing fake start IDs can be thought of "rewinding" to the past, which create splits (like time travel) that parallel ops can use. This can also be done in a hierarchical fashion. For example, [1, 2, 4, 2, 3, 4, 5, 6, 4, 7, 7, 2, 3, 4, 3] can create

S <-1-- # <-2-- # <-4--
        |
        + <-2-- # <-3-- # <-4-- # <-5-- # <-6--
        |               |
        |               + <-4-- # <-7--
        |                       |
        |                       + <-7--
        |
        + <-2-- # <-3-- # <-4--
                |
                # <-3--

The tree building algorithm uses a stack: Increasing values are pushed; repeated or decreasing values cause pops until a match is found, to which linkage can be applied.

Therefore fake start IDs can represent a tree. However, parallel op also requires "joins". To do this, we'll need to add "join" to literally "tie up loose ends". Let's consider a smaller example: [1, 2, 2, 3, 4]. The resulting tree is:

S <-1-- # <-2--
        |
        + <-2-- # <-3-- # <-4--

If "join" is placed at 3, we'd get [1, 2, 2, 3*, 4], which represents:

S <-1-- # <-2-- # <-3-- # <-4-- T
        |       |
        + <-2-- +

If "join" is placed at 4, we'd get [1, 2, 2, 3, 4*], which represents:

S <-1-- # <-----2------ # <-4-- T
        |               |
        + <-2-- # <-3-- +

Note that the end sentinel T can be thought to absorb all unjoined branches. Therefore the original tree [1, 2, 2, 3, 4] actually represents:

S <-1-- # <---------2---------- T
        |                       |
        + <-2-- # <-3-- # <-4-- +

Next, joins have a multiplicity, so since more than 2 branches can be joined at the same node. For example, [1, 2, 2, 3, 3, 4*, 5*, 5, 5, 6**] (with * representing a join) results in:

S <-1-- # <---------2---------- # <-5-- # <-6-- T
        |                       |       |
        + <-2-- # <-3-- # <-4-- + <-5-- +
                |       |       |       |
                + <-3-- +       + <-5-- +

In fact, this is just an earlier example we've seen.

Finally, note that the running count of joints, i.e., *, must be less than or equal to the running count of splits.