docs/design/managed-config.md
Author: Matt Stark
Status: Pending implementation
The design doc Secure JJ Config introduces a mechanism,
metadata.binpb, through which information about a repository / workspace can
be stored. It also discusses how allowing an external user to have control over
your config is a security risk.
There is a need for a repository to impose requirements upon users. Examples of these things include, but are not limited to:
It should be fairly obvious that there is a strong benefit to doing so. However, controlling a user’s config is sufficient to get root access to their machine, so we require a mechanism more complex than just blindly loading the config file.
This is currently achieved by projects such as chromium by instructing the user
to symlink .jj/repo/config.toml to a file in the repository. This has several
drawbacks:
The managed configuration will be read from $REPO/.config/jj/config.toml. This
is intentionally designed to be very similar to $HOME/.config/jj/config.toml
for the user configuration.
Any data stored here will be added to the metadata.binpb that was created
in secure config (that is, we will add additional fields to
the Metadata struct). This will ensure that we don't suffer from the "zip file
problem".
We will add the following fields to metadata:
enum TrustLevel {
// There's no "optional" in protos. It just returns the zero value.
UNSET = 0;
// The user wishes to completely ignore the managed config file.
IGNORED = 1;
// In trusted mode, we directly read from the managed config file.
// This presents a security risk, so the user is expected to only do this
// for a repo they trust.
TRUSTED = 2;
// In notify mode, the user is expected to manually replicate any changes
// they want from the managed config to the repo config.
NOTIFY = 3;
// In review mode, the user has to review every config individually.
REVIEW = 4;
}
message Metadata {
// previous fields
// The trust level associated with this repo.
TrustLevel trust_level = 2;
// When trust_level is review, the following fields are used.
set<Hash> approved_configs = 3;
set<Hash> rejected_configs = 4;
// This is a can of worms and thus out of scope of the MVP.
// There's no one correct way to choose the last approved.
// Sometimes, the user wants be the last appproved config in the workspace.
// At other times, the user wants be the last appproved config in the repo.
string last_approved_config = 5;
}
Note that although the Metadata struct is stored for both the repo and the
workspace, this will only be used for the repository, not the workspace.
This is because trust is associated with repositories, not workspaces. If I were to add an approved config in one workspace, for example, there is no reason I wouldn't trust it in another.
Everything here only becomes relevant when a repo has a managed config. If it does not have a managed config, we skip everything here.
If a repository has a managed config and unset trust, we first use
ui.can_prompt() to determine whether we are in a TUI.
IGNORE), and that they can run jj config managed --ignore/notify/trust in a terminal to configure this.Nice and easy. We just completely ignore the managed config file.
Nice and easy. We just read the managed config file.
Roughly speaking, if mtime(repo_config_file) > mtime(managed_config), then
the config is up to date.
If it isn't up to date, we print a warning. The user is then expected to update their repo config with any options they want from the managed config, which will stop the warning from printing as the mtime has changed.
This notify option has a rather painful UX when it comes to keeping it in sync (particularly for a user who constantly switches between different branches synced to different versions), but I choose it for several reasons. The first is that the vast majority of users will simply select to blindly trust the repo. The only people who will choose this option are the very security conscious, and this is by far the most secure mechanism. Secondly, it is so much simpler than an approval based mechanism, as we don’t need to worry about things such as workspaces being synced to different places. It provides far fewer edge cases.
In the future we may consider improving on the UX, doing things such as printing
a diff since the last seen config (per last_approved_config), but that isn't
a part of the MVP.
This is, by far, the most difficult option. There is no doubt that this provides value to some users, but the precise details of how this should behave is unclear (and much harder to implement than the other options).
The biggest problem with this is that there is not a clear "correct" set of semantics. If a config is not approved / rejected, we have 3 options:
ui.can_prompt() returning false
ui.can_prompt() does not mean that we should prompt
Consider the following examples
vim .config/jj/config.tomljj git fetch. There has been an update to the managed config.jj rebase -b @ -d trunk(). The managed config is now not
approved, and thus the next jj command will have the config disabled.Depending on what has changed, a variety of things could happen:
trunk() could run
jj rebase -b <revset> -d trunk()
immutable_heads() = ... could run something
like jj rebase -b mutable() -d trunk()
pre-upload hooks (which don't yet exist,
but are being planned) runs jj gerrit upload and bypasses them entirely.Thanks to the magic of jj undo, everything except the upload is recoverable.
What makes them a problem, however, is the fact that a user would see the
message "please approve /reject the config" and not suspect that the command
has failed.
If we use best-effort instead, we are likely to run into a different set of issues.
The first is that it's no longer clear which config is being used. If I:
Now which config is used?
The second issue is that things may be incorrect. If foo.py was renamed to
bar.py and the reference was changed in config.toml, for example, the
config is going to break. This is not as large an issue as disabling config,
IMO, as changes will usually result in an obvious error.
The vast majority of users will not use this option, as very few users are security-conscious, working on a repo they don't trust, and working on a repo with managed config.
Due to:
I propose that the "review" option not be put in the MVP, and and instead be added in later.
These options will also be able to be manually set via jj config managed --ignore/notify/trust.
This is the trickiest part of the proposal. Consider the following workflow:
jj new main@origin
jj ...
jj new stable-branch@origin
There are some edge cases we need to consider:
stable-branch may have existed before the config was addedstable-branch may have a different copy of the managed configThe naive assumption would be that you want to read the config from @, as the
config will always match the version of the code you're using. However, it turns
out that some things want to refer to @, while others want to read the config
from trunk().
Consider several different use cases:
clang-format --foo, but the option --foo was
deprecated in the latest version of clang-format
trunk()$repo/formatter --foo, but the option --foo
was deprecated in the latest version of formatter
@formatter --line-length=80
@trunk()trunk()trunk()@All in all, you can see a general pattern.
@trunk()This is problematic with trunk() because if you add the --reorder-hooks and
then checkout stable-branch it will incorrectly attempt to reorder imports
[fix.tools.rustfmt]
command = ["rustfmt", "--reorder-imports"]
In practice, it is highly unlikely that a formatter config would be written that
way. Far more likely, you would see an entry in config.toml like:
[fix.tools.rustfmt]
command = ["rustfmt"]
.rustfmt.toml:
reorder_imports = true
This simply works out of the box, since the formatter is reading the config from
@'s .rustfmt.toml
As long as the formatter is in-repo, we can just write a wrapper script which does this for us.
[fix.tools.rustfmt]
command = ["rustfmt.py"]
rustfmt.py:
os.execv(["rustfmt", "--reorder-imports"])
If you write a script for which the API keeps changing, eg. you add / remove flags to it, you can do something like this:
[aliases]
upload = ["util", "exec", "--", "bash", "-c", "python3", "-c", "$(jj file show -r 'trunk()' upload.py)"]
@ and trunk() are the only two reasonable candidates as places to read from,
IMO. I personally believe that if only one option is available, trunk() would
be much more appropriate, for the reasons specified above.
However, @pmetzger has pointed out that in a future world where Git isn’t the backend, this decision may come back to bite us (as build tools may be checked in to the build).
For now, we will only support reading from the working copy (@), but in the
future, we may consider supporting other revisions.
To achieve this, there are several methods. I'll list some of them, but they don't need review, as they are out of scope of this proposal.
The simplest that comes to mind is a --managed-config-from-revision flag or
config option.
Alternatively, a more general approach could be to create some kind of include
directive in config.toml files, like so (this has already been discussed and
rejected before, so it may not be viable).
includes = [
{
path = "other_config.toml",
revision = "trunk()",
},
]
The tradeoffs are:
jj sparse --clear --add lib or
because the file matches .gitignore and is not yet tracked.
.config/jj in jj sparseReading from the filesystem seems strictly superior, given the mitigation
strategy available, so we will go ahead with that for now. If we ever decide to
support reading from trunk(), however, we will need to support reading from
the repo.