proposals/p0911.md
Programs need to be able to select between multiple different paths of execution and multiple different values. In a rich expression language, developers expect to be able to do this within a subexpression of some overall expression.
C-family languages provide a cond ? value1 : value2 operator.
cond and value2 are
undelimited, and it's often unclear to developers how much of the adjacent
expressions are part of the conditional expression. For example:
int n = has_thing1 && cond ? has_thing2 : has_thing3 && has_thing4;
int n = (has_thing1 && cond) ? has_thing2 : (has_thing3 && has_thing4);
value1 and value2 are parsed with different rules:
cond ? f(), g() : h(), i();
(cond ? f(), g() : h()), i();
?: is not
customizable. Instead, C++ invented a std::common_type trait that models
what the result of ?: should have been.Rust allows most statements to be used as expressions, with if statements
being an important case of this: Use(if cond { v1 } else { v2 }).
This has a number of behaviors that would be surprising to developers coming
from C++ and C, such as a final ; in a {...} making a semantic
difference.
The expression semantics leak into the statement semantics. For example, Rust rejects:
fn f() {}
fn g() -> i32 {}
fn main() {
if true { f() } else { g() };
return;
}
... because the two arms of the if don't have the same type.
We have already
decided that we
do not want Carbon to treat statements such as if as being expressions
without some kind of syntactic distinction.
Provide a conditional expression with the syntax:
if cond then value1 else value2
then is a new keyword introduced for this purpose.
This syntax can be chained like if statements:
Print(if guess < value
then "Too low!"
else if guess > value
then "Too high!"
else "Correct!")
Unlike with if statements, this doesn't require a special rule.
An if expression can be used as a top-level expression, or within parentheses
or a comma-separated list such as a function call. They have low precedence, so
cannot be used as the operand of any operator, with the exception of assignment
(if assignment is treated as an operator), but they can appear in other contexts
where an arbitrary expression is permitted, for example as the operand of
return, the initializer of a variable, or even as the condition of another
if expression or if statement.
// Error, can't use `if` here.
var v: i32 = 1 * if cond then 2 else 3 + 4;
value2 extends as far to the right as possible:
var v: i32 = if cond then 2 else 3 + 4;
is the same as
var v: i32 = if cond then 2 else (3 + 4);
not
var v: i32 = (if cond then 2 else 3) + 4;
The intent is that an if expression is used to produce a value, not only for
its side-effects. If only the side-effects are desired, an if statement should
be used instead. Because value2 extends as far to the right as possible, if an
if expression appeared at the start of a statement, its value could never be
used:
if cond then value1 else value2;
For this reason and to avoid the need for lookahead or disambiguation, an if
keyword appearing at the start of a statement is always interpreted as beginning
an if statement and never as beginning an if expression.
if ... then ... else syntax should be easier to format
automatically in an unsurprising way than a ?: syntax because it is
clear that the then and else keywords should be wrapped to the start
of a new line when wrapping the overall conditional expression.value2 portion as long as possible gives a simple rule that
it seems feasible for every Carbon developer to remember. This rule is
expected to be unsurprising both due to using the same rule for value1
and value2, and because it means that if consistently behaves like a
very low precedence prefix operator.if keyword for flow control makes the
distinction between flow control and linear computation clearer.if expression is improved by having a
then and else keyword of the same lengthWe could provide no conditional expression, and instead ask people to use a different mechanism to achieve this functionality. Some options include:
if statement:
var v: Result;
if (cond) {
v = value1;
} else {
v = value2;
}
Use(v);
Use(cond.Select(value1, value2));
Use(cond.LazySelect($(value1), $(value2)));
if statement in a lambda:
Use(${ if (cond) { return value1; } else { return value2; } });
The above assumes a placeholder $(...) syntax for a single-expression lambda,
and a ${...} syntax for a lambda with statements as its body.
Advantages:
Disadvantages:
We could use the C cond ? value1 : value2 syntax.
Advantages:
Disadvantages:
: token is already in use in name binding; using it as part of a
conditional expression would be confusing.? token is likely to be desirable for use in optional unwrapping and
error handling.thenWe could use
if (cond) value1 else value2
instead of
if cond then value1 else value2
Note that we cannot avoid parentheses in this formulation without risking syntactic ambiguities.
Advantages:
if statement, albeit one with unbraced operands.if expressions:
Print(if (guess < value)
"Too low!"
else if (guess > value)
"Too high!"
else
"Correct!")
Print(if guess < value
then "Too low!"
else if guess > value
then "Too high!"
else "Correct!")
Print(if guess < value
then "Too low!"
else if guess > value
then "Too high!"
else "Correct!")
Disadvantages:
else would presumably be wrapped onto
a line by itself, wasting vertical space, whereas then and else when
paired can both comfortably precede their values on the same line; consider
F(if (cond)
value1
else
value2)
F(if cond
then value1
else value2)
if statements and if expressions by
resembling an if statement but not matching the semantics.if statements optional.We could use:
if (cond) then value1 else value2
However, it's not clear that there is value in requiring both parentheses and a
new keyword. It also seems jarring that this so closely resembles an if
statement but adds a then keyword that the if statement lacks.
We could allow an if expression to appear anywhere a parenthesized expression
can appear, and retain the rule that value2 extends as far to the right as
possible.
Advantages:
?: construct in C++.1 + (if cond then 2 else 3).Disadvantages:
value2 ends in some cases, and violates
precedence rules.if at
the start of a statement and ambiguity when parsing value2.ifWe could allow an if expression to appear anywhere a parenthesized expression
can appear, and parse the value1 and value2 as if they appeared in place of
the if expression:
var n: i32 = 1 + if cond then 2 * 3 else 4 * 5 + 6;
// ... is interpreted as ...
var n: i32 = (1 + (if cond then (2 * 3) else (4 * 5))) + 6;
// Error: expected `else` but found `+ 4`.
var m: i32 = 1 + if cond then 2 * 3 + 4 else 5 + 6;
Advantages:
Disadvantages:
else ends, and discovering this requires looking backwards to before
the if.if statement for each precedence level. Also, those
productions will result in grammar ambiguities that will need to be
resolved.Suppose we have two types where implicit conversions in both directions are possible:
class A {}
class B {}
impl A as ImplicitAs(B) { ... }
impl B as ImplicitAs(A) { ... }
By default, an expression if cond then {} as A else {} as B would be
ambiguous. If the author of A or B wishes to change this behavior:
A, then impl A as CommonTypeWith(B) must be
provided specifying the common type is A.B, then impl B as CommonTypeWith(A) must be
provided specifying the common type is B.impls need to be
provided:
impl A as CommonTypeWith(B) { let Result:! Type = C; }
impl B as CommonTypeWith(A) { let Result:! Type = C; }
We could change the rules so instead, in any of the above cases, implementing
either A as CommonTypeWith(B) or B as CommonTypeWith(A) would suffice.
Advantages:
Disadvantages:
impl of CommonTypeWith in terms
of ImplicitAs would get this special treatment, but other blanket impls
would not.ImplicitAs is treated specially
here.impls are required is a corner case. It's somewhat
uncommon for implicit conversions to be possible in both directions between
two types. In those cases, it's more uncommon for there to be a clear best
"common type". And even then, most of the time the common type will be one
of the two types being unified.From a more abstract perspective: the process of finding a common type involves
asking each type to implicitly convert to the destination type that it thinks is
best, and then failing if both sides didn't convert to the same type. If A
implicitly converts to B and the other way around, then both sides of this
process should be overridden in order to get both types to implicitly convert to
C instead.
Carbon doesn't formally have a notion of lvalue or rvalue yet; this notion is expected to be added by #821: Values, variables, pointers, and references. In any case, we certainly intend to distinguish between expressions that represent values and expressions that represent locations where values could appear. We therefore need to decide whether a conditional expression can ever be in the latter category. For example:
var a: String;
var b: String;
var c: bool;
// Valid?
(if c then a else b) = "Hello";
We could permit this, as C++ does. For example, we could say:
If both value1 and value2 are lvalues then
if cond then value1 else value2is rewritten to*(if cond then &value1 else &value2)if those pointer types have a common type.
The other reason we might want to consider this alternative is performance. In
C++, this code avoids making a std::string copy:
std::string a;
std::string b;
std::string c;
bool cond;
// ...
bool equal = c == (cond ? a : b);
... by treating the conditional expression as an lvalue of type std::string
rather than as a prvalue. However, in Carbon, following #821, we would expect
that the equivalent of a prvalue of type std::string would not necessarily
imply that a copy is made. Rather, Carbon's equivalent of prvalues would
represent either a set of instructions to initialize a value (as in C++), or the
location of some existing value that we are temporarily "borrowing".
With that in mind:
Advantages:
Disadvantages:
& somewhere anyway;
given the choice between an lvalue conditional:
F(&(if cond then a else b));
F(if cond then &a else &b);
if expression -- the constraints would depend not only on
operand types, but also on value category, and may result in a hard to
express constraint such as "either T* and U* have a common type or T
and U have a common type".This should be revisited if the direction in #821 changes substantially from the assumptions described above.
There are some known issues with the way that the extensibility mechanism works in this proposal. It is hoped that extensions to Carbon's generics mechanism will provide simple ways to resolve these issues. This design should be revisited once those mechanisms are available.
We provide both CommonTypeWith, as an extension point, and CommonType, as a
constraint. It would be preferable to provide only a single name that functions
both as the extension point and as a constraint, but we don't have a good way to
automatically make impls symmetric and avoid impl cycles if we use only one
interface.
CommonType implementations diagnosed lateExample:
class A {}
class B {}
impl A as CommonTypeWith(B) where .Result = A {}
impl B as CommonTypeWith(A) where .Result = B {}
fn F(a: A, b: B) -> auto { return if true then a else b; }
The definition of function F is rejected, because A and B have no
(consistent) common type. It would be preferable to reject the impl
definitions.
impl ordering depends on operand orderExample:
class A(T:! Type) {}
class B(T:! Type) {}
interface Fungible {}
impl A(T:! Type) as Fungible {}
impl B(T:! Type) as Fungible {}
// #1
impl A(T:! Type) as CommonTypeWith(U:! Fungible) where .Result = A(T) {}
// #2
impl B(T:! Type) as CommonTypeWith(A(T)) where .Result = T {}
fn F(a: A(i32), b: B(i32)) -> auto { return if true then a else b; }
Here, reversed #2 is a better match than #1, because it matches both A(?) and
B(?), so #2 should be consider the best-matching impl. However, we never
compare reversed #2 against non-reversed #1. Instead, we look for:
impl A(i32) as SymmetricCommonTypeWith(B(i32)), which selects #1 as being
better than the blanket impl that reverses operand order.impl B(i32) as SymmetricCommonTypeWith(A(i32)), which selects #2 as being
better than the blanket impl that reverses operand order.So we decide that the if in F is ambiguous, even though there is a unique
best CommonTypeWith match. If either #1 or #2 is written with the operand
order reversed, then F would be accepted.