proposals/p5434.md
ref parameters, arguments, returns and val returnsbound parametersrefreturned varDeref interfaceref, only pointersref bindings in the fields of classesref introducervar as a top-level statement introducerref as a type qualifierbound would change the default return to valreturn var with compound return formsref parameters allow aliasinglet to mark value returns instead of val=> infers form, not just typeref instead of var or the default. It
will bind to reference argument expressions in the caller and produces a
reference expression in the callee.
ref binding can't be rebound to a different object.addr, and is not restricted to the self parameter.ref binding, like a value binding, can't be used in fields of
classes or structs.self ref parameters are
also marked with ref.ref, val, or var.
These control the category of the call expression invoking the function, and
how the return expression is returned.
bound.ref binding is nocapture and noalias.bound.Reference bindings have come up multiple times:
addr self: Self*;var, see
issue #5250
and
proposal #5164;self and addr self, and would be a
point of friction in local pattern matching in the future; andReference returns have also come up before, particularly to support operators
such as indexing [...] and other functions that should produce a reference
expression. It is desirable, though, that this not introduce new memory unsafety
concerns, due to returning a reference to something with insufficient lifetime.
In addition, we have been interested in adding other return mechanisms that support returning values in registers in cases that our current convention won't.
addr keyword on mutating methods to get a self with a pointer type
was introduced in
proposal #722: "Nominal classes and methods".ref bindings to Carbon, paralleling reference expressions"
supports adding ref bindings to Carbon.noalias attribute
is used to mark a pointer as being aliased in only limited ways to enable
optimization. Also see
LLVM's pointer aliasing rules.nocapture attribute,
which has become
captures(none) and captures(ret: address, provenance),
which is governed by
pointer capture rules.clang::lifetimebound attribute
to mark parameters that may be referenced by the return value, in order to
detect some classes of use-after-free memory-safety bugs.ref bindingsWe introduce a new keyword ref. This may be added to a : binding to mark it
as binding to a reference expression, as in:
fn F(ptr: i32*) {
// A reference binding `x`.
let ref x: i32 = *ptr;
// Use of `x` is a reference expression that
// refers to the same object as `*ptr`.
Assert(&x == ptr);
// Equivalent to `*ptr += 1;`.
x += 1;
}
fn G() {
var y: i32 = 2;
F(&y);
Assert(y == 3);
}
The use of the name (x in the example) of a ref binding forms a durable
reference expression. We ensure that reference expressions formed by way of
reference bindings do not dangle. A ref binding may only bind to a durable
reference expression or an expression that can be converted to one. The bound
durable reference expression must outlive the ref binding.
The address of a ref bound name gives the address of the bound object, so
&x == ptr above. The reference itself does not have an address, and unlike a
pointer can't be rebound to reference a different object.
We remove addr, and use instead use ref for the self parameter when an
object is required. Note that the type will change from Self* to Self in
this case. In the future, we might
re-add addr back if needed.
class C {
// ❌ No longer valid.
fn OldMethod[addr self: Self*]() {
// Previously would dereference `self` in
// the body of the method.
self->x += 3;
}
// ✅ Now valid.
fn NewMethod[ref self: Self]() {
// Now `self` is a reference expression,
// and is not dereferenced.
self.x += 3;
}
// ✅ Other uses are unchanged.
fn Get[self: Self]() -> i32 {
return self.x;
}
var x: i32;
}
Potentially abbreviating the syntax further (to allow ref self as a short form
of ref self: Self) is left as future work.
The ref modifier is allowed on : bindings that are not:
var pattern,class type, orfn AddTwoToRef(ref x: i32) {
x += 1;
let ref y: i32 = x;
y += 1;
}
// Equivalent to:
fn AddTwoToRef(ref x: i32) {
x += 1;
let y_ptr: i32* = &x;
*y_ptr += 1;
}
We add support for ref and var in a
struct pattern when using
the shorthand a: T syntax for .a = a: T:
let {var a: i32, ref b: i32} = ...;
// Now equivalent to:
let {.a = var a: i32, .b = ref b: i32} = ...;
Note: This takes us one step closer to
{ambiguity. Previously we could distinguish between a struct literal/pattern and a non-empty block with only up to two tokens of lookahead (the struct cases start with.or_or identifier followed by:, and the block cases don't). Now we have things like:carbonfn F() -> X { var a: i32 = 0; }... where we're getting incrementally closer to ambiguity. We've got a few more steps before we get there, though, since we don't have an
X{...}expression yet, andvar ...is only allowed in struct patterns rather than struct expressions. So we're still fine, but this is cutting down our options for future syntactic expansion a little.
The ref modifier is forbidden on the bindings in class or struct type
fields.
var outer_size: i32 = 123;
class Invalid {
// ❌ Invalid. We don't currently have runtime `let` bindings in classes,
// or `ref` on `var`s, but the intent is to not have `ref` bindings as fields.
let ref invalid_ref_field: i32 = outer_size;
}
// ❌ Invalid.
var invalid_struct_type_field:
{ref .invalid: i32} = {.invalid = outer_size};
In a function argument list, arguments to non-self ref parameters are also
marked with ref. Continuing the example:
var z: i32 = 3;
AddTwoToRef(ref z);
Assert(z == 5);
// No `ref` though on the `self` argument.
var c: C = {.x = 4};
c.NewMethod();
Assert(c.Get() == 7);
Note: It is important that this restriction is syntactic, not just semantic, because it means that
refis never the first token of a full expression, and so we know without lookahead that arefin a pattern context must be the start of a binding pattern, not the start of an expression pattern.
Normally an argument to a non-ref parameter should not be marked ref, but it
is allowed in a generic context where the parameter may sometimes be ref.
Expression operators will mostly not take ref parameters, with these
exceptions:
&;[...]; and..The statement operators now use ref instead of pointers:
= and +=.++ and -- operators.Even in these cases, the arguments will not be marked with ref at the call
site. (Generally the ref parameter is the self parameter, and so wouldn't be
marked. The exception is BindToRef, but we don't want to mark its argument
with ref.)
As an experiment, we are saying a pointer formed by taking the address of a
ref bound name is LLVM-captures(none) and LLVM-noalias. This means that
while a ref parameter could be passed into a function by address, the
restrictions also allow a "move-in-move-out" approach (once we define the move
operation), assuming it is not marked bound. The intent
here is to leave the door open to a calling convention using registers and less
indirection for small-enough objects.
This means that the following code is invalid:
fn F(ref a: i32, ref b: i32) -> bool;
fn G() -> bool {
var v: i32 = 1;
return F(ref v, ref v);
}
Enforcing this restriction will be part of the memory safety story. Until then, doing this is erroneous behavior. This means that the compiler won't use those LLVM attributes unless the compiler can itself prove that the restrictions hold.
ref and val returnsThe return of a function can optionally be marked ref or val. These control
the category of the call expression invoking the function, and how the return
expression is returned.
var global: i32 = 2;
fn ReturnRef() -> ref i32 {
// ❌ Invalid: return 2;
// ✅ Valid: return a reference expression with
// sufficient lifetime.
return global;
}
// Call `ReturnRef` and use the resulting reference.
ReturnRef() += 3;
Assert(global == 5);
// Result of `ReturnRef` can be bound using a `ref`
// binding.
fn AddFive() {
let ref r: i32 = ReturnRef();
r += 5;
}
AddFive();
Assert(global == 10);
fn ReturnVal() -> val i32 {
return 2;
}
// ReturnVal() is a value expression.
let l: i32 = ReturnVal();
// Returning an initializing expression is the
// default.
fn ReturnDefault() -> i32 {
return 2;
}
// ReturnDefault() is an initializing expression.
var j: i32 = ReturnDefault();
// Use `var` to explicitly specify returning an
// initializing expression.
fn ReturnVar() -> var i32 {
return 2;
}
// `ReturnVar()` is the same as `ReturnDefault()`.
-> ref T is a durable reference expression.
The generated code for that function will return the address of a T
object.-> val T is a value expression. The function
will return the value representation of T. Since values have no address,
the value representation may be returned in registers.-> T is unchanged. It is an
initializing expression, returning in place or by copy depending on the
initializing representation of T. This is the same behavior as -> var T.auto as the return type is unchanged, but now supports an
optional ref, val, or var between the -> and auto. -> auto
continues to return an initializing expression, as does -> var auto.
-> val auto returns a value expression, and -> ref auto returns a
durable reference expression.=> to specify a return continues to return an initializing
expression, as before. See
this relevant alternative considered.A function may have multiple returns, each with their own marker, by using a tuple or struct compound return form.
fn TupleReturn() -> (val bool, ref i32, C) {
return (true, global, {.x = 3});
}
fn StructReturn()
-> {.a: val bool,
.b: ref i32,
.c: C} {
return {.a = true,
.b = global,
.c = {.x = 3}};
}
If the return of a function may reference the storage of one or more parameters
to the function, those parameters must be marked bound. This allows the
compiler to diagnose if the function's return is used after the lifetime of the
bound parameter ends. The semantics of bound are intended to match the
clang::lifetimebound attribute.
fn Member(bound ref c: C) -> ref i32 {
return c.x;
}
// Lifetime of a pointer includes the lifetime
// of what it points to.
fn Deref(bound p: i32*) -> ref i32 {
return *p;
}
fn Both(bound pc: C*) -> ref i32 {
return p->x;
}
fn Invalid1() -> ref i32 {
var x: i32 = 4;
// ❌ Error: returning reference to `x`
// whose lifetime ends when this function
// returns.
return x;
}
fn Invalid2() -> ref i32 {
var c: C = {.x = 1};
// ❌ Error: returning reference bound to `c`
// whose lifetime ends when this function
// returns.
return Member(c)
}
The address of a bound ref parameter is the
LLVM attribute captures(ret: address, provenance) instead of
captures(none).
The intent is that we would encourage using references instead of pointers when possible. Their benefits are related to their limitations, so to get those benefits we should use them when a use is restricted enough to be within those limitations.
Mirroring the tuple and struct pattern forms, we also support tuple and struct return forms.
// `->` begins a "return form"
fn F() -> <return-form>;
// Within any return form, if the first token is
// `val`, `ref`, `var`, `(`, or `{`, it is not
// treated as type expression:
// Value return, with type as specified
fn Val() -> val <type-expr>
// Reference return, with type as specified
fn Ref() -> ref <type-expr>
// Initializing return, with type as specified
fn Var() -> var <type-expr>
// Tuple compound return, with a list of
// return forms.
fn TupleCompound() -> ( <return-form>, ... )
// Tuple return, with a list of type
// expressions. Used if all members of the
// list are type expressions.
fn Tuple() -> ( <type-expr>, ... )
// Struct compound return, with a mapping from
// designators to return forms.
fn StructCompound()
-> { .<id>: <return-form>, ... }
// Struct return, used if all of the members
// are type expressions.
fn Struct() -> { .<id>: <type-expr>, ... }
// Otherwise, implicit `var` means returns
// an initializing expression.
fn Other() -> <type-expr>
Note that in the absence of val, ref, and var keywords, the implicit var
is placed in the outermost position, minimizing the number of primitive forms
returned. So fn F() -> (i32, i32) means fn F() -> var (i32, i32) not
fn F() -> (var i32, var i32). Generally the var is left off if not required,
and so will be rare in return forms, to minimize confusion with val.
fn TupleReturn(...)
-> (val bool, ref i32, C);
// Equivalent to:
// -> (val bool, ref i32, var C);
let (a: bool, ref b: i32, var c: C)
= TupleReturn(...);
fn StructReturn(...)
-> {.a: val bool,
.b: ref i32,
.c: C};
// Equivalent to:
// -> {.a: val bool,
// .b: ref i32,
// .c: var C};
// Binds to the names `x`, `y`, `z`:
let {.a = x: bool,
.b = ref y: i32,
.c = var z: C} = StructReturn(...);
// Binds to the names `a`, `b`, `c`:
let {a: bool,
ref b: i32,
var c: C} = StructReturn(...);
// Above two can be mixed, binding to
// names `a`, `y`, `z`.
let {a: bool,
.b = ref y: i32
.c = var z: C} = StructReturn(...);
Only types are allowed after a -> val, -> ref, or -> var, not a compound
return form. Examples:
// Returns a tuple of type
// `(bool, f32, C, i32)`.
fn OneTupleReturn(...)
-> (bool, f32, C, i32);
// Returns a compound tuple form
fn CompoundReturn(...)
-> (bool, val f32);
// Equivalent to:
// -> (var bool, val f32);
// ❌ Invalid, can't specify `ref` inside
// of `val`.
fn Invalid(...) -> val (bool, ref f32);
The compound return forms may be nested, as in:
fn CompoundInParens(...)
-> ({.a: bool, .b: val f32}, C, ref i32);
// Equivalent to:
// -> ({.a: var bool, .b: val f32}, var C, ref i32);
let ({.a = var x: bool, .b = val y: f32},
var c: C, ref d: i32) = CompoundInParens(...);
// or without renaming:
let ({var a: bool, b: f32},
var c: C, ref d: i32) = CompoundInParens(...);
// Contrast with a compound tuple form containing
// a struct type (not compound):
fn StructInParens(...)
-> ({.a: bool, .b: f32}, C, ref i32);
// Equivalent to:
// -> (var {.a: bool, .b: f32}, var C, ref i32);
fn CompoundInBraces(...)
-> {.a: bool, .b: (val f32, C), .c: ref i32};
// Equivalent to:
-> {.a: var bool, .b: (val f32, var C), .c: ref i32};
let {a: bool,
.b = (x: f32, var y: C),
ref c: i32} = ParensInBraces(...);
// Contrast with a compound struct form containing
// a tuple type (not compound):
fn TupleInBraces(...)
-> {.a: bool, .b: (f32, C), .c: ref i32};
// Equivalent to:
-> {.a: var bool, .b: var (f32, C), .c: ref i32};
This feature is intended to support cases like enumerate that will want to
return a value for the index but a reference to the element of the sequence
being enumerated.
Since a ref binding may only bind to a durable reference expression, it can't
be used to bind the result of a function returning an initializing expression.
However, if the initializing expression is bound to a var, any nested patterns
are reference binding patterns bound to the subobject, following
proposal #5164: "Updates to pattern matching for objects".
For example:
fn F() -> (bool, (C, i32));
let (b: bool, var (c: C, i: i32)) = F();
is equivalent to:
fn F() -> (bool, (C, i32));
let (b: bool, var v: (C, i32)) = F();
let ref c: C = v.0;
let ref i: i32 = v.1;
Note that ref is disallowed inside var since that would be redundant.
Mutation of objects with a non-copy-value representation in an active value binding ("borrowed objects") is erroneous behavior.
This was discussed 2025-05-22 and then made the subject of leads issue #5524.
The Carbon compiler should not optimize on erroneous behavior, ever, unless the compiler literally proves that it does not occur, ever. In which case, we don't need any license to start optimizing on this, as it falls under "as-if".
The fact that undefined behavior ("UB", cppreference, wikipedia) provides "as-if" without a proof is precisely the risk of using UB for any semantics, and why we don't use it here and elsewhere we use erroneous behavior.
bound parametersIt is erroneous behavior to return something that references a local object that
won't live once the function returns, even if it is a parameter marked bound.
Local objects include local variables, temporary objects, and var parameters.
fn Invalid1() -> i32* {
var x: i32 = 4;
// ❌ Invalid.
return &x;
}
fn Invalid2(bound x: i32) -> ref i32 {
var y: i32 = x;
// ❌ Invalid.
return y;
}
// ✅ Valid
fn Valid1(bound p: i32*) -> ref i32 {
return *p;
}
fn Invalid3(bound var x: i32) -> ref i32 {
// ❌ Invalid: lifetime of `var` parameter
// ends when function returns.
return x;
}
class ReturnMember {
// ✅ Valid
fn ValidRef[bound ref self: Self]() -> ref i32 {
return self.m;
}
// ❌ Invalid: can't return reference to value.
fn InvalidVal[bound self: Self]() -> ref i32 {
return self.m;
}
// ❌ Invalid: `var self` lifetime ends.
fn InvalidVar[bound var self: Self]()
-> ref i32 { return self.m; }
var m: i32;
}
class DerefPointerMember {
// ✅ Valid
fn ValidRef[bound ref self: Self]() -> ref i32 {
return *self.pm;
}
// ✅ Valid
fn ValidVal[bound self: Self]() -> ref i32 {
return *self.pm;
}
// ✅ Valid
fn ValidVar[bound var self: Self]()
-> ref i32 { return *self.pm; }
var pm: i32*;
}
Otherwise, bound parameters and global variables are the sources of storage
that can be referenced by a return, but need not be referenced, particularly not
on every code path.
// Result references `r` if `b` is true, and `p`
// otherwise. Valid as long as both are marked `bound`.
fn Conditional(b: bool, bound ref r: C, bound p: C*)
-> ref C {
if (b) {
return r;
} else {
return *p;
}
}
The parameters of functions defined in an interface may also be marked as
bound. The impl of that interface for a type can omit occurrences of bound
from the interface, but cannot add new ones.
interface I {
fn F[bound ref self: Self]
(a: Self, bound b: Self, bound c: Self*)
-> ref Self;
}
impl C1 as I {
// ✅ Valid: matches interface
fn F[bound ref self: Self]
(a: Self, bound b: Self, bound c: Self*)
-> ref Self;
}
impl C2 as I {
// ✅ Valid: proper subset of `bound` params
fn F[ref self: Self]
(a: Self, bound b: Self, c: Self*)
-> ref Self;
}
impl C2 as I {
// ❌ Invalid: `a` is not bound in `I.F`.
fn F[ref self: Self]
(bound a: Self, b: Self, c: Self*)
-> ref Self;
}
Like [[clang::lifetimebound]] in C++, bound does not affect semantics or
calling conventions, just what code is legal. This helps avoid mismatches
between typechecking against the signatures in an interface when the impl
functions are different. Exception: the question of whether bound affects the
lifetime of temporaries is future work.
Note that all combinations of a val/ref/default return can be bound to a
value/ref/var parameter. Examples:
fn RefToVal(bound ref x: C) -> val D { return x.d; }
fn ValToRef(bound y: C) -> ref D { return *y.ptr; }
fn VarToRef(bound var p: i32*) -> ref i32 { return *p; }
fn VarToDefault(bound var p: i32*) -> i32* { return p; }
For full safety, we need each bound parameter to be immutable for the duration
of the lifetime of the returned result. However, the objective for now is only
matching [[clang::lifetimebound]], which has the goal of preventing some
classes of bugs, not full memory safety. We will reconsider this with the memory
safety design.
Clang's lifetimebound attribute also only applies to the immediately pointed
to objects (by pointers or reference parameters, or pointers or reference
subobjects of an aggregate parameter). We suggest a simpler, transitive model
here that is more restrictive but should be compatible. That said, pinning down
the exact and firm semantics of bound, especially in these complex cases, is
deferred to the full memory safety design as well.
refThe address of a ref binding is noalias and either captures(none) or
captures(ret: address, provenance), depending on whether the binding is marked
bound.
noalias means like C restrict; you can't observe mutations through
aliases; mutation through a restricted pointer is not observable through
another pointercaptures(none) means there is no transitive escape: you can pass a
nocapture pointer to another nocapture function, but you can't store to
memory or returncaptures(ret: address, provenance): is like captures(none) but may be
referenced by a return.The combination of noalias and captures(none) semantics are the minimum for
the "move-in-move-out" optimization. But this condition is hard to check, so
safe code will use a stricter criteria. Unsafe code will be required to adhere
to just the noalias restrictions, but will not be checked (except possibly by
a sanitizer at runtime). The details here will be tackled as part of the memory
safety design.
Optimizations will only be performed based on information that is enforced or checked by the compiler, so these attributes won't be passed to LLVM unless their requirements can be established. This avoids introducing undefined behavior, which we particularly don't want to do in situations where C++ doesn't.
The goal of these rules it to nudge us towards function boundaries that don't constructively create aliasing in their API boundary and don't capture pointers unnecessarily.
These restrictions are experimental, and we should keep track of everything we end up needing to do to work around these restrictions so any reconsideration can be properly informed.
We expect this to improve interop and migration by allowing significantly more interface similarity between Carbon and C++. Previously, many things in C++ that used references on interface boundaries would be forced to switch to pointers. This adds ergonomic friction both at a basic level because of the forced change but also a deeper level because it will make it significantly harder to see the parallel usage across the boundary between C++ and Carbon. With reference bindings, the vast majority of this dissonance will be removed.
This does create a migration concern, raised in
open discussion on 2025-05-01,
that the nocapture and noalias modifiers don't match C++ restrictions,
particularly on the this parameter that we are going to require migrate to
ref self. We may have to add back in addr to allow a different pointer type
for those cases.
Future work: Addressing how we model the various kinds of C++ references that Carbon code may need to interact with is something we are actively considering and will be tackled in a future proposal.
Much like value/val and var bindings, ref binding and the new return forms
are are part of the type system, but only through expression categories,
patterns (function parameters and so on), and returns. Specifically, we don't
expect them to be part of the object types in Carbon. Like value bindings, we
retain a great deal of implementation flexibility around layout, and the
specifics of how they are lowered.
This specifically means we will need to incorporate ref bindings into the
Call interface and we will be adding complexity there that will need to be
handled by overloading. The changes to the Call interface is future work, and
overloading, once we add support, will need to carry additional complexity to
handle ref.
returned varThe rule is: returned var may only be used when there is a single atomic
return form, and it is the default var category.
// ✅ Allowed
fn F(...) -> V {
returned var v: V = ...;
// ...
return var;
}
fn F(...) -> {var .a : T} {
// ❌ Invalid: composite form
returned var ret: T = ...;
// ...
}
fn F(...) -> val T {
// ❌ Invalid: value return
returned var ret: T = ...;
// ...
}
We can revisit and expand this later if this does not handle use cases we would like to support.
Deref interfaceTo support customization of the prefix-* dereferencing operator, we introduce
the Deref interface.
interface Deref {
let Result:! type;
fn Op[bound ref self: Self]() -> ref Result;
}
final impl forall [T:! type] T* as Deref {
where Result = T;
fn Op[bound self: Self]() -> ref T
= "builtin.deref";
}
Then *p is rewritten to p.(Deref.Op)(), and p->m is rewritten to
p.(Deref.Op)().m. For example, this might be used by a smart pointer:
class SmartPtr(T:! type) {
fn Make(p: T*) -> Self { return {.ptr = p}; }
impl as Deref {
where Result = T;
fn Op[bound ref self: Self]() -> ref Result {
return *self.ptr;
}
}
private var ptr: T*;
}
Proposal #2274: "Subscript syntax and semantics"
added the interfaces used to support indexing with the subscripting operator
[...]. We change these in the following ways:
addr self parameters are changed to bound ref self, to allow the
result to reference the self object.At method returns by val.Addr methods are renamed Ref and return a reference instead of a
pointer that is automatically dereferenced.This proposal's PR makes those changes to the indexing design.
The member binding interface used for reference expressions from proposal #3720 can now be changed to use references instead of pointers.
Before:
// For a reference expression `x` with type `T`
// and an expression `y` of type `U`, `x.(y)` is
// `*y.((U as BindToRef(T)).Op)(&x)`
interface BindToRef(T:! type) {
extend impl as Bind(T);
fn Op[self: Self](p: T*) -> Result*;
}
After:
// For a reference expression `x` with type `T`
// and an expression `y` of type `U`, `x.(y)` is
// `y.((U as BindToRef(T)).Op)(x)`
interface BindToRef(T:! type) {
extend impl as Bind(T);
fn Op[self: Self](bound ref p: T) -> ref Result;
}
Similarly, the BindToValue interface is changed to use a val/value return,
potentially avoiding a copy of large objects.
Before:
interface BindToValue(T:! type) {
extend Bind(T);
fn Op[self: Self](x: T) -> Result;
}
After:
interface BindToValue(T:! type) {
extend Bind(T);
fn Op[self: Self](bound x: T) -> val Result;
}
A ref return can be used to expose the state of an object in a way that can be
mutated:
class Four {
fn Get[self: Self](i: i32) -> i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
fn GetMut[bound ref self: Self](i: i32) -> ref i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
private var m: array(i32, 4);
}
var x: HasMember = {.m = (0, 2, 4, 6)};
x.GetMut(2) += 1;
fn Check(y: Four) {
Assert(y.Get(2) == 5);
}
Check(x);
Future work: this will in the future often be done with an overloaded method, as in:
class Four {
overload Access {
fn [bound ref self: Self](i: i32) -> ref i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
fn [self: Self](i: i32) -> i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
}
private var m: array(i32, 4);
}
var x: HasMember = {.m = (0, 2, 4, 6)};
x.Access(2) += 1;
fn Check(y: Four) {
Assert(y.Access(2) == 5);
}
Check(x);
This may be a common enough use case that we will want to introduce a dedicated syntax:
class HasMember {
fn Access[bound ref? self: Self](i: i32) -> ref? i32 {
Assert(i >= 0 and i < 4);
return self.m[i];
}
private var m: array(i32, 4);
}
Not a change by this proposal, but note that our existing rules will require the
type in a ref binding to be complete in situations where it would not need to
be if you were using a value binding with a pointer type instead. We may need to
change this in the future to match C++ which treats reference types more like
pointer types for completeness purposes.
After this change, a ref binding to type T will require T to be complete
in the same situations that other bindings to type T require T to be
complete.
Purely as a change in syntax, the way to specify that
the value representation of a class
uses a pointer is changed from writing const Self * to const ref.
For safety, we need bindings and returns that reference storage to only be used while that storage remains valid. When the referenced storage is owned by a temporary, we have a choice to either control the lifetime of the temporary or diagnose when the lifetime of the temporary is insufficient. Deciding on our policy is future work.
Note that in many cases we can explicitly provide storage in a variable instead
of referencing a temporary. For example, using
var x: ... = ReturnsATemporary(); instead of
let ref x: ... = ReturnsATemporary();. This won't apply in all situations,
though, such as temporaries that are reachable transitively through pointers.
ref bindings in lambdasWe have already identified
future work to support reference captures in lambdas as part of proposal #3848.
This might be a reason to support ref bindings as fields of objects, with all
the restrictions that comes with that.
We still need to determine how references and the other return types interact
with effects, like Optional, errors, co-routines, and so on. For example, we
don't want to give up the benefits of being able to directly return a reference
when a function has an error path.
It is unclear if this will mean putting references into the object type system,
but we may be able to handle this with additional types or the ability to
customize return representations. For example, we might have an alternate
version of the Optional type that holds a reference:
class OptionalRef(T:! type) {
fn Make(bound ref r: T) -> Self {
return {.p = &r};
}
fn MakeEmpty() -> Self {
return {.p = Optional(T*).MakeEmpty()};
}
fn HasValue[self: Self]() -> bool {
return p.HasValue();
}
fn Get[bound ref self: Self]() -> ref Result {
Assert(self.HasValue());
return *self.p.Get();
}
private var p: Optional(T*);
}
More precise lifetime tracking will be considered with the memory safety design.
For example, the bound approach does not distinguish different components of a
compound return, or different parts of a parameter object that might have
different lifetimes.
We plan to support references to compile-time state when executing a function at compile time. That will be part of a future proposal.
Call or other interfacesFor now, ref is not represented in the Call interface introduced in
proposal #2875: Functions, function types, and function calls.
This will be tackled together in a future proposal with other aspects of
bindings not represented by the type, such as var and compile-time, along with
being generic across these aspects of bindings.
Having more support for multiple returns from a function opens the question of how to do different things with the different returns. We may want a syntax for saying some of the returns are bound to new names, and some are used in assignments to existing variables. One possibility would be to have some pattern syntax for re-initializing an existing object, as in:
fn F() -> (-> bool, ->T);
fn G() {
var x: T = ...;
Consume(~x);
let (b: bool, init x) = F();
// Continue to use `x`...
}
This was discussed in the #syntax channel on Discord on 05-19-2025.
addrThere are two reasons we might restore addr as an alternative to ref for the
implicit self parameter:
self that we want to disallow for ref
bindings generally but need an escape hatch for migrated C++ code that
relies on these patterns.self parameter to use one of the pointer semantics we create
as part of the upcoming memory safety design, that can't be achieved with
ref.This proposal tries to advance these Carbon goals:
ref parameters instead of
pointers.ref bindings and returns avoid the ceremony of round-tripping through
pointers.bound markings on parameters to allow safety equivalent to Clang's
[[clang::lifetimebound]] when returning a reference.These ideas were discussed in open discussion on:
They were also discussed in the #pointers-and-references channel in Discord starting 2025-05-05, and #syntax on 2025-05-14.
ref, only pointersThe rationale to add ref instead of staying with pointers was discussed on
2025-05-01.
In addition to the motivating problems given in
the "Problem" section, that discussion included some additional
depth to the reasons to add reference bindings.
There is a tension between wanting to have mutating expressions and only having pointers. You need some concept like a reference in order to mutate an object with an expression. The question is how small a box the references are restricted to, and where the line is drawn. C has lvalues, which contain references but are restricted to a quite small box. Reference bindings specifically are about keeping a small box around references while still adding enough expressivity to support our use cases. We have started with a model similar to C, but it fell down when it comes down to composition. Decomposing an expression into pieces loses the tools the expression provided to you. The missing tool for that was reference bindings.
We saw how much we were leaning on value bindings. The asymmetry between having value binding but not reference bindings when have value expressions and reference expressions was creating pressure. For example, when accessing members of an object, we had to escape to pointers in that operator.
One downside of this change is that before indirections were more visible in the code.
Also, this does fundamentally mean that we now have another kind of "pointer", potentially adding complexity to any memory-safety story. However, this ship already sailed to some extent with value bindings. Fundamentally, bindings are allowed to have pointer-like semantics from a lifetime perspective, and so will need to be considered as a pointer-like thing as we build out lifetime safety.
If we removed pointers after adding references, we would need something rebindable for assignable objects. The viable path forward without separate pointers and references is to have something rebindable like pointers but automatically dereferenced like references, which is the approach Rust takes. See this comment on issue #5261.
One of the features of a reference is what it cannot do, and we would have to remove those restrictions to be able to satisfy the pointer use cases with references.
ref bindings in the fields of classesA type with reference binding fields would need a lot of restrictions since reference bindings are not assignable. We did not see enough motivation to put references into objects, given the complexity that it would introduce, so we are keeping references out of types for now. This could change to support lambda reference captures.
This question was discussed on 2025-05-07.
We decided that the marking is not about lifetime, but ability to mutate. A
val may reference an object in a similar way to a ref, restricting
operations on the original object, but we are not going to mark vals since
those restrictions are enforced by the compiler. We thought the ability to
mutate, though, was something important enough to highlight to readers of the
code, even at the expense of extra work for the writer.
Swift inout parameters are
marked at caller with an & before the argument.
Jordon Rose has published
a regret that
they didn't use inout to mark the argument instead of &.
On the other hand, not marking is not known to be a source of bugs.
This is a "try it and see how well it works" sort of decision.
ref introducerFor now, we don't believe let ref to be so common as to need a shorter way to
write, unlike what we do for var. This was considered in
leads issue #5523,
which provided this rationale:
I feel like this would often be used for non-local mutation due to it fundamentally deporting mutable value semantics and instead having reference semantics. Unlike the local mutation, that seems more worthwhile to have incentives around minimizing.
However, this seems the easiest of all to revisit later if we discover that the added verbosity in practice is costing more than is worth any improvements from explicitly flagging mutable reference semantics, or if we find code is reaching for antipatterns due to the incentive.
In addition,
this comment on leads issue #5522
argued that it would be more consistent for ref and val to only apply to
bindings, and not introduce patterns, like let and var.
This was considered in leads issue #5523, which provided this rationale:
While it may be an obvious point of orthogonality, I think it adds choice without sufficient motivation, and even having that choice does add some complexity to the language.
It also seems like we could add this later if there is sufficient demand when we have larger usage experience body to pull from with the rest of the Carbon language. Currently, the affordance that feels more natural to me is what we have.
I think we're happy to see motivating use cases and revisit this. At the moment, we've just not seen motivating use cases -- everything has seemed a bit too contrived.
var as a top-level statement introducerThis was considered in leads issue #5523, which provided this rationale:
Locals are important, frequent, and frequently mutable. I don't think forcing varying locals to go through
let varfor orthogonality aids readability enough to offset the verbosity cost of added keywords on a reasonably common pattern.I'm still currently in the position that locally owned objects being mutable should not be "discouraged" or "disincentivized" by the language. And I think adding artificial incentives to try and avoid needing a mutable local variable would either have no effect beyond verbosity, or if it did have effect, it wouldn't be a net positive effect due to code being written in a less straightforward manner in order to avoid mutation.
To be clear, this is based on intuition and judgement based on my experience, not in any way based on data or specific motivating examples. I can imagine data or evidence or even a new perspective changing my position here, but so far the discussion we've had haven't done that.
ref as a type qualifierThe big concern is that any effect that is represented by a type, like
Optional or Result, will want to compose with reference returns. This could
be done by allowing ref to create an object type that could be used as a
parameter to those, as in Optional(ref T), but
we are trying to avoid going down that path.
We have future work to tackle this problem
specifically.
There was also
a concern that we might need ref types to represent argument lists with tuples,
but tuples already can't represent var or compile-time parameters. We have
other plans for this, instead of trying to stretch tuples to encompass these use
cases.
We also noted that including references in the type system led to a number of inconsistencies in C++, such as no there not being references of references.
bound would change the default return to valWe considered saying that bound would change the default return to use the
->val return convention. This was discussed on
2025-05-01
and
2025-05-08.
The idea is that val is expected to be efficient, so we should encourage using
it, but we can't always use val, since some types have a reference value
representation, but bound alleviates that concern.
Once we realized that bound is relevant for all return conventions, we
reconsidered that approach, since has a number of concerns:
impl of it when removing bound.[[clang::lifetimebound]] don't change
calling conventions, only what code is valid.Going with an approach where less depends on bound makes sense for now, since
we are going to reconsider these issues as part of our upcoming memory safety
work.
We also considered other conventions for returning from functions, in a comment on #5434, on 2025-05-08 and on 2025-05-12, most notably:
->, but always using the "in place"
convention where the caller allocates storage and provides the callee with
its address, the callee initializes the storage at that address, and the
caller is responsible for destroying after the return.bound var parameter, that caller is then responsible for
destroying. A call to this function is reference expression, but with
additional responsibility to destroy.There were also some variations on what the conditions for returning in registers using the default return convention.
We seriously considered "var without storage", but the fact that it couldn't reliably be used to initialize a variable, particularly in the middle of an object, meant it did not seem valuable enough to include.
It seemed more valuable to support the "in place" return convention. That return
form allows you to guarantee knowing the address of the object being
constructed, and was a good match for returned var. However, we realized that
var declarations shouldn't always be associated with in-memory storage, in
particular for types that may be trivially moved. For example, a var parameter
with the C++ type std::unique_ptr<T> should be passed in registers. A function
returning a std::unique_ptr<T> in place would not be as efficient as returning
it by moving it into registers.
return var with compound return formsWe considered various syntax options on 2025-05-12, but none of them seemed good enough to justify inclusion at this time:
fn F(...) -> (ref R, val L, V) {
// No longer a `var` being returned. Ideally these
// shouldn't have to be initialized together.
returned ??? (ref r: R, val l: L, var v: V) = ...?
return var;
}
// We could restrict to one `var` return component,
// but this is a lot of machinery for a small increase
// in expressiveness and applicability.
fn F(...) -> (ref R, val L, V) {
returned var v: V = ...;
let l: L = ...;
return (*r, l, var);
}
fn F(...) -> (ref R, val L, V) {
// These don't have the right category, and ideally
// shouldn't have to be initialized together.
returned var ret: (R, L, V) = ...?
return var;
}
fn F(...) -> (ref R, val L, V) {
returned var (_, _, var v: V) = <what goes here?>;
}
There was another approach we considered for returned var originally:
fn F(...) -> (ref R, val L, v1: V1, var v2: V2) {
// ...
// Must use the same names for the `var` (implicit or explicit) returns
// with bound names.
return (r, l, v1, v2);
}
But this had downsides that still apply:
V1 and V2 to have unformed states. Otherwise, v1 and v2
would need be initialized when they are declared.return var.Our current approach handles our main use case for returned var: factory
functions.
We could support an "only vars" approach in the future if we want:
fn F(...) -> (var V1, var V2, var V3) {
returned (var v1: V1, var v2: V2, var v3: V3) = ...;
// ...
return var;
}
We considered other options for the syntax of compound return forms on
2025-05-13,
#syntax in Discord on 2025-05-14,
and
2025-05-14.
The option of omitting the -> in each component did not distinguish tuples
from tuple return forms sufficiently:
-> (ref i32, var i32)
-> (bool, ref i32)
// Is this a single return of a tuple, or a triple
// return using the default return convention?
-> (i32, i32, i32)
We also considered an approach where compound return forms would start with
->?, but this raised concerns about what the meaning of that syntax would be
and whether we want to expose users to that in cases we might be able to avoid
it.
The original proposed syntax used an arrow -> in each component of a compound
form.
fn TupleReturn(...)
-> (->val bool, ->ref i32, -> C);
fn StructReturn(...)
-> {->val .a: bool,
->ref .b: i32,
-> .c: C};
This avoided ambiguity, but was verbose and visually noisy. An alternative was
suggested in
discussion on 2025-06-13.
This alternative used a default of compound returns for paren (...) and
brace {...} expressions, which you could opt out of by using one of the
three category keywords such as var to introduce a type expression that would
not be considered a compound return.
This had the downside that -> T would be interpreted as -> var T even if T
was a tuple type like (i32, i64). However, textually substituting in
(i32, i64) in for T to get -> (i32, i64) would instead be interpreted as
-> (var i32, var i64).
To overcome this problem, in discussion on 2025-06-16, we switched to the approach from this proposal.
ref parameters allow aliasingIf requiring ref parameters to be noalias ends up being too restrictive, we
could instead have the "move-in-move-out" optimization be done only when the
compiler can prove it safe. One strategy would be to generate an alternate
version of the function that is only used in the cases where the noalias
conditions can be shown to hold statically.
let to mark value returns instead of valThis proposal initially used let instead of val to mark
immutable value returns. However, let is used, in
Carbon and other languages, primarily to bind names. In Carbon, the default
binding is a value binding, but that was not considered a close enough to
connect the let keyword to value semantics.
There was also a concern about reusing let in multiple contexts to mean
different things, and having separate keywords that were only used to mark the
category of the binding was deemed better separation of concerns.
We considered making a parallel change to use init instead of var, but this
had some problems:
init keyword would be particularly expensive, because C++ code
commonly use that word in APIs.This question was considered in leads issue #5522.
=> infers form, not just typeThere was support for the idea that the => return syntax introduced in
proposal #3848
should deduce the form of the return, not just its type. This was discussed in
#lambdas on 2025-05-20.
However, trying to infer the expression category from the category of the
expression after the => runs into the problem of this often requiring the
parameters to be marked bound. A more complicated rule, for example using
whether any parameter is marked bound, could be adopted in the future, if the
simple rule proves inadequate.
Alternatively, the compiler could infer which parameters should be marked
bound in this case. That is something to consider with the memory safety
design.