proposals/p7016.md
self syntax and adding static member variablesself in either parameter listself syntax in the deduced parameter list []class modifier for non-instance member variables and functionsstatic for non-instance member functionsstatic for package- and namespace-scope variablesmethod introducerUpdate the syntax for class (and interface/impl) methods to move self into
the parameter parentheses () and make the type in its binding optional
(defaulting to Self). Introduce the static keyword for non-instance member
variables to indicate static storage. Reflects the decision in leads issue
#6931.
The placement of self in the parameter list has been a repeated source of
debate around the syntax design of Carbon -- grouping it with deduced parameters
like types is surprising for some readers and nearly unprecedented in modern
programming languages. As this is an especially pervasive and impactful aspect
of Carbon's syntax given the prevalence of methods, it is important to revisit
the syntax here and make a durable decision on how to approach this part of
Carbon's syntax.
There is also a problem that we don't have a syntax for representing non-instance data members. Especially as we look at more C++ interop and more migrations from Carbon to C++ it is important to have a clear and idiomatic syntax for these constructs.
Key questions this proposal should address include:
self (in [] versus ()).Leads discussed various syntax options for self placement and non-instance
members in issue
#6931.
We propose the following updates to the syntax for class members:
self to the () parameter list: Instance methods declare self
as the first parameter in the explicit parameter list. This aligns methods
with a model where method calls are conceptually sugar for function calls
with an explicit object argument.self type optional: Optionally allow omitting the type of self,
which results in the type being exactly Self. Example:
fn MyMethod(self).static for non-instance data members: Use static var for data
members that are part of the type rather than part of instances of the type.self
parameter declared within a class are considered non-instance member
functions without any extra syntax.Treating methods as functions with an explicit self parameter means they can
be called using standard function call syntax (for example,
MyClass.Method(obj)), not just method call syntax (obj.Method()).
While methods can be invoked as regular functions, method syntax (obj.Method)
continues to form a "bound member function" that adapts obj. This allows the
call to be resolved appropriately even when obj requires conversion to match
the type of self. Similarly, we expect obj.(Class.Method)(args) to follow
the same model. Fundamentally, our goal is that this does not change the
instance binding model.
Member variables are any vars declared within a class context (or in the
future, potentially an interface or impl context).
Instance member variables, or fields, are unchanged in the current design.
This proposal adds support for non-instance member variables, or static member
variables, using the static modifier keyword on the var declaration. Static
member variables are associated with the type itself rather than instances of
the type, and have static storage duration.
class Widget {
// Non-instance member variable.
static var count: i32;
// Non-instance member function.
fn ResetCount() { count = 0; }
}
This does not apply to package- and namespace-scope variables (if we support
them) because static storage duration is the default in those contexts. It is
currently limited to classes, but if we allow variables with static storage
duration in other contexts that don't have a strong default of that storage,
like interfaces and impl declarations, they should use the same syntax.
We propose retiring the terminology "class function". It is confusing when used
in contexts like an interface (which is not a class) or where "class" implies
other semantics. Instead, we consider all functions declared within a class,
interface, or impl to be member functions. Member functions are either
methods (taking a self parameter) or non-instance member functions.
In practice, we most often are referring to member functions broadly, regardless of whether they are methods or not, or referring very specifically to methods. Because referring to the broad set happens much more than referring narrowly to non-instance member functions, we prioritize terminology that builds on a simple broad term. We also don't expect this to compound with other adjectives like "associated" as those will already imply "member", and so would be associated functions for the broad group and associated methods for the common narrow group.
Non-instance member functions don't use any special syntax. They don't take a
self parameter and so are regular functions. For example:
class Point {
var x: i32;
var y: i32;
// No `self` parameter, so it's a regular, non-instance function.
fn Create() -> Self {
return {.x = 0, .y = 0};
}
}
fn F(p: Point) {
var p2: Point = Point.Create();
}
Instance member functions, or methods, declare self as the first parameter
in the () list. Specifying the type of self is optional and defaults to
: Self.
class Point {
var x: i32;
var y: i32;
// Explicit type in the `self` binding.
fn GetX(self: Point) -> i32 { return self.x; }
// Omitted type in the `self` binding (defaults to Point).
fn GetY(self) -> i32 { return self.y; }
// By-reference `self` binding.
fn SetX(ref self, new_x: i32) { self.x = new_x; }
}
Moving self into the parentheses reinforces the conceptual model that method
calls are sugar for function calls with an explicit object argument.
Specifically, obj.Method(args...) is equivalent to
Type.Method(obj, args...).
This adopts a partial-application model which unifies methods and functions.
A function is an "instance method" if and only if its first parameter is named
self. It can always be called directly with the first argument passed to the
self parameter. When Method names a function with a self parameter, the
syntax obj.Method is just syntactic sugar for partially applying the obj to
the self parameter, which can then be called as a function by passing
arguments for the remaining parameters.
For non-instance member functions, the absence of a self parameter is
sufficient to distinguish them from methods.
Because methods are modeled as functions with an explicit self parameter, they
can be called as ordinary functions using their qualified names, with the self
argument as the first element of the explicit parameter list:
fn F(p: Point) {
// Standard method call syntax:
var x1: i32 = p.GetX();
// Called as an ordinary function:
var x2: i32 = Point.GetX(p);
}
This proposal effectively advances Carbon's goals by focusing on:
self to the explicit parameter list within parentheses aligns
Carbon's syntax with essentially all modern programming languages that use
an explicit object parameter, notably including C++23 itself, Python, and
Rust. It also reduces the visual cost of non-generic methods by removing the
need for an additional delimiter kind in their declaration. Making the
: Self part optional further reduces clutter in the most common cases and
matches the syntax used in other languages like Rust.static on storage (member variables) provides a crisp definition
that works unambiguously across classes, interfaces, and impl blocks.self in either parameter listWe considered two approaches to modeling self that separated it from
parameters of any kind, implicit or explicit.
We could place self in an independent position with a different set of
delimiters such as:
```carbon
class Point {
var x: i32;
var y: i32;
// `self` as a separate component, likely in between the implicit and
// explicit parameter lists.
fn Create[T:! type]<self>(x: T) -> Self {
return {.x = x as i32, .y = 0};
}
}
```
We could make self implicit, similar to C++.
We have generally been happy with self being explicit instead of implicit and
so to an extent we didn't deeply consider (2) as that wasn't part of the problem
we set out to solve. We're still comfortable with the rationale about this
aspect of the syntax from p0722.
We did consider (1) but were unable to find a syntax that felt compelling. Taking a new balanced delimiter is an especially difficult and expensive choice for the syntax. And without that, most syntaxes felt verbose for something that we expect to be extremely common and a natural "default".
self syntax in the deduced parameter list []We considered placing self in the deduced parameter list (for example,
fn F[self: Self](...)), as this has been the historical syntax in Carbon.
However, users strongly associate the implicit parameter list in []s with
deduced, compile-time parameters, but the self parameter is a runtime
argument passed explicitly by the caller. While we have syntactic distinctions
such as the ! and the self keyword, the contextual collision still adds some
cognitive load for some readers. We do imagine potential runtime implicit
parameters in the future, but these are expected to be rare enough to be easier
to accommodate and to leverage distinguishing syntax.
In contrast to these challenges, putting self in the explicit parameter list
() aligns with the semantics of self as being an otherwise-normal parameter.
Last but not least, most other languages with explicit object parameters put
them at the start of the parameter list inside ()s (see for example Python and
Rust). Being consistent with how object parameters are modeled in other
languages is expected to reduce confusion and surprise for readers of Carbon.
This is especially nice here, as we're asking C++ programmers to move from
implicit object parameters to explicit object parameters -- aligning with
other languages' explicit object parameter syntax hopefully minimizes that cost.
There was some concern that removing self from the []s would result in
readers assuming that the []s can only contain type parameters, similar to
how Python works. However, that concern didn't carry as much weight as the other
considerations for the leads.
Ultimately, the leads preferred the placement in the ()s and the associated
tradeoffs with that syntax.
class modifier for non-instance member variables and functionsWe considered using class var and class fn to mark non-instance member
variables and functions. However, class is already an introducer in Carbon.
Using it as a modifier adds an overloaded meaning. This meaning would also
diverge from usage in other languages. For example, Swift uses class versus
static to distinguish between class and struct methods, and has complex
virtual dispatch rules that aren't part of the Carbon design. These members also
appear in interfaces and impl blocks where the class keyword would be mildly
surprising.
The decision to use static at all in Carbon is relatively controversial. It
has a history in C++ of being used for a sprawling and seemingly ever-growing
set of things, in addition to the already subtle meanings inherited from C.
We chose static for non-instance member variables because of widely Carbon's
major peer languages (notably C++, Rust, Swift, and Java) use the keyword
static to mark class member variables with static storage duration. Reusing
this spelling avoids a surprising divergence from the languages our users are
most likely to be familiar with, and provides a clear, specific meaning focused
on storage duration. This aligns with common definitions of
static variables focusing on
lifetime contrasted with scope.
A key aspect of the leads being comfortable with this direction was having the storage implication of the keyword be fundamental to any usage of it rather than diluting that meaning with uses in other contexts.
Given that the static keyword did come with all of these concerns, we
considered a number of alternatives and the specific rationale for not selecting
those is recorded here.
sharedWe considered using shared var to mark non-instance member variables. However,
this term is heavily associated with concurrency and thread safety (for example,
shared ownership, shared pointers). Reserving it for class members would collide
with its use for future concurrency features (see
concurrency control design
and
shared memory in CUDA).
The risk of confusion outweighed the benefits of identifying member variables that are conceptually shared across instances.
globalWe considered using global var to mark non-instance member variables. However,
global is most widely used to refer to scope (global scope), whereas these
members are explicitly scoped to a type (or potentially an interface). The most
common distinction is that of "global" implicating scope or access, while
"static" more consistently describes the storage model.
static for non-instance member functionsWe considered applying static to both member variables and functions (for
example, static fn F()), for consistency with C++, Java, C#, and Swift.
However, functions do not have instances or storage in the same way data members
do. The absence of a self parameter is sufficient to distinguish instance
methods from non-instance functions, making the keyword redundant for functions.
Reusing static here would stretch or dilute its meaning in Carbon. Not using
it lets us keep the meaning narrow, and very specific to storage. It also helps
emphasize the underlying unification that we have for member functions but not
for data members: methods are member functions, and can even be called as such
when a viable object is explicitly passed to the self parameter. The only
thing special about methods is that they additionally allow the method call
syntax.
static for package- and namespace-scope variablesWe considered whether to require or allow the static keyword on package- and
namespace-scope variables (if we support them).
The storage of these variables should be described as "static" the same way as it is for static member variables. However, we propose omitting the keyword in those contexts because static storage is the strong default. In contrast, the strong default for class members is instance storage, so a keyword is needed to opt out of that default.
method introducerWe considered using a distinct introducer like method F() instead of
fn F(self). However, we prefer a uniform fn introducer for all functions,
relying on the presence of the self parameter to identify methods. This
provides greater consistency across all different kinds of functions, including
lambdas. It also embraces the unification of methods and functions.