`field` keyword working group notes for June 26, 2024

Working group met today at the request of the LDM to look into potential options on how to mesh the field keyword and Nullable Reference Types in a complimentary fashion.

Through our own investigations and discussions with the community we've identified many areas where we would like the field keyword to be available where arguments can be made that the backing field's type should be nullable, even if the property's type is not. A non-exhaustive list of these cases is at: https://gist.github.com/jnm2/df690bc7278371d735c546c57e787b6e

In particular, we would like users to take idiomatic code cases that already use a field+property, and move those over to using field, ideally with no extra effort. In other words, we'd like them to be able to remove their field, and replace references to that field in the property with field without any extra cruft or mess.

There are several approaches we can consider, with varying user experience outcomes depending on the approach and the specific example we are looking at. These working group notes will outline the 5 potential paths we believe have merit, along with some small examinations of interesting outcomes. A followup document will go into these in depth, showing the outcome of each path against every use case we're aware of.

Clarifications

All of the examples deal with a property signature of the form:

T Prop { ...field... } // or
T Prop => ...field...;

Where T is not known to be a value type. For example, a reference type like string, or an unconstrained type parameter. Importantly, none of this document applies to when T is a value type (like int). In that case, the backing field's type is exactly the same as the property type. We do not want the runtime representation of the two member's types to be different.

For all followup examples, we'll just use the T Prop { ...field... } form as T Prop => ...field...; is already equivalent to T Prop { get { return ...field...; } }

Similarly, the examples will often use _field when translating the above form to an actual field+property pair. This does not mean that the field's name is actually _field. It is just a convenient way to indicate it without clutter.

Approach A. Backing field has the same type as the property type.

Approach A1. Ship as is with no new rules/understanding of this case.

With this approach:

T Prop { ...field... }

// Would be the same as
T _field;
T Prop { ..._field... }

This is, by far, the simplest approach we can take. It is also safe to do in C#13, as we can relax this approach in a post launch if we want. This approach is also likely one of the easiest to explain to users, and how users may immediately think about this feature space from the start.

However, there are repercussions for how certain use cases will work here. For example, if you had:

string Prop { get => field ?? ""; private set; } // or
string Prop { get => field ??= ComputeInitialValue(); }

You would get a warning in each constructor that the backing field _field was not initialized through all code paths. This would then require the user to have to suppress the warning, or write something like the following:

string Prop { get => field ??= ComputeInitialValue(); } = null!; // or

// in constructor:
this.Prop = null!;

So, while simple, migrating a trivial, idiomatic case like:

string? _prop;
string Prop { get => _prop ??= ComputeInitialValue(); }

would involve a non-trivial transformation, with potentially annoying suppressions.

Approach A2. Control backing field warnings with attributes.

With this approach:

[field: MaybeNull] string Prop { ...field... }

// Would be the same as
[MaybeNull] string _field;
string Prop { ..._field... }

This has the benefit of now both suppressing any warnings about the field being null in any constructor, while also providing appropriate warnings if field was not used safely. For example:

[field: MaybeNull] string Prop { get => field; } // will appropriately warn
[field: MaybeNull] string Prop { get => field ??= ComputeValue(); } // will appropriately not warn.

Note: the above approach would also allow for the case where the setter wanted to reset the value back to null by using AllowNull. Specifically:

// will appropriately not warn.
[field: MaybeNull, AllowNull] string Prop { get => field ??= ComputeValue(); set => field = value is "" ? null : value.Trim(); }

Like with A1 though, this whole approach comes with the annoying nuisance of having to annotate the field in this fashion to avoid warnings in the constructor. Even if it seems like something the language/compiler could understand.

Approach B. Backing field has the nullable version of the property type.

Approach B1. Use existing rules about initializing an auto-prop from a constructor.

With this approach:

string Prop { ...field... }

// Would be the same as the following for the purpose of analyzing the accessors
string? _field;
string Prop { ..._field... }

However, like with auto-properties today, one would still be required to initialize the property in the constructor, even if not necessary. So:

public C()
{
    // would warn, the auto-prop is not considered initialized
    Console.WriteLine(this.Prop);
}

Whereas:

public C()
{
    this.Prop = "";
    // would no longer warn, the auto-prop is considered initialized
    Console.WriteLine(this.Prop);
}

Note that this analysis is not accurate. Consider a setter that resets the backing field back to null:

string Prop { get; set => field = value is "" ? null : value }

public C()
{
    this.Prop = "";
    // would not warn, even though you could get null. 
    Console.WriteLine(this.Prop);
}

The benefit of this approach is the simplicity. But false negatives might unfortunately emerge for some of these cases.

Approach B2. Perform flow analysis between constructors and accessors.

This approach is the most extensive, and would prove to have the most precise understanding of nullable flow between a constructor and the auto-prop accessor.

The general idea would be perform nullable analysis of:

T Prop { get => ...field...; set => ...field... } = optional_initializer;

public C()
{
    // ... constructor statements ...
}

as if it had been written like so:

public C()
{
    T? _field = default;
    _field = optional_initializer; // if present

    // ... constructor statements ...
    // ... any calls to this.Prop become calls to Prop_get and Prop_set ...

    while (true)
    {
        if (rand())
            Prop_get();
        else
            Prop_set(...non-null-value...)
    }

    T Prop_get() { ...getter body... }
    void Prop_set(T value) { ...setter body... }
}

This looks a bit daunting at first, but comes with a simple general intuition:

The language/compiler already support precise NRT flow analysis of locals and local functions within a method.
The backing field is rethought of as just a nullable local field.
If initialized at the property declaration site, there is an upfront assignment to that backing field solely for the purpose of updating the initial flow state.
The normal constructor statements are analyzed, with any calls to the auto-props in the type now being calls to the local functions corresponding to the accessors. This ensures precise NRT analysis at every call site.
After the constructor statements finish, the property is analyzed again in a fashion that simulates how the getter/setter can be called randomly in any order after the constructor concludes.

With this formalization, precise NRT analysis just "falls out". For example, if the user had:

string Prop { get => field ?? ""; private set; } // or
string Prop { get => field ??= ComputeInitialValue(); }

then they would have to do nothing else (no need for any updating of the constructor). Similarly, the following would be known to be safe without warnings:

string Prop { get => field; set => field = value.Trim(); }

public C()
{
    this.Prop = "";
}

However, this would properly report a warning:

// warning that `field` in the getter might be null.
string Prop { get => field; set => field = value is "" ? null : value.Trim(); }

public C()
{
    this.Prop = "";
}

Everything is precise here because we are performing real NRT analysis across the constructor statements, and for after the constructor finishes up.

The downside here though is complexity of implementation and potential computational costs having to perform this analysis for each constructor in the type. It would also be critical for diagnostic messages to clearly state both where a null-ref issue might exist, and from what constructor or post-constructor path was potentially causing it.

Approach C. Backing field's type may or may not be the nullable version of the property's type.

The final approach we are exploring is one where the backing field is conditionally nullable. The intuition here is to first do an initial analysis pass of the get-accessor starting with the assumption that the backing field is nullable. If the conclusion of that analysis is that no exit points of the accessor return a maybe-null expression, then the backing field will be considered nullable. Otherwise, it will be considered non-nullable, and the existing language rules will take effect.

With this formalization, the following cases work without the need to do anything in the constructor:

string Prop { get => field ?? ""; private set; } // or
string Prop { get => field ??= ComputeInitialValue(); }

With both properties, the initial analysis is done with the backing field being string?. The analysis proves that the get accessor never returns a maybe-null expression. As such, it's "safe" to have a nullable backing field. And because it is safe, we don't need a constructor initializer.

However, if you had:

// this will appropriate warn if the constructor or initializer
// does not assign to the property as the backing field is `string`
// and thus must be assigned before the constructors exit.
string Prop { get => field; set => field = value.Trim(); }

This seems like a very simple, and limited, form of flow analysis that can make many cases work nicely without difficulty.

Next Steps

Compiler to give rough estimates on the above approaches.
Rough specs for these options.
Take every example we have and check the experience of each option against them.

`field` keyword working group notes for June 26, 2024

field keyword working group notes for June 26, 2024

Clarifications

Approach A. Backing field has the same type as the property type.

Approach A1. Ship as is with no new rules/understanding of this case.

Approach A2. Control backing field warnings with attributes.

Approach B. Backing field has the nullable version of the property type.

Approach B1. Use existing rules about initializing an auto-prop from a constructor.

Approach B2. Perform flow analysis between constructors and accessors.

Approach C. Backing field's type may or may not be the nullable version of the property's type.

Next Steps

`field` keyword working group notes for June 26, 2024