Nullability improvements working group meeting for November 1st, 2022

`[MemberNotNullWhenComplete]`

https://github.com/dotnet/csharplang/discussions/5657

Users are requesting the ability to specify postconditions which are only fulfilled after a task-returning method is awaited. We're seeing a relatively high amount of engagement (upvotes and thumbs up) on this compared to many other issues in the nullable space.

A community member observed that there are very few usages of [MemberNotNull] on async methods in open source code, which bodes well for the possibility of breaking the existing behavior if we think it's the right thing for our users. We also think that the existing behavior on async methods is just broken. Currently, when an async method throws an exception, we treat it as fulfilling the postcondition, much like for a non-async method. This is because we think the caller won't actually be able to use whatever variable was expected to be not-null after the call--thus, nothing to warn about.

However, in reality, when an async method throws an exception, it is packaged up into a Task and returned to the caller. It is not thrown until the task is awaited. Because pretty much any code can throw an exception, no async method is able to reliably fulfill nullability postconditions as currently designed. This is similar to an issue we recently fixed in definite assignment.

SharpLab

public class C
{
    string? item;
    
    [MemberNotNull(nameof(item))]
    public async Task MAsync()
    {
        Helper();
        item = "a";
        await Task.Yield(); // no warnings
    }

    public static void Helper() => throw new Exception();
    
    public static void Main()
    {
        C c = new C();
        var t = c.MAsync(); // exception is thrown indirectly through Helper and included in the returned Task
        c.item.ToString(); // no warning, NRE at runtime
    }
}

We think the most viable option to fix the issue here is to change the existing meaning of [MemberNotNull], as well as other postconditions like [NotNull], on Task-returning methods (methods which return await-able things), such that the postconditions are only applied if the call is immediately awaited. The only safe alternative would be to warn if a nullable postcondition attribute is used at all on an async method, which would be unfortunate.

Thus, the new "happy path" usage would look something like the following:

public class C
{
    string? item;
    
    [MemberNotNull(nameof(item))]
    public async Task MAsync()
    {
        await Task.Yield(); // no warning
        item = "a";
    }

    public static void Main()
    {
        C c = new C();
        await c.MAsync();
        c.item.ToString(); // ok
    }
}

The requirement to change behavior at call sites for any "awaitable" method call is because async is treated as an implementation detail, and it's not possible to determine whether a method from metadata is async or not. We'd prefer to have uniformity in usage across both source and metadata.

The requirement to immediately await rather than being able to store the task in a variable and await it later, for example, is because we won't be able to be sure about the ordering of effects in such cases:

var task = c.MAsync();
c.item = null;
await task;
c.item.ToString(); // ok?

Although we're not sure such methods would occur in real code, we also considered the behavior of an API like the following:

namespace System
{
    public static class File
    {
        public static async Task<bool> ExistsAsync([NotNull] string? path); // throws an exception if path is null
    }
}

public class C
{
    public static async Task M(string? path)
    {
        if (await File.ExistsAsync(path))
        {
            path.ToString(); // ok
        }
    }
}

Conclusion: Overall, we are mildly favorable toward this feature request. We think it's possible to design the behavior our users actually want at a reasonable implementation cost--though if further examination reveals that cost is larger than expected, we may not be able to prioritize it. When we can, we'll prepare a full proposal for the behavior we want for the full language design meeting to consider.

Defaultable value types

https://github.com/dotnet/csharplang/discussions/5337

The "hero scenario" for this feature is:

// ImmutableArray
ImmutableArray<int> arrOpt = default;
foreach (var item in arr) // no warning, NRE at runtime
{
    Console.Write(item);
}

It would be nice if the language could understand when non-nullable reference type fields within structs could be null, and whether members of the struct are safe to use as a result.

However, we did decide early on in the nullable reference types design to leave structs as a gap, just like we did with new string[length]--the language provides no assistance in determining whether an array of non-nullable reference types contains all non-null values.

It's particularly unappealing to introduce new syntax and attributes just to support this. We're curious if the situation could be improved without actually adding those--for example, how far could we take things by just tracking flow states for structs which contain non-nullable reference types, and slightly extending the usage/meaning of existing nullability attributes. However, it's not clear if we would reach a complete and usable enough state through such a strategy, especially when considering the space of possible behaviors for properties when accessed on a default receiver--the property could throw an exception, return the default value of the property type, return a non-null "placeholder" value, and so on. It's not clear what set of behaviors would meet the "table stakes" of usability while also not requiring a large amount of conceptual and implementation overhead.

Conclusion: We think it's understood at this point that users must use caution when working with structs, and especially when they expose structs in libraries. This fact, combined with the relatively low level of engagement/enthusiasm on the discussion, leads us to think that this feature would not be a good use of our resources at this time.

LINQ

https://github.com/dotnet/roslyn/issues/37468

We're seeing a fairly high upvote rate on issues tracking LINQ usability with nullable. The scenario where people are experiencing the most pain is:

IEnumerable<Widget> M(IEnumerable<string?> items)
{
    return items
        .Where(i => i != null)
        .Select(i => Widget.Parse(i)); // false warning: converting 'i' to non-nullable
}

class Widget
{
    public static Widget Parse(string item) => throw null!;
}

We found that the equivalent scenario using query syntax appears to work at first glance. However, it is actually suppressing nullable warnings that we would expect. The query variable has oblivious nullability and its state is not tracked across operations in the query. It feels like the compiler should be able to flow nullable state information from the where to the select clause in examples like the following, but we haven't yet had the opportunity to make it work. Ideally, whatever solution we adopt to improve the LINQ experience will have parity between both the call-chaining and query forms.

IEnumerable<Widget> M(IEnumerable<string?> items)
{
    return from i in items
        where i != null
        select Widget.Parse(i); // ok
}

IEnumerable<Widget> M(IEnumerable<string?> items)
{
    return from i in items
        select Widget.Parse(i); // should warn, but doesn't
}

We had a few thoughts and concerns when considering expanding on the support here:

LINQ is extensible. You can define your own operators, and you can also reimplement the LINQ "suite" on non-IEnumerable containers while getting support for the query syntax. We would probably want to ensure that any usage of the "query pattern" would also benefit from any special support we add here. We would have to assume that the implementation adheres to the standard expectations for those operators.

It might be OK to just target the "90% case" versus providing full and rich support over all the standard operators. We also think this use case is valuable enough to introduce new attributes if necessary.

We considered whether it would be good enough to have a special case just for Where(x => x != null), where the return value is somehow inferred to be IEnumerable<NonNullThing> and that type information flows into subsequent usages. It seems unfortunate to "miss" even slightly more complex cases like the following:

IEnumerable<C2> Transform(IEnumerable<C1> items)
{
    return items
        .Where(i => i.field1 != null)
        .Select(i => new C2 { field2 = i.field1 }); // would still warn
}

class C1
{
    public string? field1;
}

class C2
{
    public required string field2;
}

We also think it could be useful to somehow flow the state of collection elements across statements, like when using the Any or All operators:

List<string?> items = ...;
if (items.All(i => i != null))
{
    UseNotNullItems(items);
}

However, much like the immediately awaited Task scenario, it seems difficult to know how to "invalidate" the state of a collection once it is tested in this manner--for example, if someone calls items.Add(null) on a mutable collection after checking .All(). We might be able to reach a worthwhile level of tracking in such scenarios, but it's not clear.

Conclusion: We're interested in improving support here. We will hold a dedicated meeting for this topic in order to more thoroughly explore the problem and decide what level of support we want to provide here.

Follow-up from last meeting

After our last meeting we continued to think about the Task covariance issue, and have concluded that we should update the language documentation to capture the suppress nullable conversion warnings use case for the ! suppression operator. Currently, we think this is the full extent that we want to invest in this specific area.