Back to Csharplang

Nullability improvements working group meeting for November 7th, 2022

meetings/working-groups/nullability-improvements/NI-2022-11-07.md

latest6.1 KB
Original Source

Nullability improvements working group meeting for November 7th, 2022

Our focus today was on improving nullable analysis of LINQ queries.

Special handling of .Where(x => x != null)

We considered a solution where we specially handle null tests on the collection element itself. For example:

cs
IEnumerable<string> M(IEnumerable<string?> items)
{
    return items.Where(x => x != null); // no warning.
}

We think we could introduce a rule where the type of a Where() invocation expression is changed depending on the predicate which is passed in. (Specifically, the type argument of the return type would change its top level nullability.) However, we are not sure this solution is generally useful enough.

It has been also been suggested on multiple occasions to add a .WhereNotNull() LINQ operator to the core library. This suggestion has been rejected by the runtime team. We found a number of drawbacks to this solution, including the fact that query expressions would need to be updated to use this operator, and that query providers would need to be updated to understand the operator.

Field tracking within collection elements

We think scenarios like the following are common and important to handle:

cs
IEnumerable<HasNotNullable> M1(IEnumerable<HasNullable> enumerable)
{
    return enumerable
        .Where(x => x.Item != null)
        .Select(x => new HasNotNullable(x.Item)); // ok
}

record HasNullable(string? Item);
record HasNotNullable(string Item);

Note that this wouldn't be handled by an approach which only updates the top-level nullable state, i.e. by adding a .WhereNotNull() operator to the core library.

We want to support a complete and coherent mental model for the analysis of these queries as much as possible. Users should be able to think of the above query as being morally equivalent to the following iterator method:

cs
IEnumerable<HasNotNullable> M2(IEnumerable<HasNullable> enumerable)
{
    foreach (var x in enumerable)
    {
        if (x.Item != null)
        {
            yield return new HasNotNullable(x.Item); // ok
        }
    }
}

We also think that the filtering performed by operators like .Where() should be able to be consumed by custom downstream operators.

cs
enumerable
    .Where(x => x.Item != null)
    .SelectAsArray(x => new HasNotNullable(x.Item)); // state of 'x.Item' is still propagated, although it seems like the compiler can't make a hard assumption that this lambda is operating on the collection element.

We think this can be accomplished by associating a slot with a type argument, and when values of that type are materialized, we would copy the state from that slot in as the initial state of those values. For example:

cs
// after visiting the 'Where()' invocation, we associate the type 'HasNullable' with a slot representing the when-true branch of 'x.Item != null' in the lambda.
enumerable.Where(x => x.Item != null)
    // Parameter 'x' has type 'HasNullable', and 'HasNullable' has an associated slot. We copy this slot as the initial state of the parameter.
    .Select(x => new HasNotNullable(x.Item));

We'd like to avoid thinking of this as a bona-fide type system addition--for example, where unspeakable types are inferred and flow through to represent the combination of field nullable states. As much as possible, we want to leverage the analogy of assigning variables whose field states were already modified:

cs
var x = GetHasNullable();
x.Item = "not null";
var y = x;
y.Item.ToString(); // no warning in currently shipped compiler

Currently, we are still at the stage of heavily handwaving questions about the lifetimes of such slots and how to propagate this state only to variables which are actually associated with the tests from filtering lambdas.

cs
enumerable.Where(x => x.Item != null)
    .Select(x => new HasNotNullable(GetHasNullable().Item); // should still warn, since we didn't test the return value of 'GetHasNullable()'.

More consideration is needed here. We'll try to organize our thoughts and explore this approach in more detail when we meet again.

Identifying filtering operators

We considered how to identify when a method should get the "Where() method treatment". For example, operators like First(), Last(), SkipWhile() and TakeWhile() would also want to propagate collection element nullable state based on the lambda argument.

A few basic options we considered for identifying these methods include:

  1. Look specifically for System.Linq.Enumerable.Where() System.Linq.Enumerable.First(), etc., and don't include user-defined operators or operator "suites" on different collections.
  2. Look for any method which is callable on an enumerable receiver and which matches the naming pattern, i.e. MyExtensions.Where<T>(this MyCollection<T>, Func<T, bool>) would be automatically opted in to the new analysis behavior. This slightly resembles the translation of query expressions.
  3. Introduce an attribute which indicates a method output is being filtered by a predicate.

We sketched out a few possible names and shapes for "filter" attributes here:

cs
public static IEnumerable<T> Where<[NotNullWhenLambdaFlowStateNotNull]T>(this IEnumerable<T> enumerable, Predicate<T> predicate);

public static IEnumerable<T> Where<T>(this IEnumerable<T> enumerable, [Filter] Predicate<T> predicate);
// or [NullFilter]?

[return: Filter(nameof(predicate))]
public static IEnumerable<T> Where<T>(this IEnumerable<T> enumerable, Predicate<T> predicate);

At this point, we think that filtering methods should onboard deliberately to the new analysis behavior, and we want user-defined operators, perhaps ones with names or shapes we didn't anticipate to be able to participate where possible. Therefore, we are leaning towards introducing a filtering attribute of some kind. When we meet again we will present and consider some more concrete designs for such an attribute.