C# Language Design Meeting for August 16th, 2023

QOTD

"We're done voting so can I finally start making cracks about 2B or not 2B?"
"That is the question!"

Agenda

Ref-safety scope for collection expressions
Experimental attribute

Discussion

Ref-safety scope for collection expressions

https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/LDM-questions-2023-08-15.md#ref-safety

When a collection expression is targeted to a ref struct type, such as Span<T> and ReadOnlySpan<T>, what should be the ref-safety scope - the "lifetime" - of the created value?

Take code like this:

Span<int> s = [x, y, z];

There is a fundamental trade-off between efficiency and usefulness. If we make the scope local then s can't escape, and we can e.g. allocate on the stack. If we make the scope global, then it will be allowed to escape, in which case we need to allocate it on the heap.

If we decide on global scope, there is an option to try to be smart about this, deducing from usage that the value does not in fact escape and still allocating on the stack in those cases: we'd effectively be inferring the scope from usage.

So this leaves us with the following options:

Try to be smart and infer scope from usage
Decide scope from declaration alone A. Make the scope always local B. Make the scope always global

Option 1 seems immediately attractive, as it supposedly minimizes the set of situations where this becomes an issue. However, it really means global scope with potential optimizations. How can a user reason about whether the optimizations kick in? How can they trust that a subtle code change doesn't suddenly lead to heap allocations? In that sense it is almost the worst of both worlds.

An analyzer could help catch when this leads to allocations. It would in practice have to be implemented in the compiler, even if it isn't presented that way, because of the complexity of the analysis.

If we do go down the path of optimizing implicitly, then people will justifiably take a dependency on the performance profile of that, so we would be bound by our behavior here even if we didn't specify it. We could only ever improve, not deteriorate.

For option 2 we need to look at what a developer can do if we make the "wrong" choice for them:

If we pick local and they want global, they would have to explicitly allocate it as an array rather than a span (using an array creation expression new [] { ... } or casting a collection expression to an array type (int[])[ ... ]), then implicitly convert to the span type in question. This leads to less than elegant code, and may be hard to discover the need for.
If we pick global and they want local, in the example of the local variable s above, they could explicitly add scoped to the variable declaration. However, in other scenarios, e.g. when passing the collection expression as a method argument, there is no place to put scoped. They would need to factor it out to a local variable or something to achieve that.

Both are pretty inconvenient. How to choose?

When dealing with span types, it feels like people generally expect optimal behavior, i.e., no allocations. They would like to be told if they can't get that, and make any allocations explicit. This certainly points in the direction of option 2B.

The fact that the scope of a ref struct variable is currently set based on what it's initialized to is surprising to many people who aren't used to spans, but that is something you just have to get used to. If we said collection expressions assigned to ref structs are always local scope, the performance community would just say okay.

It does feel like it is relatively rare that ref struct arguments escape, so 2B would be a good default. However, the syntactic choices for when you need to go against the default are unappetizing. The array creation expression approach also has subtly different type inference from collection expressions! Perhaps we could have a new keyword (global? new?) that you could put in front of a collection expression to force globalness? That could be added later if this turns out to be a bigger problem than we expect.

Conclusion

2B: always local. We'll keep an eye on how inconvenient this choice becomes in practice, and consider better options for switching to global scope in the future if necessary.

Experimental attribute

https://github.com/dotnet/csharplang/blob/2c86608e9326845a94438fd512cb3df54e1170fd/proposals/csharp-12.0/experimental-attribute.md

The intent of the [Experimental] attribute is to signal that an API is experimental, and force clients to explicitly opt in to the risks of that dependency, e.g. the risk of breaking changes.

It's mostly the same as obsolete in behavior, with the difference that when you're in the context of an Experimental attribute it turns off warnings about the use of experimental things inside. Also it can be provided at the assembly/module level to apply to a whole library.

The attribute takes a diagnostic id as a parameter, so that it's not all or nothing: You can turn on or off on a per-experimental-API level. That is expected to typically be done in project files, not using pragmas in code. It would be used equally for BCL features and third party libraries.

A library author that depends on an experimental feature should either shield their consumers from breaking changes in the feature, or in turn advertise the attribute to their users.

Unlike [Obsolete] the expectation is that you would eventually remove [Experimental] from an API. Are there thoughts on how a client can clean up their suppression when that happens? It's probably hard to tool, because of the combinations of versions that would go into it.

As a potential slight hole, if you somehow get your hands on a value of an experimental type without mentioning the type itself, you get to use it without warning. This is especially observable through extension methods, which are often usable without their containing type being explicitly mentioned: If the containing type is marked experimental but not the extension method itself, the consumer will then not get a warning.

What do assembly-level attributes mean?

Should assembly-level use of [Experimental] apply recursively to members and nested types, or just to top-level types?

Options:

apply to both types and members
only apply to types (including nested types)
only apply to top-level types

Conclusion

We prefer option 1: both types and members.

Warnings or errors

At the language level, the diagnostics about use of experimental APIs cannot be errors, since errors cannot be silenced. However, it's desirable that these warnings are promoted to error severity by default, as unintended dependency on an experimental API is quite bad.

For other attributes, such promotion to error can be done in e.g. an editor config file, but because this one trades in an open-ended set of custom diagnostic IDs for each experimental API, it seems infeasible to list them all in a file.

Could we have an overarching diagnostic ID that refers collectively to all experimental features? If so, we can use existing tools to get the desired severity.

Conclusion

Keep the language-level diagnostic a warning and use an overarching diagnostic ID to allow changing the severity of all experimental API use.