meetings/2023/LDM-2023-09-18.md
https://github.com/dotnet/csharplang/issues/5354
https://github.com/dotnet/csharplang/issues/7542
Our first topic today was a discussion around whether we can or should call .ctor(int capacity) constructors for collection types, when available. This is a bit different from other
method-based patterns that we've used in the past, as we would need to look for the actual parameter name capacity; our previous examples in this space do not look for parameter names,
or even require the parameter to be named at all. However, we definitely have some examples of where not looking for the name would be a problem for existing collection types.
System.Collections.Concurrent.BlockingCollection<T>, for example, has a single-int constructor that does not do what the user would want here. There's some concern, though, that even
checking for capacity isn't good enough. For example, the BCL has a type called System.Net.CookieContainer with a constructor with a single int parameter capacity. This type isn't
constructible with a collection expression, as it's not IEnumerable and doesn't implement the new collection builder pattern, but it's concerning how close it is to being adversely affected
by this change.
There are a few options we could take to mitigate this. The first is some sort of opt-out attribute, as we've discussed before, 2. We could also take an opt-in approach instead, where type authors need to add some sort of attribute to their types or constructors in order to allow them to participate in this pattern. However, there's a tension with this change around how much we want to change/can change from older-style collection initializers. On the one hand, most of the time that users will hit this optimization, it'll be because they're using a type that hasn't adopted the new builder pattern. For those types, this is more likely to mean that it wouldn't have seen any modifications to work with the new feature, so would an opt-in approach actually help anyone? Potentially, but certainly less than if the optimization was more broadly applied.
Ultimately, we don't think we're at the right point in the release cycle to make a large assumption about capacity at this time. We'll look at special casing some well-known BCL types that
won't be moving to the collection builder pattern, but won't make broad assumptions in the C# 12 timeframe. We'll look at it as part of the collection expression changes we're planning for
C# 13.
We'll optimize well-known BCL types, not all types at this time.
Second, we looked at whether or not the compiler needs to guarantee that intermediate buffers are avoided for cases with spreads where we need to evaluate the spread to get the length. This
could cause the compiler to need a large number of temp variables to ensure that left-to-right semantics are preserved for the expression. We boiled this down to a set of questions about the
principles that we want the language spec to declare, given this example: [a, b, c, .. d, e, .. f]
a, b, and c are evaluated before d is evaluated?
d is enumerated before e or f is evaluated at all?
f unless we fully enumerate d.
We'd need to either have temporary storage for the elements of d, or couldn't presize the collection.f has been evaluated?
StringBuilder.AppendFormatted to change evaluation orders slightly.
We have seen a bit of user confusion around that, but not big. And, importantly, that was changing existing code. This change would not affect existing code, only new code.We unanimously felt that we need to at least guarantee LTR ordering of the expressions themselves. It seems perfectly reasonable that someone might have side-effecting expressions inside a later
expression that would affect an earlier expression if run out of order, especially given our strong legacy around LTR semantics in C#. We're more conflicted on requiring enumeration of the
collection, however; it feels particularly weird that a collection that exposes a Length would have an enumerator that would have a side effect on a later element in the containing collection
expression. We also feel comfortable not specifying exactly when the collection expression constructor occurs within the collection expression; we do have prior art here with interpolated strings,
and this isn't nearly as concerning a case given the explicit step users need to take to move to the new semantics. Given all of this, we think we're fine with requiring that collection expressions
guarantee the ordering of expression evaluation, but not exactly when spreads will be enumerated, or exactly when the constructor will be called. We are also comfortable not requiring the compiler
to avoid all intermediate collections for large numbers of temps; the compiler is free to do what it can determine is best for the scenario. If it heuristically believes that it would be better
to allocate a temp array on the heap, it is free to do so.
Compiler is free to use temp buffers if it believes it should, and is not required to enumerate spreads before evaluating arguments following a spread. It is required to ensure that expressions are otherwise evaluated from left-to-right.