Back to Csharplang

Collection literals working group meeting for October 6th, 2022

meetings/working-groups/collection-literals/CL-2022-10-06.md

latest7.5 KB
Original Source

Collection literals working group meeting for October 6th, 2022

This is the first meeting of the collection literals working group. The proposal is currently located at https://github.com/dotnet/csharplang/issues/5354.

Agenda

  • Overview of the current proposal and its goals
  • Discovering an initial consensus on open questions from which to make recommendations
  • Fleshing out a design for dictionary literals

Collection literals discussion

Natural types

There was consensus that there should be a natural type and that it should be List<T>. It’s not uncommon today for there to be more modifications following a collection initializer. The kicking-the-tires scenario was important to us; the experiences of trying both var x = [a]; and subsequently x.Add(...); are things we think will be formative first impressions of the feature, and we don’t want them to feel confusing or broken.

We think that List<T> will only be an additional allocation beyond T[] or ImmutableArray<T>. List<T> can expose an init void Init(T[]) method which takes ownership of the array. As a consequence, the list capacity would be initialized to exactly match the item count.

In a similar vein, we think unknown-length collection literals such as [..enumerable] should work with fixed-length collections such as ImmutableArray<T>. This is a reasonable scenario, and it will make the feature feel confusing or broken if the compiler doesn't allow it. We will look for a codegen strategy to enable this.

Eager or lazy enumeration when the target type is IEnumerable<T>

Collection literals create an instance of the concrete target type, whether that's ImmutableArray<T> or HashSet<T>. If the target type is IEnumerable<T>, we could imagine the expectation that a lazily-enumerating generator will be produced so that [..a, ..b] behaves exactly like a.Concat(b). However, we think it's clear and explicit enough that the collection literal [...] always creates a new collection. This proposal is not about list comprehensions.

Other observations

  1. We think collection literal syntax should be on par with or better than the runtime performance of the current methods of initializing collections. For example, we need to compare the codegen for (ImmutableArray<T>)[..enumerableOfT] to ImmutableArray.CreateRange<T>.

  2. We think [..a] will be a common way to do a shallow clone from one collection type to another.

  3. The repetition in init void Init(...) looks strange. init void Construct(...) was suggested as an alternative name for the compiler to bind to. This would parallel Deconstruct.

  4. If the language added let, we could consider making the natural type of let x = [a]; be ImmutableArray similar to Swift.

  5. The Init(...) method could more naturally be static and return the constructed instance, rather than an instance method, except that it would not be possible to write extension methods to extend a fixed-length collection type so that collection literals can be used with it. This is a concession to the language's current lack of static extension methods.

Dictionary literals discussion

We liked the dictionary literal having the natural type Dictionary<TKey, TValue>, assuming List<T> as the natural type of a collection literal. It should feel parallel, so if the collection literal natural type was ImmutableArray<T>, it would be ImmutableDictionary<TKey, TValue>. We noted that there are not many framework types to choose between, unlike with collection literals.

Syntactic options

We considered the following syntaxes for dictionary literals:

cs
var dict1 = { "key1": "value1", "key2": "value2" };

var dict2 = [ "key1": "value1", "key2": "value2" ];

var dict3 = [ ["key1"] = "value1", ["key2"] = "value2" ];

var dict4 = [ "key1" => "value1", "key2" => "value2" ];

Of the square bracket syntaxes, we agreed on the dict2 style as being minimalist and the most familiar coming from other languages.

  • Curly braces are too overloaded. We don’t this to be visually confusing with block expressions or impede that space.
  • [ ["a"] =... starts out visually similar to a collection literal containing another collection literal.
  • Fat arrows could make it confusing to disambiguate lists of lambdas.

Like with collection literals, we like the idea of making pattern matching resemble construction:

cs
var dict2 = [ "key1": "value1", "key2": "value2" ];

if (dict5 is [ "key1": var value ])
{

There’s an existing community proposal for indexer patterns which would combine indexer access with property patterns: https://github.com/dotnet/csharplang/discussions/4889. This would be more general than dictionaries but would include dictionaries. The community proposal has the syntax { [a]: b } and would enable patterns such as { a.b: c, [d].e: f }, and so on. This would suggest yet another syntax permutation:

cs
var dict5 = { ["key1"] = "value1", ["key2"] = "value2" };

if (dict5 is { ["key1"]: var value })
{

Open question: which way do we resolve our favored construction syntax with indexer pattern matching syntax?

The two existing dictionary initialization syntaxes have different behaviors for duplicate keys. One calls Add, which typically throws if the key already exists: { { a, b }, { a, c } } The other calls the indexer setter, which typically overwrites if the key already exists: { [a] = b, [a] = c }. This raises the question of whether dictionary literals should replace or throw. We leaned towards replacing rather than throwing. The throw behavior could annoyingly block a valid and intended scenario. Also, dictionary spread can be seen as a cousin of with expressions: obj with { prop = a } is a lot like [..dict, prop: a].

Swift has [:] to denote an empty dictionary. It would be rare to find this useful for C# dictionary literals: [] would also denote an empty dictionary when the target type is a dictionary type rather than a collection type, and neither [] nor [:] could be used when there is no target type, e.g. var x = [];. One place where [:] could be useful is to disambiguate between overloads when there is a collection type in one overload and a dictionary type in another overload. We doubt this motivates the syntax; there are other ways to disambiguate.

Dictionary literal implementation strategy

While planning dictionary literals, we need to propose a strategy for how they will initialize dictionaries. Extrapolating from the approach in the collection literals strategy provided a starting point:

cs
// Or ReadOnlySpan<>, just like with collection literals
init void Init(KeyValuePair<TKey, TValue>[] values)

Even without a new Init(...) method, a similar extrapolation could work for mutable dictionaries by calling the Add method.

Like with collection literals, we should make sure the runtime performance of dictionary literals is on par with existing methods of construction.

Future agenda

The next meeting is currently planned for October 14th, 2022.

  • Continuing on dictionary literals
  • Implementation strategies
    • Initializing fixed-length collections from unknown-length collection literals
    • Unresolved question 2 from the proposal:

      Stack allocations for huge collections might blow the stack. Should the compiler have a heuristic for placing this data on the heap? Should the language be unspecified to allow for this flexibility? We should follow what the spec/impl does for params Span<T>.

  • If there is time, exploring other open questions from the proposal