meetings/working-groups/ref-improvements/REF-2022-11-11.md
RFRS: ref field to ref struct
All the discussion on RFRS is tied to how we eventually fully support TypedReference (remove it from restricted types list). The reason is that TypedReference can support storing anything. It can even store a ref TypedReference value which makes it equivalent to a type that has RFRS.
The language does not necessarily need to support this full complexity though. One simplification is that we only limit the arguments passed to __makeref to be those we logically support on ref struct fields today. That would allow ref struct values, or ref to normal struct, but not ref to ref struct. This means IL could construct instances that violate C# lifetime rules but that would be labeled as simply unsupported.
However until we know how RFRS support will occur it's difficult to move forward with TypedReference support for fear we'd end up painting ourselves into a corner.
When discussing ref and the associated lifetime challenges it helps to have an explicit notation syntax to fall back on for discussions. That will be done by using the 'a notation. The letter following ' is the lifetime name and it can be applied to any value or ref in code. By default named lifetimes have no relationship to each other but one can be created by adding lifetime constraints at the method body.
For example here is how the use of scoped is expressed in lifetime notation
Span<int> Read(scoped ref int i) {
...
}
'b Span<int> Read('a ref 'b int i) {
}
In this case scoped becomes the 'a lifetime. That enforces the behavior of scoped because the lifetime for returns is 'b. There is no provided relationship between 'a and 'b hence no conversion and it cannot be returned.
The return only vs. containing method scopes can be explained using the following example:
ref struct RS { }
Span<int> Create(ref RS i) {
...
}
'b Span<int> Create('b ref 'a RS i) {
where 'b <= 'a
}
By establishing a relationship between 'a and 'b it allows for both the value and the ref to be returned from the method. It also prevents cyclic assignment issues because the relationship does not allow for it from a lifetime ('b could be smaller than 'a).
The immediate issue this causes is it creates layers of lifetimes in our types. Consider concretely:
ref struct RS {
ref 'a Span<byte> Field;
}
The value of the field has a lifetime that is independent of the containing instance. Typically when we look at a value there are two lifetimes to consider: the safe to escape and ref safe to escape. In the case of RS though there is also the lifetime of Field.
void E() {
// Only two lifetimes here: the value and the ref
Span<int> span = ...;
// Three lifetimes: value, ref and value of Field
RS rs = ...;
}
These lifetimes can now be arbitrarily deep. So a given struct can have N different lifetimes associated with it that we have to manage. As the types get more fields this gets more complex because the lifetimes aren't necessarily linear either:
ref struct RS {
ref Span<byte> Field1;
ref Span<byte> Field2;
}
There are two different ways we could think about this:
// Option 1: simple but limiting
ref struct RS {
ref 'a Span<byte> Field1;
ref 'a Span<byte> Field2;
}
// Option 2: flexible but now we have a tree
ref struct RS {
ref 'a Span<byte> Field1;
ref 'b Span<byte> Field2;
}
These problems mostly manifest in the method arguments must match rule. This is the rule where we look at all of the ref going into the method and ensure that they can never be cross assigned in a way that would cause an unsafe reference to the stack.
void Swap(ref Span<int> x, ref Span<int> y) {
...
}
void Use(ref Span<int> span) {
Span<int> local = stackalloc int[42];
Swap(ref span, ref local);
}
The MAMM rules are intricate but can be simplified to the following:
Every
refargument to a method be assignable from every otherrefargument in terms of lifetimes.
Before there were just two layers of lifetimes, now there are basically infinite and they all have to match. The problem with ref fields to ref struct is now virtually everything is a MAMM nightmare. Even an instance method on a struct is dealing with MAMM insanity.
The only way for MAMM to work in this scenario is for all the lifetimes involved to be equivalent. That really defeats the purpose of adding them.
The consensus is that to fully support RFRS would likely require us to have full lifetime notations in the language. For example putting the lifetimes on scoped modifier. That would allow for example scoped<a>, scoped<b>, etc ...
At the same time, we don't have the supporting scenarios at this time to justify that work. It's possible that features like params ReadOnlySpan<T> will produce these.
Much of the complexity of RFRS comes from the fields being as flexible as a ref parameter. The ability to ref re-assign, assign new values, etc ... drive the complexity particularly around MAMM. Perhaps if we limited their capabilities we could satisfy key scenarios without jumping off the complexity cliff. For example if we forced them to be readonly ref, ref scoped, etc ... would that help?
The ref scoped is likely the most plausible approach here. The key is whether we can define its meaning in such a way that it remains useful while keeping complexity low. The rules would also need to apply to ref scoped parameters too which increase the scenarios a bit.
The explicit decision to not support RFRS for now should open the door for us exploring ref struct as generic arguments and implementing interfaces. There are existing scenarios that will sufficiently help drive this feature.
Need more due diligence to be certain that we don't close the door on RFRS in the future.
readonly ref, or ref scoped can be defined with sufficient restrictions, etc ... Do any of these both allow the scenarios that matter without forcing off the complexity cliff of full lifetime notations.ref struct generics + interfaces without cutting off our ability to support RFRS at a later time