docs/compilers/CSharp/CodeGen Differences.md
Code generation differences compared to previous compilers.
NOTE - The form and shape of the IL stream as produced by the compiler is not a public contract! Compiler is only responsible for producing IL that is semantically equivalent to the C# source code. Any dependencies on exact form of the output may be broken by new versions or even bug fixes.
It may not be comprehensive since the changes are a moving target and there could be some changes that are not too obvious or significant to mention. This document is always in progress...
What is the purpose of this document?
So here is a rough list of changes: There are two sections here - Release codegen and Debug. They are separate because the changes are typically different in nature since they are motivated by different goals. Generally Release codegen strives to be the most efficient and compact representation of the sources semantics, while the Debug codegen values debuggability. When efficiency and debuggability are at conflict, Release and Debug make different choices.
== Release (optimized)
• The async codegen was for the most part redone in Roslyn.
• Iterators - the genral principle is the same, but there were some minor refinements in the state machine.
• Lambdas had some minor changes in caching strategy and a change in representation of non-lifting lambdas.
if (cacheField == null)
{
cacheFiled = {allocate new lambdaDelegate};
}
return cacheField;
We now generate something close to
return cacheField ?? (cacheFiled = {allocate new lambdaDelegate});
• The string switch codegen in Roslyn is completely new. Roslyn does not use dictionaries to avoid allocations and a potentially huge penalty when a string switch is execute for the first time. Roslyn uses a private function that maps strings to hash codes and a numeric switch. In some sense this is a partial inlining of the former technic that used static dictionaries.
• Array initializers are slightly more compact – in most cases extra temporary for the array instance is avoided and dup is used instead. (I think VB always did this, but this is new for C#)
• Leaves from nested try in many cases would not cascade through outer regions and just leave directly to the outmost region. (VB was doing that always, now C# does this too since this part of codegen is in shared library and it is also more compact) Branch threading in general may handle few more cases compared to the old compiler. Leave-to-leave case is probably the most noticeable.
• Numeric switches –
• Some unnecessary locals could be eliminated or used on the stack when compiled with /o+. This is not new, old compiled did this as well, but implementation of this optimization has changed. There are few cases where old compiler was “smarter” since it did this optimization earlier (and caused numerous inconveniences to later stages). Generally the new approach handles more scenarios, so you may notice in some cases more dups used and fewer locals.
Cascading optimizations – note that it is possible for one optimization to enable another that may not be by itself new.
== Debug (not optimized)
TBW