docs/design/coreclr/profiling/davbr-blog-archive/Tail call JIT conditions.md
This blog post originally appeared on David Broman's blog on 6/20/2007
Here are the full details I received from Grant Richins and Fei Chen when I asked how the JIT decides whether to employ the tail call optimization. Note that these statements apply to the JITs as they were when Grant and Fei looked through the code base, and are prone to change at whim. You must not take dependencies on this behavior. Use this information for your own personal entertainment only.
First, Grant talked about the 64-bit JITs (one for x64, one for ia64):
For the 64-bit JIT, we tail call whenever we’re allowed to. Here’s what prevents us from tail calling (in no particular order):
We inline the call instead (we never inline recursive calls to the same method, but we will tail call them)
The call/callvirt/calli is followed by something other than nop or ret IL instructions.
The caller or callee return a value type.
The caller and callee return different types.
The caller is synchronized (MethodImplOptions.Synchronized).
The caller is a shared generic method.
The caller has imperative security (a call to Assert, Demand, Deny, etc.).
The caller has declarative security (custom attributes).
The caller is varargs
The callee is varargs.
The runtime forbids the JIT to tail call. (There are various reasons the runtime may disallow tail calling, such as caller / callee being in different assemblies, the call going to the application's entrypoint, any conflicts with usage of security features, and other esoteric cases.)
The il did not have the tail. prefix and we are not optimizing (the profiler and debugger control this)
The il did not have the tail. prefix and the caller had a localloc instruction (think alloca or dynamic stack allocation)
The caller is getting some GS security cookie checks
The il did not have the tail. prefix and a local or parameter has had its address taken (ldarga, or ldloca)
The caller is the same as the callee and the runtime disallows inlining
The callee is invoked via stub dispatch (i.e., via intermediate code that's generated at runtime to optimize certain types of calls).
For x64 we have these additional restrictions:
For ia64 we have this additional restriction:
If all of those conditions are satisfied, we will perform a tail call. Also note that for verifiability, if the code uses a “tail.” prefix, the subsequent call opcode must be immediately followed by a ret opcode (no intermediate nops or prefixs are allowed, although there might be additional prefixes between the “tail.” prefix and the actual call opcode).
Fei has this to add about the 32-bit JIT:
I looked at the code briefly and here are the cases I saw where tailcall is disallowed: