docs/_docs/internals/backend.md
The code for the JVM backend is split up by functionality and assembled in
GenBCode.scala. This file defines class GenBCode, the compiler phase.
The workflow is split into CodeGen.scala Scala compilation context aware responsible for emitting bytecode,
and PostProcessor.scala which can be used for parallelized, context agnostic processing. In Scala 2 PostProcessor,
was responsible for performing bytecode optimization, e.g. inlining method calls. In Scala 3 it is only used for writing
Class files and Tasty to disk.
class CodeGen.Impl -[defines]--> PlainClassBuilder
| |
[extends] [extends]
| |
BCodeSyncAndTry ----------------> SyncAndTryBuilder
| |
BCodeBodyBuilder ----------------> PlainBodyBuilder
| |
BCodeSkelBuilder ----------------> PlainSkelBuilder
| / | \
BCodeHelpers ----------------> BCClassGen BCAnnotGen ... (more components)
| \
| \-------------> helper methods
| \------------> JMirrorBuilder, JAndroidBuilder (uses some components, e.g. BCInnerClassGen)
| \-----------> `backendUtils`: utility for bytecode related ops, contains mapping for supported classfile version
|
BCodeIdiomatic ----------------> utilities for code generation, e.g. genPrimitiveArithmetic
\--------------> `bTypes`: maps and fields for common BTypes
\-------------> `int`: synchronized interface between PostProcessor and compiltion ctx
The BTypes.scala class contains the BType class and predefined BTypes
Compiler creates a GenBCode Phase, calls runOn(compilationUnits),
which calls run(context). This:
compilationUnits using same instance of Context:
bTypes, used by CodeGen and PostProcessro, defined in BCodeIdiomatic (BType maps, common BTypes like StringRef)backendInterface: - proxy to Context specific operationscodeGen: CodeGen - uses backendInterface, bTypes, initializes instance of DottyPrimitives and defines JMirrorBuilder instance and implements bytecode generation flow (maps primitive members, like int.+, to bytecode instructions)fontendAccess - synchronized PostProcessor interface to compiler settings, reporting and GenBCode context (e.g. list of entrypoints)postProcessor - compilation context agnostic module dedicated to parallel processing of produced bytecode. Currently used only for writing Tasty and Class files. Defines backendUtils and classfileWritercodeGen.genUnit(ctx.compilation) which returns structure with generated definitions (both Class files and Tasty)postProcessorUpon calling codeGen.genUnit it:
PlainClassBuilder instance for each generated TypeDef and creates ASM ClassNodePostProcessor is later:
ClassNode with collected serializable lambdasThe architecture of GenBCode is the same as in Scalac. It can be partitioned
into weakly coupled components (called "subsystems" below):
Queues mediate between processors, queues don't know what each processor does.
The first queue contains AST trees for compilation units, the second queue contains ASM ClassNodes, and finally the third queue contains byte arrays, ready for serialization to disk.
Currently the queue subsystem is all sequential, but as can be seen in
http://magarciaepfl.github.io/scala/ the above design enables overlapping (a.1)
building of ClassNodes, (a.2) intra-method optimizations, and (a.3)
serialization to disk.
This subsystem is described in detail in GenBCode.scala
The previous bytecode emitter goes to great lengths to reason about bytecode-level types in terms of Symbols.
GenBCode uses BType as a more direct representation. A BType is immutable, and
a value class (once the rest of GenBCode is merged from
http://magarciaepfl.github.io/scala/ ).
Whether value class or not, its API is the same. That API doesn't reach into
the type checker. Instead, each method on a BType answers a question that can
be answered based on the BType itself. Sounds too simple to be good? It's a
good building block, that's what it is.
The internal representation of a BType is based on what the JVM uses: internal
names (e.g. Ljava/lang/String; ) and method descriptors; as defined in the JVM
spec (that's why they aren't documented in GenBCode, just read the JVM 8 spec).
All things BType can be found in BCodeGlue.scala
Bytecode can be emitted one opcode at a time, but there are recurring patterns that call for a simpler API.
For example, when emitting a load-constant, a dedicated instruction exists for emitting load-zero. Similarly, emitting a switch can be done according to one of two strategies.
All these utilities are encapsulated in file BCodeIdiomatic.scala. They know
nothing about the type checker (because, just between us, they don't need to).
So that (c) can remain oblivious to what AST trees contain, some bookkeepers are needed:
To understand how it's built, see:
final def exemplar(csym0: Symbol): Tracked = { ... }
Details in BTypes.scala
In the spirit of BCodeIdiomatic, utilities are added in BCodeHelpers for
emitting:
It's done by PlainClassBuilder(see CodeGen.scala).