docs/_spec/TODOreference/other-new-features/safe-initialization.md
Scala 3 implements experimental safe initialization check, which can be enabled by the compiler option -Wsafe-init.
The design and implementation of the initialization checker is described in the paper Safe object initialization, abstractly [3].
To get a feel of how it works, we first show several examples below.
Given the following code snippet:
abstract class AbstractFile:
def name: String
val extension: String = name.substring(4)
class RemoteFile(url: String) extends AbstractFile:
val localFile: String = s"${url.##}.tmp" // error: usage of `localFile` before it's initialized
def name: String = localFile
The checker will report:
-- Warning: tests/init/neg/AbstractFile.scala:7:4 ------------------------------
7 | val localFile: String = s"${url.##}.tmp" // error: usage of `localFile` before it's initialized
| ^
| Access non-initialized field value localFile. Calling trace:
| -> val extension: String = name.substring(4) [ AbstractFile.scala:3 ]
| -> def name: String = localFile [ AbstractFile.scala:8 ]
Given the code below:
object Trees:
class ValDef { counter += 1 }
class EmptyValDef extends ValDef
val theEmptyValDef = new EmptyValDef
private var counter = 0 // error
The checker will report:
-- Warning: tests/init/neg/trees.scala:5:14 ------------------------------------
5 | private var counter = 0 // error
| ^
| Access non-initialized field variable counter. Calling trace:
| -> val theEmptyValDef = new EmptyValDef [ trees.scala:4 ]
| -> class EmptyValDef extends ValDef [ trees.scala:3 ]
| -> class ValDef { counter += 1 } [ trees.scala:2 ]
Given the code below:
abstract class Parent:
val f: () => String = () => this.message
def message: String
class Child extends Parent:
val a = f()
val b = "hello" // error
def message: String = b
The checker reports:
-- Warning: tests/init/neg/features-high-order.scala:7:6 -----------------------
7 | val b = "hello" // error
| ^
|Access non-initialized field value b. Calling trace:
| -> val a = f() [ features-high-order.scala:6 ]
| -> val f: () => String = () => this.message [ features-high-order.scala:2 ]
| -> def message: String = b [ features-high-order.scala:8 ]
We establish the following design goals:
By reasonable usage, we include the following use cases (but not restricted to them):
this and outer this during initializationthis and outer this during initializationTo achieve the goals, we uphold the following fundamental principles: stackability, monotonicity, scopability and authority.
Stackability means that all fields of a class are initialized at the end of the class body. Scala enforces this property in syntax by demanding that all fields are initialized at the end of the primary constructor, except for the language feature below:
var x: T = _
Control effects such as exceptions may break this property, as the following example shows:
class MyException(val b: B) extends Exception("")
class A:
val b = try { new B } catch { case myEx: MyException => myEx.b }
println(b.a)
class B:
throw new MyException(this)
val a: Int = 1
In the code above, the control effect teleport the uninitialized value wrapped in an exception. In the implementation, we avoid the problem by ensuring that the values that are thrown must be transitively initialized.
Monotonicity means that the initialization status of an object should not go backward: initialized fields continue to be initialized, a field points to an initialized object may not later point to an object under initialization. As an example, the following code will be rejected:
trait Reporter:
def report(msg: String): Unit
class FileReporter(ctx: Context) extends Reporter:
ctx.typer.reporter = this // ctx now reaches an uninitialized object
val file: File = new File("report.txt")
def report(msg: String) = file.write(msg)
In the code above, suppose ctx points to a transitively initialized
object. Now the assignment at line 3 makes this, which is not fully
initialized, reachable from ctx. This makes field usage dangerous,
as it may indirectly reach uninitialized fields.
Monotonicity is based on a well-known technique called heap monotonic typestate to ensure soundness in the presence of aliasing [1]. Roughly speaking, it means initialization state should not go backwards.
Scopability means that there are no side channels to access to partially constructed objects. Control effects like coroutines, delimited control, resumable exceptions may break the property, as they can transport a value upper in the stack (not in scope) to be reachable from the current scope. Static fields can also serve as a teleport thus breaks this property. In the implementation, we need to enforce that teleported values are transitively initialized.
The three principles above contribute to local reasoning about initialization, which means:
An initialized environment can only produce initialized values.
For example, if the arguments to an new-expression are transitively
initialized, so is the result. If the receiver and arguments in a method call
are transitively initialized, so is the result.
Local reasoning about initialization gives rise to a fast initialization checker, as it avoids whole-program analysis.
The principle of authority goes hand-in-hand with monotonicity: the principle of monotonicity stipulates that initialization states cannot go backwards, while the principle of authority stipulates that the initialization states may not go forward at arbitrary locations due to aliasing. In Scala, we may only advance initialization states of objects in the class body when a field is defined with a mandatory initializer or at local reasoning points when the object becomes transitively initialized.
There are three fundamental abstractions for initialization states of objects:
In the initialization checker, the abstraction Warm is refined to handle inner
classes and multiple constructors:
C, where the immediate outer of C is V, the constructor is ctor and constructor arguments are Vs.The initialization checker checks each concrete class separately. The abstraction ThisRef
represents the current object under initialization:
C under initialization.The initialization state of the current object is stored in the abstract heap as an
abstract object. The abstract heap also serves as a cache for the field values
of warm objects. Warm and ThisRef are "addresses" of the abstract objects stored
in the abstract heap.
Two more abstractions are introduced to support functions and conditional expressions:
Fun(e, V, C): An abstract function value where e is the code, V is the
abstract value for this inside the function body and the function is located
inside the class C.
Refset(Vs): A set of abstract values Vs.
A value v is effectively hot if any of the following is true:
v is Hot.v is ThisRef and all fields of the underlying object are assigned.v is Warm[C] { ... } and
C does not contain inner classes; andv encounters no initialization errors and the method return value is effectively hot; andv is effectively hot.v is Fun(e, V, C) and calling the function encounters no errors and the
function return value is effectively hot.ThisRef) is effectively hot.An effectively hot value can be regarded as transitively initialized thus can be safely leaked via method arguments or as RHS of reassignment. The initialization checker tries to promote non-hot values to effectively hot whenenver possible.
With the established principles and design goals, the following rules are imposed:
The field access e.f or method call e.m() is illegal if e is cold.
A cold value should not be used.
The field access e.f is invalid if e has the value ThisRef and f is not initialized.
In an assignment o.x = e, the expression e must be effectively hot.
This is how monotonicity is enforced in the system. Note that in an
initialization val f: T = e, the expression e may point to a non-hot
value.
Arguments to method calls must be effectively hot.
Escape of this in the constructor is commonly regarded as an anti-pattern.
However, passing non-hot values as argument to another constructor is allowed, to support creation of cyclic data structures. The checker will ensure that the escaped non-initialized object is not used, i.e. calling methods or accessing fields on the escaped object is not allowed.
An exception is for calling synthetic applys of case classes. For example,
the method call Some.apply(e) will be interpreted as new Some(e), thus
is valid even if e is not hot.
Another exception to this rule is parametric method calls. For example, in
List.apply(e), the argument e may be non-hot. If that is the case, the
result value of the parametric method call is taken as cold.
Method calls on hot values with effectively hot arguments produce hot results.
This rule is assured by local reasoning about initialization.
Method calls on ThisRef and warm values will be resolved statically and the
corresponding method bodies are checked.
In a new expression new p.C(args), if the values of p and args are
effectively hot, then the result value is also hot.
This is assured by local reasoning about initialization.
In a new expression new p.C(args), if any value of p and args is not
effectively hot, then the result value takes the form Warm[C] { outer = Vp, args = Vargs }. The initialization code for the class C is checked again to make
sure the non-hot values are used properly.
In the above, Vp is the widened value of p --- the widening happens if p
is a warm value Warm[D] { outer = V, args } and we widen it to
Warm[D] { outer = Cold, args }.
The variable Vargs represents values of args with non-hot values widened
to Cold.
The motivation for the widening is to finitize the abstract domain and ensure termination of the initialization check.
The scrutinee in a pattern match and the values in return and throw statements must be effectively hot.
The analysis takes the primary constructor of concrete classes as entry points. It follows the constructors of super classes, which might be defined in another project. The analysis takes advantage of TASTy for analyzing super classes defined in another project.
The crossing of project boundary raises a concern about modularity. It is well-known in object-oriented programming that superclass and subclass are tightly coupled. For example, adding a method in the superclass requires recompiling the child class for checking safe overriding.
Initialization is no exception in this respect. The initialization of an object essentially involves close interaction between subclass and superclass. If the superclass is defined in another project, the crossing of project boundary cannot be avoided for soundness of the analysis.
Meanwhile, inheritance across project boundary has been under scrutiny and the
introduction of open classes mitigate the concern here.
For example, the initialization check could enforce that the constructors of
open classes may not contain method calls on this or introduce annotations as
a contract.
The feedback from the community on the topic is welcome.
Occasionally you may want to suppress warnings reported by the
checker. You can either write e: @unchecked to tell the checker to
skip checking for the expression e, or you may use the old trick:
mark some fields as lazy.