Nullability annotations

As of C# 8 and improved in C# 9, the C# language provides an opt-in feature that allows the compiler to track reference type nullability in order to catch potential null dereferences. We have adopted that feature set across .NET's libraries, working up from the bottom of the stack. We've done this (and continue to do so for all new library development) for three primary reasons, in order of importance:

To annotate the .NET surface area with appropriate nullability annotations. While this could be done solely in the reference assemblies, we do it in the implementation to help validate the selected annotations.
To help validate the nullability feature itself. With millions of lines of C# code, we have a very large and robust codebase with which to try out features and find areas in which it shines and areas in which we can improve it. The work to annotate System.Private.CoreLib in .NET Core 3.0 helped to improve the feature as shipped in C# 8, and the rest of the core libraries being annotated in .NET 5 helped to further flesh out the design for C# 9.
To find null-related bugs in .NET Runtime itself. We have found few meaningful bugs, due to how relatively well-tested the codebases are and how long they've been around. However, for new code, the annotations do help to highlight places where null values may be erroneously dereferenced, helping to avoid NullReferenceExceptions in places testing may not yet be adequate.

Breaking Change Guidance

We strive to get annotations correct the "first time" and do due-diligence in an attempt to do so. However, we acknowledge that we are likely to need to augment and change some annotations in the future:

Mistakes. Given the sheer number of APIs being reviewed and annotated, we are likely to make some errors, and we'd like to be able to fix them so that long-term customers get the greatest benefit.
Breadth. Annotation takes time, so annotations have rolled out and will continue to roll out over time rather than all at once for the whole stack.
Feedback. We may need to revisit some "gray area" decisions as to whether a parameter or return type should be nullable or non-nullable (more details later).

Any such additions or changes to annotations can impact the warnings consuming code receives if that code has opted-in to nullability analysis and warnings. Even so, for at least the foreseeable future we may still do so, and we will be very thoughtful about when and how we do.

Annotation Guidance

Nullability annotations are considered to represent intent: they represent the nullability contract for the member. Any deviation from that intent on the part of an implementation should be considered an implementation bug, and the compiler will help to minimize the chances of such bugs via its flow analysis and nullability warnings. At the same time, it's important to recognize that the validation performed by the compiler isn't perfect; it can have both false positive warnings (suggesting that something may be null even when it isn't) and false negatives (not warning when something that may be null is dereferenced). The compiler cannot guarantee that an API declared as returning a non-nullable reference never returns null, just as it can't validate that an implementation declared as accepting nulls always behaves correctly when given them. When deciding how to annotate APIs, it's important then to consider the desired contract rather than the current implementation; in other words, prefer to first annotate the API surface area the way that's desired, and only then work to address any warnings in the codebase, rather than driving the API surface area annotations based on where those warnings lead.

DO annotate all new APIs with the desired contract.
CONSIDER changing that contract if overwhelming use suggests a different de facto contract. This is particularly relevant to virtual/abstract/interface methods defined in a library where all implementations may not be under your control, and derived implementations may not have adhered to the original intent.
DO respect documented behaviors. Nullability annotations are about codifying a contract, and documentation is a description of that contract. If, for example, a base abstract or virtual method says that it may throw an exception if null is passed in, that dictates that the argument should be non-nullable.
DO continue to validate all arguments as you would have prior to nullability warnings. In particular, if you would have checked an argument for null and thrown an ArgumentNullException if it was null, continue to do so, even if the parameter is defined as non-nullable. Many consumers will not have nullability checked enabled, either because they're using a language other than C#, they're using an older version of C#, they simply haven't opted-in to nullability analysis, they've explicitly suppressed warnings, or the compiler isn't able to detect the misuse.
DO NOT remove existing argument validation when annotating existing public/protected APIs. If through the course of annotating it becomes clear that internal/private surface area is unnecessarily checking for nulls when nulls aren't possible (i.e. you found dead code), such use can be removed or replaced by asserts.
AVOID making any changes while annotating that substantively impact the generated IL for an implementation (e.g. some.Method() to some?.Method()). Any such changes should be thoroughly analyzed and reviewed as a bug fix. In some cases, such changes may be warranted, but they're a red flag that should be scrutinized heavily.

The majority of reference type usage in our APIs is fairly clear as to whether it should be nullable or not. For parameters, these general guidelines cover the majority of cases:

DO define a parameter as non-nullable if the method checks for null and throws an Argument{Null}Exception if null is passed in for that parameter (whether explicitly in that same method or implicitly as part of some method it calls), such that there's no way null could be passed in and have the method return successfully.
DO define a parameter as non-nullable if the method fails to check for null but instead will always end up dereferencing the null and throwing a NullReferenceException.
Do define a parameter as non-nullable if the documentation explicitly states that an exception may be thrown for a null argument.
DO define a parameter as nullable if the parameter is explicitly documented to accept null.
DO define a parameter as nullable if the method checks the parameter for null and does something other than throw. This may include normalizing the input, e.g. treating null as string.Empty.
DO define a parameter as nullable if the parameter is optional and has a default value of null.
DO define string message and Exception innerException arguments to Exception-derived types as nullable. Additional reference type arguments to Exception-derived types should, in general, also be nullable unless not doing so is required for compatibility.
DO prefer nullable over non-nullable if there's any disagreement between the previous guidelines. For example, if a non-virtual method has documentation that suggests null isn't accepted but the implementation explicitly checks for, normalizes, and accepts a null input, the parameter should be defined nullable.

However, there are some gray areas that require case-by-case analysis to determine intent. In particular, if a parameter isn't validated nor sanitized nor documented regarding null, but in some cases simply ignored such that a null doesn't currently cause any problems, several factors should be considered when determining whether to annotate it as null.

Is null ever passed in our own code bases? If yes, it likely should be nullable.
Is null ever passed in prominent 3rd-party code bases? If yes, it likely should be nullable.
Is null likely to be interpreted as a default / nop placeholder by callers? If yes, it likely should be nullable.
Is null accepted by other methods in a similar area or that have a similar purposes in the same code base? If yes, it likely should be nullable.
If the method is largely oblivious to null and just happens to still work if null is passed, and if the API's purpose wouldn't make sense if null were used, the parameter likely should be non-nullable.

Things are generally easier when looking at return values (and out parameters), as those can largely be driven by what the API's implementation is capable of:

DO define a return value or out/ref parameter as nullable if null may be returned under any circumstance.
DO define all other return values or out parameters as non-nullable.
DO define the type of events as nullable. (The only incredibly rare exception to this is if the implementation guarantees that the event will always have at least one delegate registered with it.)
CONSIDER special-casing public properties and fields on structs. By default, reference type fields of structs are null, and so if those fields are public, or if a property exposes such a field without performing further validation or manipulation, they technically must be nullable, and when annotating, that is how such fields should be annotated initially. However, we must also consider expected usage, and whether the default value of such structs should ever be used. In limited circumstances, we may decide that, although technically default(Struct) could be used to create a default instance and thus its exposed fields could be null and thus they should be nullable, such a use is considered invalid and such we shouldn't use it for annotation purposes. Let's consider two examples on opposite ends of the spectrum. First, System.Threading.CancellationToken. CancellationToken has a CancellationTokenSource field, and default(CancellationToken) is considered a perfectly valid instance to use... its CancellationTokenSource field must be nullable. But second, let's consider System.Reflection.InterfaceMapping. InterfaceMapping is a simple wrapper around four public reference type fields, and so technically they should all be nullable. But it's considered invalid to just use a default(InterfaceMapping); no one ever does it, as you only ever get an InterfaceMapping back from some reflection method calls that produce proper instances, and those instances will have all of these fields initialized to non-null. Making them nullable, while technically correct, would harm the 100% correct use case in favor of the 0% invalid use case.
DO NOT be concerned with the validity of annotations after an object has been Dispose'd. In general in .NET, using an instance after it's been disposed is a violation of its contract, and so similarly we generally (if the object isn't supposed to be used after disposal) don't consider the nullable annotations to be meaningful after disposal. This means, for example, that if a property is initialized in a ctor and is always non-null until disposal but then may return null after disposal, it should still be annotated as non-nullable.
DO NOT be concerned with the validity of an enumerator's Current annotations when used before MoveNext has been called or after MoveNext has returned false. As with Dispose, such use is considered invalid, and we don't want to harm the correct consumption of Current by catering to incorrect consumption.

Annotating one of our return types as non-nullable is equivalent to documenting a guarantee that it will never return null. Violations of that guarantee are bugs to be fixed in the implementation. However, there is a huge gap here, in the form of overridable members…

Virtual/Abstract Methods and Interfaces

For virtual members, annotating a return type as non-nullable places a requirement on all overrides to meet those same guarantees, just as any other documented behaviors of a virtuals apply to all overrides, whether those stated guarantees can be enforced by the compiler or not. An override that doesn't abide by these guarantees has a bug. For existing virtual APIs that have already documented a guarantee about a non-nullable return, it's expected that the return type will be annotated as non-nullable, and derived types must continue to respect that guarantee, albeit now with the compiler's assistance.

However, for existing virtual APIs that do not have any such strong guarantee documented but where the intent was for the return value to be non-null, it is a grayer area. The most accurate return type would be T?, whereas the intent-based return type would be T. For T?, the pros are that it accurately reflects that nulls may emerge, but at the expense of consumers that know a null will never emerge having to use ! or some other suppression when dereferencing. For T, the pros are that it accurately conveys the intent to overriders and allows consumers to avoid needing any form of suppression, but ironically at the expense of potential increases in occurrences of NullReferenceExceptions due to consumers then not validating the return type to be non-null and not being able to trust in the meaning of a method returning non-nullable. As such, there are several factors to consider when deciding which return type to use for an existing virtual/abstract/interface method:

How common is it that an existing override written before the guarantee was put in effect would return null?
How widespread are overrides of the method in question? This contributes to (1).
How common is it to invoke the method via the base vs via a derived type that may narrow the return type to T from T??
How common is it in the case of (3) for such invocations to then dereference the result rather than passing it off to something else that accepts a T??

Object.ToString is arguably the most extreme case. Answering the above questions:

It is fairly easy in any reasonably-sized code base to find cases, intentional or otherwise, where ToString returns null in some cases (we've found examples in dotnet/runtime, dotnet/roslyn, NuGet/NuGet.Client, dotnet/aspnetcore, and so on). One of the most prevalent conditions for this are types that just return the value in a string field which may contain its default value of null, and in particular for structs where a ctor may not have even had a chance to run and validate an input. Guidance in the docs suggests that ToString shouldn't return null or string.Empty, but even the docs don't follow its own guidance.
Thousands upon thousands of types we don't control override this method today.
It's common for helper routines to invoke via the base object.ToString, but many ToString uses are actually on derived types. This is particularly true when working in a code base that both defines a type and consumes its ToString.
Based on examination of several large code bases, we believe it to be relatively rare that the result of an Object.ToString call (made on the base) to be directly dereferenced. It's much more common to pass it to another method that accepts string?, such as String.Concat, String.Format, Console.WriteLine, logging utilities, and so on. And while we advocate that ToString results shouldn't be assumed to be in a particular machine-readable format and parsed, it's certainly the case that code bases do, such as using Substring on the result, but in such cases, the caller needs to understand the format of what's being rendered, which generally means they're working with a derived type rather than calling through the base Object.ToString.

As such, for now, we've annotated Object.ToString as returning string?. We can re-evaluate this decision as we get more experience with consumers of the feature.

In contrast, we've annotated Exception.Message (which is also virtual) as being non-nullable, even though technically a derived class could override it to return null, because doing so is so rare that we couldn't find any meaningful examples where null is returned.

Interface implementations

DO implement IComparable<T?> instead of IComparable<T> on a reference type T.
DO implement IEquatable<T?> instead of IEquatable<T> on a reference type T. e.g.

public sealed class Version : IComparable<Version?>, IEquatable<Version?>, ...

Nullable attributes

The C# compiler respects a set of attributes that impact its flow analysis. We make use of these attributes when simple annotation is insufficient to fully express the contract of a method.

DO annotate any methods that always throw (e.g. ThrowHelper.ThrowArgumentException) or that always exit the process (e.g. Environment.Exit) with [DoesNotReturn].
DO annotate generic out arguments on Try methods with [MaybeNullWhen(false)]. In general, such arguments may be null when the method returns false because the method was unable to fetch/compute the desired data. If the consumer defined the generic type as being non-nullable, then such attribution will highlight that the out argument may be null when the method returns false but will be non-null when the method returns true. If the consumer defined the generic type as being nullable, then the attribute simply ends up being a nop, because a value of that generic type could always be null (e.g. if nulls are allowed as values in a dictionary, where a successful TryGetValue call returning true could reasonably produce null).
DO annotate non-generic out reference type arguments on Try methods as being nullable and with [NotNullWhen(true)] (e.g. TryCreate([NotNullWhen(true)] out Semaphore? semaphore)). For non-generic arguments, the argument should be nullable because a failed call will generally store null into the argument, and it should be [NotNullWhen(true)] because a successful call will store non-null.
DO annotate ref arguments that guarantee the argument will be non-null upon exit (e.g. lazy initialization helpers) with [NotNull].
DO annotate properties where a getter will never return null but a setter allows null as being non-nullable but also [AllowNull].
DO annotate properties where a getter may return null but a setter throws for null as being nullable but also [DisallowNull].
DO add [NotNullWhen(true)] to nullable arguments of Try methods that will definitively be non-null if the method returns true. For example, if Int32.TryParse(string? s) returns true, s is known to not be null, and so the method should be public static bool TryParse([NotNullWhen(true)] string? s, out int result).
DO add [NotNullWhen(true)] to overrides of public virtual bool Equals(object? obj), except in the extremely rare circumstance where a non-null instance may compare equally to null (one example of where it's not valid is on Nullable<T>).
DO add [NotNullIfNotNull(string)] if nullable ref argument will be non-null upon exit, when an other argument passed evaluated to non-null, pass that argument name as string. Example: public void Exchange([NotNullIfNotNull("value")] ref object? location, object? value);.
DO add [return: NotNullIfNotNull(string)] if a method would not return null in case an argument passed evaluated to non-null, pass that argument name as string. Example: [return: NotNullIfNotNull("name")] public string? FormatName(string? name);
DO add [MemberNotNull(string fieldName)] to a helper method which initializes member field(s), passing in the field name. Example: [MemberNotNull("_buffer")] private void InitializeBuffer(). This will help to avoid spurious warnings at call sites that call the initialization method and then proceed to use the specified field. Note that there are two constructors to MemberNotNull; one that takes a single string, and one that takes a params string[]. When the number of fields initialized is small (e.g. <= 3), it's preferable to use multiple [MemberNotNull(string)] attributes on the method rather than one [MemberNotNull(string, string, string, ...)] attribute, as the latter is not CLS compliant and will likely require #pragma warning disable and #pragma warning restore around the line to suppress warnings.
AVOID using [MaybeNull], not because it's problematic, but because there's almost always a better option, such as T? (as of this writing, in all of the dotnet/runtime there are only 7 occurrences of [MaybeNull]). One example of where it's applicable is AsyncLocal<T>.Value; [DisallowNull] can't be used here, because null is valid if T is nullable, and T? shouldn't be used because Value shouldn't be set to null if T isn't nullable. Another is in the relatively rare case where a public or protected field is exposed, may begin life as null, but shouldn't be explicitly set to null.

Code Review Guidance

Code reviews for enabling the nullability warnings are particularly interesting in that they often differ significantly from general code reviews. Typically, a code reviewer focuses only on the code actually being modified (e.g. the lines highlighted by the code diffing tool); however, enabling the nullability feature has a much broader impact, in that it effectively inverts the meaning of every reference type use in the codebase (or, more specifically, in the scope at which the nullability warning context was applied). So, for example, if you turn on nullability for a whole file (#enable nullable at the top of the file) and then touch no other lines in the file, every method that accepts a string is now accepting a non-nullable string; whereas previously passing in null to that argument would be fine, now the compiler will warn about it, and to allow nulls, the argument must be changed to string?. This means that enabling nullability checking requires reviewing all exposed APIs in that context, regardless of whether they were modified or not, as the contract exposed by the API may have been implicitly modified.

A code review for enabling nullability generally involves three passes:

Review all implementation changes made in the code. Except when explicitly fixing a bug (which should be rare), the annotations employed for nullability should have zero impact on the generated IL (other than potentially some added attributes in the metadata). The most common changes are:
- Adding ? to reference type parameters and local symbols. These inform the compiler that nulls are allowed. For locals, they evaporate entirely at compile time. For parameters, they impact the [Nullable(...)] attributes emitted into the metadata by the compiler, but have no effect on the implementation IL.
- Adding ! to reference type usage. These essentially suppress the null warning, telling the compiler to treat the expression as if it's non-null. These evaporate at compile-time.
- Adding Debug.Assert(reference != null); statements. These inform the compiler that the mentioned reference is non-null, which will cause the compiler to factor that in and have the effect of suppressing subsequent warnings on that reference (until the flow analysis suggests that could change). As with any Debug.Assert, these evaporate at compile-time in release builds (where DEBUG isn't defined).
- Most any other changes have the potential to change the IL, which should not be necessary for the feature. In particular, it's common for ?s on dereferences to sneak in, e.g. changing someVar.SomeMethod() to someVar?.SomeMethod(); that is a change to the IL, and should only be employed when there's an actual known bug that's important to fix, as otherwise we're incurring unnecessary cost. Similarly, it's easy to accidentally add ? to value types, which has a significant impact, changing the T to a Nullable<T> and should be avoided.
- Any !s added that should have been unnecessary and are required due to either a compiler issue or due to lack of expressibility about annotations should have a // TODO-NULLABLE: http://link/to/relevant/issue comment added on the same line.
Review the API changes explicitly made. These are the ones that show up in the diff. They should be reviewed to validate that they make sense from a contract perspective. Do we expect/allow nulls everywhere a parameter reference type was augmented with ?? Do we potentially return nulls everywhere a return type was augmented with ?? Was anything else changed that could be an accidental breaking change (e.g. a value type parameter getting annotated to become a Nullable<T> instead of a T)?
Review all other exported APIs (e.g. public and protected on public types) for all reference types in both return and parameter positions. Anything that wasn't changed to be ? is now defined as non-nullable. For parameters, that means consuming code will now get a harsh warning if it tries to pass null, and thus these should be changed if null actually is allowed / expected. For returns, it means the API will never return null; if it might return null in some circumstance, the API should be changed to return ?. This is the most time consuming and tedious part of the review.