Back to Roslyn

Localization In Compiler Tests

docs/contributing/Localization In Compiler Tests.md

11.0.1003.9 KB
Original Source

Localization in Compiler Tests

The compiler tests are structured such that they can be run on a machine using any locale. This both serves as a tool to ensure the compiler does not have any localization errors but also to ensure that developers from any locale can contribute to the project. The ability to build and execute tests should not be limited to English speaking developers.

Our infrastructure runs the compiler test suite on es-ES machines for both .NET Core and .NET Framework. This helps us ensure that compiler tests can be authored and executed on non-English machines without issues.

How it works

To ensure that our tests run on any locale there is a general approach that is followed in the vast majority of our tests.

  1. All expected baselines are expressed in invariant culture. That means that say decimal values are expressed using . separators not ,.
  2. All generated baselines, usually via Console.Writeline inside CompileAndVerify, must be generated such that they produce output in invariant culture.
  3. Diagnostic tests use the VerifyDiagnostics helpers using en-US values for the messages. The implementation of VerifyDiagnostics will request the diagnostic message in the en-US culture. This helps ensure the compiler has no localization product issues by forcing a consistent culture no matter the current culture of the system.

The largest source of friction that developers encounter when authoring tests is passing values to Console.WriteLine that have locale dependent representations. For example double and decimal values.

csharp
decimal d = 1.2;
Console.WriteLine(d);

The above code will print 1.2 on en-US but 1,2 on es-ES machines. To fix this issue the value needs to be explicitly formatted with invariant culture.

csharp
using System.Globalization;
...

decimal d = 1.2;
Console.WriteLine(d.ToString(CultureInfo.InvariantCulture));

This will consistently print 1.2.

Exceptions

There are a few cases where it is very difficult to avoid locale dependent values. These include:

  1. When a diagnostic includes a message from a thrown exception.
  2. When a diagnostic includes a message from the underlying OS.
  3. When a message is generated by a tool such as msbuild.

In those cases the message is likely to be locale dependent. Tests that have these values should be marked is [ConditionalFact(typeof(IsEnglishLocal))] so they run on en-US machines only.

Infrastructure

Windows

The Spanish Windows machines used in our infrastructure execute with CurrentCulture set to es-ES but CurrentUICulture set to en-US. This configuration means that roughly string formatting happens with es-ES but resource lookups happen with en-US. This setup doesn't fully test our code base so our test infrastructure will force the CurrentUICulture to be CurrentCulture when the two differ.

This is done in the TestBase constructor, which all compiler test classes inherit from. When CurrentUICulture does not equal CurrentCulture, the constructor saves the original CurrentUICulture and sets CurrentUICulture to CurrentCulture. The original value is restored in Dispose so the change is scoped to the lifetime of each test instance. This ensures that on a Windows machine configured with CurrentCulture = es-ES and CurrentUICulture = en-US, resource string lookups also happen in es-ES, fully exercising the localization paths.

Linux

The Spanish Linux machines used in our infrastructure run on Ubuntu with the LC_ALL environment variable set to es_ES.UTF-8. On .NET running on Linux, LC_ALL determines both CurrentCulture and CurrentUICulture. This means that unlike the Windows configuration, both string formatting and resource lookups will use es-ES. Because both culture values are already the same, the test infrastructure does not need to force CurrentUICulture to CurrentCulture on these machines.