docs/golden_data_test_framework.md
Golden Data test framework provides ability to run and manage tests that produce an output which is verified by comparing it to the checked-in, known valid output. Any differences result in test failure and either the code or expected output has to be updated.
Golden Data tests excel at bulk diffing of failed test outputs and bulk accepting of new test outputs.
Tests MUST produce text output that is diffable can be inspected in the pull request.
Tests MUST produce an output that is deterministic and repeatable. Including running on different platforms. Same as with ASSERT_EQ.
Tests SHOULD produce an output that changes incrementally in response to the incremental test or code changes.
Multiple test variations MAY be bundled into a single test. Recommended when testing same feature with different inputs. This helps reviewing the outputs by grouping similar tests together, and also reduces the number of output files.
Changes to test fixture or test code that affect non-trivial amount test outputs MUST BE done in separate pull request from production code changes:
Tests in the same suite SHOULD share the fixtures when appropriate. This reduces cost of adding new tests to the suite. Changes to the fixture may only affect expected outputs from that fixtures, and those output can be updated in bulk.
Tests in different suites SHOULD NOT reuse/share fixtures. Changes to the fixture can affect large number of expected outputs. There are exceptions to that rule, and tests in different suites MAY reuse/share fixtures if:
Tests SHOULD print both inputs and outputs of the tested code. This makes it easy for reviewers to verify of the expected outputs are indeed correct by having both input and output next to each other. Otherwise finding the input used to produce the new output may not be practical, and might not even be included in the diff.
When resolving merge conflicts on the expected output files, one of the approaches below SHOULD be used:
Expected test outputs SHOULD be reused across tightly-coupled test suites. The suites are tightly-coupled if:
Tests SHOULD use different test files, for legitimate and expected output differences between those suites.
Examples:
AVOID manually modifying expected output files. Those files are considered to be auto generated. Instead, run the tests and then copy the generated output as a new expected output file. See "How to diff and accept new test outputs" section for instructions.
Each golden data test should produce a text output that will be later verified. The output format must be text, but otherwise test author can choose a most appropriate output format (text, json, bson, yaml or mixed). If a test consists of multiple variations each variation should be clearly separated from each other.
Note: Test output is usually only written. It is ok to focus on just writing serialization/printing code without a need to provide deserialization/parsing code.
When actual test output is different from expected output, test framework will fail the test, log both outputs and also create following files, that can be inspected later:
::mongo::unittest::GoldenTestConfig - Provides a way to configure test suite(s). Defines where the
expected output files are located in the source repo.
::mongo::unittest::GoldenTestContext - Provides an output stream where tests should write their
outputs. Verifies the output with the expected output that is in the source repo
See: golden_test.h
Before running bazel test, set up the golden test framework as described in the Setup section below.
This will ensure that the C++ test outputs are written to a location where buildscripts/golden_test.py
can find them so that the diff and accept functions work as expected.
Example:
#include "mongo/unittest/golden_test.h"
GoldenTestConfig myConfig("src/mongo/my_expected_output");
TEST(MySuite, MyTest) {
GoldenTestContext ctx(myConfig);
ctx.outStream() << "print something here" << std::endl;
ctx.outStream() << "print something else" << std::endl;
}
void runVariation(GoldenTestContext& ctx, const std::string& variationName, T input) {
ctx.outStream() << "VARIATION " << variationName << std::endl;
ctx.outStream() << "input: " << input << std::endl;
ctx.outStream() << "output: " << runCodeUnderTest(input) << std::endl;
ctx.outStream() << std::endl;
}
TEST_F(MySuiteFixture, MyFeatureATest) {
GoldenTestContext ctx(myConfig);
runMyVariation(ctx, "variation 1", "some input testing A #1")
runMyVariation(ctx, "variation 2", "some input testing A #2")
runMyVariation(ctx, "variation 3", "some input testing A #3")
}
TEST_F(MySuiteFixture, MyFeatureBTest) {
GoldenTestContext ctx(myConfig);
runMyVariation(ctx, "variation 1", "some input testing B #1")
runMyVariation(ctx, "variation 2", "some input testing B #2")
runMyVariation(ctx, "variation 3", "some input testing B #3")
runMyVariation(ctx, "variation 4", "some input testing B #4")
}
Also see self-test: golden_test_test.cpp
Use buildscripts/golden_test.py command line tool to manage the test outputs. This includes:
buildscripts/golden_test.py requires a one-time workstation setup.
Note: this setup is only required to use buildscripts/golden_test.py itself. It is NOT required to just run the Golden Data tests when not using buildscripts/golden_test.py.
Use buildscripts/golden_test.py builtin setup to initialize default config for your current platform.
Instructions for Linux
Run buildscripts/golden_test.py setup utility
buildscripts/golden_test.py setup
Instructions for Windows
Run buildscripts/golden_test.py setup utility. You may be asked for a password, when not running in "Run as administrator" shell.
c:\python\python310\python.exe buildscripts/golden_test.py setup
This is the same config as that would be setup by the Automatic Setup
This config uses a unique subfolder folder for each test run. (default)
Instructions for Linux/macOS:
This config uses a unique subfolder folder for each test run. (default)
Create ~/.golden_test_config.yml with following contents:
outputRootPattern: /var/tmp/test_output/out-%%%%-%%%%-%%%%-%%%%
diffCmd: git diff --no-index "{{expected}}" "{{actual}}"
Update .bashrc, .zshrc
export GOLDEN_TEST_CONFIG_PATH=~/.golden_test_config.yml
alternatively modify /etc/environment or other configuration if needed by Debugger/IDE etc.
Instructions for Windows:
Create %LocalAppData%.golden_test_config.yml with the following contents:
outputRootPattern: 'C:\Users\Administrator\AppData\Local\Temp\test_output\out-%%%%-%%%%-%%%%-%%%%'
diffCmd: 'git diff --no-index "{{expected}}" "{{actual}}"'
Add GOLDEN_TEST_CONFIG_PATH=~/.golden_test_config.yml environment variable:
runas /profile /user:administrator "setx GOLDEN_TEST_CONFIG_PATH %LocalAppData%\.golden_test_config.yml"
$> buildscripts/golden_test.py list
$> buildscripts/golden_test.py diff
This will run the diffCmd that was specified in the config file
$> buildscripts/golden_test.py accept
This will copy all actual test outputs from that test run to the source repo and new expected outputs.
Get expected and actual output paths for most recent test run:
$> buildscripts/golden_test.py get
Get expected and actual output paths for most most recent test run:
$> buildscripts/golden_test.py get_root
Get all available commands and options:
$> buildscripts/golden_test.py --help
Some tests will run in multiple passthroughs or build variants, so they have multiple expected files.
Whenever the test is updated, all the expected files should be updated together as well.
buildscripts/golden_test.py --verbose clean-run-accept jstests/query_golden/NAME_OF_TEST.js
This option uses resmoke.py find-suites to determine the passthrough suites a test belongs to and
runs them.
If the test is found to only belong to the query_golden_classic passthrough, it is assumed that
it can have multiple expected results due to being run under multiple build variants with a different
internalQueryFrameworkControl settings. So the test will be run with various values for
internalQueryFrameworkControl.
Example: (linux/macOS)
# Show the test run expected and actual folders:
$> cat test.log | grep "^{" | jq -s -c -r '.[] | select(.id == 6273501 ) | .attr.expectedOutputRoot + " " +.attr.actualOutputRoot ' | sort | uniq
# Run the recursive diff
$> diff -ruN --unidirectional-new-file --color=always <expected_root> <actual_root>
Parse logs and find the the expected and actual outputs for each failed test.
Example: (linux/macOS)
# Find all expected and actual outputs of tests that have failed
$> cat test.log | grep "^{" | jq -s '.[] | select(.id == 6273501 ) | .attr.testPath,.attr.expectedOutput,.attr.actualOutput'
Golden Data test config file is a YAML file specified as:
outputRootPattern:
type: String
optional: true
description:
Root path patten that will be used to write expected and actual test outputs for all tests
in the test run.
If not specified a temporary folder location will be used.
Path pattern string may use '%' characters in the last part of the path. '%' characters in
the last part of the path will be replaced with random lowercase hexadecimal digits.
examples: /var/tmp/test_output/out-%%%%-%%%%-%%%%-%%%%
/var/tmp/test_output
diffCmd:
type: String
optional: true
description: Shell command to diff a single golden test run output.
{{expected}} and {{actual}} variables should be used and will be replaced with expected and
actual output folder paths respectively.
This property is not used to decide whether the test passes or fails; it is only used to
display differences once we've decided that a test failed.
examples: git diff --no-index "{{expected}}" "{{actual}}"
diff -ruN --unidirectional-new-file --color=always "{{expected}}" "{{actual}}"