docs/antithesis/README.md
Antithesis is a third party vendor with an environment that can perform network fuzzing. We can
upload images containing docker-compose.yml files, which represent various MongoDB topologies, to
the Antithesis Docker registry. Antithesis runs docker-compose up from these images to spin up
the corresponding multi-container application in their environment and run a test suite. Network
fuzzing is performed on the topology while the test suite runs & a report is generated by
Antithesis identifying bugs. Check out
https://github.com/mongodb/mongo/wiki/Testing-MongoDB-with-Antithesis to see an example of how we
use Antithesis today.
The base_images directory consists of the building blocks for creating a MongoDB test topology.
These images are uploaded to the Antithesis Docker registry nightly during the
antithesis image build and push function.
This image contains the latest mongo, mongos and mongod binaries. It can be used to
start a mongod instance, mongos instance or execute mongo commands. This is the main building
block for creating the System Under Test topology.
This image contains the latest mongo binary as well as the resmoke test runner. The workload
container is not part of the actual toplogy. The purpose of a workload container is to execute
mongo commands to complete the topology setup, and to run a test suite on an existing topology
like so:
buildscript/resmoke.py run --suite antithesis_concurrency_sharded_with_stepdowns_and_balancer
Every topology must have 1 workload container.
Note: During workload image build, evergreen/antithesis_image_build_and_push.sh runs, which generates
"antithesis compatible" test suites and prepends them with antithesis_. These are the test suites
that can run in antithesis and are available from within the workload container.
This assembles an image with the necessary files for spinning up the corresponding topology. It
consists of a docker-compose.yml, a logs directory, a scripts directory and a data
directory. If this is structured properly, you should be able to copy the files & directories
from this image and run docker-compose up to set up the desired topology.
Example from what buildscripts/resmokelib/testing/docker_cluster_image_builder.py generates:
FROM scratch
COPY docker-compose.yml /
ADD scripts /scripts
ADD logs /logs
ADD data /data
ADD debug /debug
All topology images are built and uploaded to the Antithesis Docker registry during the
antithesis image build and push task. Some of these directories are created during the
evergreen/antithesis_image_build_and_push.sh script such as /data and /logs.
Note: These images serve solely as a filesystem containing all necessary files for a topology,
therefore use FROM scratch.
This describes how to construct the corresponding topology using the
mongo-binaries and workload images.
Example from buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml:
version: '3.0'
services:
configsvr1:
container_name: configsvr1
hostname: configsvr1
image: mongo-binaries:evergreen-latest-master
volumes:
- ./logs/configsvr1:/var/log/mongodb/
- ./scripts:/scripts/
- ./data/configsvr1:/data/configdb/
command: /bin/bash /scripts/configsvr_init.sh
networks:
antithesis-net:
ipv4_address: 10.20.20.6
# Set the an IPv4 with an address of 10.20.20.130 or higher
# to be ignored by the fault injector
#
configsvr2: ...
configsvr3: ...
database1: ...
container_name: database1
hostname: database1
image: mongo-binaries:evergreen-latest-master
volumes:
- ./logs/database1:/var/log/mongodb/
- ./scripts:/scripts/
- ./data/database1:/data/db/
command: /bin/bash /scripts/database_init.sh Shard1
networks:
antithesis-net:
ipv4_address: 10.20.20.3
# Set the an IPv4 with an address of 10.20.20.130 or higher
# to be ignored by the fault injector
#
database2: ...
database3: ...
database4: ...
database5: ...
database6: ...
mongos:
container_name: mongos
hostname: mongos
image: mongo-binaries:evergreen-latest-master
volumes:
- ./logs/mongos:/var/log/mongodb/
- ./scripts:/scripts/
command: python3 /scripts/mongos_init.py
depends_on:
- "database1"
- "database2"
- "database3"
- "database4"
- "database5"
- "database6"
- "configsvr1"
- "configsvr2"
- "configsvr3"
networks:
antithesis-net:
ipv4_address: 10.20.20.9
# The subnet provided here is an example
# An alternative subnet can be used
workload:
container_name: workload
hostname: workload
image: workload:evergreen-latest-master
volumes:
- ./logs/workload:/var/log/resmoke/
- ./scripts:/scripts/
command: python3 /scripts/workload_init.py
depends_on:
- "mongos"
networks:
antithesis-net:
ipv4_address: 10.20.20.130
# The subnet provided here is an example
# An alternative subnet can be used
networks:
antithesis-net:
driver: bridge
ipam:
config:
- subnet: 10.20.20.0/24
Each container must have a command in docker-compose.yml that runs an init script. The init
script belongs in the scripts directory, which is included as a volume. The command should be
set like so: /bin/bash /scripts/[script_name].sh or python3 /scripts/[script_name].py. This is
a requirement for the topology to start up properly in Antithesis.
When creating mongod or mongos instances, route the logs like so:
--logpath /var/log/mongodb/mongodb.log and utilize volumes -- as in database1.
This enables us to easily retrieve logs if a bug is detected by Antithesis.
The ipv4_address should be set to 10.20.20.130 or higher if you do not want that container to
be affected by network fuzzing. For instance, you would likely not want the workload container
to be affected by network fuzzing -- as shown in the example above.
Use the evergreen-latest-master tag for all images. This is updated automatically in
evergreen/antithesis_image_build_and_push.sh -- if needed.
Take a look at buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.py to see
how to use util methods from buildscripts/antithesis/topologies/sharded_cluster/scripts/utils.py
to set up the desired topology. You can also use simple shell scripts as in the case of
buildscripts/antithesis/topologies/sharded_cluster/scripts/database_init.py. These init scripts
must not end in order to keep the underlying container alive. You can use an infinite while
loop for python scripts or you can use tail -f /dev/null for shell scripts.
This should be done with care to ensure we are using our limited resources efficiently.
Create a new task extending the antithesis_task_template, tagged with antithesis, passing the specified suite to the antithesis image build and push task. See other examples to get started.
If you provide the evergreen parameter schedule_antithesis_tests to your evergreen patch, once we build the antithesis images in your evergreen patch we send antithesis an api request to run your newly created images for an hour. You will get emailed the report when it finishes running in antithesis.
Important Note: This will happen for every antithesis task you schedule in your patch. Please do not schedule more than 1 or 2 tasks with this parameter at a time or it will use up a lot of our testing time allocated with antithesis.
evergreen patch --param schedule_antithesis_tests=true
Antithesis constantly runs your resmoke suite with one random test from the suite at a time.
We support this out-of-the-box with most resmoke suites that use python fixtures.
This is very similar to how tests run in evergreen.
Your antithesis tasks in evergreen will default to this if the antithesis_test_composer_dir var is not specified on the task.
Antithesis offers a resource called Test Composer to run "test templates" against our clusters. Test Composer enables autonomous testing by letting you define templates that guide Antithesis in generating thousands of test cases across multiple system states. Your evergreen tasks will automatically use test composure if the antithesis_test_composer_dir var is specified in the task as show in the example below.
Test Composer uses an opinionated framework based on naming conventions to detect and run tests. Unlike traditional example-based testing, Test Composer templates tell Antithesis how to handle parallelism, test length, command order, and fault injection to explore your system's behavior comprehensively.
MongoDB's Test Composer implementations are located in buildscripts/antithesis/test_composer/. The setup still uses a resmoke suite to determine cluster configuration, but test execution is controlled by Test Composer commands rather than running jstests directly.
Test commands must be executable and placed directly under /opt/antithesis/test/v1/<test_dir>/. Our evergreen tasks handle building the images and putting the tests in the correct place for you. They follow the naming convention <prefix>_<command> where the prefix determines the command's behavior.
Run during fault injection periods. At least one driver or anytime command is required.
parallel_driver_<command>: Can run concurrently with other parallel drivers (including itself)
singleton_driver_<command>: Runs as the only driver command in a history branch
serial_driver_<command>: Runs when no other driver commands are active
Run in the absence of faults.
first_<command>: Optional setup command that runs once before any driver commands
eventually_<command>: Runs after driver commands start. Kills all drivers and stops faults, creating a new branch
finally_<command>: Like eventually, but only runs after all driver commands complete naturally
anytime_<command>: Can run at any time after first command, even during singleton/serial drivers
This template runs parallel JavaScript operations against MongoDB with built-in retry logic for network failures.
Commands:
Shared Logic: commands.js provides retry mechanisms for network errors and connection helpers.
Key Features:
MongoNetworkError, MongoServerSelectionError, RetryableWriteError/scripts/print_connection_string.shThis template runs resmoke tests with randomization, adapting existing test infrastructure for Test Composer.
Commands:
Both use random seeds and shuffling: --seed $(od -vAn -N4 -tu4 < /dev/urandom) --shuffle --sanityCheck
Create a test directory: buildscripts/antithesis/test_composer/<your_template_name>/
Write test commands: Create executable scripts with appropriate prefixes:
#!/usr/bin/env bash
# buildscripts/antithesis/test_composer/<template>/parallel_driver_mytest.sh
# Your test logic here
# This can run in parallel with other parallel_driver commands
Make scripts executable: chmod +x buildscripts/antithesis/test_composer/<template>/*.sh
Helper files: Use helper_ prefix or subdirectories for shared code - these are ignored by Test Composer
singleton_driver to adapt existing tests, then evolve to parallel/serial commandsTo use Test Composer instead of normal resmoke testing, set the antithesis_test_composer_dir variable in your Evergreen task:
- <<: *antithesis_task_template
name: antithesis_resmoke_suite_with_test_template
tags: ...
commands:
...
- func: "antithesis image build and push"
vars:
suite: concurrency_sharded_replication_with_balancer_and_config_transitions_and_add_remove_shard # Still used for cluster topology
resmoke_args: >- # any args that change the cluster topology can still be used
--runAllFeatureFlagTests
antithesis_test_composer_dir: basic_js_commands # Directory name under buildscripts/antithesis/test_composer/
If you are interested in leveraging Antithesis feel free to reach out to #ask-devprod-correctness or #server-testing on Slack.