Back to Mongo

How to Use Antithesis

docs/antithesis/README.md

3.6.17-windows-splunk-v317.3 KB
Original Source

How to Use Antithesis

Context

Antithesis is a third party vendor with an environment that can perform network fuzzing. We can upload images containing docker-compose.yml files, which represent various MongoDB topologies, to the Antithesis Docker registry. Antithesis runs docker-compose up from these images to spin up the corresponding multi-container application in their environment and run a test suite. Network fuzzing is performed on the topology while the test suite runs & a report is generated by Antithesis identifying bugs. Check out https://github.com/mongodb/mongo/wiki/Testing-MongoDB-with-Antithesis to see an example of how we use Antithesis today.

Base Images

The base_images directory consists of the building blocks for creating a MongoDB test topology. These images are uploaded to the Antithesis Docker registry nightly during the antithesis image build and push function.

mongo_binaries

This image contains the latest mongo, mongos and mongod binaries. It can be used to start a mongod instance, mongos instance or execute mongo commands. This is the main building block for creating the System Under Test topology.

workload

This image contains the latest mongo binary as well as the resmoke test runner. The workload container is not part of the actual toplogy. The purpose of a workload container is to execute mongo commands to complete the topology setup, and to run a test suite on an existing topology like so:

shell
buildscript/resmoke.py run --suite antithesis_concurrency_sharded_with_stepdowns_and_balancer

Every topology must have 1 workload container.

Note: During workload image build, evergreen/antithesis_image_build_and_push.sh runs, which generates "antithesis compatible" test suites and prepends them with antithesis_. These are the test suites that can run in antithesis and are available from within the workload container.

Dockerfile

This assembles an image with the necessary files for spinning up the corresponding topology. It consists of a docker-compose.yml, a logs directory, a scripts directory and a data directory. If this is structured properly, you should be able to copy the files & directories from this image and run docker-compose up to set up the desired topology.

Example from what buildscripts/resmokelib/testing/docker_cluster_image_builder.py generates:

Dockerfile
FROM scratch
COPY docker-compose.yml /
ADD scripts /scripts
ADD logs /logs
ADD data /data
ADD debug /debug

All topology images are built and uploaded to the Antithesis Docker registry during the antithesis image build and push task. Some of these directories are created during the evergreen/antithesis_image_build_and_push.sh script such as /data and /logs.

Note: These images serve solely as a filesystem containing all necessary files for a topology, therefore use FROM scratch.

docker-compose.yml

This describes how to construct the corresponding topology using the mongo-binaries and workload images.

Example from buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml:

yml
version: '3.0'

services:
        configsvr1:
                container_name: configsvr1
                hostname: configsvr1
                image: mongo-binaries:evergreen-latest-master
                volumes:
                  - ./logs/configsvr1:/var/log/mongodb/
                  - ./scripts:/scripts/
                  - ./data/configsvr1:/data/configdb/
                command: /bin/bash /scripts/configsvr_init.sh
                networks:
                        antithesis-net:
                                ipv4_address: 10.20.20.6
                                # Set the an IPv4 with an address of 10.20.20.130 or higher
                                # to be ignored by the fault injector
                                #

        configsvr2: ...
        configsvr3: ...
        database1: ...
                container_name: database1
                hostname: database1
                image: mongo-binaries:evergreen-latest-master
                volumes:
                  - ./logs/database1:/var/log/mongodb/
                  - ./scripts:/scripts/
                  - ./data/database1:/data/db/
                command: /bin/bash /scripts/database_init.sh Shard1
                networks:
                        antithesis-net:
                                ipv4_address: 10.20.20.3
                                # Set the an IPv4 with an address of 10.20.20.130 or higher
                                # to be ignored by the fault injector
                                #
        database2: ...
        database3: ...
        database4: ...
        database5: ...
        database6: ...
        mongos:
                container_name: mongos
                hostname: mongos
                image: mongo-binaries:evergreen-latest-master
                volumes:
                  - ./logs/mongos:/var/log/mongodb/
                  - ./scripts:/scripts/
                command: python3 /scripts/mongos_init.py
                depends_on:
                        - "database1"
                        - "database2"
                        - "database3"
                        - "database4"
                        - "database5"
                        - "database6"
                        - "configsvr1"
                        - "configsvr2"
                        - "configsvr3"
                networks:
                        antithesis-net:
                                ipv4_address: 10.20.20.9
                                # The subnet provided here is an example
                                # An alternative subnet can be used
        workload:
                container_name: workload
                hostname: workload
                image: workload:evergreen-latest-master
                volumes:
                  - ./logs/workload:/var/log/resmoke/
                  - ./scripts:/scripts/
                command: python3 /scripts/workload_init.py
                depends_on:
                        - "mongos"
                networks:
                        antithesis-net:
                                ipv4_address: 10.20.20.130
                                # The subnet provided here is an example
                                # An alternative subnet can be used
networks:
        antithesis-net:
                driver: bridge
                ipam:
                        config:
                        - subnet: 10.20.20.0/24

Each container must have a command in docker-compose.yml that runs an init script. The init script belongs in the scripts directory, which is included as a volume. The command should be set like so: /bin/bash /scripts/[script_name].sh or python3 /scripts/[script_name].py. This is a requirement for the topology to start up properly in Antithesis.

When creating mongod or mongos instances, route the logs like so: --logpath /var/log/mongodb/mongodb.log and utilize volumes -- as in database1. This enables us to easily retrieve logs if a bug is detected by Antithesis.

The ipv4_address should be set to 10.20.20.130 or higher if you do not want that container to be affected by network fuzzing. For instance, you would likely not want the workload container
to be affected by network fuzzing -- as shown in the example above.

Use the evergreen-latest-master tag for all images. This is updated automatically in evergreen/antithesis_image_build_and_push.sh -- if needed.

scripts

Take a look at buildscripts/antithesis/topologies/sharded_cluster/scripts/mongos_init.py to see how to use util methods from buildscripts/antithesis/topologies/sharded_cluster/scripts/utils.py to set up the desired topology. You can also use simple shell scripts as in the case of buildscripts/antithesis/topologies/sharded_cluster/scripts/database_init.py. These init scripts must not end in order to keep the underlying container alive. You can use an infinite while loop for python scripts or you can use tail -f /dev/null for shell scripts.

How do I create a new topology for Antithesis testing?

This should be done with care to ensure we are using our limited resources efficiently.

Create a new task extending the antithesis_task_template, tagged with antithesis, passing the specified suite to the antithesis image build and push task. See other examples to get started.

How do I test my suite in antithesis?

If you provide the evergreen parameter schedule_antithesis_tests to your evergreen patch, once we build the antithesis images in your evergreen patch we send antithesis an api request to run your newly created images for an hour. You will get emailed the report when it finishes running in antithesis.

Important Note: This will happen for every antithesis task you schedule in your patch. Please do not schedule more than 1 or 2 tasks with this parameter at a time or it will use up a lot of our testing time allocated with antithesis.

evergreen patch --param schedule_antithesis_tests=true

Types of testing in antithesis

Normal resmoke testing

Antithesis constantly runs your resmoke suite with one random test from the suite at a time. We support this out-of-the-box with most resmoke suites that use python fixtures. This is very similar to how tests run in evergreen. Your antithesis tasks in evergreen will default to this if the antithesis_test_composer_dir var is not specified on the task.

Test Composer

Antithesis offers a resource called Test Composer to run "test templates" against our clusters. Test Composer enables autonomous testing by letting you define templates that guide Antithesis in generating thousands of test cases across multiple system states. Your evergreen tasks will automatically use test composure if the antithesis_test_composer_dir var is specified in the task as show in the example below.

What is Test Composer?

Test Composer uses an opinionated framework based on naming conventions to detect and run tests. Unlike traditional example-based testing, Test Composer templates tell Antithesis how to handle parallelism, test length, command order, and fault injection to explore your system's behavior comprehensively.

Test Composer Structure in MongoDB

MongoDB's Test Composer implementations are located in buildscripts/antithesis/test_composer/. The setup still uses a resmoke suite to determine cluster configuration, but test execution is controlled by Test Composer commands rather than running jstests directly.

Test Command Types

Test commands must be executable and placed directly under /opt/antithesis/test/v1/<test_dir>/. Our evergreen tasks handle building the images and putting the tests in the correct place for you. They follow the naming convention <prefix>_<command> where the prefix determines the command's behavior.

Driver Commands

Run during fault injection periods. At least one driver or anytime command is required.

  • parallel_driver_<command>: Can run concurrently with other parallel drivers (including itself)

  • singleton_driver_<command>: Runs as the only driver command in a history branch

    • Example: singleton_driver_resmoke.sh - Runs a single random resmoke test
    • Use for: Porting existing integration tests, running complete workloads without interference
  • serial_driver_<command>: Runs when no other driver commands are active

    • Example: serial_driver_resmoke.sh - Runs resmoke tests sequentially
    • Use for: Full failover operations, validation steps that require quiescence
Quiescent Commands

Run in the absence of faults.

  • first_<command>: Optional setup command that runs once before any driver commands

    • Use for: Data initialization, schema setup, bootstrapping
  • eventually_<command>: Runs after driver commands start. Kills all drivers and stops faults, creating a new branch

    • Use for: Testing eventual consistency, availability after recovery, final state validation
    • Note: Include retry loops for service availability
  • finally_<command>: Like eventually, but only runs after all driver commands complete naturally

    • Use for: Testing subtle invariants, final consistency checks
Advanced Commands
  • anytime_<command>: Can run at any time after first command, even during singleton/serial drivers
    • Use for: Continuous invariant checks, monitoring, low-consistency availability checks

MongoDB Test Composer Examples

Example 1: basic_js_commands Template

This template runs parallel JavaScript operations against MongoDB with built-in retry logic for network failures.

Commands:

Shared Logic: commands.js provides retry mechanisms for network errors and connection helpers.

Key Features:

  • Automatic retry on MongoNetworkError, MongoServerSelectionError, RetryableWriteError
  • Random test data generation
  • Connection string discovery via /scripts/print_connection_string.sh
Example 2: random_resmoke Template

This template runs resmoke tests with randomization, adapting existing test infrastructure for Test Composer.

Commands:

Both use random seeds and shuffling: --seed $(od -vAn -N4 -tu4 < /dev/urandom) --shuffle --sanityCheck

Creating a New Test Template

  1. Create a test directory: buildscripts/antithesis/test_composer/<your_template_name>/

  2. Write test commands: Create executable scripts with appropriate prefixes:

    bash
    #!/usr/bin/env bash
    # buildscripts/antithesis/test_composer/<template>/parallel_driver_mytest.sh
    
    # Your test logic here
    # This can run in parallel with other parallel_driver commands
    
  3. Make scripts executable: chmod +x buildscripts/antithesis/test_composer/<template>/*.sh

  4. Helper files: Use helper_ prefix or subdirectories for shared code - these are ignored by Test Composer

Best Practices

  • Retry logic: Always include retry mechanisms for network and transient errors (see commands.js for examples)
  • Add Randomization: The more randomization you add to your tests, the more it allows antithesis to explore. It can control and reproduce the randomization so if it finds an interesting path it can explore it more.
  • Start simple: Begin with a singleton_driver to adapt existing tests, then evolve to parallel/serial commands
  • Idempotency: Design tests to handle being killed and restarted at any time

Configuring Test Composer in Evergreen

To use Test Composer instead of normal resmoke testing, set the antithesis_test_composer_dir variable in your Evergreen task:

yaml
  - <<: *antithesis_task_template
    name: antithesis_resmoke_suite_with_test_template
    tags: ...
    commands:
      ...
      - func: "antithesis image build and push"
        vars:
          suite: concurrency_sharded_replication_with_balancer_and_config_transitions_and_add_remove_shard # Still used for cluster topology
          resmoke_args: >- # any args that change the cluster topology can still be used
            --runAllFeatureFlagTests
          antithesis_test_composer_dir: basic_js_commands  # Directory name under buildscripts/antithesis/test_composer/

Additional Resources

If you are interested in leveraging Antithesis feel free to reach out to #ask-devprod-correctness or #server-testing on Slack.