testing-guidelines/quality-strategy.md
Very brief, mildly aspirational quality strategy document. This isn't where we are but it is where we want to get to.
This document does not detail how to setup an environment or how to run the tests locally nor does it contain any best practices that we try to follow when writing tests, that information exists in the contributing guide.
The purposes of all testing activities on Gradio fit one of the following objectives:
Testing is always a tradeoff. We can't cover everything unless we want to spend all of our time writing and running tests. We should focus on a few keys areas.
We should not focus on code coverage but on test coverage following the below criteria:
Our tests will broadly fall into one of three categories:
Static quality checks are generally very fast to run and do not require building the code base. They also provide the least value. These tests would be things like linting, typechecking, and formatting.
While they offer little in terms of testing functionality they align very closely with objective (4, 5) as they generally help to keep the codebase in good shape and offer very fast feedback. Such check are almost free from an authoring point of view as fixes can be mostly automated (either via scripts or editor integrations).
These tests generally test either isolated pieces of code or test the relationship between parts of the code base. They sometimes test functionality or give indications of working functionality but never offer enough confidence to rely on them solely.
These test are usually either unit or integration tests. They are generally pretty quick to write (especially unit tests) and run and offer a moderate amount of confidence. They align closely with Objectives 2 and 3 and a little bit of 1.
These kind of tests should probably make up the bulk of our handwritten tests.
These tests give by far the most confidence as they are testing only the functionality of the software and do so by running the entire software itself, exactly as a user would.
This aligns very closely with objective 1 but significantly impacts objective 5, as these tests are costly to both write and run. Despite the value, due to the downside we should try to get as much out of other tests types as we can, reserving functional testing for complex use cases and end-to-end journey.
Tests in this category could be browser-based end-to-end tests, accessibility tests, or performance tests. They are sometimes called acceptance tests.
We currently use the following tools:
All operating systems refer to the current runner variants supported by GitHub actions.
All unspecified version segments (x) refer to latest.
| Software | Version(s) | Operating System(s) |
|---|---|---|
| Python | 3.10.x | ubuntu-latest, windows-latest |
| Node | 18.x.x | ubuntu-latest |
| Browser | playwright-chrome-x | ubuntu-latest |
Tests need to be executed in a number of environments and at different stages of the development cycle in order to be useful. The requirements for tests are as follows:
For instructions on how to write and run tests see the contributing guide.
As we formalise our testing strategy and bring / keep our test up to standard, it is important that we have some principles on managing defects as they occur/ are reported. For now we can have one very simple rule: