rfcs/hspec-test-suite.md
authors: Phil Freeman (paf31), Abby Sassel (sassela)
The recent work on new backends has cast a light on some problems which substantially slow down new feature development:
It is prohibitively expensive, in terms of developer effort, to write integration tests for alternative (non-Postgres) backends, or to port existing Postgres tests to other backends.
The existing test infrastructure does not support cross-backend feature testing, such as generalized joins.
The current method of running integration tests in a local development environment and in CI (BuildKite) is inconsistent and convoluted.
Project goal: Make it as easy as possible for developers to write high-quality tests for new features, and run them locally and in CI.
Produce a Haskell-based test suite which can initially complement, and eventually replace, the testing power of the current pytest test suite for alternative backend and cross-backend features.
<!-- how does this differ from code coverage? -->The testing power is the ability to identify incorrect code via a failing test.
A test group is an indivisible unit of the test suite. Each test group runs an ordered sequence of tests. Test groups with only a single test are allowed and preferred.
How will we know when we've achieved the project goal?
TestGraphQLQueryBasicMySQL, is executable locally against a MySQL backend.TestGraphQLQueryBasicCommon, is executable locally against all currently supported backends (Postgres, Citus, SQL Server, BigQuery and MySQL) or a subset thereof.RemoteResourceSpec is executable locally against a Postgres backend(s).This GitHub issue is kept up to date with progress.
Each checkpoint does not necessarily correspond to single PR. Organise PRs as you judge sensible, but prefer smaller PRs where they are functional and self-contained.
We will scope this to something very simple for now: basic queries (where, order by, filter and offset) but aim to be as comprehensive and deliberate as possible. Incidentally, this will also be the first set of features we will want to be able to test for any new backend. We will extend this subset of features in the same order as that in which we prioritize work on new backends.
TestM can be reused, adapted, or replaced.TestGraphQLQueryBasicMySQL, executable locally against MySQL.TestGraphQLQueryBasicCommon, executable locally against all backends, and run in CI.RemoteResourceSpec, executable locally against Postgres and run in CI.TestGraphQLQueryBasicMySQL.TestGraphQLQueryBasicCommon.RemoteResourceSpec.What design decisions have been debated that are worth documenting?
Should the test suite spin up the services it needs, or expect necessary services to be running?
Should we stick to the golden test approach from the pytest suite?
If this project is successful, we will work to prioritize adding other “feature areas” to this new test suite. The ideal scenario would be to replace as much as possible of pytest over the next 6 to 12 months. It will have to remain for tests which don’t fit into this new framework, but its surface area should be greatly reduced.
There will be two axes along which we will extend things: test cases (partitioned into groups), and backends. We need to make it possible for developers to work along each of these axes:
Adding a new test case to an existing group for all existing backends
Adding a new test group and implementing test support for all (or a subset of) existing backends
Adding a new backend and implementing support for some existing test groups
We can visualize the space of tests as follows:
The test architecture should reflect this structure.
The x-axis represents the various backends. Each backend should implement the following:
The ability to provision or connect to a test database
The ability to list out supported all test groups
The ability to set up the test database schema, which should cover all of the supported test groups
A blank metadata structure which provides the test database as a data source with a standard name
The ability to run a test transactionally, with roll-back of data between test cases
This might be sensibly implemented in a new BackendTest type class.
The y-axis represents the various test groups.
Test groups should be roughly aligned with product features. For example, basic queries and remote schemas might be two product feature areas which necessitate the creation of new test groups.
Each group should implement the following:
A function which modifies the blank metadata structure to add its own necessary metadata
This function might fail, because it might have preconditions on the input metadata. E.g. a test group might set up permissions on a table, but assume that permissions are undefined as a precondition. Composing two such groups could result in failure.
A failure to install metadata additions should result in a test failure.
Given these backends and test groups, the basic test plan looks like this:
For each backend B (in parallel):
Provision a test database for B
Create a blank metadata structure for B
For each test group G which B claims to support:
Apply the metadata additions for G to the metadata
Replace the metadata on the server with this generated metadata
Run all of the tests cases in group G
DB-to-DB Joins Test Suite tracking GitHub issue
This change unblocks testing of DB-to-DB joins, which was untenable with our existing Python test suite. Although the PR includes exploratory work and overlaps somewhat with this RFC’s proposal, it’s minimally-scoped to support testing of DB-to-DB joins as soon as possible. Some or all of this work may be refactored or deleted depending on the outcome of this RFC and related work.
Within that PR, Vamshi described the expected DB-to-DB relationship behaviour and a proposal for designing new DB-to-DB joins tests.
Feedback from DB-to-DB joins testing effort
Test.Hspec.Wai's WaiSession and a reader called TestM to carry around some postgres config. this limits us to request/response testing, and doesn't give us much access to the guts of the running server. if it were me, i might go full YesodExample style and add a new type with an Example instance and expose helpers to encapsulate both request/response testing and examining the server state. this also gives you better type inferences and error messagesscripts/dev.sh - these tests are currently in a separate module tree from the other haskell tests (i believe intentionally?), so dev.sh doesn't appear to know about them. probably want to rectify thattestCaseFamily :: SourceMetadata backend -> m (SourceMetadata backend) looks reasonable, but we're currently doing:
withMetadata
:: ((SourceMetadata ('Postgres 'Vanilla), Application) -> IO b)
-> IO b
Ask Joe/IR team for more detail if needed.
Ask Joe/IR team for more detail if needed.
Ask Vamshi for more detail if needed.