baml_language/sdk_tests/README.md
The BAML programming language allows users to generate an SDK in their host language of choice with bindings to all their BAML types and functions. These are the e2e tests for those generated SDKs (which tests both the sdkgen logic and underlying FFI interfaces).
# Run every SDK test target across all fixtures.
cargo nextest run -p sdk_test_python_pydantic2 -p sdk_test_typescript_node
# Run SDK tests for a specific generator.
cargo nextest run -p sdk_test_python_pydantic2
cargo nextest run -p sdk_test_typescript_node
# Or run the python/nodejs tests specifically
cargo nextest run -p sdk_test_python_pydantic2 function_calls::pytest
cargo nextest run -p sdk_test_typescript_node function_calls::vitest
SDK tests are designed to be run using
cargo nextest runand will > fail in surprising ways if run usingcargo test. Specifically,cargo nextest runis designed to pick up changes in the BAML FFI layers and SDK generators. See DEVELOPMENT.md for more details.
cargo nextest will not pass extra arguments through to pytest or vitest.
To apply test filters to pytest/vitest, first run the nextest to set up the
fixture's generated/ directory, then run the host-language test directly:
# pytest: run tests matching a keyword expression.
cargo nextest run -p sdk_test_python_pydantic2 function_calls::pytest
(cd sdk_tests/crates/python_pydantic2/function_calls/generated && uv run pytest -v -k optional_args)
# vitest: run tests matching a test-name pattern.
cargo nextest run -p sdk_test_typescript_node function_calls::vitest
(cd sdk_tests/crates/typescript_node/function_calls/generated && pnpm exec vitest run -t optional_args)
Each SDK is implemented in two parts: an FFI to provide core runtime bindings and an SDK generator to generate typed bindings.
sdk_test_python_pydantic2 provides coverage for
sdks/python/rust/bridge_pythonsdks/python/src/baml_coresdks/python/rust/sdkgen_python_pydantic2sdk_test_typescript_node provides coverage for
sdks/nodejs/bridge_nodejssdks/nodejs/sdkgen_typescript_nodeThere are two dimensions for SDK tests: generators and fixtures.
There is one Rust crate per SDK generator:
sdk_test_python_pydantic2 for the python/pydantic2 generatorsdk_test_typescript_node for the typescript/node generatorEach crate fans out over every fixture in sdk_tests/fixtures/.
Each fixture is a single baml_src/ tree that contains .baml source
for testing different aspects of each SDK.
Host-language test code for each generated SDK/fixture lives in the
corresponding customizable/ tree.
sdk_tests/
|-- fixtures/ # generator-agnostic input only -- baml_src/ and nothing else
| |-- function_calls/baml_src/ # .baml source (input to every generator)
| |-- llm_functions/baml_src/
| `-- type_shapes/baml_src/
`-- crates/ # one crate per generator target; per-fixture content nested inside
|-- python_pydantic2/
| |-- Cargo.toml # name = "sdk_test_python_pydantic2"
| |-- function_calls/
| | |-- customizable/ # tracked: *.py -- symlinked into generated/
| | `-- generated/ # gitignored: build output
| | |-- baml_sdk/ # codegen output
| | |-- pyproject.toml # name = "sdk-tests-python-pydantic2-docstrings-etc"
| | |-- .venv/ # uv sync output
| | `-- *.py # symlinked from ../customizable/
| |-- llm_functions/
| | |-- customizable/
| | `-- generated/ # same shape
| `-- type_shapes/
| |-- customizable/
| `-- generated/
`-- typescript_node/
|-- Cargo.toml # name = "sdk_test_typescript_node"
|-- function_calls/
| |-- customizable/ # tracked: *.test.ts -- copied into generated/
| `-- generated/ # gitignored: build output
| |-- baml_sdk/ # empty until sdkgen_typescript_node lands
| |-- package.json # name = "sdk-tests-nodejs-typescript-docstrings-etc"
| |-- tsconfig.json
| |-- node_modules/ # pnpm install output
| `-- *.test.ts # copied from ../customizable/
|-- llm_functions/
| |-- customizable/
| `-- generated/
`-- type_shapes/
|-- customizable/
`-- generated/
fixtures/ and under each
crates/<generator>/): lowercase snake (function_calls,
llm_functions, type_shapes). The same name appears in both
trees -- fixtures/<F>/baml_src/ is the input;
crates/<G>/<F>/ is the output for one generator.crates/): lowercase snake
matching the generator key (python_pydantic2,
typescript_node).sdk_test_<generator> -- one per generator.mkdir -p sdk_tests/fixtures/<name>/baml_src/ and drop .baml
files in. Nothing else goes under
sdk_tests/fixtures/<name>/ -- it's the generator-agnostic
input only.<name>/customizable/ directory under the generator's crate
containing the host-language tests, e.g.
sdk_tests/crates/python_pydantic2/<name>/customizable/test_main.py
and/or
sdk_tests/crates/typescript_node/<name>/customizable/main.test.ts.cargo nextest run -p sdk_test_python_pydantic2 <name>:: to run
(nextest fires setup.sh to sync the new fixture's venv).No code edits needed in build.rs or src/lib.rs -- the fixture
list is discovered at build time from sdk_tests/fixtures/ and
emitted into the generated test scaffold.