fern/01-guide/04-baml-basics/testing-functions.mdx
You can test your BAML functions in the VSCode Playground by adding a test snippet into a BAML file:
enum Category {
Refund
CancelOrder
TechnicalSupport
AccountIssue
Question
}
function ClassifyMessage(input: string) -> Category {
client GPT4Turbo
prompt #"
... truncated ...
"#
}
test Test1 {
functions [ClassifyMessage]
args {
// input is the first argument of ClassifyMessage
input "Can't access my account using my usual login credentials, and each attempt results in an error message stating 'Invalid username or password.' I have tried resetting my password using the 'Forgot Password' link, but I haven't received the promised password reset email."
}
// 'this' is the output of the function
@@assert( {{ this == "AccountIssue" }})
}
{" "}
<div class="resizer"> <iframe class="resized" src="https://promptfiddle.com/embed?id=testing_functions" height="640" style="border: none;" resize="both" overflow="auto" msallowfullscreen ></iframe> </div>See more interactive examples
The BAML playground will give you a starting snippet to copy that will match your function signature.
<Warning> BAML doesn't use colons `:` between key-value pairs except in function parameters. </Warning> <hr /> ## Complex object inputsObjects are injected as dictionaries
class Message {
user string
content string
}
function ClassifyMessage(messages: Messages[]) -> Category {
...
}
test Test1 {
functions [ClassifyMessage]
args {
messages [
{
user "hey there"
// multi-line string using the #"..."# syntax
content #"
You can also add a multi-line
string with the hashtags
Instead of ugly json with \n
"#
}
]
}
}
For a function that takes an image as input, like so:
function MyFunction(myImage: image) -> string {
client GPT4o
prompt #"
Describe this image: {{myImage}}
"#
}
You can define test cases using image files, URLs, or base64 strings.
<Tabs> <Tab title="File" language="baml"> <Warning> Committing a lot of images into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of images, please read [GitHub's size limit documentation][github-large-files] and consider setting up [large file storage][github-lfs]. </Warning>test Test1 {
functions [MyFunction]
args {
myImage {
file "../path/to/image.png"
}
}
}
Image files must be somewhere in baml_src/.
If base64 is a data URL, this field will be ignored.
For a function that takes audio as input, like so:
function MyFunction(myAudio: audio) -> string {
client GPT4o
prompt #"
Describe this audio: {{myAudio}}
"#
}
You can define test cases using audio files, URLs, or base64 strings.
<Tabs> <Tab title="File" language="baml"> <Warning> Committing a lot of audio files into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of audio, please read [GitHub's size limit documentation][github-large-files] and consider setting up [large file storage][github-lfs]. </Warning>test Test1 {
functions [MyFunction]
args {
myAudio {
file "../path/to/audio.mp3"
}
}
}
audio files must be somewhere in baml_src/.
If base64 is a data URL, this field will be ignored.
For a function that takes a Pdf as input, like so:
function MyFunction(myPdf: pdf) -> string {
client GPT4o
prompt #"
Summarize this Pdf: {{myPdf}}
"#
}
You can define test cases using Pdf files, URLs, or base64 strings.
<Tabs> <Tab title="File" language="baml"> <Warning> Committing a lot of Pdf files into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of Pdfs, please read [GitHub's size limit documentation][github-large-files] and consider setting up [large file storage][github-lfs]. </Warning>test Test1 {
functions [MyFunction]
args {
myPdf {
file "../path/to/document.pdf"
}
}
}
Pdf files must be somewhere in baml_src/.
If base64 is a data URL, this field will be ignored.
For a function that takes a video as input, like so:
function MyFunction(myVideo: video) -> string {
client GPT4o
prompt #"
Describe this video: {{myVideo}}
"#
}
You can define test cases using video files, URLs, or base64 strings.
<Tabs> <Tab title="File" language="baml"> <Warning> Committing large video files into your repository can make it slow to clone and pull your repository. If you expect to commit >500MiB of videos, please read [GitHub's size limit documentation][github-large-files] and consider setting up [large file storage][github-lfs]. </Warning>test Test1 {
functions [MyFunction]
args {
myVideo {
file "../path/to/video.mp4"
}
}
}
Video files must be somewhere in baml_src/.
If base64 is a data URL, this field will be ignored.
Test blocks in BAML code may contain checks and asserts. These attributes behave similarly to value-level Checks and Asserts, with several additional variables available in the context of the jinja expressions you can write in a test:
_ variable contains fields result, checks and latency_ms.this variable refers to the value computed by the test, and is
shorthand for _.result._.checks.$NAME can refer to the NAME of any earlier
check that was run in the same test block. By referring to prior checks,
you can build compound checks and asserts, for example asserting that all
checks of a certain type passed.The following example illustrates how each of these features can be used to validate a test result.
test MyTest {
functions [EchoString]
args {
input "example input"
}
@@check( nonempty, {{ this|length > 0 }} )
@@check( small_enough, {{ _.result|length < 1000 }} )
@@assert( {{ _.checks.nonempty and _.checks.small_enough }})
@@assert( {{ _.latency_ms < 1000 }})
}
@@check and @@assert behave differently:
@@check represents a property
of the test result that should either be manually checked or checked by a
subsequent stage in the test. Multiple @@check predicates can fail
without causing a hard failure of the test.@@assert represents a hard guarantee. The first failing assert will halt
the remainder of the checks and asserts in this particular test.For more information about the syntax used inside @@check and @@assert
attributes, see Checks and Asserts
Classes and enums marked with the @@dynamic
attribute can be modified in tests using the type_builder and dynamic
blocks.
The type_builder block can contain new types scoped to the parent test block
and also dynamic blocks that act as modifiers for dynamic classes or enums.
{" "}
<div class="resizer"> <iframe class="resized" src="https://promptfiddle.com/embed?id=dynamic_types" height="640" style="border: none;" resize="both" overflow="auto" msallowfullscreen ></iframe> </div>While the VSCode playground is excellent for interactive development and debugging, you can also run your tests from the command line using the BAML CLI:
# Run all tests
baml-cli test
# Run tests for a specific function
baml-cli test -i "ClassifyMessage::"
# Run tests in parallel with custom concurrency
baml-cli test --parallel 5
# List available tests without running them
baml-cli test --list
See the CLI Test Reference for complete documentation of all available options, filtering capabilities, and output formats.
When deploying to production, you may want to reduce the size of your generated baml_client by excluding test blocks. Use the --no-tests flag with the generate command:
baml-cli generate --no-tests
This strips test blocks from the inlined BAML code in the generated client, reducing bundle size without affecting runtime functionality. Your BAML functions will continue to work normally; only the embedded test definitions are removed.
See the CLI Generate Reference for more details.