beps/docs/proposals/BEP-002-match/README.md
The match expression in BAML provides exhaustive, type-safe pattern matching over union types, enums, and literal values. It enables declarative handling of different data shapes while ensuring at compile time that all cases are covered.
LLM responses in BAML frequently use union types and nullable fields. Currently, handling these requires verbose if-else chains that are error-prone and don't guarantee exhaustiveness:
function Process(result: LlmResult) -> string {
if (result == null) {
return "No result"
}
if (result instanceof Success) {
if (result.score >= 0.9) {
return "High confidence"
}
return "Low confidence"
}
if (result instanceof Failure) {
return "Failed: " + result.reason
}
// What if we forget a case? No compiler help.
}
Pattern matching solves this with:
Problem: In a bare pattern like match(x) { Type1 => ... }, it's ambiguous whether Type1 is a type name or a variable/value.
Solution: Type patterns always use the name: TypeExpr syntax:
match (x) {
s: Success => "got success: " + s.data // s is bound, type is Success
_: Failure => "got failure" // _ is bound but discarded
value1 => "literal match" // value1 is a literal or catch-all
}
This makes parsing unambiguous and aligns with TypeScript's type annotation syntax.
_ is a Binding, Not a Special KeywordThe underscore _ is a valid binding name. Like any other binding, it captures the matched value. However, the value is dropped later in the pipeline (not accessible in the arm body).
match (result) {
_: Success => "success" // _ bound to Success, but dropped
other => "other: " + other // other bound and usable
}
Important: There is no special default keyword. Any untyped binding (including _ or any identifier) acts as a catch-all because it matches any value.
A pattern without : TypeExpr matches anything and binds the scrutinee:
match (x) {
_: int => "integer"
_: string => "string"
other => "something else: " + other // catch-all, binds to 'other'
}
// Or using _ as catch-all (value discarded):
match (x) {
_: int => "integer"
_ => "not an integer" // catch-all, value discarded
}
The type expression after : supports full generality — unions, type aliases, parenthesized groups:
// All equivalent:
x: int | bool
x: (int | bool)
x: (int) | (bool)
// Complex unions:
result: Success | Failure
code: 200 | 201 | 204
cmd: "start" | "stop"
When binding a union pattern, the bound variable retains the exact union type, not a collapsed/widened type:
class Success { data string }
class Failure { reason string }
class Pending { eta int }
match (result) {
x: Success | Failure => {
// x has type `Success | Failure`, NOT some supertype
handle(x)
}
_: Pending => "pending"
}
This preserves precision: you know exactly which types x could be.
Exhaustiveness is required. An untyped binding (catch-all) satisfies exhaustiveness for all remaining cases:
type Result = Success | Failure | null
// Exhaustive via explicit patterns:
match (result) {
_: Success => "ok"
_: Failure => "error"
null => "nothing"
}
// Exhaustive via catch-all:
match (result) {
_: Success => "ok"
_ => "not success" // covers Failure and null
}
// NOT exhaustive — compile error:
match (result) {
_: Success => "ok"
_: Failure => "error"
// Error: 'null' not handled
}
Future enhancement: Warn when a catch-all covers multiple cases, to encourage explicit handling.
match_expr := 'match' '(' expr ')' '{' match_arm+ '}'
match_arm := pattern guard? '=>' arm_body
pattern := binding_pattern | literal_pattern | union_pattern
binding_pattern := IDENT (':' type_expr)?
literal_pattern := 'null' | 'true' | 'false' | INTEGER | FLOAT | STRING
union_pattern := (literal_pattern | enum_variant) ('|' (literal_pattern | enum_variant))*
enum_variant := IDENT '.' IDENT
guard := 'if' expr
arm_body := expr | block_expr
type_expr := ... (existing type expression grammar)
| Pattern | Matches | Binding | Example |
|---|---|---|---|
name | Anything | name bound to scrutinee | other => use(other) |
_ | Anything | Discarded | _ => "fallback" |
name: Type | Values of type T | name bound (narrowed) | s: Success => s.data |
_: Type | Values of type T | Discarded | _: Failure => "failed" |
null | null value | None | null => "nothing" |
true / false | Boolean literal | None | true => "yes" |
42 / 3.14 | Numeric literal | None | 200 => "OK" |
"foo" | String literal | None | "start" => "starting" |
Enum.Variant | Enum variant (value equality) | None | Status.Active => "active" |
A | B | Union of literals/enum variants | None | Status.Active | Status.Pending => ... |
x: A | B | Union of types (not values) | x bound | x: Success | Failure => ... |
Note: The
Typeafter:must be an actual type (class, primitive, type alias), not an enum variant. Enum variants are values and must be matched directly without:binding.
The : in name: TypeExpr binds tighter than |:
x: int | bool // parsed as x: (int | bool)
x: (int | bool) // same as above
x: (int) | (bool) // same as above
class Success { data string, score float }
class Failure { reason string, code int }
type Result = Success | Failure | null
function Process(result: Result) -> string {
return match (result) {
null => "No result"
s: Success => "Got: " + s.data
f: Failure => "Error: " + f.reason
}
}
Guards add conditions that must be true for the arm to match:
function Classify(result: Result) -> string {
return match (result) {
null => "none"
s: Success if s.score >= 0.9 => "excellent: " + s.data
s: Success if s.score >= 0.7 => "good: " + s.data
s: Success => "marginal: " + s.data
f: Failure if f.code >= 500 => "server error: " + f.reason
f: Failure => "client error: " + f.reason
}
}
Important: Guards do not contribute to exhaustiveness. A guarded pattern s: Success if cond does not cover all Success values. You must have an unguarded fallback.
Scope note: This is a simplification for the current proposal. Future versions may support smarter exhaustiveness analysis that recognizes complementary guards (e.g.,
if x > 0andif x <= 0) as covering all cases.
enum Status { Active, Inactive, Pending, Archived }
function Describe(s: Status) -> string {
return match (s) {
Status.Active => "User is active"
Status.Inactive => "User is inactive"
Status.Pending => "Awaiting approval"
Status.Archived => "User archived"
}
}
If you later add Status.Deleted, the compiler will error on this match — forcing you to handle the new case.
type HttpSuccess = 200 | 201 | 204
type HttpError = 400 | 404 | 500
function DescribeStatus(code: int) -> string {
return match (code) {
200 | 201 => "Success with content"
204 => "Success, no content"
400 | 404 => "Client error"
500 => "Server error"
_ => "Unknown status: " + code
}
}
enum Status { Active, Inactive, Pending }
function IsActionable(s: Status) -> bool {
return match (s) {
Status.Active | Status.Pending => true
Status.Inactive => false
}
}
type Primitive = string | int | bool
type Complex = User | Image
type Any = Primitive | Complex | null
function Categorize(val: Any) -> string {
return match (val) {
null => "nothing"
p: Primitive => "primitive value" // p has type string | int | bool
c: Complex => "complex object" // c has type User | Image
}
}
class Request { auth: ApiKey | OAuth | null, endpoint: string }
function Authorize(req: Request) -> string {
return match (req.auth) {
null => "No auth for " + req.endpoint
a: ApiKey => match (a.key) {
k: string if k.startsWith("prod_") => "Production key"
_ => "Dev key"
}
o: OAuth if o.expires > now() => "Valid OAuth"
_: OAuth => "Expired OAuth"
}
}
When an arm needs multiple statements, use a block. The last expression is the result:
match (status) {
_: Error => {
log("Error occurred")
metrics.increment("errors")
"Failed" // return value
}
_ => "OK"
}
The compiler performs exhaustiveness analysis to ensure all possible values are handled.
s: T if cond covers only a subset of T_ or name) covers everything not yet matchedtype T = A | B | C
// OK: all explicit
match (x) {
_: A => "a"
_: B => "b"
_: C => "c"
}
// OK: catch-all covers B and C
match (x) {
_: A => "a"
_ => "not a"
}
// ERROR: C not covered
match (x) {
_: A => "a"
_: B => "b"
}
// ERROR: unreachable arm (B already covered by catch-all)
match (x) {
_: A => "a"
_ => "other"
_: B => "b" // unreachable!
}
Within a matched arm, bound variables have their type narrowed:
type T = string | int | null
match (x) {
s: string => s.length() // s is string, not string | int | null
n: int => n + 1 // n is int
null => 0
}
match (x) {
a: A => use(a) // a is A
a: B => use(a) // a is B (different a, shadows previous)
}
// a is not accessible here
| Feature | Rust | Python 3.10+ | TypeScript | BAML |
|---|---|---|---|---|
| Syntax | match x { ... } | match x: case ... | N/A | match (x) { ... } |
| Type patterns | Some(x) | case Some(x) | N/A | x: Type |
| Binding | x @ pattern | case x | N/A | x: Type or x |
| Guards | if cond | if cond | N/A | if cond |
| Exhaustiveness | Enforced | Optional | N/A | Enforced |
| Literal unions | 1 | 2 | 3 | case 1 | 2 | 3 | N/A | 1 | 2 | 3 |
Key BAML choices:
: for type binding (familiar to TS/JS developers)=> for arm bodies (familiar to JS arrow functions)default keyword; catch-all is just an untyped bindingThe following features are explicitly deferred for future consideration:
// NOT YET SUPPORTED:
match (user) {
User { name: "Admin" } => "admin"
User { name, age } => name + " is " + age
}
Destructuring introduces complexity around:
{ field } vs { field? })// NOT SUPPORTED:
match (x) {
_: A | _: B => "a or b" // can't have multiple binding patterns
}
// INSTEAD, use type union:
match (x) {
_: A | B => "a or b" // A | B is a type expression
}
@ Binding (Bind Whole + Destructure)// NOT SUPPORTED:
match (x) {
s @ Success { data } => use(s, data)
}
Add parse_match_expr to the parser, producing:
MATCH_EXPR node containing:
MATCH_ARM nodesEach MATCH_ARM contains:
Desugar match to equivalent if-else chain with instanceof checks:
// Source:
match (x) {
s: Success if s.score > 0.9 => "high"
s: Success => "low"
_: Failure => "failed"
}
// Desugared (conceptually):
{
let $scrut = x
if ($scrut instanceof Success) {
let s = $scrut
if (s.score > 0.9) {
"high"
} else {
"low"
}
} else if ($scrut instanceof Failure) {
"failed"
} else {
// unreachable if exhaustive
}
}
These are subtle points that implementers must handle correctly:
_ is NOT a KeywordUnlike Rust where _ is a special pattern, in BAML _ is just an identifier that happens to be dropped. The lexer should emit it as a WORD token, not a special token. The "drop" behavior is handled later in the pipeline (name resolution or codegen), not in parsing.
// These are parsed identically at the syntax level:
_ => "fallback"
other => "fallback"
// The difference is semantic: _ is dropped, other is usable
After :, we parse a type expression, not a value expression. This means:
s: Success — Success is resolved as a type (class name)s: 200 | 201 — 200 and 201 are literal typess: Success | Failure — union of class typesWithout the :, we have value patterns:
200 — matches the integer value 200Status.Active — matches the enum variant (via value equality)other — binds to any valueImportant distinction: Enum variants like Status.Active are values, not types. You match them directly as value patterns:
// CORRECT: Enum variant as value pattern (no binding needed)
Status.Active => "active"
Status.Active | Status.Pending => "actionable"
// INCORRECT: Cannot use enum variant after `:` — it's not a type!
// x: Status.Active => ... // ERROR: Status.Active is a value, not a type
To match an enum with binding, match the entire enum type and use guards:
enum Status { Active, Inactive, Pending }
match (s) {
// Match specific variants as values (no binding):
Status.Active => "active"
Status.Inactive => "inactive"
Status.Pending => "pending"
}
// Or match the enum type with binding and use guards:
match (s) {
x: Status if x == Status.Active => "active: " + x
_: Status => "other status"
}
This is critical for the exhaustiveness checker:
match (x) {
s: Success if s.score > 0.9 => "high"
// ERROR: Success not exhaustively covered!
}
Even though s: Success appears, the guard makes it partial. The checker must track "guarded coverage" separately from "total coverage."
The scrutinee must be evaluated exactly once and stored:
match (expensiveCall()) {
_: A => ...
_: B => ...
}
// Must compile to:
let $scrut = expensiveCall()
if ($scrut instanceof A) { ... }
else if ($scrut instanceof B) { ... }
// NOT:
if (expensiveCall() instanceof A) { ... } // Wrong: multiple calls!
When x: A | B matches, x has type A | B, not a supertype:
class Success { data string }
class Failure { reason string }
class Pending { eta int }
match (result) {
x: Success | Failure => {
// x has type: Success | Failure
// NOT some broader type
}
_: Pending => "pending"
}
This requires the type checker to construct the exact union type from the pattern.
Order matters. The compiler should error on unreachable arms:
match (x) {
_ => "catch all"
_: Success => "never reached" // ERROR: unreachable
}
But overlapping typed patterns are fine (first wins):
match (x) {
_: A => "a" // matches A
_: A | B => "ab" // only matches B (A already handled)
}
These questions from earlier drafts have been resolved:
| Question | Resolution |
|---|---|
default vs _ for catch-all? | No special keyword; any untyped binding is catch-all |
Guard keyword: if or when? | if — familiar to all |
Should | allow multiple patterns? | No; | is always a type/value union within one pattern |
| Require parens in type unions? | No; precedence is clear (x: A | B = x: (A | B)) |