fern/01-guide/06-prompt-engineering/classification.mdx
In this tutorial, you'll learn how to create a simple but effective spam classifier using BAML and OpenAI's GPT models. By the end, you'll have a working classifier that can distinguish between spam and legitimate messages.
First, let's define what our classification output should look like. Create a new file called spam_classifier.baml and add the following schema:
enum MessageType {
SPAM
NOT_SPAM
}
This schema defines a simple classification with two possible labels: SPAM or NOT_SPAM.
Next, we'll create a function that uses GPT-4 to classify text. Add this to your spam_classifier.baml file:
function ClassifyText(input: string) -> MessageType {
client "openai/gpt-5-mini"
prompt #"
Classify the message.
{{ ctx.output_format }}
{{ _.role("user") }}
{{ input }}
"#
}
Let's break down what this function does:
gpt-5-mini modelTo ensure our classifier works correctly, let's add some test cases:
test BasicSpamTest {
functions [ClassifyText]
args {
input "Buy cheap watches now! Limited time offer!!!"
}
}
test NonSpamTest {
functions [ClassifyText]
args {
input "Hey Sarah, can we meet at 3 PM tomorrow to discuss the project?"
}
}
This is what it looks like in the BAML Playground:
Now that you have your classifier set up, try it with your own examples. Here are some messages you can test:
height="640" style="border: none;" resize="both" overflow="auto" msallowfullscreen
</iframe></div>
While the spam classifier demonstrates single-label classification (where each input belongs to exactly one category), many real-world problems require multiple labels. Let's build a support ticket classifier that can assign multiple relevant categories to each ticket.
Create a new file called ticket_classifier.baml and define the possible ticket categories as an enum:
enum TicketLabel {
ACCOUNT
BILLING
GENERAL_QUERY
}
class TicketClassification {
labels TicketLabel[]
}
Notice how this schema differs from our spam classifier:
enum to define valid labelslabels field is an array (TicketLabel[]), allowing multiple labels per ticketAdd the classification function to your ticket_classifier.baml file:
function ClassifyTicket(ticket: string) -> TicketClassification {
client "openai/gpt-5-mini"
prompt #"
You are a support agent at a tech company. Analyze the support ticket and select all applicable labels.
{{ ctx.output_format }}
{{ _.role("user") }}
{{ ticket }}
"#
}
Key differences from the spam classifier:
Add test cases that cover both single-label and multi-label scenarios:
test ClassifyTicketSingleLabel {
functions [ClassifyTicket]
args {
ticket "I need help resetting my password"
}
}
test ClassifyTicketMultiLabel {
functions [ClassifyTicket]
args {
ticket "My account is locked and I can't access my billing information"
}
}
This is what it looks like in the BAML Playground:
Test the multi-label classifier with these examples: