fern/01-guide/09-comparisons/pydantic.mdx
Pydantic is a popular library for data validation in Python used by most -- if not all -- LLM frameworks, like instructor.
BAML also uses Pydantic. The BAML Rust compiler can generate Pydantic models from your .baml files. But that's not all the compiler does -- it also takes care of fixing common LLM parsing issues, supports more data types, handles retries, and reduces the amount of boilerplate code you have to write.
Let's dive into how Pydantic is used and its limitations.
Pydantic can help you get structured output from an LLM easily at first glance:
class Resume(BaseModel):
name: str
skills: List[str]
def create_prompt(input_text: str) -> str:
PROMPT_TEMPLATE = f"""Parse the following resume and return a structured representation of the data in the schema below.
Resume:
---
{input_text}
---
Schema:
{Resume.model_json_schema()['properties']}
Output JSON:
"""
return PROMPT_TEMPLATE
def extract_resume(input_text: str) -> Union[Resume, None]:
prompt = create_prompt(input_text)
chat_completion = client.chat.completions.create(
model="gpt-5", messages=[{"role": "system", "content": prompt}]
)
try:
output = chat_completion.choices[0].message.content
if output:
return Resume.model_validate_json(output)
return None
except Exception as e:
raise e
That's pretty good, but now we want to add an Education model to the Resume model. We add the following code:
...
+class Education(BaseModel):
+ school: str
+ degree: str
+ year: int
class Resume(BaseModel):
name: str
skills: List[str]
+ education: List[Education]
def create_prompt(input_text: str) -> str:
additional_models = ""
+ if "$defs" in Resume.model_json_schema():
+ additional_models += f"\nUse these other schema definitions as +well:\n{Resume.model_json_schema()['$defs']}"
PROMPT_TEMPLATE = f"""Parse the following resume and return a structured representation of the data in the schema below.
Resume:
---
{input_text}
---
Schema:
{Resume.model_json_schema()['properties']}
+ {additional_models}
Output JSON:
""".strip()
return PROMPT_TEMPLATE
...
A little ugly, but still readable... But managing all these prompt strings can make your codebase disorganized very quickly.
Then you realize the LLM sometimes outputs some text before giving you the json, like this:
+ The output is:
{
"name": "John Doe",
... // truncated for brevity
}
So you add a regex to address that that extracts everything in {}:
def extract_resume(input_text: str) -> Union[Resume, None]:
prompt = create_prompt(input_text)
print(prompt)
chat_completion = client.chat.completions.create(
model="gpt-5", messages=[{"role": "system", "content": prompt}]
)
try:
output = chat_completion.choices[0].message.content
print(output)
if output:
+ # Extract JSON block using regex
+ json_match = re.search(r"\{.*?\}", output, re.DOTALL)
+ if json_match:
+ json_output = json_match.group(0)
return Resume.model_validate_json(output)
return None
except Exception as e:
raise e
Next you realize you actually want an array of Resumes, but you can't really use List[Resume] because Pydantic and Python don't work this way, so you have to add another wrapper:
+class ResumeArray(BaseModel):
+ resumes: List[Resume]
Now you need to change the rest of your code to handle different models. That's good longterm, but it is now more boilerplate you have to write, test and maintain.
Next, you notice the LLM sometimes outputs a single resume {...}, and sometimes an array [{...}]...
You must now change your parser to handle both cases:
+def extract_resume(input_text: str) -> Union[List[Resume], None]:
+ prompt = create_prompt(input_text) # Also requires changes
chat_completion = client.chat.completions.create(
model="gpt-5", messages=[{"role": "system", "content": prompt}]
)
try:
output = chat_completion.choices[0].message.content
if output:
# Extract JSON block using regex
json_match = re.search(r"\{.*?\}", output, re.DOTALL)
if json_match:
json_output = json_match.group(0)
try:
+ parsed = json.loads(json_output)
+ if isinstance(parsed, list):
+ return list(map(Resume.model_validate_json, parsed))
+ else:
+ return [ResumeArray(**parsed)]
return None
except Exception as e:
raise e
You could retry the call against the LLM to fix the issue, but that will cost you precious seconds and tokens, so handling this corner case manually is the only solution.
Sidenote: At this point your prompt looks like this:
JSON Schema:
{'name': {'title': 'Name', 'type': 'string'}, 'skills': {'items': {'type': 'string'}, 'title': 'Skills', 'type': 'array'}, 'education': {'anyOf': [{'$ref': '#/$defs/Education'}, {'type': 'null'}]}}
Use these other JSON schema definitions as well:
{'Education': {'properties': {'degree': {'title': 'Degree', 'type': 'string'}, 'major': {'title': 'Major', 'type': 'string'}, 'school': {'title': 'School', 'type': 'string'}, 'year': {'title': 'Year', 'type': 'integer'}}, 'required': ['degree', 'major', 'school', 'year'], 'title': 'Education', 'type': 'object'}}
and sometimes even GPT-4 outputs incorrect stuff like this, even though it's technically correct JSON (OpenAI's "JSON mode" will still break you)
{
"name":
{
"title": "Name",
"type": "string",
"value": "John Doe"
},
"skills":
{
"items":
{
"type": "string",
"values":
[
"Python",
"JavaScript",
"React"
]
... // truncated for brevity
(this is an actual result from GPT-4 before some more prompt engineering)
when all you really want is a prompt that looks like the one below -- with way less tokens (and less likelihood of confusion). :
Parse the following resume and return a structured representation of the data in the schema below.
Resume:
---
John Doe
Python, Rust
University of California, Berkeley, B.S. in Computer Science, 2020
---
+JSON Schema:
+{
+ "name": string,
+ "skills": string[]
+ "education": {
+ "school": string,
+ "degree": string,
+ "year": integer
+ }[]
+}
Output JSON:
Ahh, much better. That's 80% less tokens with a simpler prompt, for the same results. (See also Microsoft's TypeChat which uses a similar schema format using typescript types)
But we digress, let's get back to the point. You can see how this can get out of hand quickly, and how Pydantic wasn't really made with LLMs in mind. We haven't gotten around to adding resilience like retries, or falling back to a different model in the event of an outage. There's still a lot of wrapper code to write.
There are other core limitations. Say you want to do a classification task using Pydantic. An Enum is a great fit for modelling this.
Assume this is our prompt:
Classify the company described in this text into the best
of the following categories:
Text:
---
{some_text}
---
Categories:
- Technology: Companies involved in the development and production of technology products or services
- Healthcare: Includes companies in pharmaceuticals, biotechnology, medical devices.
- Real estate: Includes real estate investment trusts (REITs) and companies involved in real estate development.
The best category is:
Since we have descriptions, we need to generate a custom enum we can use to build the prompt:
class FinancialCategory(Enum):
technology = (
"Technology",
"Companies involved in the development and production of technology products or services.",
)
...
real_estate = (
"Real Estate",
"Includes real estate investment trusts (REITs) and companies involved in real estate development.",
)
def __init__(self, category, description):
self._category = category
self._description = description
@property
def category(self):
return self._category
@property
def description(self):
return self._description
We add a class method to load the right enum from the LLM output string:
@classmethod
def from_string(cls, category: str) -> "FinancialCategory":
for c in cls:
if c.category == category:
return c
raise ValueError(f"Invalid category: {category}")
Update the prompt to use the enum descriptions:
def print_categories_and_descriptions():
for category in FinancialCategory:
print(f"{category.category}: {category.description}")
def create_prompt(text: str) -> str:
additional_models = ""
print_categories_and_descriptions()
PROMPT_TEMPLATE = f"""Classify the company described in this text into the best
of the following categories:
Text:
---
{text}
---
Categories:
{print_categories_and_descriptions()}
The best category is:
"""
return PROMPT_TEMPLATE
And then we use it in our AI function:
def classify_company(text: str) -> FinancialCategory:
prompt = create_prompt(text)
chat_completion = client.chat.completions.create(
model="gpt-5", messages=[{"role": "system", "content": prompt}]
)
try:
output = chat_completion.choices[0].message.content
if output:
# Use our helper function!
return FinancialCategory.from_string(output)
return None
except Exception as e:
raise e
What gets hairy is if you want to change your types.
str(category) will save FinancialCategory.healthcare into your DB, but your parser only recognizes "Healthcare", so you'll need more boilerplate if you ever want to programmatically analyze your data.There are libraries like instructor do provide a great amount of boilerplate but you're still:
The Boundary toolkit helps you iterate seamlessly compared to Pydantic.
Here's all the BAML code you need to solve the Extract Resume problem from earlier (VSCode prompt preview is shown on the right):
<Note> Here we use a "GPT4" client, but you can use any model. See [client docs](/ref/llm-client-providers/open-ai) </Note> {/* ```bamlclass Education { school string degree string year int }
class Resume { name string skills string[] education Education[] }
function ExtractResume(resume_text: string) -> Resume { client GPT4 prompt #" Parse the following resume and return a structured representation of the data in the schema below.
Resume:
---
{{ input.resume_text }}
---
Output in this JSON format:
{{ ctx.output_format }}
Output JSON:
"# }
The BAML compiler generates a python client that imports and calls the function:
```python
from baml_client import baml as b
async def main():
resume = await b.ExtractResume(resume_text="""John Doe
Python, Rust
University of California, Berkeley, B.S. in Computer Science, 2020""")
assert resume.name == "John Doe"
That's it! No need to write any more code. Since the compiler knows what your function signature is we literally generate a custom deserializer for your own unique usecase that just works.
Converting the Resume into an array of resumes requires a single line change in BAML (vs having to create array wrapper classes and parsing logic).
In this image we change the types and BAML automatically updates the prompt, parser, and the Python types you get back.
Adding retries or resilience requires just a couple of modifications. And best of all, you can test things instantly, without leaving your VSCode.
Pydantic is excellent for data validation, but LLM applications need more than validation - they need a complete structured extraction solution.
BAML's advantages over Pydantic:
What you get with BAML that Pydantic can't provide:
Why this matters for your team:
We built BAML because writing a Python library wasn't powerful enough to solve the real challenges of LLM structured extraction.
Get started today with Python, TypeScript, Go, Ruby or other languages.
Our mission is to make the best developer experience for AI engineers working with LLMs. Contact us at [email protected] or Join us on Discord to stay in touch with the community and influence the roadmap.