Extracting Action Items from Meeting Transcripts

In this tutorial, you'll learn how to build a BAML function that automatically extracts structured action items from meeting transcripts. By the end, you'll have a working system that can identify tasks, assignees, priorities, subtasks, and dependencies.

Prerequisites

Basic understanding of BAML syntax
An OpenAI API key configured in your environment

Step 1: Define the Data Models

First, let's define the data structures for our tasks. Create a new BAML file called action_items.baml and add these class definitions:

baml

class Subtask {
  id int
  name string
}

enum Priority {
  HIGH
  MEDIUM
  LOW
}

class Ticket {
  id int
  name string 
  description string
  priority Priority
  assignees string[]
  subtasks Subtask[]
  dependencies int[]
}

These models define:

A Subtask class for breaking down larger tasks
A Priority enum for task urgency levels
A Ticket class that represents a complete task with all its metadata

Step 2: Create the Task Extraction Function

Next, we'll create a function that uses GPT-4 to analyze meeting transcripts and extract tasks:

baml

function ExtractTasks(transcript: string) -> Ticket[] {
  client "openai/gpt-4"
  prompt #"
    You are an expert at analyzing meeting transcripts and extracting structured action items and tasks.
    Extract all action items, tasks and subtasks from the meeting transcript below.
    For each task:
    - Generate a unique ID
    - Include who is assigned to it
    - Set appropriate priority level
    - Identify subtasks if any
    - Note any dependencies on other tasks

    {{ ctx.output_format }}

    {{ _.role("user") }} {{ transcript }}
  "#
}

This function:

Takes a meeting transcript as input
Returns an array of Ticket objects
Uses GPT-4 to analyze the transcript
Includes clear instructions in the prompt for task extraction

Step 3: Test the Implementation

Let's add test cases to verify our implementation works correctly. Add these test cases to your BAML file:

baml

test SimpleTranscript {
  functions [ExtractTasks]
  args {
    transcript #"
        Alice: We need to update the website by next week. This is high priority.
        Bob: I can handle that. I'll need Carol's help with the design though.
        Carol: Sure, I can help with the design part.
    "#
  }
}

test ComplexTranscript {
  functions [ExtractTasks]
  args {
    transcript #"
        Alice: Hey team, we have several critical tasks we need to tackle for the upcoming release. First, we need to work on improving the authentication system. It's a top priority.
        Bob: Got it, Alice. I can take the lead on the authentication improvements. Are there any specific areas you want me to focus on?
        Alice: Good question, Bob. We need both a front-end revamp and back-end optimization. So basically, two sub-tasks.
        Carol: I can help with the front-end part of the authentication system.
        Bob: Great, Carol. I'll handle the back-end optimization then.
        Alice: Perfect. Now, after the authentication system is improved, we have to integrate it with our new billing system. That's a medium priority task.
        Carol: Is the new billing system already in place?
        Alice: No, it's actually another task. So it's a dependency for the integration task. Bob, can you also handle the billing system?
        Bob: Sure, but I'll need to complete the back-end optimization of the authentication system first, so it's dependent on that.
        Alice: Understood. Lastly, we also need to update our user documentation to reflect all these changes. It's a low-priority task but still important.
        Carol: I can take that on once the front-end changes for the authentication system are done. So, it would be dependent on that.
        Alice: Sounds like a plan. Let's get these tasks modeled out and get started.
    "#
  }
}

These tests provide:

A simple case with a single task and subtask
A complex case with multiple tasks, priorities, dependencies, and assignees

This is what you see in the BAML playground:

This is the output from the complex test case:

output.txt

[
  {
    "id": 1,
    "name": "Improve Authentication System",
    "description": "Overhaul the authentication system focusing on both front-end and back-end aspects.",
    "priority": "HIGH",
    "assignees": ["Bob", "Carol"],
    "subtasks": [
      {
        "id": 2,
        "name": "Front-end Revamp"
      },
      {
        "id": 3,
        "name": "Back-end Optimization"
      }
    ],
    "dependencies": []
  },
  {
    "id": 4,
    "name": "Develop Billing System",
    "description": "Create a new billing system which will be integrated with the authentication system.",
    "priority": "MEDIUM",
    "assignees": ["Bob"],
    "subtasks": [],
    "dependencies": [3]
  },
  {
    "id": 5,
    "name": "Integrate Authentication System with Billing System",
    "description": "Integrate the improved authentication system with the new billing system.",
    "priority": "MEDIUM",
    "assignees": ["Bob"],
    "subtasks": [],
    "dependencies": [3, 4]
  },
  {
    "id": 6,
    "name": "Update User Documentation",
    "description": "Update the user documentation to reflect changes in the authentication and billing systems.",
    "priority": "LOW",
    "assignees": ["Carol"],
    "subtasks": [],
    "dependencies": [2, 5]
  }
]

What's Next?

You can enhance this implementation by:

Adding due dates to the Ticket class
Including status tracking for tasks
Adding validation for task dependencies
Implementing custom formatting for the extracted tasks

Common Issues and Solutions

If tasks aren't being properly identified, try adjusting the prompt to be more specific
If priorities aren't being set correctly, consider adding examples in the prompt
For complex transcripts, you might need to adjust the model parameters for better results