sdk/contentunderstanding/Azure.AI.ContentUnderstanding/README.md
Azure AI Content Understanding is a multimodal AI service that extracts semantic content from documents, video, audio, and image files. It transforms unstructured content into structured, machine-readable data optimized for retrieval-augmented generation (RAG) and automated workflows.
Use the client library for Azure AI Content Understanding to:
Source code | Package (NuGet) | API reference documentation | Product documentation
Install the client library for .NET with NuGet:
dotnet add package Azure.AI.ContentUnderstanding
Before using the Content Understanding SDK, you need to set up a Microsoft Foundry resource and deploy the required large language models. Content Understanding currently uses OpenAI GPT models (such as gpt-4.1, gpt-4.1-mini, and text-embedding-3-large).
Important: You must create your Microsoft Foundry resource in a region that supports Content Understanding. For a list of available regions, see Azure Content Understanding region and language support.
https://<your-resource-name>.services.ai.azure.com/)Important: Grant Required Permissions
After creating your Microsoft Foundry resource, you must grant yourself the Cognitive Services User role to enable API calls for setting default model deployments:
Note: This role assignment is required even if you are the owner of the resource. Without this role, you will not be able to call the Content Understanding API to configure model deployments for prebuilt analyzers and custom analzyers.
Important: The prebuilt and custom analyzers require large language model deployments. You must deploy at least these models before using prebuilt analyzers and custom analyzers:
prebuilt-documentSearch, prebuilt-imageSearch, prebuilt-audioSearch, prebuilt-videoSearch require gpt-4.1-mini and text-embedding-3-largeprebuilt-invoice, prebuilt-receipt require gpt-4.1 and text-embedding-3-largeTo deploy a model:
gpt-4.1, gpt-4.1-mini, and text-embedding-3-largegpt-4.1 for the gpt-4.1 model). You can use any deployment name you prefer, but you'll need to note it for use in Step 3 when configuring model deployments.Repeat this process for each model required by your prebuilt analyzers.
For more information on deploying models, see Create model deployments in Microsoft Foundry portal.
IMPORTANT: This is a one-time setup per Microsoft Foundry resource that maps your deployed models to those required by the prebuilt analyzers and custom models. If you have multiple Microsoft Foundry resources, you need to configure each one separately.
You need to configure the default model mappings in your Microsoft Foundry resource. This can be done programmatically using the SDK. The configuration maps your deployed models (currently gpt-4.1, gpt-4.1-mini, and text-embedding-3-large) to the large language models required by prebuilt analyzers.
To configure model deployments using code, see Sample 00: Configure model deployment defaults for a complete example. The sample shows how to:
In order to interact with the Content Understanding service, you'll need to create an instance of the ContentUnderstandingClient class. To authenticate the client, you need your Microsoft Foundry resource endpoint and credentials. You can use either an API key or Microsoft Entra ID authentication.
The simplest way to authenticate is using DefaultAzureCredential, which supports multiple authentication methods and works well in both local development and production environments:
// Example: https://your-foundry.services.ai.azure.com/
string endpoint = "<endpoint>";
var credential = new DefaultAzureCredential();
var client = new ContentUnderstandingClient(new Uri(endpoint), credential);
You can also authenticate using an API key from your Microsoft Foundry resource:
// Example: https://your-foundry.services.ai.azure.com/
string endpoint = "<endpoint>";
string apiKey = "<apiKey>";
var client = new ContentUnderstandingClient(new Uri(endpoint), new AzureKeyCredential(apiKey));
⚠️ Security Warning: API key authentication is less secure and is only recommended for testing purposes with test resources. For production, use
DefaultAzureCredentialor other secure authentication methods.
To get your API key:
For more information on authentication, see Azure Identity client library.
Content Understanding provides a rich set of prebuilt analyzers that are ready to use without any configuration. These analyzers are powered by knowledge bases of thousands of real-world document examples, enabling them to understand document structure and adapt to variations in format and content.
Prebuilt analyzers are organized into several categories:
Summary for each content item:
prebuilt-documentSearch - Extracts content from documents (PDF, images, Office documents) with layout preservation, table detection, figure analysis, and structured markdown output. Optimized for RAG scenarios.prebuilt-imageSearch - Analyzes standalone images and returns a one-paragraph description of the image content. Optimized for image understanding and search scenarios. For images that contain text (including hand-written text), use prebuilt-documentSearch.prebuilt-audioSearch - Transcribes audio content with speaker diarization, timing information, and conversation summaries. Supports multilingual transcription.prebuilt-videoSearch - Analyzes video content with visual frame extraction, audio transcription, and structured summaries. Provides temporal alignment of visual and audio content and can return multiple segments per video.prebuilt-read, prebuilt-layout)prebuilt-document, prebuilt-image, prebuilt-audio, prebuilt-video)prebuilt-documentFieldSchema, prebuilt-documentFields)For a complete list of available prebuilt analyzers and their capabilities, see the Prebuilt analyzers documentation.
The API returns different content types based on the input. Both DocumentContent and AudioVisualContent classes derive from AnalysisContent class, which provides basic information and markdown representation. Each derived class provides additional properties to access detailed information:
DocumentContent - For document files (PDF, HTML, images, Office documents such as Word, Excel, PowerPoint, and more). Provides basic information such as page count and MIME type. Retrieve detailed information including pages, tables, figures, paragraphs, and many others.AudioVisualContent - For audio and video files. Provides basic information such as timing information (start/end times) and frame dimensions (for video). Retrieve detailed information including transcript phrases, timing information, and for video, key frame references and more.Content Understanding operations are asynchronous long-running operations. The workflow is:
The SDK provides Operation<T> types that handle polling automatically when using WaitUntil.Completed. For analysis operations, the SDK returns Operation<AnalysisResult> and provides access to the operation ID via the Id property. This operation ID can be used with GetResultFile* and DeleteResult* methods.
ContentUnderstandingClient - The main client for analyzing content, as well as creating, managing, and configuring analyzersAnalysisResult - Contains the structured results of an analysis operation, including content elements, markdown, and metadataWe guarantee that all client instance methods are thread-safe and independent of each other (guideline). This ensures that the recommendation of reusing client instances is always safe, even across threads.
Client options | Accessing the response | Long-running operations | Handling failures | Diagnostics | Mocking | Client lifetime
<!-- CLIENT COMMON BAR -->You can familiarize yourself with different APIs using Samples.
The samples demonstrate:
prebuilt-documentSearch, optimized for RAG (Retrieval-Augmented Generation) applicationsprebuilt-documentSearch, prebuilt-imageSearch, prebuilt-audioSearch, and prebuilt-videoSearchprebuilt-invoiceSee the samples directory for complete examples.
Error: "Access denied due to invalid subscription key or wrong API endpoint"
endpoint URL is correctAPI key is valid or that your Microsoft Entra ID credentials have the correct permissionsError: "Model deployment not found" or "Default model deployment not configured"
Error: "Operation failed" or timeout
WaitUntil.Completed or manual pollingTo enable logging for debugging, configure logging in your application:
using Azure.Core.Diagnostics;
// Enable console logging
using AzureEventSourceListener listener = AzureEventSourceListener.CreateConsoleLogger();
For more information, see Diagnostics samples.
prebuilt-documentSearchThis project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
<!-- LINKS -->