Back to Conductor

Conductor AI Module

ai/README.md

2019-04-12-130048.4 KB
Original Source

Conductor AI Module

The Conductor AI module provides built-in integration with 12 popular LLM providers and vector databases, enabling AI-powered workflows through simple task definitions -- including chat, embeddings, image generation, audio synthesis, video generation, document generation, and tool calling.

Table of Contents

Supported Providers

LLM Providers

ProviderChatEmbeddingsImage GenAudio GenVideo GenModels
OpenAIGPT-4o, GPT-4o-mini, DALL-E-3, Sora-2, text-embedding-3-small/large
AnthropicClaude 3.5 Sonnet, Claude 3 Opus/Sonnet/Haiku, Claude 4 Sonnet
Google GeminiGemini 1.5/2.0, Veo 2/3, Imagen, text-embedding-004
Azure OpenAIGPT-4o, GPT-4, GPT-3.5-turbo, text-embedding-ada-002, DALL-E-3
AWS BedrockClaude 3.x, Titan, Llama 3.x, amazon.titan-embed-text-v2:0
Mistral AIMistral Small/Medium/Large, Mixtral 8x7B, mistral-embed
CohereCommand, Command-R, Command-R+, embed-english-v3.0
GrokGrok-3, Grok-3-mini
Perplexity AISonar, Sonar Pro
HuggingFaceLlama 3.x, Mistral 7B, Zephyr
OllamaLlama 3.x, Mistral, Phi, nomic-embed-text (local deployment)
Stability AISD3.5 Large/Medium, Stable Image Core, Stable Image Ultra

Vector Database Providers

ProviderStorageSearchDescription
PostgreSQL (pgvector)Postgres with vector extension
PineconeManaged vector database
MongoDB AtlasMongoDB vector search

Note: Multiple named instances of these providers can be configured. See Vector Database Configuration for details.

AI Task Types

Overview

Task TypeTask NameDescription
Chat CompleteLLM_CHAT_COMPLETEMulti-turn conversational AI with optional tool calling
Text CompleteLLM_TEXT_COMPLETESingle prompt completion
Generate EmbeddingsLLM_GENERATE_EMBEDDINGSConvert text to vector embeddings
Image GenerationGENERATE_IMAGEGenerate images from text prompts
Audio GenerationGENERATE_AUDIOText-to-speech synthesis
Video GenerationGENERATE_VIDEOGenerate videos from text/image prompts (async)
Index TextLLM_INDEX_TEXTStore text with embeddings in vector DB
Store EmbeddingsLLM_STORE_EMBEDDINGSStore pre-computed embeddings
Search IndexLLM_SEARCH_INDEXSemantic search using text query
Search EmbeddingsLLM_SEARCH_EMBEDDINGSSearch using embedding vectors
Get EmbeddingsLLM_GET_EMBEDDINGSRetrieve stored embeddings
List MCP ToolsLIST_MCP_TOOLSList tools from MCP server
Generate PDFGENERATE_PDFConvert markdown to PDF document
Call MCP ToolCALL_MCP_TOOLCall a tool on MCP server

LLM_CHAT_COMPLETE

Multi-turn conversational AI with support for tool calling.

Inputs:

ParameterTypeRequiredDescription
llmProviderStringProvider name (e.g., openai, anthropic, gemini)
modelStringModel identifier (e.g., gpt-4o, claude-3-5-sonnet-20241022)
messagesArrayConversation messages with role and message fields
temperatureNumberSampling temperature (0.0-2.0, default: 1.0)
maxTokensIntegerMaximum tokens in response
topPNumberNucleus sampling parameter
stopSequencesArraySequences that stop generation
toolsArrayTool definitions for function calling

Outputs:

FieldTypeDescription
resultStringGenerated response text
finishReasonStringWhy generation stopped (STOP, TOOL_CALLS, LENGTH)
tokenUsedIntegerTotal tokens used
promptTokensIntegerTokens in the prompt
completionTokensIntegerTokens in the response
toolCallsArrayTool invocations (when finishReason is TOOL_CALLS)

LLM_TEXT_COMPLETE

Single prompt text completion.

Inputs:

ParameterTypeRequiredDescription
llmProviderStringProvider name
modelStringModel identifier
promptStringText prompt to complete
temperatureNumberSampling temperature
maxTokensIntegerMaximum tokens in response

Outputs:

FieldTypeDescription
resultStringGenerated completion text
tokenUsedIntegerTotal tokens used

LLM_GENERATE_EMBEDDINGS

Convert text to vector embeddings for semantic search.

Inputs:

ParameterTypeRequiredDescription
llmProviderStringProvider name
modelStringEmbedding model (e.g., text-embedding-3-small)
textStringText to embed

Outputs:

FieldTypeDescription
resultArray<Number>Vector embedding (e.g., 1536 dimensions for OpenAI)

GENERATE_IMAGE

Generate images from text prompts.

Inputs:

ParameterTypeRequiredDescription
llmProviderStringProvider name (e.g., openai)
modelStringImage model (e.g., dall-e-3)
promptStringImage description
widthIntegerImage width in pixels
heightIntegerImage height in pixels
nIntegerNumber of images to generate
styleStringStyle preset (e.g., vivid, natural)

Outputs:

FieldTypeDescription
urlStringURL to generated image
b64_jsonStringBase64-encoded image data (if requested)

GENERATE_AUDIO

Text-to-speech synthesis.

Inputs:

ParameterTypeRequiredDescription
llmProviderStringProvider name
modelStringTTS model (e.g., tts-1, tts-1-hd)
textStringText to convert to speech
voiceStringVoice selection (e.g., alloy, echo, nova)

Outputs:

FieldTypeDescription
mediaArrayMedia items with location (URL/path) and mimeType

GENERATE_VIDEO

Generate videos from text or image prompts. This is an async task -- it submits a generation job and polls for completion automatically.

Supported Providers: OpenAI (Sora-2), Google Vertex AI (Veo 2/3)

Inputs:

ParameterTypeRequiredDescription
llmProviderStringYesProvider name (openai, vertex_ai, or google_gemini)
modelStringYesVideo model (e.g., sora-2, veo-3)
promptStringYesText description of the video to generate
durationIntegerNoDuration in seconds (OpenAI: 4, 8, or 12; default: 5)
sizeStringNoVideo dimensions, e.g., 1280x720 (OpenAI)
aspectRatioStringNoAspect ratio, e.g., 16:9, 9:16 (Gemini)
resolutionStringNoResolution preset: 720p, 1080p (Gemini)
styleStringNoStyle preset (e.g., cinematic)
nIntegerNoNumber of videos to generate (default: 1)
inputImageStringNoURL or base64 image for image-to-video generation
negativePromptStringNoWhat to exclude from the video (Gemini)
personGenerationStringNoPerson policy: dont_allow, allow_adult (Gemini)
generateAudioBooleanNoGenerate audio with video (Gemini Veo 3+)
seedIntegerNoSeed for reproducibility
maxDurationSecondsIntegerNoHard limit on video duration
maxCostDollarsFloatNoEstimated cost limit

Outputs:

FieldTypeDescription
mediaArrayGenerated media items (video MP4 + optional thumbnail)
media[].locationStringHTTP URL to the stored video or thumbnail file
media[].mimeTypeStringMIME type (video/mp4 for video, image/webp for thumbnail)
jobIdStringProvider's async job ID
statusStringFinal status (COMPLETED or FAILED)
pollCountIntegerNumber of polling iterations

Provider-Specific Notes:

  • OpenAI Sora: Supports sora-2 and sora-2-pro models. Valid durations are 4, 8, or 12 seconds. Valid sizes: 1280x720, 720x1280, 1792x1024, 1024x1792. Returns video + webp thumbnail.
  • Google Gemini Veo: Supports veo-2.0-generate-001, veo-3.0, veo-3.1. Use llmProvider as google_gemini or vertex_ai. When using API key, no GCP credentials needed. Veo 3+ supports audio generation.

LLM_INDEX_TEXT

Store text with auto-generated embeddings in a vector database.

Inputs:

ParameterTypeRequiredDescription
vectorDBStringConfigured vector database instance name
namespaceStringNamespace for organization
indexStringIndex name
embeddingModelProviderStringProvider for embeddings
embeddingModelStringEmbedding model name
textStringText to index
docIdStringDocument identifier (auto-generated if not provided)
metadataObjectAdditional metadata to store

LLM_STORE_EMBEDDINGS

Store pre-computed embeddings in a vector database.

Inputs:

ParameterTypeRequiredDescription
vectorDBStringConfigured vector database instance name
namespaceStringNamespace for organization
indexStringIndex name
embeddingsArray<Number>Pre-computed embedding vector
docIdStringDocument identifier
metadataObjectAdditional metadata

LLM_SEARCH_INDEX

Semantic search using a text query (auto-generates embeddings).

Inputs:

ParameterTypeRequiredDescription
vectorDBStringConfigured vector database instance name
namespaceStringNamespace to search
indexStringIndex name
embeddingModelProviderStringProvider for query embedding
embeddingModelStringEmbedding model name
queryStringSearch query text
llmMaxResultsIntegerMaximum results to return (default: 10)

LLM_SEARCH_EMBEDDINGS

Search using pre-computed embedding vectors.

Inputs:

ParameterTypeRequiredDescription
vectorDBStringConfigured vector database instance name
namespaceStringNamespace to search
indexStringIndex name
embeddingsArray<Number>Query embedding vector
llmMaxResultsIntegerMaximum results to return

LLM_GET_EMBEDDINGS

Retrieve stored embeddings by document ID.

Inputs:

ParameterTypeRequiredDescription
vectorDBStringConfigured vector database instance name
namespaceStringNamespace
indexStringIndex name
docIdStringDocument identifier

Outputs:

FieldTypeDescription
resultArray<Number>Stored embedding vector

GENERATE_PDF

Convert markdown text to a PDF document. Supports full GitHub Flavored Markdown including headings, tables, code blocks, lists, task lists, blockquotes, images, links, and inline formatting. No external API keys required -- uses built-in Apache PDFBox rendering.

Inputs:

ParameterTypeRequiredDefaultDescription
markdownString-Markdown text to convert to PDF
pageSizeStringA4Page size: A4, LETTER, LEGAL, A3, A5
marginTopNumber72Top margin in points (72pt = 1 inch)
marginRightNumber72Right margin in points
marginBottomNumber72Bottom margin in points
marginLeftNumber72Left margin in points
themeStringdefaultStyle preset: default or compact
baseFontSizeNumber11Base font size in points
outputLocationStringautoOutput URI (e.g., file:///tmp/report.pdf). Defaults to payload store.
pdfMetadataObject-PDF metadata: title, author, subject, keywords
imageBaseUrlString-Base URL for resolving relative image paths

Outputs:

FieldTypeDescription
result.locationStringURI of the generated PDF file
result.sizeBytesIntegerSize of the generated PDF in bytes
mediaArrayMedia items with location and mimeType (application/pdf)
finishReasonStringCOMPLETED on success

Supported Markdown Features:

FeatureSyntax
Headings# H1 through ###### H6
Bold / Italic**bold**, *italic*, ***both***
TablesGFM pipe tables with header row
Code blocksFenced (```) and indented code blocks
Bullet lists- item or * item (nested supported)
Ordered lists1. item (nested supported)
Task lists- [x] done, - [ ] todo
Blockquotes> quoted text
Links[text](url) (rendered as clickable PDF links)
Images![alt](url) (HTTP/HTTPS, file://, data: URIs, relative paths)
Horizontal rules---
Strikethrough~~strikethrough~~
Inline code`code`
Footnotes[^1] references

LIST_MCP_TOOLS

List available tools from an MCP (Model Context Protocol) server.

Inputs:

ParameterTypeRequiredDescription
mcpServerStringMCP server URL (e.g., http://localhost:3000/mcp)
headersObjectHTTP headers for authentication

Outputs:

FieldTypeDescription
toolsArrayTool definitions with name, description, and inputSchema

CALL_MCP_TOOL

Call a specific tool on an MCP server.

Inputs:

ParameterTypeRequiredDescription
mcpServerStringMCP server URL
methodStringTool name to call
headersObjectHTTP headers for authentication
*AnyAll other parameters passed as tool arguments

Outputs:

FieldTypeDescription
contentArrayResult content items with type and text
isErrorBooleanWhether the call resulted in an error

Configuration

Global Configuration

Add to your application.properties or application.yml:

properties
# Enable AI integrations and workers (default: false, must be explicitly enabled)
conductor.integrations.ai.enabled=true

# Payload storage location for large AI inputs/outputs (optional)
conductor.ai.payload-store-location=/tmp/conductor-ai

Note: AI workers are disabled by default. You must set conductor.integrations.ai.enabled=true to enable them.

Vector Database Configuration

Vector databases support multiple named instances. For detailed configuration options and examples, see Vector Database Configuration.

JDBC Configuration

JDBC connections support multiple named instances for the JDBC worker task. For detailed configuration options, migration guide, and examples, see JDBC Configuration.

Provider-Specific Configuration (LLM)

OpenAI

properties
conductor.ai.openai.api-key=${OPENAI_API_KEY}
conductor.ai.openai.base-url=https://api.openai.com/v1
conductor.ai.openai.organization-id=org-xxxxx
PropertyRequiredDefaultDescription
api-key-OpenAI API key
base-urlhttps://api.openai.com/v1API base URL
organization-id-Organization ID

Anthropic

properties
conductor.ai.anthropic.api-key=${ANTHROPIC_API_KEY}
conductor.ai.anthropic.base-url=https://api.anthropic.com
conductor.ai.anthropic.version=2023-06-01
conductor.ai.anthropic.beta-version=prompt-caching-2024-07-31
PropertyRequiredDefaultDescription
api-key-Anthropic API key
base-urlhttps://api.anthropic.comAPI base URL
version-API version
beta-version-Beta features (e.g., prompt caching)
completions-path-Custom completions endpoint path

Google Gemini / Vertex AI

Use llmProvider as either google_gemini or vertex_ai (both resolve to the same provider).

properties
# Option 1: API key (simplest — works for image/video/audio gen)
conductor.ai.gemini.api-key=${GEMINI_API_KEY}

# Option 2: Vertex AI credentials (required for chat completions and embeddings)
conductor.ai.gemini.project-id=${GOOGLE_CLOUD_PROJECT}
conductor.ai.gemini.location=us-central1
conductor.ai.gemini.publisher=google
PropertyRequiredDefaultDescription
api-key-Gemini API key from Google AI Studio
project-id-GCP project ID (required for chat/embeddings via Vertex AI)
location-GCP region (e.g., us-central1)
base-url{location}-aiplatform.googleapis.com:443API endpoint
publisher-Model publisher

Note: When api-key is set, image/video/audio generation uses the Google AI API directly. Chat completions and embeddings require Vertex AI credentials (project-id + Application Default Credentials or service account). Both can be configured simultaneously.

Azure OpenAI

properties
conductor.ai.azureopenai.api-key=${AZURE_OPENAI_API_KEY}
conductor.ai.azureopenai.base-url=${AZURE_OPENAI_ENDPOINT}
conductor.ai.azureopenai.deployment-name=gpt-4o-mini
conductor.ai.azureopenai.user=your-user-id
PropertyRequiredDefaultDescription
api-key-Azure OpenAI API key
base-url-Azure resource endpoint
deployment-name-Deployment name
user-User identifier for tracking

AWS Bedrock

properties
conductor.ai.bedrock.access-key=${AWS_ACCESS_KEY_ID}
conductor.ai.bedrock.secret-key=${AWS_SECRET_ACCESS_KEY}
conductor.ai.bedrock.region=us-east-1
# OR use bearer token for AWS SSO/temporary credentials
conductor.ai.bedrock.bearer-token=${AWS_SESSION_TOKEN}
PropertyRequiredDefaultDescription
access-key✅*-AWS access key ID
secret-key✅*-AWS secret access key
regionus-east-1AWS region
bearer-token-AWS session token (for temporary credentials)

* Required unless using bearer token or IAM roles

Mistral AI

properties
conductor.ai.mistral.api-key=${MISTRAL_API_KEY}
conductor.ai.mistral.base-url=https://api.mistral.ai
PropertyRequiredDefaultDescription
api-key-Mistral AI API key
base-urlhttps://api.mistral.aiAPI base URL

Cohere

properties
conductor.ai.cohere.api-key=${COHERE_API_KEY}
conductor.ai.cohere.base-url=https://api.cohere.ai
PropertyRequiredDefaultDescription
api-key-Cohere API key
base-urlhttps://api.cohere.aiAPI base URL

Grok (xAI)

properties
conductor.ai.grok.api-key=${GROK_API_KEY}
conductor.ai.grok.base-url=https://api.x.ai/v1
PropertyRequiredDefaultDescription
api-key-Grok API key
base-urlhttps://api.x.ai/v1API base URL

Perplexity AI

properties
conductor.ai.perplexity.api-key=${PERPLEXITY_API_KEY}
conductor.ai.perplexity.base-url=https://api.perplexity.ai
PropertyRequiredDefaultDescription
api-key-Perplexity API key
base-urlhttps://api.perplexity.aiAPI base URL

HuggingFace

properties
conductor.ai.huggingface.api-key=${HUGGINGFACE_API_KEY}
conductor.ai.huggingface.base-url=https://api-inference.huggingface.co/models
PropertyRequiredDefaultDescription
api-key-HuggingFace API token
base-urlhttps://api-inference.huggingface.co/modelsAPI base URL

Ollama (Local)

properties
conductor.ai.ollama.base-url=http://localhost:11434
conductor.ai.ollama.auth-header-name=Authorization
conductor.ai.ollama.auth-header=Bearer token-here
PropertyRequiredDefaultDescription
base-urlhttp://localhost:11434Ollama server URL
auth-header-name-Custom auth header name
auth-header-Custom auth header value

Stability AI

properties
conductor.ai.stabilityai.api-key=${STABILITY_API_KEY}
PropertyRequiredDefaultDescription
api-keyYes-Stability AI API key

Supported models: sd3.5-large, sd3.5-large-turbo, sd3.5-medium, sd3-large, sd3-medium, core (Stable Image Core), ultra (Stable Image Ultra). The endpoint is selected automatically based on the model name.

Environment Variables

The AI module reads from standard environment variables automatically. Set the environment variable for a provider and it will be enabled -- no need to edit properties files.

Quick Reference

ProviderEnvironment VariableDescription
OpenAIOPENAI_API_KEYAPI key from platform.openai.com
OpenAIOPENAI_ORG_IDOptional organization ID
AnthropicANTHROPIC_API_KEYAPI key from console.anthropic.com
Mistral AIMISTRAL_API_KEYAPI key from console.mistral.ai
CohereCOHERE_API_KEYAPI key from dashboard.cohere.com
Grok / xAIXAI_API_KEYAPI key from x.ai
PerplexityPERPLEXITY_API_KEYAPI key from perplexity.ai
HuggingFaceHUGGINGFACE_API_KEYToken from huggingface.co/settings/tokens
Stability AISTABILITY_API_KEYAPI key from platform.stability.ai
Azure OpenAIAZURE_OPENAI_API_KEYAPI key from Azure portal
Azure OpenAIAZURE_OPENAI_ENDPOINTEndpoint URL (e.g., https://your-resource.openai.azure.com)
Azure OpenAIAZURE_OPENAI_DEPLOYMENTDeployment name
AWS BedrockAWS_ACCESS_KEY_IDAWS access key
AWS BedrockAWS_SECRET_ACCESS_KEYAWS secret key
AWS BedrockAWS_REGIONAWS region (default: us-east-1)
Google GeminiGEMINI_API_KEYGemini API key from Google AI Studio
Google GeminiGOOGLE_CLOUD_PROJECTGCP project ID (for Vertex AI chat/embeddings)
Google GeminiGOOGLE_CLOUD_LOCATIONGCP region (default: us-central1)
Google GeminiGOOGLE_APPLICATION_CREDENTIALSPath to service account JSON file
OllamaOLLAMA_HOSTOllama server URL (default: http://localhost:11434)

Usage

Linux/macOS:

bash
export OPENAI_API_KEY=sk-your-api-key
export ANTHROPIC_API_KEY=sk-ant-your-api-key
./gradlew bootRun

Windows (PowerShell):

powershell
$env:OPENAI_API_KEY = "sk-your-api-key"
$env:ANTHROPIC_API_KEY = "sk-ant-your-api-key"
./gradlew bootRun

Note: Explicit property values in application.properties or external configuration files (e.g., conductor.properties) take precedence over environment variables.

Docker

Docker Run

Pass environment variables using -e flags:

bash
docker run -d \
  -p 8080:8080 \
  -e OPENAI_API_KEY=sk-your-api-key \
  -e ANTHROPIC_API_KEY=sk-ant-your-api-key \
  conductor:server

Docker Compose

Create a docker-compose.yml:

yaml
version: '3.8'
services:
  conductor:
    image: conductor:server
    ports:
      - "8080:8080"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - MISTRAL_API_KEY=${MISTRAL_API_KEY}
      # Add other providers as needed

Create a .env file in the same directory:

bash
OPENAI_API_KEY=sk-your-api-key
ANTHROPIC_API_KEY=sk-ant-your-api-key
MISTRAL_API_KEY=your-mistral-key

Run with:

bash
docker-compose up -d

Google Gemini with Docker

Using API key (simplest):

bash
docker run -d \
  -p 8080:8080 \
  -e GEMINI_API_KEY=your-api-key \
  conductor:server

Using Vertex AI credentials (for chat/embeddings):

bash
docker run -d \
  -p 8080:8080 \
  -e GOOGLE_CLOUD_PROJECT=your-project-id \
  -e GOOGLE_APPLICATION_CREDENTIALS=/app/config/credentials.json \
  -v /path/to/credentials.json:/app/config/credentials.json:ro \
  conductor:server

When running on GKE with Workload Identity, credentials are provided automatically by the platform.

AWS Bedrock with Docker

Using environment variables:

bash
docker run -d \
  -p 8080:8080 \
  -e AWS_ACCESS_KEY_ID=your-access-key \
  -e AWS_SECRET_ACCESS_KEY=your-secret-key \
  -e AWS_REGION=us-east-1 \
  conductor:server

Or mount your AWS credentials directory:

bash
docker run -d \
  -p 8080:8080 \
  -v ~/.aws:/root/.aws:ro \
  conductor:server

Sample Workflows

1. Chat Completion (Conversational AI)

json
{
  "name": "chat_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "chat_task",
      "taskReferenceName": "chat",
      "type": "LLM_CHAT_COMPLETE",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "system",
            "message": "You are a helpful assistant."
          },
          {
            "role": "user",
            "message": "What is the capital of France?"
          }
        ],
        "temperature": 0.7,
        "maxTokens": 500
      }
    }
  ]
}

Output:

json
{
  "result": "The capital of France is Paris.",
  "metadata": {
    "usage": {
      "promptTokens": 25,
      "completionTokens": 8,
      "totalTokens": 33
    }
  }
}

2. Generate Embeddings

json
{
  "name": "embedding_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "generate_embeddings",
      "taskReferenceName": "embeddings",
      "type": "LLM_GENERATE_EMBEDDINGS",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "text-embedding-3-small",
        "text": "Conductor is an orchestration platform"
      }
    }
  ]
}

Output:

json
{
  "result": [0.123, -0.456, 0.789, ...]  // 1536-dimensional vector
}

3. Image Generation

json
{
  "name": "image_gen_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "generate_image",
      "taskReferenceName": "image",
      "type": "GENERATE_IMAGE",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "dall-e-3",
        "prompt": "A futuristic cityscape at sunset",
        "width": 1024,
        "height": 1024,
        "n": 1,
        "style": "vivid"
      }
    }
  ]
}

Output:

json
{
  "url": "https://...",
  "b64_json": "base64-encoded-image-data"
}

4. Audio Generation (Text-to-Speech)

json
{
  "name": "tts_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "generate_audio",
      "taskReferenceName": "audio",
      "type": "GENERATE_AUDIO",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "tts-1",
        "text": "Hello, this is a test of text to speech.",
        "voice": "alloy"
      }
    }
  ]
}

Output:

json
{
  "url": "https://...",
  "format": "mp3"
}

5. Semantic Search with Vector DB

json
{
  "name": "semantic_search_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "index_documents",
      "taskReferenceName": "index",
      "type": "LLM_INDEX_TEXT",
      "inputParameters": {
        "vectorDB": "postgres-prod",
        "namespace": "documentation",
        "index": "tech_docs",
        "embeddingModelProvider": "openai",
        "embeddingModel": "text-embedding-3-small",
        "text": "Conductor is a workflow orchestration platform",
        "docId": "doc_001"
      }
    },
    {
      "name": "search_documents",
      "taskReferenceName": "search",
      "type": "LLM_SEARCH_INDEX",
      "inputParameters": {
        "vectorDB": "postgres-prod",
        "namespace": "documentation",
        "index": "tech_docs",
        "embeddingModelProvider": "openai",
        "embeddingModel": "text-embedding-3-small",
        "query": "workflow orchestration",
        "llmMaxResults": 5
      }
    }
  ]
}

Output:

json
{
  "result": [
    {
      "docId": "doc_001",
      "score": 0.95,
      "text": "Conductor is a workflow orchestration platform"
    }
  ]
}

6. RAG (Retrieval Augmented Generation)

A basic RAG workflow that searches a knowledge base and generates an answer:

json
{
  "name": "rag_workflow",
  "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "search_knowledge_base",
      "taskReferenceName": "search",
      "type": "LLM_SEARCH_INDEX",
      "inputParameters": {
        "vectorDB": "postgres-prod",
        "namespace": "kb",
        "index": "articles",
        "embeddingModelProvider": "openai",
        "embeddingModel": "text-embedding-3-small",
        "query": "${workflow.input.question}",
        "llmMaxResults": 3
      }
    },
    {
      "name": "generate_answer",
      "taskReferenceName": "answer",
      "type": "LLM_CHAT_COMPLETE",
      "inputParameters": {
        "llmProvider": "anthropic",
        "model": "claude-3-5-sonnet-20241022",
        "messages": [
          {
            "role": "system",
            "message": "Answer based on the following context: ${search.output.result}"
          },
          {
            "role": "user",
            "message": "${workflow.input.question}"
          }
        ],
        "temperature": 0.3
      }
    }
  ]
}

Complete RAG Demo (Index + Search + Answer)

A self-contained workflow that indexes documents, searches them, and generates an answer:

json
{
  "name": "complete_rag_demo",
  "description": "Index documents, search, and generate RAG answer",
  "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "index_doc_1",
      "taskReferenceName": "index_doc_1_ref",
      "type": "LLM_INDEX_TEXT",
      "inputParameters": {
        "vectorDB": "postgres-prod",
        "index": "demo_index",
        "namespace": "demo_docs",
        "docId": "intro-001",
        "text": "Conductor is a distributed workflow orchestration engine that runs in the cloud. It allows developers to build complex stateful applications by orchestrating microservices.",
        "embeddingModelProvider": "openai",
        "embeddingModel": "text-embedding-3-small",
        "dimensions": 1536,
        "metadata": { "category": "introduction" }
      }
    },
    {
      "name": "index_doc_2",
      "taskReferenceName": "index_doc_2_ref",
      "type": "LLM_INDEX_TEXT",
      "inputParameters": {
        "vectorDB": "postgres-prod",
        "index": "demo_index",
        "namespace": "demo_docs",
        "docId": "features-002",
        "text": "Conductor supports multiple vector databases including PostgreSQL (pgvector), MongoDB Atlas, and Pinecone. It also integrates with LLM providers like OpenAI, Anthropic, and Azure OpenAI.",
        "embeddingModelProvider": "openai",
        "embeddingModel": "text-embedding-3-small",
        "dimensions": 1536,
        "metadata": { "category": "features" }
      }
    },
    {
      "name": "index_doc_3",
      "taskReferenceName": "index_doc_3_ref",
      "type": "LLM_INDEX_TEXT",
      "inputParameters": {
        "vectorDB": "postgres-prod",
        "index": "demo_index",
        "namespace": "demo_docs",
        "docId": "config-003",
        "text": "You can configure multiple named instances of the same vector database type for different environments like production, development, and staging.",
        "embeddingModelProvider": "openai",
        "embeddingModel": "text-embedding-3-small",
        "dimensions": 1536,
        "metadata": { "category": "configuration" }
      }
    },
    {
      "name": "search_index",
      "taskReferenceName": "search_ref",
      "type": "LLM_SEARCH_INDEX",
      "inputParameters": {
        "vectorDB": "postgres-prod",
        "index": "demo_index",
        "namespace": "demo_docs",
        "query": "What vector databases does Conductor support?",
        "embeddingModelProvider": "openai",
        "embeddingModel": "text-embedding-3-small",
        "dimensions": 1536,
        "maxResults": 3
      }
    },
    {
      "name": "generate_rag_answer",
      "taskReferenceName": "answer_ref",
      "type": "LLM_CHAT_COMPLETE",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "system",
            "message": "You are a technical expert. Answer the question using only the provided context."
          },
          {
            "role": "user",
            "message": "Context:\n${search_ref.output.result}\n\nQuestion: What vector databases does Conductor support?"
          }
        ],
        "temperature": 0.2
      }
    }
  ],
  "outputParameters": {
    "indexed_docs": ["${index_doc_1_ref.output}", "${index_doc_2_ref.output}", "${index_doc_3_ref.output}"],
    "search_results": "${search_ref.output.result}",
    "answer": "${answer_ref.output.result}"
  }
}

Run without input:

bash
curl -X POST 'http://localhost:8080/api/workflow/complete_rag_demo' \
  -H 'Content-Type: application/json' \
  -d '{}'

7. MCP (Model Context Protocol) Tool Integration

MCP allows workflows to interact with external tools and data sources via HTTP/HTTPS or stdio (local) servers.

List Tools from MCP Server

json
{
  "name": "mcp_list_tools_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "list_mcp_tools",
      "taskReferenceName": "list_tools",
      "type": "LIST_MCP_TOOLS",
      "inputParameters": {
        "mcpServer": "http://localhost:3000/mcp"
      }
    }
  ]
}

Output:

json
{
  "tools": [
    {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "inputSchema": {
        "type": "object",
        "properties": {
          "location": {"type": "string"}
        },
        "required": ["location"]
      }
    }
  ]
}

The Model Context Protocol supports multiple transport types:

  • Streamable HTTP (default): Standard HTTP/HTTPS endpoints (recommended per MCP spec 2025-11-25)
  • SSE (deprecated): Only used when URL explicitly contains /sse endpoint

Call MCP Tool (HTTP Server)

json
{
  "name": "mcp_weather_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "get_weather",
      "taskReferenceName": "weather",
      "type": "CALL_MCP_TOOL",
      "inputParameters": {
        "mcpServer": "http://localhost:3000/mcp",
        "method": "get_weather",
        "location": "New York",
        "units": "fahrenheit"
      }
    }
  ]
}

Output:

json
{
  "content": [
    {
      "type": "text",
      "text": "Current weather in New York: 72°F, Partly cloudy"
    }
  ],
  "isError": false
}

MCP Server URL Formats:

  • HTTP: http://localhost:3000 (uses Streamable HTTP transport)
  • HTTP/SSE (deprecated): http://localhost:3000/sse
  • HTTP/Streamable: http://localhost:3000/mcp
  • HTTPS: https://api.example.com/mcp

Note: All input parameters except mcpServer, method, and headers are automatically passed as arguments to the MCP tool.

MCP + AI Agent Workflow

Complete example combining MCP tools with LLM for autonomous agent behavior:

json
{
  "name": "mcp_ai_agent_workflow",
    "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "list_available_tools",
      "taskReferenceName": "discover_tools",
      "type": "LIST_MCP_TOOLS",
      "inputParameters": {
        "mcpServer": "http://localhost:3000/mcp"
      }
    },
    {
      "name": "decide_which_tools_to_use",
      "taskReferenceName": "plan",
      "type": "LLM_CHAT_COMPLETE",
      "inputParameters": {
        "llmProvider": "anthropic",
        "model": "claude-3-5-sonnet-20241022",
        "messages": [
          {
            "role": "system",
            "message": "You are an AI agent. Available tools: ${discover_tools.output.tools}. User wants to: ${workflow.input.task}"
          },
          {
            "role": "user",
            "message": "Which tool should I use and what parameters? Respond with JSON: {method: string, arguments: object}"
          }
        ],
        "temperature": 0.1,
        "maxTokens": 500
      }
    },
    {
      "name": "execute_tool",
      "taskReferenceName": "execute",
      "type": "CALL_MCP_TOOL",
      "inputParameters": {
        "mcpServer": "http://localhost:3000/mcp",
        "method": "${plan.output.result.method}",
        "arguments": "${plan.output.result.arguments}"
      }
    },
    {
      "name": "summarize_result",
      "taskReferenceName": "summarize",
      "type": "LLM_CHAT_COMPLETE",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "user",
            "message": "Summarize this result for the user: ${execute.output.content}"
          }
        ],
        "maxTokens": 200
      }
    }
  ]
}

Workflow Input:

json
{
  "task": "Get the current weather in San Francisco"
}

Workflow Output:

json
{
  "discover_tools": {
    "tools": [
      {"name": "get_weather", "description": "..."},
      {"name": "calculate", "description": "..."}
    ]
  },
  "plan": {
    "result": {
      "method": "get_weather",
      "arguments": {"location": "San Francisco", "units": "fahrenheit"}
    }
  },
  "execute": {
    "content": [{"type": "text", "text": "72°F, Sunny"}]
  },
  "summarize": {
    "result": "The current weather in San Francisco is 72°F and sunny."
  }
}

8. Video Generation (OpenAI Sora)

json
{
  "name": "video_gen_openai_sora",
  "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "generate_video",
      "taskReferenceName": "sora_video",
      "type": "GENERATE_VIDEO",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "sora-2",
        "prompt": "A slow cinematic aerial shot of a coastal city at golden hour, waves crashing against cliffs",
        "duration": 8,
        "size": "1280x720",
        "n": 1,
        "style": "cinematic"
      }
    }
  ]
}

Output:

json
{
  "media": [
    {
      "location": "/api/media/.../video.mp4",
      "mimeType": "video/mp4"
    },
    {
      "location": "/api/media/.../thumbnail.webp",
      "mimeType": "image/webp"
    }
  ],
  "jobId": "video_abc123...",
  "status": "COMPLETED",
  "pollCount": 14
}

9. Video Generation (Google Gemini Veo)

json
{
  "name": "video_gen_gemini_veo",
  "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "generate_video",
      "taskReferenceName": "veo_video",
      "type": "GENERATE_VIDEO",
      "inputParameters": {
        "llmProvider": "vertex_ai",
        "model": "veo-3",
        "prompt": "A time-lapse of a blooming flower in a sunlit garden, soft bokeh background",
        "duration": 8,
        "aspectRatio": "16:9",
        "resolution": "720p",
        "personGeneration": "dont_allow",
        "generateAudio": true,
        "negativePrompt": "blurry, low quality, text overlay",
        "n": 1
      }
    }
  ]
}

10. Multi-Step Pipeline (Image + Video)

A workflow that generates an image and a video in sequence:

json
{
  "name": "image_to_video_pipeline",
  "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "generate_image",
      "taskReferenceName": "source_image",
      "type": "GENERATE_IMAGE",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "dall-e-3",
        "prompt": "A serene mountain lake at dawn with mist rising from the water",
        "width": 1792,
        "height": 1024,
        "n": 1
      }
    },
    {
      "name": "generate_video",
      "taskReferenceName": "animated_video",
      "type": "GENERATE_VIDEO",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "sora-2",
        "prompt": "A serene mountain lake at dawn, gentle ripples spread across the water as mist slowly drifts",
        "duration": 8,
        "size": "1280x720",
        "style": "cinematic"
      }
    }
  ]
}

11. PDF Generation (Markdown to PDF)

Generate a PDF document from markdown content with layout options and metadata:

json
{
  "name": "pdf_generation_workflow",
  "version": 1,
  "schemaVersion": 2,
  "tasks": [
    {
      "name": "generate_pdf",
      "taskReferenceName": "pdf",
      "type": "GENERATE_PDF",
      "inputParameters": {
        "markdown": "# Sales Report\n\n## Summary\n\nTotal revenue: **$5.4M**\n\n| Region | Revenue | Growth |\n|--------|---------|--------|\n| North America | $2.4M | +12% |\n| Europe | $1.8M | +8% |\n\n## Recommendations\n\n1. Expand APAC sales team\n2. Launch enterprise tier in EU\n\n> *Our best quarter yet.*",
        "pageSize": "LETTER",
        "theme": "default",
        "pdfMetadata": {
          "title": "Sales Report - Q4 2025",
          "author": "Conductor Workflow"
        }
      }
    }
  ]
}

Output:

json
{
  "result": {
    "location": "file:///tmp/conductor/wf-123/task-456/abc.pdf",
    "sizeBytes": 12345
  },
  "media": [
    {
      "location": "file:///tmp/conductor/wf-123/task-456/abc.pdf",
      "mimeType": "application/pdf"
    }
  ],
  "finishReason": "COMPLETED"
}

12. LLM-to-PDF Pipeline (Report Generation)

A multi-step workflow that uses an LLM to generate a markdown report and then converts it to PDF:

json
{
  "name": "llm_to_pdf_pipeline",
  "version": 1,
  "schemaVersion": 2,
  "inputParameters": ["topic", "audience"],
  "tasks": [
    {
      "name": "generate_report_markdown",
      "taskReferenceName": "llm_report",
      "type": "LLM_CHAT_COMPLETE",
      "inputParameters": {
        "llmProvider": "openai",
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "system",
            "message": "You are a professional report writer. Generate well-structured markdown reports."
          },
          {
            "role": "user",
            "message": "Write a report about: ${workflow.input.topic}\nAudience: ${workflow.input.audience}"
          }
        ],
        "temperature": 0.7,
        "maxTokens": 2000
      }
    },
    {
      "name": "convert_to_pdf",
      "taskReferenceName": "pdf_output",
      "type": "GENERATE_PDF",
      "inputParameters": {
        "markdown": "${llm_report.output.result}",
        "pageSize": "A4",
        "pdfMetadata": {
          "title": "${workflow.input.topic}",
          "author": "Conductor AI Pipeline"
        }
      }
    }
  ],
  "outputParameters": {
    "reportMarkdown": "${llm_report.output.result}",
    "pdfLocation": "${pdf_output.output.result.location}",
    "pdfSizeBytes": "${pdf_output.output.result.sizeBytes}"
  }
}

Workflow Input:

json
{
  "topic": "Cloud Migration Best Practices",
  "audience": "CTO and engineering leadership"
}

Workflow Output:

json
{
  "reportMarkdown": "# Cloud Migration Best Practices\n\n## Executive Summary\n...",
  "pdfLocation": "file:///tmp/conductor/wf-789/task-012/report.pdf",
  "pdfSizeBytes": 28456
}

13. LLM Tool Calling with MCP Tools

Use LLM_CHAT_COMPLETE with the tools parameter to let the LLM autonomously decide when to call MCP tools. When the LLM needs to use a tool, it returns finishReason: "TOOL_CALLS" with the tool invocations.

LLM Output with Tool Calls

When the LLM decides to call tools, the output looks like this:

json
{
  "result": [],
  "media": [],
  "finishReason": "TOOL_CALLS",
  "tokenUsed": 90,
  "promptTokens": 75,
  "completionTokens": 15,
  "toolCalls": [
    {
      "taskReferenceName": "call_2prFOIfVdwS4BTAi4Z43qPGe",
      "name": "get_weather",
      "type": "MCP_TOOL",
      "inputParameters": {
        "method": "get_weather",
        "location": "Tokyo"
      }
    }
  ]
}

Key Points:

  • finishReason: "TOOL_CALLS" indicates the LLM wants to invoke tools
  • toolCalls array contains all tool invocations with their parameters
  • Each tool call has a unique taskReferenceName for workflow orchestration
  • The configParams.mcpServer in each tool definition specifies the MCP server URL

Enable/Disable AI Workers

Global Enable/Disable

AI workers are disabled by default for security. Enable them explicitly:

properties
# Enable all AI workers and integrations
conductor.integrations.ai.enabled=true

To disable:

properties
# Disable all AI workers (or simply omit the property)
conductor.integrations.ai.enabled=false

Conditional Provider Registration

Providers are automatically registered only when their API keys are configured. To disable a specific provider, simply remove or comment out its configuration:

properties
# OpenAI will be registered
conductor.ai.openai.api-key=sk-xxx

# Anthropic will NOT be registered (commented out)
# conductor.ai.anthropic.api-key=sk-ant-xxx

Environment-Based Configuration

Use environment variables to control which providers are enabled in different environments:

bash
# Development - use local Ollama
export OLLAMA_BASE_URL=http://localhost:11434
./gradlew bootRun

# Production - use OpenAI and Anthropic
export OPENAI_API_KEY=sk-xxx
export ANTHROPIC_API_KEY=sk-ant-xxx
./gradlew bootRun

Testing

Integration Tests

The module includes integration tests that run against real APIs when credentials are provided via environment variables:

bash
# Run all tests (integration tests skipped if no API keys)
./gradlew :conductor-ai:test

# Run with real OpenAI API
export OPENAI_API_KEY=sk-xxx
./gradlew :conductor-ai:test

# Run without integration tests
env -u OPENAI_API_KEY -u ANTHROPIC_API_KEY ./gradlew :conductor-ai:test

Test Environment Variables

ProviderEnvironment Variable
OpenAIOPENAI_API_KEY
AnthropicANTHROPIC_API_KEY
MistralMISTRAL_API_KEY
GrokGROK_API_KEY
CohereCOHERE_API_KEY
HuggingFaceHUGGINGFACE_API_KEY
PerplexityPERPLEXITY_API_KEY
OllamaOLLAMA_BASE_URL
AWS BedrockAWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
Azure OpenAIAZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT
Gemini VertexGOOGLE_CLOUD_PROJECT

License

Copyright 2026 Conductor Authors. Licensed under the Apache License 2.0.