Back to Ruflo

ADR-014: Conveyor AI Chat System Architecture

ruflo/docs/adr/ADR-014-CHAT-SYSTEM-ARCHITECTURE.md

3.10.540.8 KB
Original Source

ADR-014: Conveyor AI Chat System Architecture

Status: Implemented Date: 2026-01-19 Verified: All Cloud Functions operational including suggestion-agent (see ADR-015) Author: Conveyor AI Team Deciders: Engineering, Product, DevOps, Security Related: ADR-001-EXTENSION-ARCHITECTURE, ADR-004-RUVECTOR-POSTGRES-GCP-DEPLOYMENT, ADR-011-cloud-run-extension-architecture, ADR-012-vitejs-heroui-frontend-stack, ADR-013-hybrid-data-layer-architecture, ADR-015-CLOUD-FUNCTIONS-ARCHITECTURE


Context

Conveyor AI requires a unified chat-based interface to interact with the entire platform ecosystem. Users need to:

  1. Query multiple systems - Access case data, run simulations, query databases, and sync Airtable records through natural language
  2. Execute complex workflows - Chain operations across Cloud Functions (case-manager, simulation-agent, db-query-agent, airtable-agent)
  3. Leverage AI capabilities - Use Gemini 2.5 Pro for intent classification, response generation, and context management
  4. Perform client-side ML - Run RL algorithms locally via WASM modules for low-latency predictions
  5. Integrate with Claude - Expose MCP Server for Claude Code integration and agent orchestration
  6. Real-time collaboration - Support WebSocket connections for live updates and multi-user sessions

The current architecture has isolated Cloud Functions without a unified conversational layer, requiring users to know specific API endpoints and request formats.


Decision

Implement a Chat System Architecture deployed on Google Cloud Run with the following components:

High-Level Architecture

+-----------------------------------------------------------------------------------+
|                           CONVEYOR AI CHAT SYSTEM                                  |
+-----------------------------------------------------------------------------------+
|                                                                                    |
|  +------------------------------------------+  +--------------------------------+  |
|  |           CHAT INTERFACE                 |  |        MCP SERVER              |  |
|  |  (ViteJS + HeroUI + WebSocket Client)   |  |    (Cloud Run Service)         |  |
|  |                                          |  |                                |  |
|  |  +------------------------------------+  |  |  - Claude Code Integration    |  |
|  |  | Chat Input / Message History       |  |  |  - Tool Registry              |  |
|  |  | Command Palette / Quick Actions    |  |  |  - Session Management         |  |
|  |  | WASM Module Runner (shared-ai)     |  |  |  - OAuth 2.0 Token Exchange   |  |
|  |  +------------------------------------+  |  +--------------------------------+  |
|  +------------------------------------------+                                      |
|                         |                                                          |
|                         v                                                          |
|  +-----------------------------------------------------------------------------------+
|  |                          CHAT ORCHESTRATOR (Cloud Run)                            |
|  |  +-------------+  +-----------------+  +----------------+  +----------------+    |
|  |  | Intent      |  | Command         |  | Context        |  | Response       |    |
|  |  | Classifier  |  | Parser          |  | Manager        |  | Generator      |    |
|  |  | (Gemini)    |  | (Regex + NLP)   |  | (Redis/PG)     |  | (Gemini 2.5)   |    |
|  |  +-------------+  +-----------------+  +----------------+  +----------------+    |
|  +-----------------------------------------------------------------------------------+
|                         |                                                          |
|                         v                                                          |
|  +-----------------------------------------------------------------------------------+
|  |                           FUNCTION DISPATCHER                                     |
|  |                                                                                   |
|  |     +----------------+  +----------------+  +----------------+  +-------------+   |
|  |     | case-manager   |  | simulation-    |  | db-query-      |  | airtable-   |   |
|  |     | Cloud Function |  | agent          |  | agent          |  | agent       |   |
|  |     |                |  | Cloud Function |  | Cloud Function |  | Cloud Func  |   |
|  |     +----------------+  +----------------+  +----------------+  +-------------+   |
|  |                                                                                   |
|  +-----------------------------------------------------------------------------------+
|                         |                                                          |
|                         v                                                          |
|  +-----------------------------------------------------------------------------------+
|  |                           DATA LAYER                                              |
|  |  +-------------------+  +-------------------+  +--------------------+             |
|  |  | Cloud SQL         |  | Airtable          |  | Secret Manager     |             |
|  |  | PostgreSQL        |  | (Cases, Clients)  |  | (API Keys, OAuth)  |             |
|  |  | (RuVector + RL)   |  |                   |  |                    |             |
|  |  +-------------------+  +-------------------+  +--------------------+             |
|  +-----------------------------------------------------------------------------------+
|                                                                                    |
+-----------------------------------------------------------------------------------+

Component Architecture

                                    USER
                                      |
                    +----------------+------------------+
                    |                                   |
                    v                                   v
            +---------------+                  +-----------------+
            | Web Interface |                  | Claude Code     |
            | (chat-ui)     |                  | (MCP Protocol)  |
            +-------+-------+                  +--------+--------+
                    |                                   |
                    |     WebSocket (wss://)            |
                    |                                   |
                    v                                   v
            +----------------------------------------------+
            |           chat-orchestrator                   |
            |              (Cloud Run)                      |
            |                                               |
            |  +------------------------------------------+ |
            |  |              MESSAGE ROUTER              | |
            |  |                                          | |
            |  |   /chat  --> Intent Classifier           | |
            |  |   /cmd   --> Command Parser              | |
            |  |   /mcp   --> MCP Protocol Handler        | |
            |  |   /ws    --> WebSocket Manager           | |
            |  +------------------------------------------+ |
            |                                               |
            |  +------------------------------------------+ |
            |  |            GEMINI INTEGRATION            | |
            |  |                                          | |
            |  |   Model: gemini-2.5-pro                  | |
            |  |   Context Window: 1M tokens             | |
            |  |   Function Calling: Enabled             | |
            |  +------------------------------------------+ |
            +----------------------------------------------+
                    |
                    | HTTPS (Internal)
                    |
    +---------------+---------------+---------------+
    |               |               |               |
    v               v               v               v
+--------+   +-----------+   +----------+   +---------+
| case-  |   | simulation|   | db-query |   | airtable|
| manager|   | -agent    |   | -agent   |   | -agent  |
+--------+   +-----------+   +----------+   +---------+

Command Syntax Specification

Command Structure

/<command> [subcommand] [--flag=value] [positional args]

Available Commands

CommandSubcommandDescriptionExample
/caselistList cases with filters/case list --status=active --limit=10
/casegetGet case details/case get CASE-12345
/casecreateCreate new case/case create --type=water_damage --value=75000
/caseupdateUpdate case/case update CASE-12345 --status=settled
/casesearchSemantic search/case search "large commercial fire damage"
/simrunRun Q-Learning/sim run --case-type=water_damage --strategy=aggressive
/simoptimalGet optimal strategy/sim optimal --case-type=fire_damage
/simstatsSimulation statistics/sim stats --period=30d
/dbqueryExecute SQL query/db query "SELECT * FROM cases LIMIT 10"
/dbreportGenerate report/db report --type=case_summary --period=monthly
/dbschemaShow table schema/db schema cases
/airtablesyncSync Airtable data/airtable sync --table=Cases --direction=bidirectional
/airtablequeryQuery Airtable/airtable query --table=Clients --filter="Status=Active"
/airtableupsertUpsert records/airtable upsert --table=Cases --data={...}
/help-Show command help/help case
/status-System health check/status

Natural Language Processing

Updated 2026-01-19: Enhanced with FunctionExecutor service for direct intent-to-function routing.

Users can interact using either explicit /commands OR natural language. The system uses a three-tier processing model:

  1. Explicit Commands (/case list) → Routed directly via CommandRouter
  2. Natural Language Intent ("show me case 123") → Detected via FunctionExecutor.detectIntent() → Executed directly
  3. Complex Queries ("explain the performance trends") → Processed by Gemini AI with function calling

Natural Language Intent Patterns

User InputDetected IntentCloud FunctionExample Response
"Show me case ABC-123"get_casecase-managerCase details
"List all cases"list_casescase-managerPaginated case list
"Search for fire damage cases"search_casescase-managerMatching cases
"What's the case status?"get_case_statscase-managerStatistics
"Summarize case 12345"summarize_caseairtable-agentAI-generated summary
"Run a simulation for disability"run_simulationsimulation-agentRL results
"What's the optimal strategy?"get_optimal_strategysimulation-agentRecommended strategy
"Show simulation stats"get_simulation_statssimulation-agentPerformance metrics
"Forecast revenue"get_forecastdb-query-agentForecast data
"Analyze Airtable data"analyze_airtableairtable-agentTable analysis

FunctionExecutor Architecture

typescript
// services/FunctionExecutor.ts
class FunctionExecutor {
  // Detect intent from natural language
  detectIntent(input: string): FunctionCallResult | null;

  // Execute function call against Cloud Functions
  execute(functionCall: FunctionCallResult): Promise<FunctionExecutionResult>;

  // Process natural language end-to-end
  processNaturalLanguage(input: string): Promise<FunctionExecutionResult | null>;
}

Gemini Function Calling Flow

When natural language doesn't match known patterns, Gemini AI decides which function to call:

User Input → Gemini AI → Function Call Decision → FunctionExecutor → Cloud Function → Result → Gemini → Natural Language Response

This allows complex queries like "What happened with the Johnson case last month?" to be intelligently routed.

Response Format

typescript
interface ChatResponse {
  id: string;                    // Message UUID
  timestamp: string;             // ISO 8601
  type: 'text' | 'data' | 'chart' | 'error';
  content: string;               // Human-readable response
  data?: Record<string, any>;    // Structured data payload
  source: string;                // Originating function
  metadata: {
    tokens_used: number;
    latency_ms: number;
    function_calls: string[];
  };
}

Integration Points

1. Cloud Functions Integration

typescript
// chat-orchestrator/src/dispatcher/FunctionDispatcher.ts
interface CloudFunctionConfig {
  name: string;
  url: string;
  timeout: number;
  retries: number;
}

const CLOUD_FUNCTIONS: Record<string, CloudFunctionConfig> = {
  'case-manager': {
    name: 'case-manager',
    url: 'https://case-manager-hwqrrwrlna-uc.a.run.app',
    timeout: 30000,
    retries: 2,
  },
  'simulation-agent': {
    name: 'simulation-agent',
    url: 'https://simulation-agent-hwqrrwrlna-uc.a.run.app',
    timeout: 60000,
    retries: 1,
  },
  'db-query-agent': {
    name: 'db-query-agent',
    url: 'https://db-query-agent-hwqrrwrlna-uc.a.run.app',
    timeout: 30000,
    retries: 2,
  },
  'airtable-agent': {
    name: 'airtable-agent',
    url: 'https://airtable-agent-hwqrrwrlna-uc.a.run.app',
    timeout: 30000,
    retries: 2,
  },
};

2. Gemini 2.5 Pro Integration

typescript
// chat-orchestrator/src/ai/GeminiClient.ts
import { GoogleGenerativeAI } from '@google/generative-ai';

interface GeminiConfig {
  model: 'gemini-2.5-pro';
  maxOutputTokens: 8192;
  temperature: 0.7;
  topP: 0.95;
  tools: GeminiFunctionTool[];
}

const GEMINI_TOOLS: GeminiFunctionTool[] = [
  {
    name: 'query_cases',
    description: 'Query case data from the case management system',
    parameters: {
      type: 'object',
      properties: {
        action: { type: 'string', enum: ['list', 'get', 'search', 'stats'] },
        caseId: { type: 'string' },
        filters: { type: 'object' },
      },
    },
  },
  {
    name: 'run_simulation',
    description: 'Run RL simulation for case strategy optimization',
    parameters: {
      type: 'object',
      properties: {
        action: { type: 'string', enum: ['run_qlearning', 'get_optimal', 'stats'] },
        caseType: { type: 'string' },
        strategy: { type: 'string' },
      },
    },
  },
  {
    name: 'query_database',
    description: 'Execute database queries and generate reports',
    parameters: {
      type: 'object',
      properties: {
        action: { type: 'string', enum: ['query', 'report', 'schema', 'analytics'] },
        query: { type: 'string' },
        reportType: { type: 'string' },
      },
    },
  },
  {
    name: 'sync_airtable',
    description: 'Synchronize and query Airtable data',
    parameters: {
      type: 'object',
      properties: {
        action: { type: 'string', enum: ['sync', 'query', 'upsert', 'analyze'] },
        tableName: { type: 'string' },
        data: { type: 'object' },
      },
    },
  },
];

3. WASM Module Integration (Client-Side)

typescript
// chat-ui/src/wasm/WasmRunner.ts
import { QLearning, MonteCarlo, MinCut, VectorOps } from '@conveyor/shared-ai';

interface WasmModules {
  qlearning: typeof QLearning;
  montecarlo: typeof MonteCarlo;
  mincut: typeof MinCut;
  vectors: typeof VectorOps;
}

export class WasmRunner {
  private modules: WasmModules;
  private initialized: boolean = false;

  async initialize(): Promise<void> {
    // Load WASM modules from shared-ai package
    this.modules = {
      qlearning: await import('@conveyor/shared-ai/ml/QLearning'),
      montecarlo: await import('@conveyor/shared-ai/simulation/MonteCarlo'),
      mincut: await import('@conveyor/shared-ai/graph/MinCut'),
      vectors: await import('@conveyor/shared-ai/vectors/VectorOps'),
    };
    this.initialized = true;
  }

  async runLocalPrediction(caseData: CaseInput): Promise<StrategyPrediction> {
    // Run Q-Learning locally for instant predictions
    const ql = new this.modules.qlearning.QLearning({
      learningRate: 0.1,
      discountFactor: 0.95,
      explorationRate: 0.1,
    });

    // Load cached Q-table from IndexedDB
    const cachedQTable = await this.loadCachedQTable();
    if (cachedQTable) {
      ql.importQTable(cachedQTable);
    }

    return ql.getOptimalAction(caseData.stateKey);
  }
}

4. MCP Server Integration

typescript
// mcp-server/src/server.ts
import { Server } from '@modelcontextprotocol/sdk/server';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio';

const server = new Server({
  name: 'conveyor-ai-mcp',
  version: '1.0.0',
});

// Register tools for Claude Code integration
server.setRequestHandler('tools/list', async () => ({
  tools: [
    {
      name: 'conveyor_case_query',
      description: 'Query Conveyor AI case management system',
      inputSchema: {
        type: 'object',
        properties: {
          action: { type: 'string' },
          caseId: { type: 'string' },
          filters: { type: 'object' },
        },
      },
    },
    {
      name: 'conveyor_simulation',
      description: 'Run RL simulations for case strategy',
      inputSchema: {
        type: 'object',
        properties: {
          action: { type: 'string' },
          caseType: { type: 'string' },
        },
      },
    },
    {
      name: 'conveyor_analytics',
      description: 'Generate analytics and reports',
      inputSchema: {
        type: 'object',
        properties: {
          reportType: { type: 'string' },
          period: { type: 'string' },
        },
      },
    },
  ],
}));

server.setRequestHandler('tools/call', async (request) => {
  const { name, arguments: args } = request.params;

  // Route to appropriate Cloud Function
  const response = await functionDispatcher.dispatch(name, args);

  return {
    content: [{ type: 'text', text: JSON.stringify(response) }],
  };
});

5. AI Suggestions System (Web Worker + Cloud Function)

The AI Suggestions System provides non-blocking, contextual suggestions in the side panel using a Web Worker architecture with Gemini 3 and Google Search grounding.

+-----------------------------------------------------------------------------------+
|                         AI SUGGESTIONS ARCHITECTURE                                |
+-----------------------------------------------------------------------------------+
|                                                                                    |
|  +------------------+     +-------------------+     +------------------------+     |
|  | SuggestionsPanel |     | useSuggestions    |     | suggestions.worker.ts  |     |
|  | (React Component)|<--->| (React Hook)      |<--->| (Web Worker Thread)    |     |
|  |                  |     |                   |     |                        |     |
|  | - Clickable cards|     | - State mgmt      |     | - Background processing|     |
|  | - Source links   |     | - Worker comms    |     | - Debounced API calls  |     |
|  | - Priority badges|     | - Context updates |     | - Response caching     |     |
|  +------------------+     +-------------------+     +------------------------+     |
|                                                              |                     |
|                                                              | HTTPS               |
|                                                              v                     |
|                                               +---------------------------+        |
|                                               |   suggestion-agent        |        |
|                                               |   (Cloud Function Gen2)   |        |
|                                               |                           |        |
|                                               |  - Gemini 3 Flash/Pro    |        |
|                                               |  - Google Search Grounding|        |
|                                               |  - Structured JSON Output |        |
|                                               +---------------------------+        |
|                                                                                    |
+-----------------------------------------------------------------------------------+

Suggestion Types

TypeIconDescriptionExample
commandCommandPrebuilt command to execute/db report case_summary
insightSparklesData-driven observation"Case volume up 15% this week"
taskTargetActionable next step"Review pending water damage cases"
optimizationTrendingUpPerformance improvement"Batch similar cases for efficiency"
learningBookOpenEducational content"RL strategies for fire damage"

Structured Output Schema

typescript
// suggestion-agent/index.js - Gemini 3 Structured Output
const SUGGESTION_SCHEMA = {
  type: 'object',
  properties: {
    suggestions: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          id: { type: 'string' },
          type: { type: 'string', enum: ['command', 'insight', 'task', 'optimization', 'learning'] },
          title: { type: 'string' },
          description: { type: 'string' },
          command: { type: 'string' },
          priority: { type: 'string', enum: ['high', 'medium', 'low'] },
          relevanceScore: { type: 'number' },
          grounded: { type: 'boolean' },
          sources: {
            type: 'array',
            items: {
              type: 'object',
              properties: {
                title: { type: 'string' },
                url: { type: 'string' },
              },
            },
          },
        },
        required: ['id', 'type', 'title', 'description', 'priority', 'relevanceScore'],
      },
    },
    analysis: {
      type: 'object',
      properties: {
        sessionSummary: { type: 'string' },
        topicsDiscussed: { type: 'array', items: { type: 'string' } },
        userIntent: { type: 'string' },
        nextBestAction: { type: 'string' },
      },
    },
    groundingMetadata: {
      type: 'object',
      properties: {
        searchQueries: { type: 'array', items: { type: 'string' } },
        sourcesUsed: { type: 'number' },
      },
    },
  },
  required: ['suggestions', 'analysis'],
};

Google Search Grounding Configuration

typescript
// Enable dynamic grounding with Google Search
import { DynamicRetrievalMode } from '@google/generative-ai';

const modelConfig = {
  model: 'gemini-3-flash-preview', // or 'gemini-3-pro-preview'
  generationConfig: {
    responseMimeType: 'application/json',
    responseSchema: SUGGESTION_SCHEMA,
  },
  tools: [{
    googleSearchRetrieval: {
      dynamicRetrievalConfig: {
        mode: DynamicRetrievalMode.MODE_DYNAMIC,
        dynamicThreshold: 0.3, // Use grounding when confidence < 70%
      },
    },
  }],
};

Web Worker Communication

typescript
// useSuggestions.ts - Hook for worker communication
interface WorkerMessage {
  type: 'analyze' | 'quick' | 'data_driven' | 'cancel';
  context: SessionContext;
  options?: { enableGrounding?: boolean; model?: string };
  requestId?: string;
}

interface WorkerResponse {
  type: 'suggestions' | 'error' | 'status';
  data?: SuggestionsResponse;
  error?: string;
  timing?: { startTime: number; endTime: number; duration: number };
}

Performance Characteristics

MetricValueNotes
Worker Init~100msOne-time module load
Debounce2000msPrevents excessive API calls
Cache TTL60sLocal suggestion caching
API Latency500-2000msDepends on grounding
Max Context15 messagesSent to Cloud Function

6. WebSocket Real-Time Updates

typescript
// chat-orchestrator/src/websocket/WebSocketManager.ts
import { WebSocketServer, WebSocket } from 'ws';

interface WebSocketMessage {
  type: 'subscribe' | 'unsubscribe' | 'message' | 'status';
  channel?: string;
  payload?: any;
}

export class WebSocketManager {
  private wss: WebSocketServer;
  private channels: Map<string, Set<WebSocket>> = new Map();

  constructor(server: http.Server) {
    this.wss = new WebSocketServer({ server, path: '/ws' });
    this.setupHandlers();
  }

  private setupHandlers(): void {
    this.wss.on('connection', (ws, req) => {
      // Authenticate via query param or header
      const token = new URL(req.url, 'http://localhost').searchParams.get('token');
      if (!this.validateToken(token)) {
        ws.close(4001, 'Unauthorized');
        return;
      }

      ws.on('message', (data) => {
        const msg: WebSocketMessage = JSON.parse(data.toString());
        this.handleMessage(ws, msg);
      });
    });
  }

  broadcast(channel: string, payload: any): void {
    const subscribers = this.channels.get(channel);
    if (subscribers) {
      const message = JSON.stringify({ type: 'message', channel, payload });
      subscribers.forEach((ws) => {
        if (ws.readyState === WebSocket.OPEN) {
          ws.send(message);
        }
      });
    }
  }
}

Security Model

Authentication Flow

+--------+     +---------------+     +----------------+     +----------+
|  User  | --> | Google OAuth  | --> | Token Exchange | --> | Chat API |
+--------+     | (Identity     |     | Service        |     |          |
               | Platform)     |     +----------------+     +----------+
               +---------------+           |
                                           v
                               +-----------------------+
                               | Cloud SQL             |
                               | (Session Store)       |
                               +-----------------------+

Security Components

ComponentImplementationPurpose
OAuth 2.0Google Identity PlatformUser authentication
JWT TokensRS256 signed, 1hr expirySession management
API KeysSecret Manager storedService-to-service auth
Rate LimitingCloud Armor + Redis100 req/min per user
Input ValidationZod schemasRequest sanitization
SQL InjectionParameterized queriesDatabase protection
CORSAllowlisted originsCross-origin security
TLS 1.3Cloud Run managedTransport encryption

Secret Manager Keys

Secret NamePurpose
gemini-api-keyGemini 2.5 Pro API access
anthropic-api-keyClaude API (MCP fallback)
oauth-client-secretGoogle OAuth client secret
jwt-signing-keyJWT token signing
redis-urlRate limiting store
ruvector-db-passwordDatabase access

Rate Limiting Strategy

typescript
// chat-orchestrator/src/middleware/rateLimiter.ts
import { RateLimiterRedis } from 'rate-limiter-flexible';

const rateLimiters = {
  chat: new RateLimiterRedis({
    storeClient: redisClient,
    keyPrefix: 'rl:chat',
    points: 100,          // 100 requests
    duration: 60,         // per minute
    blockDuration: 60,    // block for 1 minute if exceeded
  }),

  commands: new RateLimiterRedis({
    storeClient: redisClient,
    keyPrefix: 'rl:cmd',
    points: 50,           // 50 commands
    duration: 60,         // per minute
  }),

  mcp: new RateLimiterRedis({
    storeClient: redisClient,
    keyPrefix: 'rl:mcp',
    points: 200,          // 200 MCP requests
    duration: 60,         // per minute
  }),
};

Authorization Model

typescript
// chat-orchestrator/src/auth/permissions.ts
interface Permission {
  action: string;
  resource: string;
  conditions?: Record<string, any>;
}

const ROLE_PERMISSIONS: Record<string, Permission[]> = {
  admin: [
    { action: '*', resource: '*' },
  ],
  analyst: [
    { action: 'read', resource: 'cases' },
    { action: 'read', resource: 'simulations' },
    { action: 'execute', resource: 'reports' },
    { action: 'read', resource: 'airtable' },
  ],
  operator: [
    { action: 'read', resource: 'cases' },
    { action: 'create', resource: 'cases' },
    { action: 'update', resource: 'cases', conditions: { ownedBy: '${userId}' } },
    { action: 'execute', resource: 'simulations' },
  ],
  viewer: [
    { action: 'read', resource: 'cases' },
    { action: 'read', resource: 'reports' },
  ],
};

Deployment Architecture

Cloud Run Services

ServiceURLMemoryCPUMin/Max Instances
chat-orchestratorchat-orchestrator-*.run.app1024Mi21/10
chat-uichat-ui-*.run.app512Mi10/5
mcp-servermcp-server-*.run.app512Mi11/3

Infrastructure Diagram

+-----------------------------------------------------------------------------------+
|                        GOOGLE CLOUD PLATFORM (new-project-473022)                  |
+-----------------------------------------------------------------------------------+
|                                                                                    |
|  +---------------------------+  +---------------------------+                     |
|  |     Cloud Run Region      |  |     Cloud SQL Region      |                     |
|  |     (us-central1)         |  |     (us-central1)         |                     |
|  |                           |  |                           |                     |
|  |  +---------------------+  |  |  +---------------------+  |                     |
|  |  | chat-orchestrator   |  |  |  | conveyor-ruvector-  |  |                     |
|  |  | (1024Mi, 2 CPU)     |--+--+--| db (PostgreSQL 15)  |  |                     |
|  |  +---------------------+  |  |  +---------------------+  |                     |
|  |                           |  |                           |                     |
|  |  +---------------------+  |  +---------------------------+                     |
|  |  | chat-ui             |  |                                                    |
|  |  | (512Mi, 1 CPU)      |  |  +---------------------------+                     |
|  |  +---------------------+  |  |     Cloud Functions       |                     |
|  |                           |  |     (us-central1)         |                     |
|  |  +---------------------+  |  |                           |                     |
|  |  | mcp-server          |  |  |  +-------------------+    |                     |
|  |  | (512Mi, 1 CPU)      |--+--+--| case-manager      |    |                     |
|  |  +---------------------+  |  |  +-------------------+    |                     |
|  |                           |  |  +-------------------+    |                     |
|  +---------------------------+  |  | simulation-agent  |    |                     |
|                                 |  +-------------------+    |                     |
|  +---------------------------+  |  +-------------------+    |                     |
|  |     Secret Manager        |  |  | db-query-agent    |    |                     |
|  |                           |  |  +-------------------+    |                     |
|  |  gemini-api-key           |  |  +-------------------+    |                     |
|  |  anthropic-api-key        |  |  | airtable-agent    |    |                     |
|  |  oauth-client-secret      |  |  +-------------------+    |                     |
|  |  jwt-signing-key          |  +---------------------------+                     |
|  |  ruvector-db-password     |                                                    |
|  +---------------------------+                                                    |
|                                                                                    |
|  +---------------------------+  +---------------------------+                     |
|  |     Cloud Armor           |  |     Cloud Monitoring      |                     |
|  |     (WAF + DDoS)          |  |     (Metrics + Alerts)    |                     |
|  +---------------------------+  +---------------------------+                     |
|                                                                                    |
+-----------------------------------------------------------------------------------+

Data Flow

Message Processing Pipeline

User Input
    |
    v
+------------------+
| Input Validation |  <-- Zod schema validation
+------------------+
    |
    v
+------------------+
| Rate Limiter     |  <-- Redis-backed rate limiting
+------------------+
    |
    v
+------------------+
| Auth Middleware  |  <-- JWT validation
+------------------+
    |
    v
+------------------+     +------------------+
| Command Parser   | OR  | Intent Classifier|
| (if /command)    |     | (Gemini 2.5 Pro) |
+------------------+     +------------------+
    |                           |
    +-------------+-------------+
                  |
                  v
         +------------------+
         | Context Manager  |  <-- Load conversation history
         +------------------+
                  |
                  v
         +------------------+
         | Function         |  <-- Route to Cloud Function
         | Dispatcher       |
         +------------------+
                  |
    +-------------+-------------+-------------+
    |             |             |             |
    v             v             v             v
+--------+  +-----------+  +----------+  +---------+
| case-  |  | simulation|  | db-query |  | airtable|
| manager|  | -agent    |  | -agent   |  | -agent  |
+--------+  +-----------+  +----------+  +---------+
    |             |             |             |
    +-------------+-------------+-------------+
                  |
                  v
         +------------------+
         | Response         |  <-- Format and enrich response
         | Generator        |
         +------------------+
                  |
                  v
         +------------------+
         | WebSocket        |  <-- Broadcast to subscribers
         | Broadcaster      |
         +------------------+
                  |
                  v
            User Response

Consequences

Positive

  1. Unified Interface - Single chat interface replaces multiple API endpoints and dashboards
  2. Natural Language - Users interact via natural language without learning API syntax
  3. Real-time Updates - WebSocket integration enables live collaboration and notifications
  4. Client-Side ML - WASM modules provide instant predictions without server round-trips
  5. Claude Integration - MCP Server enables Claude Code to orchestrate Conveyor AI workflows
  6. Context Preservation - Conversation history enables follow-up queries and context-aware responses
  7. Extensibility - Command system allows easy addition of new functions

Negative

  1. Complexity - Additional orchestration layer increases system complexity
  2. Latency - AI intent classification adds ~200-500ms to non-command requests
  3. Cost - Gemini API calls add per-token costs (~$0.0025/1K input tokens)
  4. State Management - WebSocket connections require careful handling for scale
  5. Cold Starts - New Cloud Run instances incur ~2-5s startup latency

Risks

RiskLikelihoodImpactMitigation
Gemini API rate limitsMediumHighImplement caching, fallback to local inference
WebSocket connection dropsMediumMediumAutomatic reconnection with exponential backoff
Token exhaustionLowHighBudget alerts, usage monitoring, rate limiting
Intent misclassificationMediumMediumCommand syntax as fallback, confidence thresholds
Data inconsistencyLowMediumTransaction boundaries, idempotent operations

Mitigation Strategies

  • Gemini Caching: Cache frequent intents and responses in Redis (1hr TTL)
  • Fallback Chain: Command parser -> Gemini -> Local inference -> Error message
  • Connection Pooling: Maintain warm connections to Cloud Functions
  • Circuit Breaker: Trip after 5 consecutive failures, retry after 30s
  • Graceful Degradation: Serve cached/stale data when functions unavailable

Implementation Phases

Phase 1: Foundation (Week 1-2) - COMPLETE

  • Create chat-system Cloud Run service (deployed as chat-system-245235083640.us-central1.run.app)
  • Implement command parser with ADR-014 command syntax (/case, /sim, /db, /airtable, /status, /help)
  • Configure Secret Manager integration for API keys
  • Set up WebSocket infrastructure (deferred - using HTTP polling)

Phase 2: AI Integration (Week 3-4) - COMPLETE

  • Integrate Gemini 2.5 Pro for AI responses
  • Implement conversation history and context management
  • Add response generation with streaming support
  • Create function dispatcher (CommandRouter.ts) for Cloud Functions

Phase 3: MCP Server (Week 5) - PLANNED

  • Deploy MCP Server on Cloud Run
  • Register Conveyor AI tools
  • Implement OAuth token exchange
  • Test Claude Code integration

Phase 4: Chat UI (Week 6-7) - COMPLETE

  • Create ViteJS + HeroUI chat interface (HeroUI components, TailwindCSS)
  • Collapsible side panel for metrics (Messages, AI Responses, Commands, Learning)
  • Fixed command input at bottom
  • Chat history modal with search/archive/restore
  • Self-learning hooks integration
  • Integrate WASM modules from shared-ai (deferred)
  • Add offline support with IndexedDB (deferred)

Phase 5: Security & Polish (Week 8) - PARTIAL

  • Implement OAuth 2.0 flow (Google Identity Platform via @shared-api/auth)
  • VITE_SKIP_AUTH for development/testing mode
  • Configure Cloud Armor WAF
  • Add rate limiting
  • Performance optimization (Vite chunking, gzip/brotli compression)

Monorepo Structure (Implemented)

/extensions-cloudrun/
+-- packages/
|   +-- shared-ui/              # IMPLEMENTED - Reusable HeroUI components
|   +-- shared-api/             # IMPLEMENTED - API clients, auth, logger
|   +-- shared-config/          # IMPLEMENTED - Vite base config, TailwindCSS
|   +-- shared-wasm/            # IMPLEMENTED - WASM utilities
+-- apps/
|   +-- sales-pipeline/         # IMPLEMENTED - Sales extension
|   +-- financial-ops/          # IMPLEMENTED - Financial extension
|   +-- hr-compensation/        # IMPLEMENTED - HR extension
|   +-- compliance-legal/       # IMPLEMENTED - Compliance extension
|   +-- customer-success/       # IMPLEMENTED - Customer success extension
|   +-- revenue-ops/            # IMPLEMENTED - Revenue ops extension
|   +-- chat-system/            # IMPLEMENTED - Chat web interface + backend
|       +-- src/
|       |   +-- components/     # ChatWindow, CommandInput, CommandPalette, SuggestionsPanel
|       |   +-- hooks/          # useChat, useChatHistory, useSelfLearning, useSuggestions
|       |   +-- workers/        # suggestions.worker.ts (Web Worker for background AI)
|       |   +-- services/       # GeminiService, CommandRouter
|       |   +-- config/         # systemPrompt, constants
|       +-- nginx.conf          # CSP headers, static serving
|       +-- Dockerfile          # Cloud Run deployment
+-- infrastructure/
    +-- gcp/
        +-- functions/          # IMPLEMENTED - 6 Cloud Functions
            +-- airtable-agent/
            +-- case-manager/
            +-- simulation-agent/
            +-- db-query-agent/
            +-- suggestion-agent/  # NEW - AI suggestions with Gemini 3 + Google Search
            +-- pandadoc-webhook/

References


Document Version: 1.4 Last Updated: 2026-01-19 Implementation Status: Phases 1, 2, 4 Complete; Phase 3, 5 Partial; AI Suggestions System Deployed (suggestion-agent Cloud Function + Web Worker)