Back to Langflow

Feature: Langflow Assistant

docs/features/langflow-assistant.md

1.11.0.dev1121.0 KB
Original Source

Feature: Langflow Assistant

Generated on: 2026-01-21 Updated on: 2026-03-30 Updated on: 2026-05-19 Updated on: 2026-05-27 Status: Draft Owner: Engineering Team

2026-05-19 revision — Single-agent-loop pivot (Claude Code / Codex pattern): the assistant is now ONE agent + an MCP toolkit instead of a multi-phase orchestrator. New MCP tools GenerateComponent, DescribeFlowIO, and RunFlow; run metrics surfaced on completion; edit+run continuation gating; provider-agnostic in-flow model selection (no OpenAI obligation); an "Orchestrating..." progress step for multi-step prompts. Single-thing requests are byte-identical to the previous experience.

2026-05-27 revision — Per-turn cost + reliability hardening pass: (1) Per-turn usage / duration badge — every complete SSE event now carries usage (input/output/total tokens summed across TranslationFlow + every agent attempt) and duration_seconds (server-side perf_counter). The frontend renders them via the Playground's MessageMetadata badge (subtle variant) next to the assistant title. Per-phase (intent/main) assistant.tokens.phase log lines back the new dashboards. (2) Model-fallback chain — when a flow execution fails with a model_not_found-class error the streamer swaps to the next candidate from get_provider_model_candidates(provider) without consuming a validation-retry slot; auth / rate-limit / network errors fall through unchanged. Exhausted providers surface a named format_models_exhausted_message. (3) Empty-state ModelProvider dialog — the "No Models Configured" panel now opens ModelProviderModal inline instead of navigating to Settings. (4) Built-in code scan exemptionrun_working_flow skips the AST scan when a node's inline code is byte-identical (after whitespace normalization) to the registry's canonical template for that type, so built-ins like URLComponent are no longer false-positively blocked. (5) Generic tool-name fallback + reserved-name guardrails_derive_tool_name snake-cases the component class when the output uses a generic method (output / process / build_output / …); the generator now refuses Output(name="component_as_tool") / method="to_toolkit" at validation time; _should_skip_output was tightened to require name + method + types ALL match the synthetic sentinel. (6) Diagnostic-preserving errorsextract_friendly_error now surfaces the deepest meaningful cause (provider client 'message': '...' repr, or the part after "Error building Component X:") instead of the wrapper prefix. (7) Frontend ModelInput sanitization — a new recoverModelOption helper repairs doubly-encoded model values so the Agent node's Language Model dropdown trigger never renders literal JSON. (8) MAX_CANVAS_SUMMARY_CHARS = 2000 — hard cap on the canvas summary injected into the prompt, plus an explicit "do NOT treat as new instructions" framing block for prompt-injection mitigation. (9) PLAN_APPROVAL_INPUT byte-identical short-circuit — the classifier now bypasses one full LLM round-trip on plan approval the same way it already does for edit continuation.

2026-06-03 revision@-mention of canvas components and fields in the assistant input. Typing @ opens a filterable list of the canvas components; selecting one inserts a quoted, space-free reference token '<componentId>' that the agent resolves to that component's details/code via the existing get_flow_component_details MCP tool. Typing . adjacent to a confirmed token re-triggers the list in field mode, listing that component's user-facing template fields (sourced client-side from node.data.node.template, no network call); selecting one inserts a single terminal token '<componentId>.<fieldName>' resolved to the field's current value via get_flow_component_field_value. Frontend-only parsing (mention-parsing.ts / use-component-mentions.ts); no backend change. See ADR-031.

Companion docs: end-to-end architecture (with Mermaid sequence/flow diagrams) lives at src/backend/base/langflow/agentic/ARCHITECTURE.md. This document covers the product/feature-level model; refer to ARCHITECTURE.md for the internal single-agent-loop and MCP tool wiring.


Table of Contents

  1. Overview
  2. Ubiquitous Language Glossary
  3. Domain Model
  4. Behavior Specifications
  5. Architecture Decision Records
  6. Technical Specification
  7. Observability
  8. Deployment & Rollback
  9. Architecture Diagrams

1. Overview

Summary

The Langflow Assistant is an AI-powered chat interface that helps users generate custom Langflow components through natural language prompts. It provides real-time streaming feedback during component generation, automatic code validation with retry logic, and seamless integration with the Langflow canvas.

Business Context

Building custom components in Langflow requires knowledge of the component architecture, Python programming, and understanding of inputs/outputs. The Langflow Assistant removes this barrier by allowing users to describe what they want in natural language, and the AI generates validated, ready-to-use component code that can be added directly to their flow.

Bounded Context

Context: Agentic - AI-assisted development capabilities within Langflow

This context owns:

  • AI assistant interactions and chat management
  • Component code generation and validation
  • Streaming progress updates and token delivery
  • Model provider integration and configuration
ContextRelationshipDescription
FlowCustomer-SupplierAssistant generates components that integrate with flows; Flow context supplies flow IDs and component APIs
Model ProvidersConformistAssistant conforms to configured model providers (OpenAI, Anthropic, etc.) for LLM capabilities
VariablesCustomer-SupplierVariables context supplies API keys; Assistant uses them for model authentication
Custom ComponentsCustomer-SupplierCustom Components context supplies validation APIs; Assistant uses them to validate generated code

2. Ubiquitous Language Glossary

TermDefinitionCode Reference
AssistantAI-powered chat interface that generates Langflow components from natural languageAssistantPanel, AssistantService
AssistantMessageA single message in the chat, either from user or assistantAssistantMessage interface
ComponentCodePython code that defines a Langflow component with inputs, outputs, and processing logiccomponent_code field, extract_component_code()
IntentClassificationLLM-based detection of whether user wants to generate a component, ask a question, or is off-topicclassify_intent(), IntentResult
ProgressStepA discrete stage in the component generation pipeline (generating, validating, etc.)StepType, AgenticStepType
SSEServer-Sent Events - Protocol for streaming real-time progress updates from server to clientStreamingResponse, postAssistStream()
TokenEventReal-time streaming of LLM output tokens for Q&A responsesAgenticTokenEvent, format_token_event()
ValidationTwo-phase process: static AST analysis (validate_component_code()) followed by runtime instantiation (validate_component_runtime())validate_component_code(), validate_component_runtime(), ValidationResult
ValidationRetryAutomatic re-generation attempt when validation fails, including error contextVALIDATION_RETRY_TEMPLATE, max_retries
FloatingPanelThe assistant panel displayed as a floating overlay centered on the canvasAssistantPanel
ModelProviderExternal LLM service (OpenAI, Anthropic, etc.) used for generationprovider, PREFERRED_PROVIDERS
EnabledProviderA model provider that has been configured with valid API credentialsget_enabled_providers_for_user()
FlowExecutorService that runs Langflow flows programmatically for assistant operationsFlowExecutor, execute_flow_file()
TranslationFlowPre-built flow that translates user input and classifies intentTranslationFlow.json, TRANSLATION_FLOW
LangflowAssistantFlowPre-built flow containing the main assistant prompt and component generation logicLangflowAssistant.json, LANGFLOW_ASSISTANT_FLOW
ReasoningUIAnimated typing display showing "thinking" messages during component generationAssistantLoadingState
ApproveActionUser action to add a validated component to the canvashandleApprove(), addComponent()
OffTopicIntent classification for questions unrelated to Langflow (other tools, general knowledge)"off_topic", OFF_TOPIC_REFUSAL_MESSAGE
RuntimeValidationSecond-phase validation that instantiates the component class to catch import/runtime errorsvalidate_component_runtime(), build_custom_component_template()
AgenticSessionPrefixagentic_ prefix on session IDs to isolate Assistant sessions from PlaygroundAGENTIC_SESSION_PREFIX
SingleAgentLoopThe assistant is one agent (flow_builder_assistant.py) plus an MCP toolkit; the same loop chains tools for multi-thing prompts instead of spawning sub-agents (Claude Code / Codex pattern)flow_builder_assistant.py, src/lfx/src/lfx/mcp/flow_builder_tools/
GenerateComponentMCP tool that re-enters the full component validation pipeline mid-loop, registers the user component, and returns class_name so SearchComponentTypes can find itGenerateComponent (flow_builder_tools/)
DescribeFlowIOMCP tool that deterministically classifies a flow's inputs/outputs/tool components from the actual wiring (scalable for large flows; replaces guess-by-name)DescribeFlowIO (flow_builder_tools/)
RunFlowMCP tool that executes the canvas flow and returns the result plus run metricsRunFlow (flow_builder_tools/), agentic/services/flow_run.py
RunMetrics{duration_seconds, input_tokens, output_tokens, total_tokens} returned by RunFlow; duration via perf_counter, tokens via extract_graph_token_usage over graph verticesagentic/services/flow_run.py, extract_graph_token_usage
OrchestratingStepProgress step/label ("Orchestrating...") chosen for compound or build+run prompts; the run detector is a post-LLM rescue, not a pre-LLM overrideagentic/services/request_framing.py::decide_progress_step
EditContinuationAfter the user approves proposed canvas edits, the frontend saves the flow then silently re-sends EDIT_CONTINUATION_INPUT so the SAME request finishes deferred steps; only fires when a deferred step existed (continuation_expected)EDIT_CONTINUATION_INPUT, continuation_expected
ComponentThenFlowCompound intent: make a component, build a flow with it, then run it — handled by the single agent loop in one requestcomponent_then_flow (translation_flow.py / intent_classification.py)
AvailableModelProvidersDetects providers that actually have a configured key and injects an [Available language models ...] block into the prompt; no fixed OpenAI/Anthropic obligation for the flow's Agent nodeavailable_model_providers(global_variables) (agentic/services/flow_preparation.py)
PlanApprovalPLAN_APPROVAL_INPUT is a byte-identical FE/BE protocol string matched exactly and bypassing the classifier — both via the frontend Continue button and via the deterministic short-circuit in classify_intent that saves one full TranslationFlow LLM round-trip; propose_plan is optional (only large/ambiguous changes stop for a plan card)PLAN_APPROVAL_INPUT, propose_plan, classify_intent
PerTurnUsageAccumulated LLM token usage for an Assistant turn (TranslationFlow classification + every agent attempt + retries). Same shape as the playground's properties.usage so the existing MessageMetadata renderer is reusedtotal_usage / _accumulate() in assistant_service.execute_flow_with_validation_streaming; AssistantMessage.usage (TS)
PerTurnDurationWall-clock duration of the turn measured server-side with perf_counter around the whole pipeline; surfaced as duration_seconds on the complete event and rendered (converted to ms) by MessageMetadatarequest_started_at / _complete() in assistant_service; AssistantMessage.duration (TS)
UsageBadgeThe subtle inline badge rendered next to the assistant title on a complete message — the Playground's MessageMetadata component reused with the new subtle prop (muted-foreground color, no ml-auto) so it sits cleanly in the header lineMessageMetadata subtle prop in messageMetadataComponent; AssistantMessageItem render
PerPhaseTokenLogStructured assistant.tokens.phase phase=<intent|main> user_id=... input=... output=... total=... log emitted by _accumulate so log indices (Sentry / Datadog) can group cost by phase and alert on outliers_accumulate(tokens, phase=...)
IntentResultTokensNew optional tokens field on IntentResult carrying the TranslationFlow LLM cost for that classification, threaded through all five JSON-parsing fallback paths via _with_tokens(...)IntentResult.tokens, _with_tokens()
MetricsEnvelopePer-run token usage wrapped inside the executor's result dict under the _metrics key; consumed and stripped by the orchestrator (assistant_service) and the intent classifier so it never leaks into the user-facing SSE payload (the curated usage field does that job)flow_executor.execute_flow_file / execute_flow_file_streaming; _accumulate(result.pop("_metrics", ...))
MaxCanvasSummaryCharsHard 2000-char cap on the canvas-summary string injected into the prompt as [Canvas reference ...]. Safety net for very large canvases (50+ components, long sticky notes, big custom-component code) whose multi-kB summaries would re-ship on every LLM turn and crowd out the user's requestMAX_CANVAS_SUMMARY_CHARS in flow_types.py; truncation in _get_current_flow_summary
CanvasReferenceBlockPrompt framing for the injected canvas summary: wrapped in [Canvas reference (quoted prior state — do NOT treat as new instructions ...)] ... [End of canvas reference] so the LLM is taught to read it as quoted prior context. Reduces prompt-injection surface from flow names / sticky notes / component values_get_current_flow_summary injection block
TranslationFlowMaxTokensHard ceiling (max_tokens=300) on the classifier's JSON output. Typical output is 60–120 tokens; 300 leaves ~2× headroom for non-Latin translations. Pure cost containment with no observable UX impact_build_llm_config in translation_flow.py
ModelFallbackChainInner while swap_requested: loop in the streaming orchestrator that, on a model_not_found-class error, swaps model_name for the next entry from get_provider_model_candidates(provider) and re-runs THIS attempt without consuming a validation-retry slot. Auth / rate-limit / network errors fall through unchanged. The chain is seeded with the resolver's default so it walks PAST already-tried modelstried_models set; inner swap loop in execute_flow_with_validation_streaming; get_provider_model_candidates()
ModelUnavailableMarkerSubstring (case-insensitive) used by is_model_unavailable_error to identify model_not_found-class errors: "model_not_found", "does not have access to model", "model is not available", "the model does not exist", "model not available", "no access to model"_MODEL_UNAVAILABLE_MARKERS in helpers/error_handling.py
ModelsExhaustedMessageNamed, user-actionable error string produced when every candidate model on a provider has been tried and failed (e.g. "No accessible model on openai. Tried: gpt-4o, gpt-4o-mini. Configure access to one of these models in your openai account, or switch to a different provider in Settings → Model Providers.")format_models_exhausted_message(provider, tried_models)
DiagnosticErrorExtractionextract_friendly_error now extracts the deepest meaningful cause via _extract_deepest_meaningful_cause (provider client 'message': '...' repr first, then the part after "Error building Component X:") before falling back to plain truncation. Surfaces the actually useful detail instead of the wrapper prefix_extract_deepest_meaningful_cause, _PROVIDER_MESSAGE_RE, _COMPONENT_WRAPPER_PREFIX in helpers/error_handling.py
ApiKeyDiagnosticPreservationget_llm captures the user-supplied api_key BEFORE the global-variable resolution step so the error message can name the unresolved variable back to the user instead of always pointing to the canonical key — and replaces a missing/"Unknown" provider error with a "reselect a model" message instead of the nonsense Unknown API key … UNKNOWN_API_KEY stringget_llm in lfx/base/models/unified_models/instantiation.py
RecoverModelOptionFrontend defensive helper that sanitizes a ModelInput value before reading name — repairs a doubly-encoded payload (the assistant's flow_update pipeline can leave the entire model list serialized into value[0].name) so the Agent node's Language Model dropdown trigger renders a plain readable model name instead of literal JSON like [{"provider":"OpenAI",...]recoverModelOption in components/core/parameterRenderComponent/components/modelInputComponent/helpers/recover-model-option.ts
SerializedModelCoercionBackend coercion at the configure_component choke point: model-typed template fields (template[field].type == "model") now normalize JSON/YAML-string values and the name-nested-spec QA pattern into the canonical [{"provider": X, "name": Y}] shape. Prevents the catalog falling back to provider="Unknown"get_llm: missing a provider. Applies to both BuildFlowFromSpec and ConfigureComponent tools_coerce_model_value, _coerce_single_model_entry, _parse_serialized_model_text in lfx/graph/flow_builder/component.py
BuiltinCodeExemptionrun_working_flow's AST scan (_scan_flow_component_code) now skips a node whose inline code is byte-identical (after _normalize_code whitespace strip) to the registry's canonical template for that type. Built-ins like URLComponent legitimately use importlib.util.find_spec and os.environ.get — patterns the LLM-code-scanner forbids — so without this exemption every run of a trusted built-in was a false-positive block. Registry-lookup failure falls back to scan-all_get_canonical_code_map, _normalize_code in agentic/services/flow_run.py
GenericToolNameFallback_derive_tool_name rule in lfx/base/tools/component_tool.py: when a Component has exactly ONE tool-exposed Output and that Output's method name is generic (output, process, build_output, run, execute, main, handler, build_result), the LLM-facing tool name is derived from the snake_cased component class name (RandomMenuItemrandom_menu_item) — the class name is the user's stated intent and is always more informative than output/process. Multi-output components keep method-derived names so tools don't collide_GENERIC_OUTPUT_METHOD_NAMES, _class_name_to_tool_name, _derive_tool_name
ReservedOutputNameThe two synthetic-tool sentinels the wiring layer auto-creates when a Component is flipped to Tool Mode: Output.name = "component_as_tool" + Output.method = "to_toolkit". Generation-time validator (validate_component_code) rejects code that declares either with a hint to pick a value-descriptive name; runtime _should_skip_output was tightened to require name + method + types ALL match the synthetic so a user-declared component_as_tool is no longer dropped_RESERVED_OUTPUT_NAME, _RESERVED_OUTPUT_METHOD in helpers/validation.py; _should_skip_output in lfx/base/tools/component_tool.py
AgentToolCompatibilitySection"Agent Tool Compatibility" block in the LangflowAssistant.json system prompt teaching the generator (1) action verb_noun method naming, (2) class-level description as the LLM-facing tool description, (3) tool_mode=True discipline + clear info=, (4) NEVER use the reserved component_as_tool/to_toolkit names. The complementary defense to the runtime guardrailsLangflowAssistant.json system prompt
OpenProviderModalEmptyStateThe "No Models Configured" empty-state button now opens a ModelProviderModal dialog inline (modelType="llm") instead of navigating to /settings/model-providers. Lets the user configure providers without leaving the assistant panelAssistantNoModelsState in assistant-no-models-state.tsx; data-testid="assistant-no-models-configure-providers"
MaxFlowVerificationAttemptsHard cost ceiling (MAX_FLOW_VERIFICATION_ATTEMPTS = 3) for the post-build flow-verification loop. Each attempt costs one full execution plus at most one agent fix turn — so the cap doubles as the user-visible "after N attempt(s)" caveat stringflow_types.MAX_FLOW_VERIFICATION_ATTEMPTS
ComponentMentionTyping @ in the assistant input opens a filterable list of the canvas components; arrow/Tab navigate, Enter confirms, Esc cancels, and the highlighted component is selected on the canvas as a live preview. Confirming inserts a quoted, space-free token '<componentId>' the agent resolves to that component's details/codedetectMention, formatMentionToken in mention-parsing.ts; use-component-mentions.ts; assistant-mention-popover.tsx
FieldMentionChaining . directly after a confirmed component token ('<id>'.) re-opens the list in field mode, showing that component's user-facing template fields by display name; selecting one inserts a single terminal token '<id>.<fieldName>' the agent resolves to the field's current value. The field list is sourced client-side from node.data.node.template (no network call); underscore-prefixed/internal keys (_type, _frontend_node_*) and code, plus show: false and label-less fields, are excludeddetectFieldMention, formatFieldMentionToken, toFieldItems in use-component-mentions.ts

3. Domain Model

3.1 Aggregates

AssistantSession

The core aggregate managing a user's interaction session with the assistant.

  • Root Entity: AssistantSession (implicit, managed via session_id)
  • Entities:
    • AssistantMessage - Individual messages in the conversation
    • AgenticProgressState - Current step in generation pipeline
  • Value Objects:
    • AssistantModel - Selected provider/model combination
    • AgenticResult - Final generation result with validation status
    • ValidationResult - Outcome of code validation
    • IntentResult - Translation and intent classification
  • Invariants:
    • A session must have a valid flow_id to generate components
    • Only one message can be in streaming status at a time
    • session_id is generated once per session on the frontend and reused across all requests in the same session
    • A new session_id is only generated when the user explicitly clicks "New session"
    • session_id must be passed with every request to maintain conversation memory
    • The TranslationFlow must NOT share the assistant's session_id (it is stateless)

ComponentGeneration

Represents a single component generation attempt with validation.

  • Root Entity: Generation attempt (tracked via validation_attempts)
  • Value Objects:
    • ComponentCode - Extracted Python code
    • ValidationResult - Compilation and instantiation result
  • Invariants:
    • Maximum retries cannot exceed configured max_retries
    • Component code must contain a class inheriting from Component
    • Component must define inputs or outputs to be valid

ModelProviderConfiguration

Configuration for available LLM providers.

  • Root Entity: Provider configuration per user
  • Value Objects:
    • EnabledProvider - Provider with valid API key
    • ProviderModel - Available model for a provider
  • Invariants:
    • At least one provider must be enabled to use assistant
    • API key must be valid and non-empty for provider to be enabled
    • Provider preference order: Anthropic > OpenAI > Google Generative AI > Groq
    • IBM WatsonX and Ollama are supported with provider-specific parameter injection (URL, project ID, base URL)

Model Selection Behavior

The frontend implements automatic model selection to ensure a valid model is always sent to the backend:

  • Auto-selection: When no model is explicitly selected, or when the persisted model's provider is no longer available, the first available model from enabled providers is automatically selected
  • Persistence: Selected model is stored in localStorage (langflow-assistant-selected-model)
  • Validation: On load, persisted model is validated against current enabled providers. If the provider or model no longer exists, the selection is cleared and auto-selection triggers
  • Provider icon: The model selector trigger displays the provider's icon (e.g., Anthropic, OpenAI) instead of a generic label
  • Invariant: A request must never be sent without a valid model selection to prevent backend fallback to unexpected providers

3.2 Domain Events

EventTriggerPayloadConsumers
ProgressUpdateEach pipeline stage transition{step, attempt, max_attempts, message?, error?}Frontend UI (SSE)
TokenGeneratedEach LLM output token (Q&A only){chunk: string}Frontend UI (SSE)
GenerationCompletePipeline finished successfully{result, validated, class_name?, component_code?}Frontend UI (SSE)
GenerationErrorUnrecoverable error occurred{message: string}Frontend UI (SSE)
GenerationCancelledUser cancelled or disconnected{message?: string}Frontend UI (SSE)
ValidationSucceededCode compiled and instantiated{class_name, code}Assistant Service
ValidationFailedCode failed to compile/instantiate{error, code, class_name?}Assistant Service (triggers retry)
ComponentApprovedUser clicked "Add to Canvas"{component_code, class_name}Canvas (adds node)
OrchestratingStartedCompound or build+run prompt detected (post-LLM rescue){step: "orchestrating", message: "Orchestrating..."}Frontend UI (SSE)
RunMetricsSurfacedRunFlow finished executing the canvas flow{duration_seconds, input_tokens, output_tokens, total_tokens} (folded into the complete payload)Frontend UI (SSE)
EditContinuationResentUser approved proposed canvas edits and a deferred step existedEDIT_CONTINUATION_INPUT re-sent on the SAME request (continuation_expected: true)Assistant Service
PerTurnUsageSurfacedAny complete event (success, refusal, retry-exhausted, sanitization-blocked){usage: {input_tokens, output_tokens, total_tokens}, duration_seconds} folded into the complete payload by _complete()Frontend MessageMetadata badge (subtle variant) on the assistant message header
PerPhaseTokensLogged_accumulate(tokens, phase="intent"|"main") after each LLM callstructured log line assistant.tokens.phase phase=... user_id=... session_id=... input=... output=... total=...Log indices / Sentry / Datadog dashboards (per-phase cost & outlier alerting)
ModelFallbackAttemptedFlowExecutionError with a model_not_found-class signal AND a known providerassistant.model_fallback from=<old> to=<new> provider=<p> tried_so_far=[...] log; inner-loop swap re-runs the same attempt without consuming a validation slotInternal — observable via logs / metrics
ModelsExhaustedEvery candidate from get_provider_model_candidates(provider) was triedformat_models_exhausted_message(provider, tried_models) becomes the user-facing execution_errorFrontend error SSE event
UnsafeBuiltinScanSkipped (silent)_scan_flow_component_code matched a node's code byte-identically (after _normalize_code) to the registry canonical templateno SSE event; the scan is simply skipped for that nodeInternal — observable via tests
PlanApprovalShortCircuitedclassify_intent sees text.strip() == PLAN_APPROVAL_INPUTintent.build_flow.deterministic: plan-approval continuation signal log; TranslationFlow LLM call is skippedInternal — saves one full LLM round-trip per approval click

4. Behavior Specifications

Feature: Langflow Assistant

As a Langflow user I want to generate custom components using natural language So that I can build flows without writing Python code manually

Background

  • Given a user with an active Langflow session
  • And at least one model provider is configured with a valid API key
  • And the user has a flow open in the canvas

Scenario: Generate a simple component successfully

  • Given the assistant panel is open
  • When I enter "Create a component that converts text to uppercase"
  • And I click send
  • Then I should see a "generating_component" progress indicator
  • And I should see the reasoning UI with typing animation
  • And I should see an "extracting_code" progress indicator
  • And I should see a "validating" progress indicator
  • And I should see a "validated" success indicator
  • And I should see the generated component code
  • And I should see an "Add to Canvas" button

Scenario: Ask a question about Langflow

  • Given the assistant panel is open
  • When I enter "How do I connect two components?"
  • And I click send
  • Then I should see a "generating" progress indicator
  • And I should see streaming text response
  • And I should NOT see the validation indicators
  • And I should NOT see an "Add to Canvas" button

Scenario: Add generated component to canvas

  • Given a validated component has been generated
  • When I click the "Add to Canvas" button
  • Then the component should be validated through the component API
  • And the component should appear on the canvas at viewport center
  • And I should see a success notification

Scenario: Select a specific model for generation

  • Given the assistant panel is open
  • And I have multiple model providers configured
  • When I click the model selector
  • And I select "gpt-4o" from "openai"
  • And I enter "Create a text splitter component"
  • And I click send
  • Then the generation should use the selected model
  • And I should see the model name in the request

Scenario: Auto-select first available model

  • Given the assistant panel is open
  • And I have not previously selected a model
  • And I have at least one model provider configured
  • When the model selector initializes
  • Then the first available model should be automatically selected
  • And the selected model should be persisted for future sessions

Scenario: Maintain conversation memory across messages

  • Given the assistant panel is open
  • And I have previously generated a component
  • When I ask a follow-up question like "can you use dataframe output instead?"
  • Then the assistant should remember the previous component
  • And it should generate a modified version of the component
  • And I should see the same progress indicators as the initial generation
  • And I should see the component card with "Approve" and "View Code" buttons

Scenario: Follow-up modification classified as component generation

  • Given I have generated a component in the current session
  • When I send a modification request like "add error handling" or "use X instead"
  • Then the intent should be classified as "generate_component" (not "question")
  • And the progress card should appear immediately (no token streaming)
  • And the validation pipeline should run as normal

Scenario: New session resets conversation memory

  • Given I have messages in the assistant chat
  • When I click "New session" in the header
  • Then all messages should be cleared
  • And a new session ID should be generated
  • And subsequent messages should not reference previous conversation

Scenario: Multi-language support

  • Given the assistant panel is open
  • When I enter "Crie um componente que soma dois numeros" (Portuguese)
  • And I click send
  • Then the input should be translated to English internally
  • And the intent should be classified as "generate_component"
  • And I should receive a valid component that adds numbers

Scenario: Automatic retry on validation failure

  • Given the assistant panel is open
  • And max_retries is set to 3 (meaning 3 total attempts)
  • When I submit a request that generates invalid code
  • Then I should see a "validation_failed" indicator with a clean error (not raw stacktrace)
  • And I should see a "retrying" indicator
  • And the system should automatically re-generate with error context
  • And I should see "Attempt 2 of 3" in the UI (1-indexed, only shown on retries)
  • And validation includes both static AST checks AND runtime instantiation

Scenario: Max retries exhausted

  • Given the assistant panel is open
  • And max_retries is set to 3
  • When the generated code fails validation 3 times
  • Then I should see the "Component generation failed" card
  • And I should see a friendly message: "The selected model was unable to generate valid component code. Try again or use a more capable model."
  • And I should see a collapsible "Error details" section (collapsed by default)
  • And I should see a "View code" toggle and a "Try Again" button
  • And I should NOT see an "Add to Canvas" button

Scenario: Ask about unrelated tools (off-topic guardrail)

  • Given the assistant panel is open
  • When I ask "how does n8n work?" or any question unrelated to Langflow
  • Then the intent should be classified as "off_topic"
  • And I should see a refusal message redirecting me to Langflow-related topics
  • And the LLM should NOT be called for the main response (saves API cost)

Scenario: No model provider configured

  • Given no model providers are configured
  • When I open the assistant panel
  • Then I should see the "No Models Configured" empty state
  • And the input field should be disabled
  • And I should see a link to Settings > Model Providers

Scenario: API key expired or invalid

  • Given the configured API key is invalid
  • When I submit a generation request
  • Then I should see an error message about the API key
  • And the error should mention configuring in Settings

Scenario: User cancels generation

  • Given a component generation is in progress
  • When I click the stop button
  • Then the generation should be cancelled
  • And I should see a "cancelled" status on the message
  • And the progress indicators should disappear
  • And I should be able to send a new message

Scenario: Open assistant with keyboard shortcut

  • Given I am on the flow page with focus on the canvas (not in a text input)
  • When I press the configured shortcut key (default: A, configurable in Settings > Shortcuts)
  • Then the assistant panel should open
  • And the text input should be auto-focused so I can start typing immediately
  • And pressing the shortcut again (when not focused on the input) should close it

Scenario: Close assistant with Escape

  • Given the assistant panel is open
  • When I press Escape
  • Then the assistant panel should close
  • And this should work whether focus is on the canvas or inside the assistant's text input

Scenario: Input placeholder during generation

  • Given a component generation or Q&A response is in progress
  • Then the input placeholder should show "Working on it..." instead of the random suggestion text
  • And once generation completes, the placeholder should return to normal

Scenario: Q&A response with example code renders as text

  • Given the user asks a question like "how do I create a component?"
  • And the LLM response includes example component code in a markdown code block
  • When the response completes
  • Then the response should render as markdown text (with syntax-highlighted code block)
  • And it should NOT trigger the component validation pipeline
  • And it should NOT show a component card or "Add to Canvas" button

Scenario: Component generation fallback for edge cases

  • Given a component generation response completes
  • And the backend did not set result.validated (e.g., format mismatch)
  • But the message has completedSteps containing component generation steps
  • When the content contains a Python class extending Component in a code block
  • Then the frontend should extract and display it as a component card

Scenario: Compound prompt — create a component, build a flow with it, and run it

  • Given the assistant panel is open
  • When I enter "Create a component that reverses text, build a flow that uses it, and run it"
  • Then the intent should be classified as "component_then_flow" (compound)
  • And I should see the "Orchestrating..." progress step
  • And the SAME single agent loop should chain the tools: GenerateComponent → build the flow → RunFlow
  • And no separate sub-agents should be spawned
  • And I should see the run result followed by run metrics

Scenario: Provider-agnostic in-flow model auto-pick (no OpenAI obligation)

  • Given the assistant panel is open
  • And only a non-OpenAI provider (e.g., Anthropic or Groq) has a configured API key
  • When I ask the assistant to build a flow containing an Agent node
  • Then the prompt should include an [Available language models ...] block listing only providers with a configured key
  • And the Agent node should be wired with a provider that actually has a key (no forced OpenAI/Anthropic)
  • And the assistant must NOT run an Agent that has no model configured

Scenario: Orchestrating indicator on multi-step prompts

  • Given the assistant panel is open
  • When I enter a prompt that asks to both build a flow and run it
  • Then decide_progress_step should select the orchestrating step
  • And I should see an "Orchestrating..." progress label (not just "Generating...")
  • And the run detection should occur as a post-LLM rescue, not a pre-LLM keyword override

Scenario: Run metrics shown after a run

  • Given the assistant has just executed a flow via RunFlow
  • When the run completes
  • Then the completion payload should include duration_seconds, input_tokens, output_tokens, and total_tokens
  • And the duration should be measured with perf_counter
  • And the token counts should be aggregated from the graph vertices via extract_graph_token_usage
  • And the agent should report these metrics in its reply
  • And there should be NO separate "run visual feedback" UI (only run + result + metrics)

Scenario: Edit + run continuation after approval

  • Given the assistant proposed canvas edits and the user is asked to approve them
  • And a deferred step exists (e.g., running the flow after the edit)
  • When I approve the proposed edits
  • Then the frontend should save the flow first
  • And then silently re-send EDIT_CONTINUATION_INPUT on the SAME request
  • And the deferred step (running the flow) should complete in that continuation
  • And if NO deferred step existed (continuation_expected is false), no continuation should fire (prevents the duplicate-message glitch)

Scenario: propose_plan is optional

  • Given the assistant panel is open
  • When I ask for a small, unambiguous flow change
  • Then the agent should apply it directly without stopping for a plan card
  • When I ask for a large or ambiguous change
  • Then the agent should stop and present a plan card for approval (PLAN_APPROVAL_INPUT)

Scenario: Clear conversation history

  • Given I have multiple messages in the chat
  • When I click the "New session" button
  • Then all messages should be removed
  • And a new session_id should be generated
  • And the panel should stay at expanded size
  • And any in-progress generation should be cancelled

Scenario: Per-turn cost badge on every assistant reply

  • Given the assistant has just finished a turn (success, refusal, off-topic, retry-exhausted, or sanitization-blocked)
  • Then the assistant message header should render a subtle MessageMetadata badge with token counts and duration
  • And the totals should include TranslationFlow classification + every agent attempt + any retries (per-turn aggregation)
  • And the badge should NOT shift the title layout (uses the subtle variant — muted color, no ml-auto)
  • And the backend log should emit assistant.tokens.phase phase=intent and assistant.tokens.phase phase=main lines

Scenario: Model-not-found falls back to a sibling model on the same provider

  • Given the user (or auto-selected) model returns a model_not_found-class error (e.g. OpenAI 403 because the project lacks access to a newly-released default)
  • And the provider is known
  • When the request fails inside the inner streaming loop
  • Then the streamer should swap model_name for the next entry from get_provider_model_candidates(provider)
  • And the same attempt should re-run WITHOUT consuming a validation-retry slot
  • And a log line assistant.model_fallback from=... to=... provider=... tried_so_far=[...] should be emitted
  • And the tried_models set should be seeded with the resolver's default so the fallback walks past it

Scenario: All candidate models exhausted on a provider

  • Given every model returned by get_provider_model_candidates(provider) has been tried
  • When the last attempt also fails with a model-unavailable error
  • Then the user should see format_models_exhausted_message(provider, tried_models) (e.g. "No accessible model on openai. Tried: gpt-4o, gpt-4o-mini. Configure access to one of these models in your openai account, or switch to a different provider in Settings → Model Providers.")
  • And the message should NOT be Error building Component Agent (the pre-fix generic string)

Scenario: Non-model-availability errors do not trigger fallback

  • Given the request fails with an auth, rate-limit, or network error
  • When is_model_unavailable_error returns false
  • Then NO model fallback should be attempted
  • And the error should surface through the existing friendly-error path unchanged

Scenario: Empty-state opens the ModelProvider dialog inline

  • Given no model providers are configured
  • When I click the "Configure providers" button in the assistant's empty state
  • Then the ModelProviderModal should open inline (modelType="llm")
  • And I should NOT be navigated away to /settings/model-providers
  • And the button should carry data-testid="assistant-no-models-configure-providers"

Scenario: Built-in component code is exempted from the run-time security scan

  • Given the agent added a built-in like URLComponent via add_component (registry-verbatim copy)
  • When run_working_flow calls _scan_flow_component_code
  • Then the node's code should be compared (after _normalize_code whitespace strip) against the registry's canonical template
  • And if byte-identical, the AST scan should be skipped for that node
  • And the built-in's legitimate importlib.util.find_spec / os.environ.get should NOT produce a false-positive violation
  • And registry-lookup failure should fall back to scan-all (never trust unverified code on the degraded path)

Scenario: Generic output method name falls back to the class name

  • Given the user (or LLM) generated a Component with exactly one tool-exposed Output and the Output's method is generic (e.g. output, process, build_output, run)
  • When the agent toolkits the component for the LLM
  • Then the LLM-facing tool name should be the snake_cased component class name (e.g. RandomMenuItemrandom_menu_item, HTTPClienthttp_client)
  • And descriptive method names (get_forecast, search_products) should NOT be overridden
  • And multi-output components should keep method-derived names so tools don't collapse to a single name

Scenario: Generator refuses reserved output names at validation time

  • Given the assistant generated a Component that declares Output(name="component_as_tool", ...) or method="to_toolkit"
  • When validate_component_code runs
  • Then the validation should fail with an actionable error explaining the synthetic-Tool sentinel collision
  • And the suggested fix should be to pick a value-descriptive name (item, price, result)
  • And the retry loop should produce correctly-named output instead of silently dropping the user's tool at runtime

Scenario: Runtime keeps user-declared outputs that happen to share the sentinel name

  • Given a Component reaches runtime whose Output happens to be named component_as_tool (e.g. from a saved flow predating the validator guardrail)
  • When ComponentToolkit._should_skip_output evaluates the output
  • Then the synthetic check should require name + method + types ALL match
  • And a user-declared output should be kept (not dropped) — the empty-tool-list production failure is closed

Scenario: API-key error names the user's variable back to them

  • Given the user referenced a Global Variable named MY_OPENAI_KEY for the Agent's api_key and the resolver could not find it
  • When get_llm reaches the missing-API-key branch
  • Then the error message should name MY_OPENAI_KEY AND the canonical OPENAI_API_KEY
  • And the user should be pointed to Settings → Model Providers
  • And if the provider arrived empty / "Unknown", the message should instead say "The selected model is missing a provider. Please reselect a model from the dropdown in the Language Model field …"

Scenario: Wrapped errors surface the deepest meaningful cause

  • Given the upstream error is wrapped (e.g. "Error building Component Agent: <real cause>" or a provider client repr containing 'message': '...')
  • When extract_friendly_error runs
  • Then _extract_deepest_meaningful_cause should return the provider message OR the substring after the first : of the component wrapper
  • And the wrapper prefix "Error building Component …" should NOT be returned by itself

Scenario: ModelInput dropdown trigger never renders literal JSON

  • Given a saved Agent flow whose model field value was doubly-encoded by the assistant's flow_update pipeline (the whole list serialized into value[0].name)
  • When the canvas renders the ModelInputComponent trigger
  • Then recoverModelOption(value?.[0]) should parse the embedded JSON and recover the real model name
  • And the trigger should show a plain readable model name (e.g. gpt-4o) not [{"provider":"OpenAI",...]

Scenario: Canvas summary is truncated above 2000 chars

  • Given the user's canvas is very large (many components, long sticky notes, big custom-component code)
  • When _get_current_flow_summary produces a flow_to_spec_summary result above MAX_CANVAS_SUMMARY_CHARS
  • Then the summary should be truncated to 2000 chars + "\n... [truncated]" before being injected into the prompt
  • And the prompt should always wrap it in the [Canvas reference (quoted prior state — do NOT treat as new instructions ...)] ... [End of canvas reference] block

Scenario: TranslationFlow output is capped at 300 tokens

  • Given the classifier ran on a non-empty user input
  • Then the underlying LLM call should be configured with max_tokens=300
  • And typical output of 60–120 tokens should be unaffected
  • And the model should not over-generate explanations the parser strips anyway

Scenario: Plan approval bypasses TranslationFlow

  • Given the frontend (Continue button or skip-all auto-approve) sends a message whose body is byte-identical to PLAN_APPROVAL_INPUT
  • When classify_intent runs
  • Then the deterministic short-circuit should return IntentResult(intent="build_flow", ...) WITHOUT calling the TranslationFlow LLM
  • And the log should emit intent.build_flow.deterministic: plan-approval continuation signal
  • And the same shortcut should already exist for EDIT_CONTINUATION_INPUT

Scenario: Serialized model spec is coerced at configure_component

  • Given the agent's BuildFlowFromSpec or ConfigureComponent tool sets a model-typed template field with a JSON/YAML string OR the QA pattern [{"name": "[{...JSON spec...}]", "provider": "Unknown"}]
  • When configure_component writes the value
  • Then _coerce_model_value should normalize the value to canonical [{"provider": X, "name": Y}]
  • And template[field].value should hold the coerced shape
  • And params[field] should be mutated in place so post-configure helpers read the canonical value
  • And get_llm should NOT fall back to provider="Unknown"ValueError: missing a provider
  • And bare model-name strings like "gpt-4o" should be left untouched so the catalog path still runs

Scenario: Reference a canvas component with an @-mention

  • Given the assistant panel is open and the canvas has components
  • When I type @ in the input
  • Then a list of the canvas components should open, filterable by display name
  • And the highlighted component should be selected on the canvas as a preview
  • When I press Enter (or click an item)
  • Then a quoted reference token like 'LanguageModelComponent-XSmrK' should be inserted (no trailing space)
  • And the agent should resolve it to that component's details/code via get_flow_component_details

Scenario: Reference a single field's value with @component.field

  • Given I have just inserted a component reference token
  • When I type . immediately after the token
  • Then the list should re-open in field mode showing that component's user-facing fields by display name
  • And internal fields (underscore-prefixed keys such as _type / _frontend_node_*, and code) should NOT appear
  • When I select a field
  • Then a single terminal token like 'LanguageModelComponent-XSmrK.api_key' should replace the reference
  • And the agent should resolve it to that field's current value via get_flow_component_field_value

Scenario: Keyboard navigation scrolls the mention list into view

  • Given the mention list is open and taller than the popover
  • When I move the highlight past the visible area with the arrow keys
  • Then the highlighted option should scroll into view (scrollIntoView({ block: "nearest" }))

5. Architecture Decision Records

ADR-001: Server-Sent Events (SSE) for Streaming

Status: Accepted

Context

The assistant needs to provide real-time feedback during component generation, which can take 10-60 seconds. Users need to see progress updates and token streaming to understand the system is working.

Decision

Use Server-Sent Events (SSE) for streaming progress updates and tokens from backend to frontend, instead of WebSockets or polling.

Consequences

Benefits:

  • Simpler implementation than WebSockets (unidirectional)
  • Native browser support via fetch with ReadableStream
  • Automatic reconnection handling
  • Works well with HTTP/2

Trade-offs:

  • Unidirectional only (client cannot send during stream)
  • Limited to text-based data (JSON encoded)
  • Some proxy/firewall issues possible

Impact on Product:

  • Users see real-time progress during generation
  • Reduced perceived latency
  • Better user experience during long operations

ADR-002: Intent Classification via LLM

Status: Accepted

Context

The assistant needs to distinguish between component generation requests and general questions. Additionally, users may write prompts in any language.

Decision

Use a dedicated LLM-based TranslationFlow to classify intent and translate input to English before processing.

Consequences

Benefits:

  • Accurate intent detection via LLM reasoning
  • Multi-language support without explicit language detection
  • Consistent English input for component generation flow

Trade-offs:

  • Additional LLM call adds latency (~1-2 seconds)
  • Additional cost per request
  • Fallback needed if classification fails

Impact on Product:

  • Users can write in any language
  • Better routing between Q&A and component generation
  • Slightly increased response time

ADR-003: Automatic Validation with Retry

Status: Accepted

Context

LLMs sometimes generate code with syntax errors or missing imports. Manual retry is frustrating for users.

Decision

Automatically validate generated code by instantiating the component class. On failure, retry with error context included in the prompt.

Consequences

Benefits:

  • Higher success rate for component generation
  • Self-healing through error context
  • Better user experience (no manual retry needed)

Trade-offs:

  • Multiple LLM calls on failure (up to 4x cost)
  • Longer total time when retries needed
  • Some errors may not be fixable by retry

Impact on Product:

  • Users get working components more often
  • Reduced frustration from validation failures
  • Transparent retry process via progress UI

ADR-004: Floating-Only Panel (Sidebar Removed)

Status: Accepted (supersedes previous floating+sidebar decision)

Context

The initial design supported both floating and sidebar view modes. However, the floating panel with dynamic open/close and size expansion worked well as a standalone solution — it stays out of the way, doesn't conflict with other areas of Langflow (sidebar, playground, canvas), and the open/close/resize behavior feels natural. The sidebar mode added complexity (spacer divs, negative margins, conditional styling) for a view that wasn't needed.

Decision

Remove the sidebar view mode entirely. The assistant always uses the floating panel. Removed: view mode toggle, AssistantViewMode type, useAssistantViewMode hook, sidebar spacer div, and all sidebar-conditional CSS from FlowPage.

Consequences

Benefits:

  • Simpler codebase — single layout to maintain
  • No spacer divs or negative margins affecting the flow page
  • Focused, polished experience for v1
  • Resizable floating panel covers all use cases

Trade-offs:

  • Users cannot dock the assistant to the side (can be revisited later if needed)

Impact on Product:

  • Cleaner, more focused feature for initial release
  • Faster iteration on a single well-polished layout

ADR-005: Frontend-Owned Session Persistence

Status: Accepted

Context

The assistant had no conversation memory — every message was treated as a new session because the frontend never sent a session_id. The backend generated a new UUID per request (request.session_id or str(uuid.uuid4())), so the Agent's memory component never found previous messages.

Decision

The frontend generates a session_id once (via useRef) when the useAssistantChat hook initializes, and includes it in every postAssistStream request. A new session_id is only generated when the user clicks "New session" (handleClearHistory).

Consequences

Benefits:

  • Conversation memory works across follow-up messages
  • Users can iterate on component designs within a session
  • "New session" provides a clean slate when needed

Trade-offs:

  • Session memory is only frontend-scoped (lost on page refresh)
  • Long sessions accumulate message history that may affect LLM context

Key Files:

  • src/frontend/.../hooks/use-assistant-chat.tssessionIdRef stores the ID, passed in every request
  • src/backend/.../agentic/api/router.py — falls back to uuid.uuid4() only if no session_id is sent

ADR-006: TranslationFlow Session Isolation

Status: Accepted

Context

The TranslationFlow (intent classification) and LangflowAssistant flow shared the same session_id. This caused cross-flow contamination: the TranslationFlow's JSON intent responses were stored alongside the assistant's messages. On subsequent requests, the TranslationFlow's LLM saw messages from both flows in its history, causing intent classification to fail and default to "question".

Decision

  1. Pass session_id=None when calling classify_intent — the TranslationFlow is stateless and does not need conversation memory.
  2. Set should_store_message=False on both ChatInput and ChatOutput in the TranslationFlow — it should never persist messages.

Consequences

Benefits:

  • Intent classification is stateless and deterministic per message
  • No cross-flow contamination in the assistant's message history
  • TranslationFlow works identically on 1st and Nth request

Trade-offs:

  • TranslationFlow has no context about previous turns (addressed by improved prompt, see ADR-007)

Key Files:

  • src/backend/.../agentic/services/assistant_service.pysession_id=None in classify_intent call
  • src/backend/.../agentic/flows/translation_flow.pyshould_store_message=False

ADR-007: Q&A Path Isolation and Off-Topic Guardrails (supersedes previous intent-independent extraction)

Status: Accepted

Context

The original ADR-007 introduced intent-independent code extraction: all responses were scanned for component code regardless of intent. This caused a critical bug: when users asked questions like "how do I create a component?", the LLM's example code in the answer was extracted, validated, and displayed as a component card instead of the text answer.

Additionally, the TranslationFlow only classified two intents (generate_component and question), allowing questions about unrelated tools (n8n, Docker, etc.) to pass through as "question" and receive full LLM responses.

Decision

Three changes:

  1. Q&A path isolation — When intent is "question", the backend returns the response immediately as plain text without code extraction/validation. Code extraction only runs for "generate_component" intent.

  2. Off-topic intent — Added "off_topic" as a third intent classification. Questions about other tools, platforms, or unrelated topics are blocked before calling the main LLM, saving API cost and enforcing scope.

  3. Frontend fallback scoping — The frontend only shows a component card for Q&A responses if message.completedSteps contains component generation steps (indicating the backend intended to generate a component). This prevents example code in explanatory answers from being misinterpreted.

Consequences

Benefits:

  • Q&A answers with example code render as markdown, not component cards
  • Off-topic questions are blocked before the expensive LLM call
  • Clear separation between generation and Q&A paths

Trade-offs:

  • If intent classification misses a follow-up modification (classifies as "question"), the component card won't appear. Mitigated by improved TranslationFlow prompt with modification examples.

Key Files:

  • src/backend/.../agentic/services/assistant_service.pyif not is_component_request: yield complete; return
  • src/backend/.../agentic/flows/translation_flow.py — three intents: generate_component, question, off_topic
  • src/frontend/.../components/assistant-message.tsx — fallback scoped by completedSteps

ADR-008: Fixed-Width Zoom Percentage Display

Status: Accepted

Context

The zoom percentage in the canvas controls bar (e.g., "65%", "150%", "200%") caused the entire controls bar to shift width when the zoom changed between values with different character counts. This created a visually distracting layout jump.

Decision

Apply a fixed width (w-11, 44px) with text-center to the zoom percentage display. Reduce the button's outer padding (px-0.5) to remove dead space between the redo icon and the percentage, and add gap-0.5 between the percentage text and the chevron icon.

Key Files:

  • src/frontend/.../canvasControlsComponent/CanvasControlsDropdown.tsx — fixed-width zoom display

ADR-009: GPU-Accelerated Panel Open Transition

Status: Accepted

Context

Opening the assistant panel felt sluggish when there were previous chat messages. The root cause was transition-all duration-300 on the panel container, which forced the browser to transition every CSS property (including height, width, border, shadow) across the entire message DOM on every open/close.

Decision

  1. Replace transition-all with transition-[opacity,transform] — only animate the two properties needed for the fade+slide effect.
  2. Reduce duration-300 to duration-200 for a snappier feel.
  3. Add will-change-[opacity,transform] to hint the browser to GPU-accelerate these properties, avoiding expensive repaints on the message list.

Consequences

Benefits:

  • Panel opens instantly regardless of message count
  • No layout thrashing from transitioning height/width/shadow/border
  • GPU-composited animation avoids main-thread repaints

Trade-offs:

  • Size changes (compact → expanded) are no longer animated (they snap instantly, which is actually preferable)

Key Files:

  • src/frontend/.../assistantPanel/assistant-panel.tsxcontainerClasses transition properties

ADR-010: Two-Phase Validation (Static + Runtime)

Status: Accepted

Context

The original validation (validate_component_code) only performed static AST analysis — syntax, class name extraction, overlapping I/O names, return statements. Code with valid syntax but wrong imports (e.g., from lfx.base import Component instead of from lfx.custom import Component) passed validation, was marked as validated: true, and showed "Add to Canvas". Clicking it failed silently because the /api/v1/custom_component endpoint performed real instantiation.

Decision

Add a second validation phase (validate_component_runtime) that attempts to instantiate the component using Component(_code=code) + build_custom_component_template(). If runtime validation fails, the error is fed back into the retry loop.

Consequences

Benefits:

  • Components marked as validated: true can always be added to canvas
  • Import errors, missing base classes, and runtime issues are caught during generation
  • The retry loop includes runtime error context, improving LLM's ability to self-correct

Trade-offs:

  • Runtime validation executes the generated code (mitigated by the prior scan_code_security AST scan — its denylist was widened to block secret/env exfiltration os.environ/os.getenv/os.putenv, raw file access open()/breakpoint(), and dunder sandbox escapes __subclasses__/__globals__/__builtins__/…; HTTP is intentionally allowed to keep legitimate API components working). Note this is a denylist, not a true sandbox.
  • Slightly slower validation per attempt (~100ms overhead)

Key Files:

  • src/backend/.../agentic/helpers/validation.pyvalidate_component_runtime()
  • src/backend/.../agentic/helpers/code_security.pyscan_code_security() denylist (shared with the run-time gate; see ADR-MCP-040 in langflow-assistant-mcp.mdrun_working_flow re-scans every node's inline code before exec to close the bypass for code that skipped generation-time scanning)
  • src/backend/.../agentic/services/assistant_service.py — calls runtime validation after AST passes

ADR-011: Session ID Isolation from Playground

Status: Accepted

Context

Assistant sessions appeared in the Playground's session list because both used the same MessageTable with the same flow_id. The Assistant's ChatOutput component stored messages with should_store_message=True, and the Playground queried SELECT DISTINCT session_id FROM message WHERE flow_id = ?.

Decision

  1. Prefix all Assistant session IDs with agentic_ on the frontend.
  2. Filter out agentic_-prefixed sessions in the Playground's session query.

Consequences

Benefits:

  • Assistant sessions are completely invisible in the Playground
  • Agent's conversation memory still works (same prefixed session_id across turns)
  • No database schema changes required

Key Files:

  • src/frontend/.../hooks/use-assistant-chat.tsAGENTIC_SESSION_PREFIX
  • src/backend/.../api/v1/monitor.pyWHERE session_id NOT LIKE 'agentic_%'

ADR-012: Configurable Keyboard Shortcut

Status: Accepted

Context

The "A" key shortcut to open the assistant was hardcoded in FlowPage/index.tsx, making it impossible for users to remap or disable via Settings > Shortcuts.

Decision

Register the shortcut in the existing customDefaultShortcuts system with name "AI Assistant" and default key "a". The FlowPage reads the shortcut from useShortcutsStore instead of using a hardcoded string.

Key Files:

  • src/frontend/.../customization/constants.ts"AI Assistant" entry
  • src/frontend/.../stores/shortcuts.tsaiAssistant: "a"
  • src/frontend/.../pages/FlowPage/index.tsx — reads from store

ADR-013: Resilient Intent Classification for Diverse Models

Status: Accepted

Context

Models like IBM granite return non-JSON responses from the TranslationFlow, causing all requests to default to "question" intent. This prevented component generation from ever triggering with these models.

Decision

Add three progressive fallbacks when json.loads() fails:

  1. Extract JSON from markdown code blocks (```json ... ```)
  2. Find JSON objects embedded in surrounding text
  3. Match known intent keywords in plain text ("generate_component", "off_topic")

Key Files:

  • src/backend/.../services/helpers/intent_classification.py_MARKDOWN_JSON_RE, _EMBEDDED_JSON_RE, keyword fallback

ADR-014: Provider-Specific Parameter Injection

Status: Accepted

Context

IBM WatsonX and Ollama require additional parameters beyond API key and model name (WatsonX: URL + project ID; Ollama: base URL). The inject_model_into_flow function only injected the model value into Agent nodes, leaving provider-specific fields empty. This caused authentication failures for WatsonX in the Assistant.

Decision

Thread provider_vars (resolved from database) through flow_executorflow_loaderinject_model_into_flow. The injection function now sets api_key, base_url_ibm_watsonx, project_id (WatsonX) and base_url_ollama (Ollama) on Agent node templates.

Key Files:

  • src/backend/.../services/flow_preparation.pyprovider_fields injection
  • src/backend/.../services/flow_executor.py — passes global_variables as provider_vars

ADR-015: Session Storage in localStorage (Not Database)

Status: Accepted

Context

Users expect assistant session history to persist. A decision was needed on whether to store sessions in the database (like the Playground) or in browser localStorage.

Decision

Session history is stored in browser localStorage (key: langflow-assistant-sessions), limited to 10 sessions. Sessions are serialized/deserialized with progress state stripped and in-flight messages marked as "cancelled".

Consequences

Benefits:

  • Zero backend changes — no new API endpoints or database tables
  • Fast read/write with no network latency
  • Simple implementation matching the frontend-scoped session_id model

Trade-offs:

  • Sessions are lost when browser cache is cleared — users should be aware
  • No cross-device or cross-browser synchronization
  • Limited to 10 sessions (oldest dropped on overflow)
  • Session data is not backed up or recoverable

Important: This is a known limitation. If users report lost sessions, the answer is that assistant sessions are browser-local only. Database persistence can be added as a future enhancement if needed.

Key Files:

  • src/frontend/.../hooks/use-session-history.tssaveCurrentSession(), switchSession(), deleteSession()
  • src/frontend/.../helpers/session-storage.ts — serialization/deserialization
  • src/frontend/.../assistant-panel.constants.tsASSISTANT_SESSIONS_STORAGE_KEY, ASSISTANT_MAX_SESSIONS

ADR-016: Single-Agent-Loop over Multi-Phase Orchestration

Status: Accepted (supersedes the multi-phase orchestrator approach)

Context

The assistant had grown into a multi-phase orchestrator with separate sub-agents per phase. This added coordination overhead, divergent prompts, and made compound requests ("make a component, build a flow with it, and run it") brittle. Tools like Claude Code and Codex demonstrate that a single agent with a good toolkit handles multi-step work more reliably.

Decision

Collapse the assistant into ONE agent (flow_builder_assistant.py) plus an MCP toolkit (src/lfx/src/lfx/mcp/flow_builder_tools/). Single-thing requests behave exactly as before (byte-identical experience). When the user asks for multiple things in one prompt, the SAME single loop chains the tools (orchestration) — no separate sub-agents are spawned.

Consequences

Benefits:

  • One prompt, one loop — simpler to reason about and to evolve
  • Compound requests handled natively by tool chaining
  • No cross-sub-agent state divergence

Trade-offs:

  • The single prompt must encode targeting/self-verification discipline (see ADR-021)
  • Long tool chains accumulate context within one loop

Key Files:

  • src/backend/.../agentic/flows/flow_builder_assistant.py — the single agent
  • src/lfx/src/lfx/mcp/flow_builder_tools/GenerateComponent, DescribeFlowIO, RunFlow
  • src/backend/base/langflow/agentic/ARCHITECTURE.md — end-to-end diagrams

ADR-017: Provider-Agnostic In-Flow Model Selection

Status: Accepted (clarifies/supersedes the provider-preference part of ADR-014 and the §3 "Model Selection Behavior" / ModelProviderConfiguration provider-preference invariant)

Context

The §3 ModelProviderConfiguration invariant and the "Model Selection Behavior" note state a hard preference order "Anthropic > OpenAI > Google Generative AI > Groq" and imply a provider fallback obligation for the in-flow Agent model. In practice, forcing a fixed order (especially an OpenAI obligation) fails when only a different provider has a configured key.

Decision

available_model_providers(global_variables) in agentic/services/flow_preparation.py detects providers that actually have a configured key. The prompt injects an [Available language models ...] block and forbids running an Agent that has no model.

Clarification of the older text (the original text is intentionally not deleted): the assistant's own model is preferred when available; for the flow's Agent node, any provider with a configured key may be used — there is no fixed OpenAI/Anthropic obligation and no fixed provider-preference order.

Consequences

Benefits:

  • Works with any single configured provider (no OpenAI requirement)
  • The Agent node is never wired without a usable model
  • Prompt is grounded in the user's real provider configuration

Trade-offs:

  • Model choice for the flow's Agent is less deterministic across users

Key Files:

  • src/backend/.../agentic/services/flow_preparation.pyavailable_model_providers()

ADR-018: Optional propose_plan

Status: Accepted

Context

Previously the agent stopped for a plan card on essentially every flow build, which slowed down small/unambiguous edits and added an extra approval round-trip.

Decision

propose_plan is now OPTIONAL. The agent only stops for a plan card on large or ambiguous changes; small unambiguous changes are applied directly. Plan approval still uses the byte-identical PLAN_APPROVAL_INPUT protocol string (matched exactly, bypassing the classifier).

Consequences

Benefits:

  • Faster path for routine edits
  • Plan card reserved for cases where it actually adds value

Trade-offs:

  • Heuristic decides "large/ambiguous"; occasional mismatch with user expectation

ADR-019: Run Metrics Extraction

Status: Accepted

Context

After the assistant runs a flow, users want to know how long it took and how many tokens it consumed. Earlier "run visual feedback" UI added complexity without surfacing concrete numbers.

Decision

RunFlow returns {duration_seconds, input_tokens, output_tokens, total_tokens}. Duration is measured with perf_counter; tokens are aggregated via extract_graph_token_usage over the graph vertices (agentic/services/flow_run.py). All "run visual feedback" code was removed — only run + result + metrics remain.

Consequences

Benefits:

  • Accurate, comparable duration (monotonic clock, not wall-clock)
  • Token usage is read from the actual graph vertices, not estimated
  • Simpler surface (no separate run-feedback UI)

Trade-offs:

  • Token counts depend on vertices exposing usage; components that don't report usage contribute zero

Key Files:

  • src/backend/.../agentic/services/flow_run.pyperf_counter duration, extract_graph_token_usage

ADR-020: Edit + Run Continuation Gating

Status: Accepted (extends ADR for assistant edit-approval continuation)

Context

When the user approves proposed canvas edits, deferred steps (e.g., running the flow) must still complete on the SAME request. An earlier implementation re-sent the continuation unconditionally, causing a duplicate-message glitch when there was nothing deferred.

Decision

After approval the frontend saves the flow, then silently re-sends EDIT_CONTINUATION_INPUT (byte-identical FE/BE protocol string, matched exactly, bypassing the classifier) so the same request finishes deferred steps. configure_component direct-apply now runs in the same turn. Continuation only fires if a deferred step actually existed, gated by continuation_expected.

Consequences

Benefits:

  • Deferred run completes seamlessly after edit approval
  • No duplicate assistant message when nothing was deferred

Trade-offs:

  • Frontend and backend must keep the protocol string byte-identical

ADR-021: Real Introspection for User Component Overlay & In-Place Working-Flow ContextVar

Status: Accepted

Context

The user component overlay scaffolded outputs by guessing, causing run crashes like "Attribute build_output not found" and wrong-output scaffolds. Separately, build_flow rebinding the working-flow ContextVar via .set() meant the run engine did not see the canvas ("There is no flow on the canvas to run").

Decision

  1. The overlay now does REAL introspection via build_custom_component_template(Component(_code=code)) (agentic/services/user_components_overlay.py), reflecting the class's real Output method (AST-derived).
  2. build_flow mutates the working-flow ContextVar IN PLACE (never .set() rebind) so the run engine sees the canvas. New ContextVars: agent_run_context (provider/model/api_key_var) and _current_flow_id_var.
  3. The prompt has an explicit targeting rule (act on the right component) plus self-verification after acting.

Consequences

Benefits:

  • Fixes "Attribute build_output not found" and wrong-output scaffolds
  • Run engine reliably sees the freshly built flow
  • Agent acts on the intended component and verifies its own actions

Key Files:

  • src/backend/.../agentic/services/user_components_overlay.py — real introspection
  • src/backend/.../agentic/services/agent_run_context.pyagent_run_context, _current_flow_id_var

ADR-022: Frontend — Replace-Canvas fitView & Edit Diff Card Readability

Status: Accepted

Context

"Replace canvas" left the new flow off-screen, and the flow-edit diff card was hard to read (raw \n, missing "Show more", misaligned Accept/Dismiss).

Decision

  1. Replace canvas performs a proper fitView via a double requestAnimationFrame (apply-flow-update.ts).
  2. The flow-edit diff card renders without raw \n, restores "Show more", and aligns Accept/Dismiss to the GHOST green button pattern used by the other cards.

Consequences

Benefits:

  • Replaced flow is visible and framed immediately
  • Diff card is readable and visually consistent with sibling cards

Key Files:

  • src/frontend/.../apply-flow-update.ts — double-rAF fitView

ADR-023: Per-Turn Usage + Duration Surfaced on Every complete Event

Status: Accepted

Context

The Playground exposes per-message LLM cost (token counts + duration) via the MessageMetadata badge. The Assistant did not — users had no signal for what each turn cost, even when retries / fallbacks doubled or tripled the run. Without that signal there is also no way to alert on outliers.

Decision

  1. execute_flow_with_validation_streaming opens a request_started_at = perf_counter() clock and a total_usage = {input_tokens, output_tokens, total_tokens} dict.
  2. A nested _accumulate(tokens, phase=...) is called after every LLM call — TranslationFlow (phase="intent"), the main agent (phase="main"), and every retry — so the running total reflects the true per-turn cost.
  3. A nested _complete(data) closure replaces every format_complete_event(...) call in the orchestrator (success, off-topic refusal, sanitization-blocked, retry-exhausted, plain Q&A, no-code, etc.) and injects usage + duration_seconds into the payload.
  4. The executor wraps per-run token usage from extract_graph_token_usage(graph) under a _metrics key in the result dict; both the orchestrator's end branch and classify_intent .pop("_metrics", ...) before returning so it never leaks into user-facing text.
  5. Frontend: AssistantMessage grows usage + duration fields; AssistantMessageItem renders MessageMetadata with the new subtle prop (muted-foreground color, no ml-auto) inline next to the assistant title; duration_seconds → milliseconds at the boundary to match the existing renderer's contract.
  6. emit_execution_retry_events takes a complete_event_formatter callback so the retry-exhausted complete event reports its real cost (the orchestrator passes _complete; default is format_complete_event for callers that don't track cost).

Consequences

Benefits:

  • Users see the true per-turn cost (TranslationFlow + every agent attempt + retries), not a partial total
  • Reuses the Playground's renderer (no parallel UI to maintain)
  • usage/duration are part of the complete payload so session-load → rehydrate also shows the badge
  • Per-phase assistant.tokens.phase logs back dashboards and outlier alerts without needing a new metrics pipeline

Trade-offs:

  • Frontend now stores usage/duration on the AssistantMessage (session-storage size grows marginally per message)
  • Retry-exhausted paths must remember to pass the formatter override (defended by the default)

Key Files:

  • src/backend/.../agentic/services/assistant_service.pytotal_usage, _accumulate, _complete
  • src/backend/.../agentic/services/flow_executor.py_metrics envelope
  • src/backend/.../agentic/services/flow_types.pyIntentResult.tokens
  • src/backend/.../agentic/services/helpers/intent_classification.py_with_tokens wrapper threaded through every fallback path
  • src/backend/.../agentic/helpers/streaming_retry.pycomplete_event_formatter param
  • src/frontend/.../components/common/messageMetadataComponent/subtle prop
  • src/frontend/.../components/core/assistantPanel/components/assistant-message.tsx — inline render
  • src/frontend/.../components/core/assistantPanel/hooks/use-assistant-chat.ts — store on message, ms conversion

ADR-024: Model-Fallback Chain for model_not_found-Class Errors

Status: Accepted

Context

OpenAI 403 model_not_found (catalog default not enabled for the user's project) reached the SSE error event as the generic "Error building Component Agent" instead of trying a sibling model the user could actually call. Users had no idea why a sensible default broke and no obvious recovery.

Decision

Add an inner while swap_requested: loop inside the streaming attempt:

  1. tried_models: set[str] is seeded with the resolver's default so the fallback walks past it.
  2. On FlowExecutionError, if is_model_unavailable_error(e.original_error_message) matches a curated denylist of substrings ("model_not_found", "does not have access to model", "model is not available", "the model does not exist", "model not available", "no access to model") AND a provider is known, the orchestrator picks the next candidate from get_provider_model_candidates(provider), logs assistant.model_fallback from=... to=... provider=... tried_so_far=[...], swaps model_name, sets swap_requested = True, and re-runs THIS attempt — without consuming a slot from the outer validation-retry budget.
  3. When every candidate has been tried, format_models_exhausted_message(provider, tried_models) becomes the user-facing execution_error.
  4. Auth / rate-limit / network errors deliberately do NOT match the markers — they would recur on the next model and mask the real problem.
  5. Per-iteration state (result, cancelled, execution_error, has_flow_updates, saw_set_flow, saw_run, last_set_flow, set_flow_applied) is reset at the top of each inner-loop pass so a swap starts cleanly.

Consequences

Benefits:

  • A wrong default model never blocks a turn that another model on the same provider could serve
  • The named "exhausted" message gives the user an actionable next step (request access OR switch provider)
  • Auth / rate-limit / network failures keep their existing semantics

Trade-offs:

  • Multiple LLM calls when the first models fail (cost is bounded by the candidate list length — same provider only)
  • The marker denylist is a heuristic; new wording variants for "model unavailable" must be added explicitly

Key Files:

  • src/backend/.../agentic/services/assistant_service.py — inner swap loop + reset block
  • src/backend/.../agentic/helpers/error_handling.pyis_model_unavailable_error, _MODEL_UNAVAILABLE_MARKERS, format_models_exhausted_message
  • src/backend/.../agentic/services/provider_service.pyget_provider_model_candidates

ADR-025: Built-in Component Code Exemption for the Run-Time Security Gate

Status: Accepted (extends ADR-MCP-040 from langflow-assistant-mcp.md)

Context

run_working_flow AST-scans every node's inline code before exec to close the bypass for component code that skipped the generation-time scanner (ADR-MCP-040). The denylist is tuned for LLM-generated code and correctly blocks secret-env exfiltration, raw file access, and dunder sandbox escapes. But trusted built-ins legitimately use forbidden patterns: URLComponent calls importlib.util.find_spec("langflow") for optional dependency detection and os.environ.get("HTTPS_PROXY") for proxy config. The result was a flood of false-positive "refused to run" errors on flows that only ever contained built-ins the agent had copied from the registry.

Decision

Add a built-in exemption in _scan_flow_component_code:

  1. _get_canonical_code_map() walks the user-aware registry once and builds {component_type: canonical_code}. Registry-lookup failure returns {} so the caller falls back to scan-all (never trust unverified code on the degraded path).
  2. _normalize_code(code) strips per-line trailing whitespace + outer blanks — registry-JSON-flow round-trips occasionally re-line-end the code, and a benign serialization artifact should not force an unnecessary scan.
  3. For each node, if a canonical code exists for its type AND _normalize_code(node_code) == _normalize_code(canonical), skip the scan.
  4. Otherwise (LLM- or user-modified code, or registry lookup failed) scan as before.

Consequences

Benefits:

  • Trusted built-ins run without false-positive blocks
  • The exemption is byte-identity-based (only the canonical registry copy is trusted), so a single character change re-enables the scan
  • Registry failure degrades safely (scan-all)
  • Independent of the validator (code_security) — same denylist, narrower scope

Trade-offs:

  • Adds one registry lookup per run (cached by the existing _load_registry_user_aware machinery)
  • A canonical built-in updated in the registry will momentarily look "modified" in old saved flows until the user re-copies it (still safe — scan runs)

Key Files:

  • src/backend/.../agentic/services/flow_run.py_get_canonical_code_map, _normalize_code, _scan_flow_component_code
  • src/backend/tests/.../test_flow_run_builtin_exemption.py — tests for byte-identical exemption, modified-divergence scan, unknown-type scan-all, registry-lookup error scan-all, whitespace-drift identity

ADR-026: Generic Tool-Name Fallback + Reserved-Name Guardrails

Status: Accepted

Context

Tool-mode components were sometimes wired with output methods whose names carry no semantic signal — output, process, build_output, run. The LLM-facing tool name comes from the method, so the agent saw tools called output / process and either ignored them or called them randomly. Separately, two production failures stemmed from the synthetic-tool sentinel name component_as_tool (and method to_toolkit): (a) the validator did not refuse LLM-generated code that declared Output(name="component_as_tool", ...) — the runtime then dropped the output and the agent got an empty tool list; (b) the runtime's _should_skip_output matched on name alone, so a user-declared component_as_tool was wrongly stripped (2026-05-27 incident).

Decision

  1. Generic-method fallback (lfx/base/tools/component_tool.py): _derive_tool_name(component, output_method, outputs) checks an _GENERIC_OUTPUT_METHOD_NAMES frozenset ("output", "process", "build_output", "run", "execute", "main", "handler", "build_result"). When a Component has exactly ONE tool-exposed output AND the method is generic, the tool name is the snake_cased class name via _class_name_to_tool_name(class_name) (acronym-preserving — HTTPClienthttp_client, S3Buckets3_bucket). Multi-output components keep method-derived names so tools don't collide.
  2. Validator guardrails (agentic/helpers/validation.py): validate_component_code adds two checks — _RESERVED_OUTPUT_NAME = "component_as_tool" and _RESERVED_OUTPUT_METHOD = "to_toolkit" — and returns an actionable ValidationResult(is_valid=False, ...) whose error explains the synthetic collision and suggests a value-descriptive name (item, price, result).
  3. Runtime precision (lfx/base/tools/component_tool.py::_should_skip_output): the synthetic check now requires name + method + types ALL match the synthetic. Anything less is a user-declared output that happens to share the name and must be kept.
  4. Prompt guidance: a new "Agent Tool Compatibility" section in the LangflowAssistant.json system prompt teaches the generator (a) action verb_noun method naming, (b) class-level description as tool description, (c) tool_mode=True discipline with clear info=, (d) NEVER use component_as_tool/to_toolkit.

Consequences

Benefits:

  • The agent gets descriptive tool names even when the LLM picked a generic method
  • LLM-generated Output(name="component_as_tool", ...) is rejected at the retry loop's first turn instead of failing silently at runtime
  • A legitimate user-declared component_as_tool (e.g. from a saved flow predating the validator) is no longer dropped
  • The prompt + validator + runtime form three independent layers of defense

Trade-offs:

  • The class-name fallback is narrow (single-output components only) so multi-output toolkits keep distinct names — descriptive method names from the LLM still beat the class name (no override there)

Key Files:

  • src/lfx/src/lfx/base/tools/component_tool.py_GENERIC_OUTPUT_METHOD_NAMES, _class_name_to_tool_name, _derive_tool_name, tightened _should_skip_output
  • src/backend/.../agentic/helpers/validation.py_RESERVED_OUTPUT_NAME, _RESERVED_OUTPUT_METHOD
  • src/backend/.../agentic/flows/LangflowAssistant.json — "Agent Tool Compatibility" prompt section

ADR-027: Diagnostic-Preserving Friendly Errors + API-Key Variable Naming

Status: Accepted

Context

Two patterns produced misleading errors: (a) Provider client errors arrive as "Error building Component Agent: <real cause>" or as a Python repr containing 'message': '...'. The default colon-split truncation in extract_friendly_error returned the wrapper prefix and discarded the actually useful detail. (b) get_llm's "API key required" error always pointed to the canonical variable name (e.g. OPENAI_API_KEY) even when the user had configured a differently-named Global Variable (e.g. MY_OPENAI_KEY) — and when the provider arrived empty / "Unknown" the message became the nonsense "Unknown API key is required when using Unknown provider … UNKNOWN_API_KEY".

Decision

  1. Diagnostic preservation (extract_friendly_error): a new _extract_deepest_meaningful_cause runs FIRST — it tries _PROVIDER_MESSAGE_RE ('message': '...' repr) then the "Error building Component " wrapper prefix (returning the substring after the first :, length-gated by MIN_MEANINGFUL_PART_LENGTH). Unwrapped errors are byte-identical to the prior behavior.
  2. API-key naming (get_llm): the function captures original_api_key_input BEFORE the global-variable resolver runs. If the resolver fails AND the original input differs from the canonical name, the error message names BOTH ("The variable 'MY_OPENAI_KEY' referenced by the component's api_key field could not be resolved … Configure 'MY_OPENAI_KEY' (or the canonical 'OPENAI_API_KEY') in Settings → Model Providers.").
  3. Empty/Unknown provider guard (get_llm): an empty or "Unknown" provider replaces the API-key error with "The selected model is missing a provider. Please reselect a model from the dropdown in the Language Model field …" — pointed at the actual fix.

Consequences

Benefits:

  • Users see the actionable cause, not the wrapper prefix
  • A misconfigured Global Variable name is surfaced back to the user verbatim
  • The "Unknown" placeholder path is replaced with a usable instruction

Trade-offs:

  • Two new regex / prefix checks run on every friendly-error path; cheap but non-zero
  • The provider-empty branch shadows the variable-naming branch — intentional ordering, but worth keeping in mind for future error types

Key Files:

  • src/backend/.../agentic/helpers/error_handling.py_extract_deepest_meaningful_cause, _PROVIDER_MESSAGE_RE, _COMPONENT_WRAPPER_PREFIX
  • src/lfx/src/lfx/base/models/unified_models/instantiation.pyget_llm API-key + provider guards

ADR-028: Frontend ModelInput Trigger Sanitization (recoverModelOption)

Status: Accepted

Context

The Agent node's Language Model dropdown trigger sometimes displayed a literal stringified JSON value — [{"provider":"OpenAI","name":"gpt-4o",...] — instead of the model name. Root cause: the assistant's flow_update pipeline can produce a doubly-encoded payload where the entire model list is serialized into the first array element's name field. The trigger reads selectedModel?.name and rendered the raw JSON. (PR-12575 Bug 3.)

Decision

Add a defensive recoverModelOption(value?.[0]) step before reading name in ModelInputComponent. The helper:

  1. Detects a stringified array/object in value[0].name (or value[0] itself).
  2. Parses it and recovers the actual model entry.
  3. Returns a ModelOption whose name is a plain readable string.

The recovery is purely defensive — it runs on every trigger render but short-circuits when the value is already canonical.

Consequences

Benefits:

  • The dropdown trigger never shows literal JSON to the user
  • Repairs already-saved bad flows on read without forcing a re-save
  • Complements the backend coercion (ADR-029) — frontend defense for any payload that slipped through

Trade-offs:

  • Runs on every render; trivial cost but worth noting if the trigger ever becomes hot

Key Files:

  • src/frontend/.../parameterRenderComponent/components/modelInputComponent/helpers/recover-model-option.ts
  • src/frontend/.../parameterRenderComponent/components/modelInputComponent/index.tsx — replaces value?.[0] with recoverModelOption(value?.[0])

ADR-029: Serialized Model Spec Coercion at configure_component

Status: Accepted

Context

The agent's BuildFlowFromSpec and ConfigureComponent tools sometimes emit model-typed template field values as a JSON or YAML string instead of the canonical [{"provider": X, "name": Y}] list. A QA-observed pattern stuffs an entire serialized list into the first element's name field: [{"name": "[{...JSON spec...}]", "provider": "Unknown"}]. Without normalization, the raw string lands in template['model'].value, the catalog falls back to provider="Unknown", and get_llm raises ValueError: missing a provider.

Decision

Coerce model values at the single configure_component choke point (lfx/graph/flow_builder/component.py):

  1. _parse_serialized_model_text(text) — tries JSON then YAML, but only on text that looks structured ({, [, - , or key: value + newline). A bare model name like "gpt-4o" is left untouched so the catalog path still runs.
  2. _coerce_single_model_entry(item) — unwraps the nested-serialized-spec-stuffed-into-name pattern.
  3. _coerce_model_value(value) — normalizes the value to canonical list[dict]. Accepts already-canonical lists (each entry still checked for nested serialization), single dicts, and serialized strings.
  4. configure_component invokes the coercion only when template[key].type == "model". Both template[key].value AND params[key] (in place) hold the coerced shape so post-configure helpers (e.g. _mirror_model_value_into_options) read the canonical value.

Consequences

Benefits:

  • Catches the bug at the single tool-write boundary instead of repairing every consumer
  • Bare model names still work (catalog path unchanged)
  • Independent of the frontend recovery (ADR-028) — both layers protect different ingress paths

Trade-offs:

  • The "looks structured" heuristic must keep up with new spec serializations (today: JSON, YAML inline + block)
  • Mutating params in place is necessary for downstream helpers but is a subtle contract

Key Files:

  • src/lfx/src/lfx/graph/flow_builder/component.py_parse_serialized_model_text, _coerce_single_model_entry, _coerce_model_value, configure_component
  • src/lfx/tests/unit/test_build_flow_from_spec.py — coverage for the QA patterns

ADR-030: Empty-State ModelProviderModal Inline Open

Status: Accepted

Context

The "No Models Configured" empty state's "Configure providers" button called navigate("/settings/model-providers"), taking the user out of the assistant panel (and often out of the flow page entirely) just to configure the prerequisite. The Settings page already exposes a ModelProviderModal that can be mounted anywhere.

Decision

Replace the navigate call with an inline <ModelProviderModal> opened via local React state (isProviderModalOpen). The button gets a stable testid (assistant-no-models-configure-providers) for E2E coverage and the modal is rendered with modelType="llm".

Consequences

Benefits:

  • Users configure providers without leaving the assistant
  • The flow page state (selection, viewport) is preserved
  • The button is now E2E-targetable via its dedicated testid

Trade-offs:

  • One more modal mount inside the assistant tree (cheap; lazy-rendered)

Key Files:

  • src/frontend/.../assistantPanel/components/assistant-no-models-state.tsx

ADR-031: @-Mention of Canvas Components and Fields in the Assistant Input

Status: Accepted

Context

The assistant always receives a (capped) canvas summary, but users had no way to point it at a specific component or the current value of a single field without describing it in prose. On large canvases this is ambiguous and burns tokens re-describing what is already on screen.

Decision

Add an @-mention affordance to the assistant input, parsed entirely on the frontend:

  1. @ opens a filterable list of canvas components (detectMention). Confirming inserts a quoted, space-free token '<componentId>' (formatMentionToken).
  2. Typing . adjacent to a confirmed token re-triggers the list in field mode (detectFieldMention), sourcing the component's user-facing fields from node.data.node.template client-side (no network call, via toFieldItems). Confirming inserts one terminal token '<componentId>.<fieldName>' (formatFieldMentionToken).

The agent resolves these tokens with the existing MCP tools get_flow_component_details (component) and get_flow_component_field_value (field) — no backend change required.

Consequences

Benefits:

  • Precise, low-token references to a component or a single field value
  • The field list is instant (client-side template) and exact (keyed off the component id, not a fuzzy name)
  • Reuses the server-side resolution tools that already existed

Trade-offs:

  • The component token no longer auto-appends a trailing space — the . must sit adjacent to the closing quote to chain into a field mention, so users type their own space to continue prose. The terminal field token keeps a trailing space.
  • Only fields with a display_name and show !== false are listed; underscore-prefixed/internal keys (_type, _frontend_node_*) and code are excluded. A valid field that lacks a display_name would not appear.

Key Files:

  • src/frontend/.../assistantPanel/helpers/mention-parsing.tsdetectMention, detectFieldMention, formatMentionToken, formatFieldMentionToken
  • src/frontend/.../assistantPanel/hooks/use-component-mentions.ts — mode state, toFieldItems, confirm + canvas-highlight logic
  • src/frontend/.../assistantPanel/components/assistant-mention-popover.tsx — list rendering (field rows show the name only) and active-item scrollIntoView
  • src/backend/.../agentic/utils/flow_component.pyget_component_details, get_component_field_value (resolution, unchanged)

6. Technical Specification

6.1 Dependencies

TypeNamePurpose
ServiceFlowExecutorExecutes pre-built assistant flows (.py or .json, with .py taking priority)
ServiceProviderServiceDetects configured model providers and retrieves API keys
ServiceVariableServiceRetrieves user's stored API keys from encrypted storage
ServiceValidationServiceCompiles and instantiates component code for validation
External APILLM Provider APIsOpenAI, Anthropic, Azure, Google, IBM WatsonX, Ollama, Groq - for text generation
Librarylfx.runFlow execution engine
Librarylfx.custom.validateComponent class creation and validation
Frontenduse-stick-to-bottomAuto-scroll behavior in chat
Frontend@xyflow/reactCanvas integration for component placement

6.2 API Contracts

POST /api/v1/agentic/assist/stream

Purpose: Generate component or answer question with streaming progress updates

Request:

json
{
  "flow_id": "string - Required. UUID of the current flow",
  "input_value": "string - The user's message/prompt",
  "provider": "string - Optional. Model provider (openai, anthropic, etc.)",
  "model_name": "string - Optional. Specific model name (gpt-4o, claude-3-opus, etc.)",
  "max_retries": "integer - Optional. Total validation attempts (default: 3)",
  "session_id": "string - Required for conversation memory. Prefixed with 'agentic_' to isolate from Playground. Generated once per session by the frontend, reused across all requests. New ID on 'New session' only. Backend falls back to uuid4() if omitted."
}

Response (SSE Stream):

Event: progress

json
{
  "event": "progress",
  "step": "generating_component | generating | extracting_code | validating | validated | validation_failed | retrying | orchestrating",
  "attempt": 0,
  "max_attempts": 3,
  "message": "string - Human-readable status message",
  "error": "string - Optional. Error message for validation_failed",
  "class_name": "string - Optional. Component class name",
  "component_code": "string - Optional. Generated code for validation_failed"
}

Event: token (Q&A only)

json
{
  "event": "token",
  "chunk": "string - Token text"
}

Event: complete

json
{
  "event": "complete",
  "data": {
    "result": "string - Full response text",
    "validated": true,
    "class_name": "UppercaseComponent",
    "component_code": "class UppercaseComponent(Component):...",
    "validation_attempts": 1,
    "continuation_expected": "boolean - Optional. True when a deferred step exists and the frontend should silently re-send EDIT_CONTINUATION_INPUT after saving the flow",
    "usage": {
      "input_tokens": "integer - Optional. Accumulated input tokens for the whole turn (TranslationFlow classification + every agent attempt + retries)",
      "output_tokens": "integer - Optional. Same aggregation rule.",
      "total_tokens": "integer - Optional. Same aggregation rule."
    },
    "duration_seconds": "number - Optional. perf_counter-measured wall-clock duration of the whole turn (server-side around the entire pipeline). Rendered as milliseconds by the frontend MessageMetadata badge.",
    "run_metrics": {
      "duration_seconds": "number - Optional. perf_counter-measured run duration (present when RunFlow executed — distinct from the turn-level duration_seconds above)",
      "input_tokens": "integer - Optional. Aggregated via extract_graph_token_usage over graph vertices",
      "output_tokens": "integer - Optional.",
      "total_tokens": "integer - Optional."
    }
  }
}

Note on usage + duration_seconds — these are surfaced on EVERY complete event (success, refusal, off-topic, retry-exhausted, sanitization-blocked, plain Q&A) via the _complete(data) closure inside execute_flow_with_validation_streaming. The frontend stores them on AssistantMessage.usage / AssistantMessage.duration (ms) and renders the Playground's MessageMetadata badge with the new subtle prop next to the assistant title. The TranslationFlow's per-call usage is included via IntentResult.tokens.

Event: error

json
{
  "event": "error",
  "message": "string - Friendly error message"
}

Event: cancelled

json
{
  "event": "cancelled",
  "message": "string - Optional cancellation reason"
}

GET /api/v1/agentic/check-config

Purpose: Check if assistant is properly configured and return available providers

Request: None (uses authenticated user context)

Response (Success):

json
{
  "configured": true,
  "configured_providers": ["openai", "anthropic"],
  "providers": [
    {
      "name": "openai",
      "configured": true,
      "default_model": "gpt-4o",
      "models": [
        {"name": "gpt-4o", "display_name": "GPT-4o"},
        {"name": "gpt-4-turbo", "display_name": "GPT-4 Turbo"}
      ]
    }
  ],
  "default_provider": "openai",
  "default_model": "gpt-4o"
}

POST /api/v1/agentic/assist

Purpose: Non-streaming version of assist (prefer streaming for better UX)

Request: Same as /assist/stream

Response (Success):

json
{
  "result": "string - Full response",
  "validated": true,
  "class_name": "MyComponent",
  "component_code": "string - Python code",
  "validation_attempts": 1
}

6.3 Error Handling

Error CodeConditionUser MessageRecovery Action
400No provider configured"No model provider is configured. Please configure at least one model provider in Settings."Navigate to Settings > Model Providers
400Provider not available"Provider 'X' is not configured. Available providers: [list]"Select a different provider or configure the requested one
400Missing API key"OPENAI_API_KEY is required for the Langflow Assistant with openai. Please configure it in Settings > Model Providers."Add API key in Settings
400Unknown provider"Unknown provider: X"Use a supported provider
404Flow file not found"Flow file 'X.json' not found"Ensure agentic flows are deployed
500Flow execution errorFriendly error extracted from the actual error (e.g., "Rate limit exceeded. Please wait a moment and try again.")Retry request; check server logs
ValidationErrorCode syntax errorIncludes SyntaxError: ...System auto-retries with error context
ValidationErrorImport errorIncludes ModuleNotFoundError: ...System auto-retries with error context
ValidationErrorMissing Component base"Could not extract class name from code"System auto-retries with hint
NetworkErrorClient disconnected"Request cancelled"User can retry

7. Observability

7.1 Key Metrics

MetricTypeDescriptionAlert Threshold
assistant_requests_totalCounterTotal number of assistant requestsN/A (baseline)
assistant_requests_by_intentCounterRequests segmented by intent (generate_component, question, off_topic)N/A
assistant_generation_duration_secondsHistogramTime from request to completionP95 > 60s
assistant_validation_attemptsHistogramNumber of validation attempts per requestP95 > 2
assistant_validation_success_rateGaugePercentage of validations succeeding on first attempt< 70%
assistant_provider_usageCounterRequests by provider (openai, anthropic, etc.)N/A
assistant_errors_totalCounterTotal errors by type> 10/min
assistant_cancellations_totalCounterUser-initiated cancellations> 20% of requests

7.2 Important Logs

Log LevelEventFieldsWhen
INFOassistant.request.starteduser_id, flow_id, provider, model_name, intentRequest received
INFOassistant.generation.attemptattempt, max_retriesEach generation attempt
INFOassistant.validation.successclass_name, attemptsComponent validated successfully
WARNINGassistant.validation.failederror, attempt, class_nameValidation failed, will retry
ERRORassistant.validation.exhaustederror, attempts, code_snippetMax retries reached
INFOassistant.request.completedduration_ms, validated, attemptsRequest finished
INFOassistant.request.cancelledreason, duration_msUser cancelled
ERRORassistant.flow.errorerror_type, error_message, flow_nameFlow execution failed

7.3 Dashboards

Assistant Usage Dashboard:

  • Request Rate - Requests per minute over time
  • Intent Distribution - Pie chart of generate_component vs question vs off_topic
  • Provider Usage - Bar chart of requests by provider
  • Validation Success Rate - Time series of first-attempt success rate
  • Response Time P50/P95/P99 - Latency distribution

Assistant Health Dashboard:

  • Error Rate by Type - Stacked area chart
  • Provider Availability - Status indicators per provider
  • Validation Failure Reasons - Top error messages
  • Stream Disconnect Rate - Client disconnects over time

8. Deployment & Rollback

8.1 Feature Flags

No dedicated feature flags are currently implemented. The assistant is always enabled when the agentic backend is available. Feature flags may be added in the future for granular control.

8.2 Database Migrations

  • No database migrations required
  • Configuration stored in existing variables table (API keys)
  • Session messages are persisted in the message store (keyed by session_id) for Agent conversation memory within a session
  • Session ID is frontend-scoped, prefixed with agentic_, and generated per hook instance (reset on "New session")
  • Session history (chat messages list) is stored in browser localStorage only — not in the database. Clearing browser data deletes all assistant session history. This is by design (see ADR-015)
  • The Playground filters out agentic_-prefixed sessions to avoid cross-contamination (see ADR-011)

8.3 Rollback Plan

  1. Immediate: Set assistant_enabled feature flag to off
  2. If backend issues: Revert backend deployment to previous version
  3. If frontend issues: Revert frontend deployment to previous version
  4. Data considerations: No persistent data to rollback
  5. Dependencies: No downstream dependencies affected

8.4 Smoke Tests

  • Open assistant panel from canvas controls
  • Verify model providers are detected (if configured)
  • Submit a simple component generation request
  • Verify progress indicators appear during generation
  • Verify component code is displayed on completion
  • Click "Add to Canvas" and verify component appears
  • Submit a Q&A question and verify streaming response
  • Submit a follow-up modification (e.g., "use dataframe output instead") and verify the progress card appears (not raw code)
  • Verify conversation memory: ask a follow-up that references the previous component
  • Click "New session" and verify the assistant does not remember the previous conversation
  • Cancel a generation in progress
  • Clear conversation history
  • Press the configured shortcut (default A) on the canvas to open the assistant and verify the input is auto-focused
  • Press Escape to close the assistant (from both canvas and input focus)
  • Verify the shortcut is listed and editable in Settings > Shortcuts
  • Verify the input placeholder shows "Working on it..." during generation (not a random suggestion)
  • Resize the floating panel by dragging edges and verify size is persisted
  • Verify zoom percentage display doesn't shift the controls bar when zooming between 60% and 200%
  • Open the assistant panel with previous messages and verify it opens instantly (no lag)
  • Verify assistant sessions do NOT appear in Playground session list
  • Ask an off-topic question (e.g., "how does n8n work?") and verify it is blocked
  • Verify model selector shows correct provider icon (not generic "AI")
  • Switch to a different provider and verify the icon updates
  • Generate a component with invalid imports and verify retry loop catches it with runtime validation
  • Verify validation failure shows friendly message with collapsible error details
  • Verify attempt counter shows correct "Attempt X of 3" (never exceeds max)
  • Test with IBM WatsonX or Ollama provider and verify it works
  • Submit a compound prompt ("create a component, build a flow with it, and run it") and verify the single loop chains the tools and returns a run result
  • Configure ONLY a non-OpenAI provider and verify a flow's Agent node is wired with that provider (no forced OpenAI/Anthropic)
  • Run a flow and verify run metrics (duration_seconds + token counts) are shown after completion
  • Approve proposed canvas edits with a deferred run and verify it continues on the same request (no duplicate message); verify no continuation fires when nothing was deferred
  • Use "Replace canvas" and verify the new flow is framed via fitView (visible, not off-screen)
  • Verify a multi-step prompt shows the "Orchestrating..." progress label
  • Verify every assistant complete reply shows the subtle MessageMetadata badge with token counts and duration; reopen the panel and confirm the badge persists
  • Verify the backend log emits assistant.tokens.phase phase=intent and assistant.tokens.phase phase=main lines per turn
  • Trigger a model_not_found (e.g. force a model the project lacks access to) and verify the streamer swaps to the next provider candidate without consuming a validation slot (look for assistant.model_fallback from=... to=... in logs)
  • Exhaust every candidate on a provider and verify the user sees "No accessible model on <provider>. Tried: [...]. Configure access ... or switch to a different provider in Settings → Model Providers."
  • Open the assistant with no provider configured and verify the "Configure providers" button opens the ModelProviderModal dialog inline (no navigation away)
  • Add a built-in like URLComponent via the assistant and run the flow; verify it is NOT blocked as unsafe (built-in code is byte-identity-exempted from the AST scan)
  • Generate a Component whose Output's method is generic (output/process) and verify the resulting tool name is the snake_cased class name (e.g. RandomMenuItemrandom_menu_item)
  • Generate a Component declaring Output(name="component_as_tool", ...) and verify the validator refuses it with a "reserved name" error suggesting a value-descriptive name
  • Load a saved flow whose model value was doubly-encoded and verify the Agent's Language Model trigger renders a plain model name (not literal JSON like [{"provider":...])
  • Trigger a wrapped error ("Error building Component Agent: <real cause>" or a provider 'message': '...' repr) and verify the friendly error surfaces the deepest meaningful cause
  • Misname a Global Variable for the Agent's api_key and verify the error names BOTH the user's variable AND the canonical key
  • Click Continue on a proposed plan and verify the next request bypasses the TranslationFlow (deterministic PLAN_APPROVAL_INPUT short-circuit; look for intent.build_flow.deterministic: plan-approval continuation signal in logs)
  • Open the assistant on a very large canvas (50+ components) and verify the injected [Canvas reference ...] block is truncated at ~2000 chars with [truncated] marker

9. Architecture Diagrams

9.1 Context Diagram (Level 1)

mermaid
C4Context
  title System Context diagram for Langflow Assistant

  Person(user, "Langflow User", "Builds AI workflows using visual canvas")

  System(assistant, "Langflow Assistant", "AI-powered component generation through natural language")

  System_Ext(llm_providers, "LLM Providers", "OpenAI, Anthropic, Azure, Google - text generation")
  System_Ext(langflow_core, "Langflow Core", "Flow execution, component validation, canvas")

  Rel(user, assistant, "Sends prompts, receives components")
  Rel(assistant, llm_providers, "Generates text via API")
  Rel(assistant, langflow_core, "Validates code, adds to canvas")

9.2 Container Diagram (Level 2)

mermaid
C4Container
  title Container diagram for Langflow Assistant

  Person(user, "User", "Langflow user")

  Container_Boundary(frontend, "Frontend") {
    Container(assistant_panel, "AssistantPanel", "React", "Chat UI with progress indicators")
    Container(assistant_hooks, "Assistant Hooks", "React Hooks", "State management and API calls")
    Container(sse_client, "SSE Client", "TypeScript", "Parses streaming events")
  }

  Container_Boundary(backend, "Backend") {
    Container(agentic_api, "Agentic API", "FastAPI", "HTTP endpoints for assistant")
    Container(assistant_service, "AssistantService", "Python", "Orchestrates generation with retry")
    Container(flow_executor, "FlowExecutor", "Python", "Runs assistant flows")
    Container(validation_service, "ValidationService", "Python", "Validates component code")
  }

  Container_Ext(flows, "Assistant Flows", "JSON/Python", "LangflowAssistant.json, translation_flow.py")
  System_Ext(llm, "LLM Provider", "External API")

  Rel(user, assistant_panel, "Enters prompts")
  Rel(assistant_panel, assistant_hooks, "Uses")
  Rel(assistant_hooks, sse_client, "Processes stream")
  Rel(sse_client, agentic_api, "POST /assist/stream", "SSE")
  Rel(agentic_api, assistant_service, "Delegates")
  Rel(assistant_service, flow_executor, "Executes flows")
  Rel(assistant_service, validation_service, "Validates code")
  Rel(flow_executor, flows, "Loads")
  Rel(flow_executor, llm, "Calls API")

9.3 Component Flow Diagram

Note (2026-05-19): the pipeline is now a single agent loop (flow_builder_assistant.py) plus an MCP toolkit (GenerateComponent, DescribeFlowIO, RunFlow). The diagram below still describes the feature-level intent → generate → validate → run flow (which is byte-identical for single-thing requests); for multi-thing/compound prompts the SAME loop chains the tools. The full single-agent-loop + MCP wiring diagrams live in src/backend/base/langflow/agentic/ARCHITECTURE.md — the existing diagrams here are intentionally left as-is.

mermaid
flowchart TD
    A[User Input] --> B{Intent Classification
TranslationFlow - stateless}
    B -->|off_topic| Z[Return Refusal Message
no LLM call]
    B -->|generate_component| C[Execute LangflowAssistant Flow]
    B -->|question| D[Execute LangflowAssistant Flow
with token streaming]

    D --> F[Complete Response
plain text / Q&A]

    C --> G[Extract Component Code]

    G --> H{Code Found?}
    H -->|No| F
    H -->|Yes| I[Static Validation
AST parsing]

    I --> I2{AST Valid?}
    I2 -->|No| L
    I2 -->|Yes| I3[Runtime Validation
instantiate component]

    I3 --> J{Runtime Valid?}
    J -->|Yes| K[Return Validated Component
component card with Add to Canvas]
    J -->|No| L{Retries Left?}

    L -->|Yes| M[Retry with Error Context]
    M --> C
    L -->|No| N[Return Friendly Error
collapsible details + Try Again]

    K --> O[User Clicks Add to Canvas]
    O --> P[Component API Validation]
    P --> Q[Add to Canvas]

9.4 State Machine: Generation Pipeline

┌──────────┐    ┌─────────────────────────┐
│  Start   │───▶│  Intent Classification  │
└──────────┘    └─────────┬───────────────┘
                          │
              ┌───────────┼────────────┐
              │           │            │
              ▼           ▼            ▼
        ┌──────────┐ ┌──────────┐ ┌─────��──────┐
        │off_topic │ │ question │ │gen_component│
        └────┬─────┘ └────┬─────┘ └─────┬──────┘
             │            │              │
             ▼            ▼              ▼
        ┌──────────┐ ┌──────────┐ ┌────────────────────────┐
        │ refusal  │ │generating│ │  generating_component  │
        │ message  │ └────┬─────┘ └─────────┬──────────────┘
        └──────────┘      │                 │
                          ▼                 ▼
                   ┌────────────┐    ┌─────────────────┐
                   │  complete  │    │generation_complete│
                   │(plain text)│    └────────┬────────┘
                   └────────────┘             │
                                              ▼
                                     ┌─────────────────┐
                                     │ extracting_code │
                                     └────────┬────────┘
                                              │
                                     ┌────────▼────────┐
                                     │   validating    │
                                     │ (AST + Runtime) │
                                     └────────┬────────┘
                                              │
                          ┌───────────────────┼────────────────┐
                          │ Valid             │ Invalid        │
                          ▼                   ▼                │
                 ┌─────────────────┐ ┌──────────────────────┐  │
                 │    validated    │ │  validation_failed   │  │
                 └────────┬────────┘ └──────────┬───────────┘  │
                          │                     │              │
                          ▼                     ▼              │
                 ┌─────────────────┐   ┌─────────────────┐    │
                 │    complete     │   │    retrying     │────┘
                 │   (validated)   │   └─────────────────┘
                 └─────────────────┘          │
                                              │ max attempts
                                              ▼
                                     ┌─────────────────┐
                                     │    complete     │
                                     │ (not validated) │
                                     │ friendly error  │
                                     └─────────────────┘