.cursor/skills/error-handling/SKILL.md
Unified error handling system for DF. Use when adding API endpoints, modifying error handling, or adding frontend API calls.
Prerequisites: Read
docs/dev-guides/7-unified-error-handling.mdbefore changing API error behavior. Readdocs/dev-guides/2-log-sanitization.mdwhen the work involves logging, credentials, external services, or DataLoaders. If your work introduces new error handling patterns or conventions, update this file and related dev-guides accordingly.
Frontend Backend
──────── ───────
apiClient.ts errors.py
├── apiRequest() ←── JSON ──── ├── ErrorCode (enum)
├── streamRequest() ←── NDJSON ── └── AppError (exception)
└── parseStreamLine()
error_handler.py
errorCodes.ts ├── register_error_handlers(app)
└── getErrorMessage() ├── classify_and_wrap_llm_error()
└── stream_error_event()
errorHandler.ts
└── handleApiError() security/sanitize.py
└── classify_llm_error() (internal)
MessageSnackbar ← dfSlice.messages
Use this contract for all new or reworked DF APIs:
| Scenario | HTTP | Shape |
|---|---|---|
| Non-streaming success | 200 | {"status": "success", "data": ...} |
| Non-streaming business/validation error | 200 | {"status": "error", "error": {"code", "message", "retry", "request_id"}} |
| Non-streaming auth/authorization error | 401 / 403 | same structured error body |
| Streaming preflight error | 200 | application/json + {"status": "error", "error": ...} |
| Streaming in-flight fatal error | 200 | NDJSON line: {"type": "error", "error": ...} |
| No Flask route / too large / unhandled crash | 404 / 413 / 500 | transport-level error |
Do not use HTTP 400/422 for application validation errors in new code.
Do not convert in-flight NDJSON errors to status: "error"; once the stream has
started, event type is the protocol discriminator.
Application-controlled business and validation errors return HTTP 200 with
status: "error" in the body. Only these use non-200:
401/403 — auth errors (AUTH_REQUIRED, AUTH_EXPIRED, ACCESS_DENIED)404 — no matching Flask route413 — WSGI body limit exceeded500 — unhandled exception (program bug)from data_formulator.errors import AppError, ErrorCode
from data_formulator.error_handler import json_ok
@bp.route('/my-endpoint', methods=['POST'])
def my_endpoint():
content = request.get_json()
if not content.get('required_field'):
raise AppError(ErrorCode.INVALID_REQUEST, "Missing required_field")
try:
result = do_work(content)
except SomeBusinessError as e:
raise AppError(ErrorCode.DATA_LOAD_ERROR, "Failed to load data") from e
except Exception as e:
from data_formulator.error_handler import classify_and_wrap_llm_error
raise classify_and_wrap_llm_error(e) from e
return json_ok(result)
# Global handler returns: HTTP 200 + {"status": "error", "error": {code, message, retry}}
# Auth errors (AUTH_REQUIRED/AUTH_EXPIRED/ACCESS_DENIED) return 401/403
Legacy {"status": "error", "message": "..."}, error_message, bare {error},
and status: "ok" responses are historical formats. Do not add new compatibility
branches for them; migrate the route to json_ok() / AppError before using
apiRequest().
Validation MUST be outside the generator. Failures return 200 JSON (not NDJSON).
from data_formulator.errors import AppError, ErrorCode
from data_formulator.error_handler import (
classify_and_wrap_llm_error,
stream_error_event,
stream_preflight_error,
)
@bp.route('/my-stream', methods=['POST'])
def my_stream():
if not request.is_json:
return stream_preflight_error(
AppError(ErrorCode.INVALID_REQUEST, "Invalid request")
)
content = request.get_json()
client = get_client(content['model'])
def generate():
try:
for event in agent.run(...):
yield json.dumps(event, ensure_ascii=False) + "\n"
except Exception as e:
yield stream_error_event(classify_and_wrap_llm_error(e))
return Response(stream_with_context(generate()), mimetype='application/x-ndjson')
Streaming runtime errors intentionally use {"type": "error", "error": ...}.
They cannot use a top-level status envelope because the HTTP response and NDJSON
event stream have already started.
import { apiRequest } from '../app/apiClient';
import { handleApiError } from '../app/errorHandler';
try {
const { data } = await apiRequest<ResponseType>(getUrls().MY_ENDPOINT, {
method: 'POST',
body: JSON.stringify(payload),
headers: { 'Content-Type': 'application/json' },
});
} catch (e) {
handleApiError(e, 'MyComponent');
}
For UI loading state, model the request lifecycle explicitly with
LoadableState from src/app/loadableState.ts. Do not infer loading from
missing data (!data), because failed requests may legitimately leave data
empty while loading has ended.
import { streamRequest } from '../app/apiClient';
import { handleApiError } from '../app/errorHandler';
try {
for await (const event of streamRequest(url, options, abortController.signal)) {
switch (event.type) {
case 'text_delta':
break;
case 'error':
// Error arrived mid-stream — show inline in component.
break;
case 'done':
break;
}
}
} catch (e) {
handleApiError(e, 'MyComponent');
}
handleApiError(e, 'MyComponent', {
onAuth: () => redirectToLogin(), // AUTH_REQUIRED / AUTH_EXPIRED
onRetryable: () => retryOperation(), // LLM_RATE_LIMIT / LLM_TIMEOUT
silent: true, // don't show Snackbar (component handles display)
});
DF API consumers should use apiRequest() / streamRequest() and
handleApiError(). Direct fetchWithIdentity() is for lower-level client helpers
and explicit protocol exceptions such as file downloads, blob/CSV responses,
OIDC redirects, SPA fallback, or third-party URLs.
Do not apply the normal JSON API protocol mechanically to file downloads / CSV
streaming, SPA fallback, OIDC redirect flows, frontend fetches to third-party
URLs, or errors after a streaming response has already started. Check the route's
protocol first, then preserve safe error bodies and avoid str(exc) exposure.
Backend — Add to py-src/data_formulator/errors.py ErrorCode:
MY_NEW_ERROR = "MY_NEW_ERROR"
No HTTP mapping needed — defaults to HTTP 200. Only add to ERROR_CODE_HTTP_STATUS if it's an auth code.
Frontend mapping — Add to src/app/errorCodes.ts ERROR_CODE_I18N_MAP:
MY_NEW_ERROR: 'errors.myNewError',
Translations — Add to both locale files:
src/i18n/locales/en/errors.json: "myNewError": "English message"src/i18n/locales/zh/errors.json: "myNewError": "中文消息"All streaming endpoints are now on the unified protocol:
| Endpoint | Format | Notes |
|---|---|---|
/data-agent-streaming | NDJSON + stream_error_event() | Emits top-level type events; errors use {type:"error", error:{...}} |
/get-recommendation-questions | NDJSON + stream_error_event() | Was error: {json} prefix |
/generate-report-chat | Pure NDJSON + stream_error_event() | Was SSE data: {json} prefix |
/data-loading-chat | NDJSON + stream_error_event() | str(e) removed |
/clean-data-stream | NDJSON + stream_error_event() | Was \n{json}\n format |
Non-streaming endpoints:
| Endpoint | Error Format | Notes |
|---|---|---|
/chart-insight | AppError → HTTP 200 + {status:"error", error:{code,message,retry}} | Fully migrated. Frontend uses fetchChartInsight rejected reducer. |
| All migrated endpoints | AppError → HTTP 200 + unified error body | credentials, knowledge, sessions, tables, agents |
/derive-data, /refine-data, /sort-data, /process-data-on-load, /test-model | json_ok() / AppError | Migrated to new format |
Not all .catch(() => {}) are bugs. Use this decision tree:
addMessages or handleApiError().rejected handler with addMessagesif (action.error?.name !== 'AbortError')When consuming a migrated streaming endpoint, handle the current NDJSON event format directly:
const data = JSON.parse(line);
if (data.type === 'error') {
const errMsg = data.error?.message || 'Unknown error';
// show to user...
}
For table CRUD endpoints, use the specialized classifier:
from data_formulator.routes.tables import classify_and_raise_db_error
@tables_bp.route('/my-table-op', methods=['POST'])
def my_table_op():
try:
result = workspace.do_something()
return jsonify({"status": "success", "data": result})
except Exception as e:
classify_and_raise_db_error(e)
classify_and_raise_db_error maps common DB errors to appropriate AppError codes
(returned as HTTP 200 by the global handler, except ACCESS_DENIED → 403):
TABLE_NOT_FOUND (HTTP 200)INVALID_REQUEST (HTTP 200)ACCESS_DENIED (HTTP 403)CONNECTOR_ERROR (HTTP 200)For connector endpoints, use:
from data_formulator.data_connector import classify_and_raise_connector_error
except Exception as e:
classify_and_raise_connector_error(e, operation="preview")
Connector/DataLoader classification is intentionally simple and lives in
data_formulator.data_loader.connector_errors. It maps common failures to a
small stable set: INVALID_REQUEST, CONNECTOR_AUTH_FAILED, AUTH_EXPIRED,
ACCESS_DENIED, DB_CONNECTION_FAILED, DB_QUERY_ERROR, DATA_LOAD_ERROR,
or CONNECTOR_ERROR. Do not add endpoint-local string matching unless the
classifier cannot reasonably cover the category.
All JSON errors include error.request_id and an X-Request-Id response
header. Show/copy this ID for users when reporting backend failures; do not
show raw exception text in production. Unhandled 500 responses must never
include raw tracebacks, even in debug mode; return a safe category plus
request_id and keep full stack traces in server logs only.
When an error isn't reaching the frontend:
{"status": "error", "error": {"code": ..., "message": ...}}{"type": "error", "error": {"code": ..., "message": ...}}application/x-ndjson, not application/json or text/event-streamdata.type === 'error'?register_error_handlers(app) is called in app.pyerrorhandler(Exception) takes priority over global handlersLegacy message / error_message bodies are protocol violations on migrated API paths.
Server-side logs must never leak passwords, tokens, API keys, or connection strings. The project uses a defense-in-depth approach with two layers.
from data_formulator.security.log_sanitizer import (
sanitize_url, sanitize_params, redact_token,
)
# Dict with credentials → sanitize_params()
log.info("Connecting with: %s", sanitize_params(params))
# URL that may embed credentials → sanitize_url()
logger.info("Issuer: %s", sanitize_url(issuer_url))
# Token/API key → redact_token()
logger.debug("Token: %s", redact_token(token))
Registered in app.py:configure_logging(). Automatically redacts:
://user:pass@host)Bearer tokenspassword=xxx, api_key=xxx, secret=xxx patternsDisable with LOG_SANITIZE=false for local debugging only.
| Data | Utility | Why not just filter? |
|---|---|---|
dict with password keys | sanitize_params() | Filter can't identify arbitrary password values in dict repr |
| URL from config/env | sanitize_url() | Explicit is clearer; filter is backup |
| Token/key value | redact_token() | Explicit is clearer; filter is backup |
| Normal text | Nothing | Filter handles edge cases |
When adding a module that handles credentials or external services:
logger.*() calls for credential/URL/token loggingsanitize_params() for dicts, sanitize_url() for URLs, redact_token() for tokenstype(exc).__name__ over str(exc) in warning-level logsSENSITIVE_KEYS in log_sanitizer.py| File | Purpose |
|---|---|
py-src/data_formulator/errors.py | ErrorCode enum + AppError exception |
py-src/data_formulator/error_handler.py | Global handlers, classify_and_wrap_llm_error, stream_error_event |
py-src/data_formulator/security/log_sanitizer.py | sanitize_url, sanitize_params, redact_token, SensitiveDataFilter |
py-src/data_formulator/routes/tables.py | classify_and_raise_db_error (database/workspace errors) |
py-src/data_formulator/data_connector.py | classify_and_raise_connector_error (connector errors) |
py-src/data_formulator/security/sanitize.py | classify_llm_error (internal), sanitize_error_message |
src/app/apiClient.ts | apiRequest, streamRequest, parseStreamLine, ApiRequestError |
src/app/errorHandler.ts | handleApiError |
src/app/errorCodes.ts | ERROR_CODE_I18N_MAP, getErrorMessage |
src/i18n/locales/{en,zh}/errors.json | Error message translations |