Back to Promptfoo

Plan: `promptfoo eval -c ` Cloud Config Support

docs/plans/eng-1770.md

0.121.96.1 KB
Original Source

Plan: promptfoo eval -c <uuid> Cloud Config Support

Context

Currently, promptfoo redteam run -c <uuid> supports loading configs from Promptfoo Cloud by UUID, but promptfoo eval -c only accepts local file paths. This feature extends the same cloud config loading pattern to the eval command, allowing users to run promptfoo eval -c <cloud-uuid> to fetch and execute a config stored in Promptfoo Cloud.

The ticket (ENG-1770) has two parts:

  1. Open Source (this PR): Make the CLI accept a cloud UUID for eval -c
  2. Cloud: Show the promptfoo eval -c <uuid> command in the Cloud run modal (separate repo)

Changes

1. Add getEvalConfigFromCloud() to src/util/cloud.ts

Create a new function modeled after the existing getConfigFromCloud() (line 88-114) but hitting a different endpoint for eval configs:

typescript
export async function getEvalConfigFromCloud(id: string): Promise<UnifiedConfig> {
  // Same pattern as getConfigFromCloud but using `configs/${id}` endpoint
}
  • Endpoint: GET /api/v1/configs/${id} (eval configs, not redteam-specific)
  • Reuse existing makeRequest() helper (line 25) and cloudConfig.isEnabled() check pattern
  • Same error handling pattern as getConfigFromCloud

2. Add UUID detection to src/commands/eval.ts

In doEval() (line 107), add UUID detection before the existing config path processing at line 142. This mirrors the pattern in src/redteam/commands/run.ts:62-79.

typescript
// Before the existing config path processing (line 142)
const UUID_REGEX = /^[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}$/;

if (cmdObj.config?.length === 1 && UUID_REGEX.test(cmdObj.config[0])) {
  const cloudConfigObj = await getEvalConfigFromCloud(cmdObj.config[0]);
  defaultConfig = cloudConfigObj;
  cmdObj.config = undefined;
}

Key behaviors:

  • Only trigger when exactly one config path is provided and it's a UUID
  • Fetch the config from cloud and use it as defaultConfig
  • Clear cmdObj.config so resolveConfigs uses defaultConfig instead of trying to read a file
  • Import getEvalConfigFromCloud from ../../util/cloud

3. Update eval command description

Update the -c option description at line 903-906 to mention cloud UUID support:

typescript
.option(
  '-c, --config <paths...>',
  'Path to configuration file or cloud config UUID. Automatically loads promptfooconfig.yaml',
)

4. Add tests

Create test cases in a new test file or add to existing eval command tests, following the pattern from test/redteam/commands/run.test.ts:

  • Test UUID detection triggers cloud fetch
  • Test local file paths bypass UUID detection
  • Test error when cloud is not enabled
  • Test multiple config paths with UUID (should not trigger UUID detection)

Files to Modify

FileChange
src/util/cloud.tsAdd getEvalConfigFromCloud() function
src/commands/eval.tsAdd UUID detection + cloud fetch in doEval(), update -c description
test/commands/eval.test.ts or new test fileAdd tests for UUID cloud config

Existing Code to Reuse

  • makeRequest() from src/util/cloud.ts:25 - authenticated HTTP helper
  • cloudConfig.isEnabled() from src/globalConfig/cloud.ts - auth check
  • UUID regex pattern from src/redteam/commands/run.ts:17
  • resolveConfigs() from src/util/config/load.ts:481 - existing config resolution (no changes needed)

Verification

  1. Build: npm run build should succeed
  2. Lint: npm run l && npm run f
  3. Unit tests: Run existing + new tests with npx vitest src/commands/eval and npx vitest src/util/cloud
  4. Manual test (if cloud access available):
    • npm run local -- eval -c <valid-uuid> --env-file .env should fetch config from cloud
    • npm run local -- eval -c path/to/config.yaml should still work as before
    • npm run local -- eval -c <invalid-uuid-format> should fall through to file resolution

Decisions (Reconciled)

  1. Use the OSS endpoint contract: GET /api/v1/configs/:id returning an envelope ({ config: ... }), not a raw payload.
  2. Support persisted config shape with providers and tests; loading must not require additional requests.
  3. Provider references in the config should use promptfoo://provider/<uuid>.
  4. Prompts should be emitted as plain strings.
  5. tests defaults to [] when missing.
  6. description falls back to config.name when missing.
  7. --watch is not supported when -c <uuid> is used. CLI should fail fast with a clear error.
  8. If -c <value> matches UUID format but cloud fetch fails (404/auth disabled), hard-fail.
  9. If multiple -c values are supplied and any value is a UUID, fail with an explicit error stating only one -c value is allowed for cloud UUID mode.
  10. Add tests in both places:
    • test/commands/eval.test.ts for UUID detection/CLI behavior
    • test/util/cloud.test.ts for getEvalConfigFromCloud() contract/error handling
  11. Scope is eval only (not redteam eval).
  12. No temporary UI note or minimum-version gating is required.
  13. In UUID mode, clear/ignore defaultConfigPath for the run to prevent accidental local reload/fallback behavior.
  14. Read-time schema normalization should normalize legacy fields (providerIds/testCases) into canonical fields (providers/tests).

CLI Error Messages (exact draft text)

  1. Multiple -c values with UUID mode:
    • Cloud config UUID mode supports exactly one -c value. Use: promptfoo eval -c <cloud-config-uuid>
  2. UUID mode with --watch:
    • --watch is not supported when using a cloud config UUID with -c. Use a local config file path for watch mode.
  3. UUID-shaped value with failed cloud fetch:
    • Failed to load cloud eval config "<uuid>". <reason>. Cloud UUID inputs do not fall back to local file paths. Check authentication and that the UUID exists.

Remaining Follow-ups

None.