Back to Promptfoo

Using the node package

site/docs/usage/node-package.md

0.121.98.7 KB
Original Source

Using the node package

Installation

promptfoo is available as a node package on npm:

sh
npm install promptfoo

Usage

Use promptfoo as a library in your project by importing the evaluate function and other utilities:

ts
import promptfoo from 'promptfoo';

const results = await promptfoo.evaluate(testSuite, options);

The evaluate function takes the following parameters:

The results of the evaluation are returned as an EvaluateSummary object.

Provider functions

A ProviderFunction is a Javascript function that implements an LLM API call. It takes a prompt string and a context. It returns the LLM response or an error. See ProviderFunction type.

You can load providers using the loadApiProvider function:

ts
import { loadApiProvider } from 'promptfoo';

// Load a provider with default options
const provider = await loadApiProvider('openai:o3-mini');

// Load a provider with custom options
const providerWithOptions = await loadApiProvider('azure:chat:test', {
  options: {
    apiHost: 'test-host',
    apiKey: 'test-key',
  },
});

Assertion functions

An Assertion can take an AssertionValueFunction as its value. The function receives:

  • output: the LLM output string
  • context: execution context, including prompt, vars, test, logProbs, config, provider, providerResponse, and optional trace data for debugging
<details> <summary>Type definition</summary> ```typescript type AssertionValueFunction = ( output: string, context: AssertionValueFunctionContext, ) => AssertionValueFunctionResult | Promise<AssertionValueFunctionResult>;

interface AssertionValueFunctionContext { prompt: string | undefined; vars: Record<string, unknown>; test: AtomicTestCase; logProbs: number[] | undefined; config?: Record<string, any>; provider: ApiProvider | undefined; providerResponse: ProviderResponse | undefined; trace?: TraceData; }

type AssertionValueFunctionResult = boolean | number | GradingResult;

interface GradingResult { // Whether the test passed or failed pass: boolean;

// Test score, typically between 0 and 1 score: number;

// Plain text reason for the result reason: string;

// Map of labeled metrics to values namedScores?: Record<string, number>;

// Weighted denominator for namedScores when assertion weights are used namedScoreWeights?: Record<string, number>;

// Record of tokens usage for this assertion tokensUsed?: Partial<{ total: number; prompt: number; completion: number; cached?: number; }>;

// Additional matcher/provider metadata metadata?: Record<string, unknown>;

// List of results for each component of the assertion componentResults?: GradingResult[];

// The assertion that was evaluated assertion?: Assertion; }

</details>

For more info on different assertion types, see [assertions & metrics](/docs/configuration/expected-outputs/).

### Transform functions

When using the node package, you can pass JavaScript functions directly as `transform`, `transformVars`, or `contextTransform` values — instead of string expressions or `file://` references.

This enables better IDE support, type checking, and debugging:

```ts
import promptfoo from 'promptfoo';

const results = await promptfoo.evaluate({
  prompts: ['What tools did you use to answer: {{question}}'],
  providers: ['openai:gpt-5-mini'],
  tests: [
    {
      vars: { question: 'What is 2+2?' },
      options: {
        // Transform the output before assertions
        transform: (output, context) => {
          return output.toUpperCase();
        },
      },
      assert: [
        {
          type: 'contains',
          value: 'calculator',
          // Transform just for this assertion
          transform: (output, context) => {
            const tools = context.metadata?.toolCalls ?? [];
            return tools.map((t) => t.name).join(', ');
          },
        },
      ],
    },
  ],
});
```

Transform functions receive:

- `output`: the LLM output (string or object)
- `context`: an object containing `vars`, `prompt`, and optionally `metadata` from the provider response

:::note

Function transforms are not serializable. If you use `writeLatestResults: true`, function transforms will not be persisted in the stored config. Use string expressions or `file://` references if you need results to be fully reproducible from the stored eval.

:::

For more on transforms, see [Transforming Outputs](/docs/configuration/guide#transforming-outputs).

## Example

`promptfoo` exports an `evaluate` function that you can use to run prompt evaluations.

```js
import promptfoo from 'promptfoo';

const results = await promptfoo.evaluate(
  {
    prompts: ['Rephrase this in French: {{body}}', 'Rephrase this like a pirate: {{body}}'],
    providers: ['openai:gpt-5-mini'],
    tests: [
      {
        vars: {
          body: 'Hello world',
        },
      },
      {
        vars: {
          body: "I'm hungry",
        },
      },
    ],
    writeLatestResults: true, // write results to disk so they can be viewed in web viewer
  },
  {
    maxConcurrency: 2,
  },
);

console.log(results);

This code imports the promptfoo library, defines the evaluation options, and then calls the evaluate function with these options.

You can also supply functions as prompts, providers, or asserts:

js
import promptfoo from 'promptfoo';

(async () => {
  const results = await promptfoo.evaluate({
    prompts: [
      'Rephrase this in French: {{body}}',
      (vars) => {
        return `Rephrase this like a pirate: ${vars.body}`;
      },
    ],
    providers: [
      'openai:gpt-5-mini',
      (prompt, context) => {
        // Call LLM here...
        console.log(`Prompt: ${prompt}, vars: ${JSON.stringify(context.vars)}`);
        return {
          output: '<LLM output>',
        };
      },
    ],
    tests: [
      {
        vars: {
          body: 'Hello world',
        },
      },
      {
        vars: {
          body: "I'm hungry",
        },
        assert: [
          {
            type: 'javascript',
            value: (output) => {
              const pass = output.includes("J'ai faim");
              return {
                pass,
                score: pass ? 1.0 : 0.0,
                reason: pass ? 'Output contained substring' : 'Output did not contain substring',
              };
            },
          },
        ],
      },
    ],
  });
  console.log('RESULTS:');
  console.log(results);
})();

There's a full example on Github here.

Here's the example output in JSON format:

json
{
  "results": [
    {
      "prompt": {
        "raw": "Rephrase this in French: Hello world",
        "display": "Rephrase this in French: {{body}}"
      },
      "vars": {
        "body": "Hello world"
      },
      "response": {
        "output": "Bonjour le monde",
        "tokenUsage": {
          "total": 19,
          "prompt": 16,
          "completion": 3
        }
      }
    },
    {
      "prompt": {
        "raw": "Rephrase this in French: I&#39;m hungry",
        "display": "Rephrase this in French: {{body}}"
      },
      "vars": {
        "body": "I'm hungry"
      },
      "response": {
        "output": "J'ai faim.",
        "tokenUsage": {
          "total": 24,
          "prompt": 19,
          "completion": 5
        }
      }
    }
    // ...
  ],
  "stats": {
    "successes": 4,
    "failures": 0,
    "tokenUsage": {
      "total": 120,
      "prompt": 72,
      "completion": 48
    }
  },
  "table": [
    ["Rephrase this in French: {{body}}", "Rephrase this like a pirate: {{body}}", "body"],
    ["Bonjour le monde", "Ahoy thar, me hearties! Avast ye, world!", "Hello world"],
    [
      "J'ai faim.",
      "Arrr, me belly be empty and me throat be parched! I be needin' some grub, matey!",
      "I'm hungry"
    ]
  ]
}

Sharing Results

To get a shareable URL, set sharing: true along with writeLatestResults: true:

js
const results = await promptfoo.evaluate({
  prompts: ['Your prompt here'],
  providers: ['openai:gpt-5-mini'],
  tests: [{ vars: { input: 'test' } }],
  writeLatestResults: true,
  sharing: true,
});

console.log(results.shareableUrl); // https://app.promptfoo.dev/eval/abc123

Requires a Promptfoo Cloud account or self-hosted server. For self-hosted, pass sharing: { apiBaseUrl, appBaseUrl } instead of true.