provider-amazon-sagemaker (Amazon SageMaker AI Provider)

This example demonstrates how to evaluate models deployed on Amazon SageMaker AI endpoints using promptfoo.

You can run this example with:

bash

npx promptfoo@latest init --example provider-amazon-sagemaker
cd provider-amazon-sagemaker

Purpose

This example shows how to:

Connect to and evaluate models deployed on Amazon SageMaker AI endpoints
Configure various model types (OpenAI, Anthropic, Llama, Mistral) running on SageMaker AI
Compare performance between different SageMaker AI -hosted models
Use transform functions to format prompts for specific model requirements
Work with embeddings models on SageMaker

Prerequisites

AWS account with SageMaker AI access
Deployed SageMaker AI endpoints with your models
AWS credentials configured locally

Required npm packages:

bash

npm install -g @aws-sdk/client-sagemaker-runtime

Environment Variables

This example requires the following environment variables:

AWS_ACCESS_KEY_ID - Your AWS access key
AWS_SECRET_ACCESS_KEY - Your AWS secret key
AWS_REGION - Optional, can also be specified in the configuration

You can set these in a .env file or directly in your environment.

Example Configurations

This example includes multiple configuration files demonstrating different SageMaker integration patterns:

promptfooconfig.openai.yaml: OpenAI-compatible models on SageMaker
promptfooconfig.jumpstart.yaml: AWS JumpStart foundation models
promptfooconfig.llama.yaml: Llama 3.2 models on SageMaker JumpStart
promptfooconfig.mistral.yaml: Mistral 7B v3 models on SageMaker (Hugging Face)
promptfooconfig.llama-vs-mistral.yaml: Comparison between Llama and Mistral models
promptfooconfig.embedding.yaml: Embedding models on SageMaker
promptfooconfig.multimodel.yaml: Multiple model types on SageMaker
promptfooconfig.transform.yaml: Transform functions for SageMaker endpoints

Running the Examples

Replace the endpoint names in the configuration files with your actual SageMaker endpoints
Run the evaluation using promptfoo:

bash

# Run a specific configuration
promptfoo eval -c promptfooconfig.jumpstart.yaml

Testing Your Setup

This directory includes a test script to validate your SageMaker AI endpoint configuration before running a full evaluation:

bash

# Basic test for an OpenAI-compatible endpoint
node test-sagemaker-provider.js --endpoint=my-endpoint --model-type=openai

# Test with an embedding endpoint
node test-sagemaker-provider.js --endpoint=my-embedding-endpoint --embedding=true

# Test with transforms
node test-sagemaker-provider.js --endpoint=my-endpoint --model-type=llama --transform=true

# Test with a custom transform file
node test-sagemaker-provider.js --endpoint=my-endpoint --transform=true --transform-file=transform.js

Transform Functions

The SageMaker provider supports transforming prompts before they're sent to the endpoint, which is particularly useful for formatting prompts according to specific model requirements.

Inline Transform

yaml

providers:
  - id: sagemaker:llama:your-endpoint
    config:
      region: us-west-2
      modelType: llama
      # Apply an inline transform
      transform: |
        return `<s>[INST] ${prompt} [/INST]`;

File-Based Transform

This example includes a sample transform file (transform.js) that shows how to create reusable transformations:

yaml

providers:
  - id: sagemaker:jumpstart:your-endpoint
    config:
      region: us-west-2
      modelType: jumpstart
      # Reference an external transform file
      transform: file://transform.js

The transform function receives the prompt and a context object containing the provider configuration:

javascript

module.exports = function (prompt, context) {
  // Access config values
  const maxTokens = context.config?.maxTokens || 256;

  // Return transformed input
  return {
    inputs: prompt,
    parameters: {
      max_new_tokens: maxTokens,
      temperature: 0.7,
    },
  };
};

JumpStart Models

JumpStart models require a specific input/output format. The provider handles this automatically when modelType: jumpstart is specified:

yaml

providers:
  - id: sagemaker:jumpstart:your-jumpstart-endpoint
    config:
      region: us-west-2
      modelType: jumpstart
      maxTokens: 256
      responseFormat:
        path: 'json.generated_text'

Rate Limiting with Delays

For better rate limiting with SageMaker endpoints, you can add delays between API calls:

yaml

providers:
  - id: sagemaker:your-endpoint
    config:
      region: us-west-2
      delay: 500 # Add a 500ms delay between API calls

Expected Results

After running the evaluation, you should expect to see:

A comparison of responses from your SageMaker endpoints across different prompts
Performance metrics for each endpoint and prompt combination
Any errors or issues with specific endpoints or configurations

Troubleshooting

"Batch inference failed" Errors

If you encounter "Batch inference failed" errors:

Add a delay parameter (at least 500ms recommended)
Verify you're using the correct modelType for your endpoint:
- For Llama models: Use modelType: jumpstart
- For Mistral models: Use modelType: huggingface
Ensure you've specified the correct contentType and acceptType as "application/json"
Check that your endpoint is active and functioning in the SageMaker console

Response Format Issues

If you're getting unusual responses or missing output:

Make sure you're using the correct JavaScript expression for your model type:
- For Llama models (JumpStart): Use responseFormat.path: "json.generated_text"
- For Mistral models (Hugging Face): Use responseFormat.path: "json[0].generated_text"

Transform Issues

If transforms aren't working correctly:

Check that your transform function returns a valid string or object
For file-based transforms, verify the file path is correct and the file is accessible
Use the test script with --transform=true to debug transform behavior

Rate Limiting

If you're still experiencing errors even with the correct configuration:

Increase the delay between requests (try 1000ms or higher)
Run fewer tests in parallel
Monitor your endpoint metrics in the SageMaker console