docs/sf/providers/aws/guide/agents/runtime.md
A Runtime is your deployed agent application. It's the container or Python code that runs your AI agent logic -- use any framework -- LangGraph, Strands Agents, CrewAI, or your own custom code. The Serverless Framework handles building, packaging, and deploying your runtime to AWS Bedrock AgentCore.
For a quick start guide, see the main AI Agents page.
AgentCore supports two deployment methods: Docker/Image deployment (any language) and Code deployment (Python only).
Use Docker for multi-language projects, complex dependencies, or full control over the runtime environment.
Minimal configuration (auto-detection):
ai:
agents:
myAgent: {}
The framework automatically detects a Dockerfile in the current directory and handles:
Explicit Dockerfile configuration:
ai:
agents:
myAgent:
artifact:
image:
file: Dockerfile.agent # Custom Dockerfile name
path: ./agent # Build context directory
repository: my-agent-repo # Custom ECR repository name
buildArgs: # Docker build arguments
PYTHON_VERSION: '3.12'
ENV: production
Pre-built image:
ai:
agents:
myAgent:
artifact:
image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-agent:latest
Use pre-built images when:
Deploy Python code directly without Docker. Best for simple agents with standard dependencies.
Basic code deployment:
ai:
agents:
myAgent:
handler: agent.py
runtime: python3.12
The handler property triggers code deployment mode. The framework packages your Python code and uploads it to S3.
Supported runtimes:
python3.10python3.11python3.12python3.13 (default)With custom S3 location:
ai:
agents:
myAgent:
handler: main.py
runtime: python3.12
artifact:
s3:
bucket: my-artifacts-bucket
key: agents/my-agent.zip
versionId: abc123 # Optional: specific version
Use custom S3 locations when:
Packaging options:
Control which files are included in the code package, same as Lambda function packaging:
ai:
agents:
myAgent:
handler: agent.py
package:
patterns:
- '!tests/**'
- '!docs/**'
include:
- 'lib/**'
exclude:
- '*.pyc'
Control how your runtime connects to the network.
Your agent is accessible over the internet with AWS authentication.
ai:
agents:
myAgent:
network:
mode: PUBLIC
Use PUBLIC mode when:
Deploy your agent within a Virtual Private Cloud for enhanced security.
ai:
agents:
myAgent:
network:
mode: VPC
subnets:
- subnet-0123456789abcdef0
- subnet-0123456789abcdef1
securityGroups:
- sg-0123456789abcdef0
Use VPC mode when:
VPC requirements:
Control who can invoke your runtime.
Without an authorizer, your runtime uses AWS SigV4 authentication. Callers must have valid AWS credentials and IAM permissions.
ai:
agents:
myAgent: {} # Uses AWS IAM by default
Protect your runtime with JWT tokens from an OIDC-compliant identity provider (Cognito, Auth0, Okta, etc.).
ai:
agents:
myAgent:
authorizer:
type: CUSTOM_JWT
jwt:
discoveryUrl: https://cognito-idp.us-east-1.amazonaws.com/us-east-1_xxxxx/.well-known/openid-configuration
allowedAudience:
- my-app-client-id
allowedClients:
- my-app-client-id
allowedScopes:
- openid
- profile
JWT configuration options:
| Property | Required | Description |
|---|---|---|
discoveryUrl | Yes | OIDC discovery endpoint URL |
allowedAudience | No | List of valid aud claim values |
allowedClients | No | List of valid client_id values |
allowedScopes | No | List of required scopes |
customClaims | No | Custom claim validation rules |
Custom claims validation:
ai:
agents:
myAgent:
authorizer:
type: CUSTOM_JWT
jwt:
discoveryUrl: https://.../.well-known/openid-configuration
customClaims:
- inboundTokenClaimName: department
inboundTokenClaimValueType: STRING
authorizingClaimMatchValue:
claimMatchOperator: EQUALS
claimMatchValue:
matchValueString: engineering
You can combine the JWT configuration above with any of the deployment examples in the Examples section below.
Control session timeouts and runtime lifetime.
ai:
agents:
myAgent:
lifecycle:
idleRuntimeSessionTimeout: 900 # seconds (60-28800)
maxLifetime: 3600 # seconds (60-28800)
| Property | Range | Default | Description |
|---|---|---|---|
idleRuntimeSessionTimeout | 60-28800 | 900 | Seconds before idle session is terminated |
maxLifetime | 60-28800 | 28800 | Maximum session lifetime regardless of activity |
When to adjust these values:
idleRuntimeSessionTimeout: To reduce memory costs during idle periods (memory is billed per second while the session is alive, even when idle)idleRuntimeSessionTimeout: For agents with long-running conversationsmaxLifetime: For security-sensitive applications requiring session rotationmaxLifetime: For long-running batch processing or analysis tasksSpecify the communication protocol for your runtime.
ai:
agents:
myAgent:
protocol: HTTP # HTTP, MCP, or A2A
| Protocol | Description | Use Case |
|---|---|---|
HTTP | Standard HTTP requests (default) | General purpose agents, REST-like interactions |
MCP | Model Context Protocol | Agents that expose tools via MCP |
A2A | Agent-to-Agent | Multi-agent orchestration |
Create named endpoints for your runtime. Endpoints let you manage versioned access points -- for example, a production endpoint tracking the latest version and a staging endpoint pinned to a specific version.
ai:
agents:
myAgent:
endpoints:
- name: production
description: Production endpoint, always tracks latest
- name: staging
version: '1'
description: Staging endpoint pinned to version 1
| Property | Required | Description |
|---|---|---|
name | No | Endpoint name (auto-generated if omitted, defaults to default) |
version | No | Pin to a specific runtime version (omit to track latest) |
description | No | Human-readable description (max 256 chars) |
Each endpoint creates an AWS::BedrockAgentCore::RuntimeEndpoint CloudFormation resource with its own ARN, available as a stack output.
Pass configuration to your runtime via environment variables.
ai:
agents:
myAgent:
environment:
MODEL_ID: us.anthropic.claude-sonnet-4-5-20250929-v1:0
LOG_LEVEL: INFO
MAX_TOKENS: '4096'
API_ENDPOINT: https://api.example.com
Best practices:
us.anthropic.claude-...) for on-demand Bedrock modelsControl which HTTP headers are passed through to your runtime.
ai:
agents:
myAgent:
requestHeaders:
allowlist:
- X-Trace-Id
- X-Request-Id
- X-Correlation-Id
Use cases:
Limits: Maximum 20 headers in the allowlist.
The framework automatically creates an IAM role with required permissions. You can customize or replace it.
ai:
agents:
myAgent: {} # Role created automatically
The auto-generated role includes permissions for:
Always included:
Conditional (added when configured):
memory is configured)gateway is configured)Use an existing IAM role:
ai:
agents:
myAgent:
role: arn:aws:iam::123456789012:role/MyCustomAgentRole
Add custom permissions to the auto-generated role:
ai:
agents:
myAgent:
role:
name: my-agent-role # Optional: custom role name
statements:
- Effect: Allow
Action:
- s3:GetObject
- s3:PutObject
Resource: arn:aws:s3:::my-bucket/*
- Effect: Allow
Action: secretsmanager:GetSecretValue
Resource: arn:aws:secretsmanager:us-east-1:123456789012:secret:my-secret-*
managedPolicies:
- arn:aws:iam::aws:policy/AmazonDynamoDBReadOnlyAccess
permissionsBoundary: arn:aws:iam::123456789012:policy/MyPermissionsBoundary
tags:
CostCenter: AI-Team
| Property | Description |
|---|---|
name | Custom role name (max 64 chars) |
statements | Additional IAM policy statements |
managedPolicies | ARNs of managed policies to attach |
permissionsBoundary | ARN of permissions boundary policy |
tags | Tags to apply to the role |
Add metadata to your runtime for organization and cost tracking.
ai:
agents:
myAgent:
description: Production customer service agent with memory and tools
tags:
Team: AI
Project: CustomerService
Environment: production
CostCenter: CC-1234
| Property | Limit | Description |
|---|---|---|
description | 1200 chars | Human-readable description |
tags | Standard AWS limits | Key-value pairs for resource tagging |
Here's a complete example showing all configuration options:
service: my-ai-service
provider:
name: aws
region: us-east-1
ai:
agents:
myAgent:
# Description
description: Production AI agent with full configuration
# Deployment (choose one approach)
artifact:
image:
file: Dockerfile
path: ./agent
repository: my-agent-repo
buildArgs:
ENV: production
# OR for code deployment:
# handler: agent.py
# runtime: python3.12
# Protocol
protocol: HTTP
# Endpoints (named access points for the runtime)
endpoints:
- name: production
description: Tracks latest version
- name: staging
version: '1'
description: Pinned to version 1
# Networking
network:
mode: PUBLIC
# For VPC:
# mode: VPC
# subnets: [subnet-xxx]
# securityGroups: [sg-xxx]
# Authentication
authorizer:
type: CUSTOM_JWT
jwt:
discoveryUrl: https://cognito-idp.us-east-1.amazonaws.com/us-east-1_xxx/.well-known/openid-configuration
allowedAudience:
- my-client-id
allowedClients:
- my-client-id
# Lifecycle
lifecycle:
idleRuntimeSessionTimeout: 900
maxLifetime: 3600
# Environment
environment:
MODEL_ID: us.anthropic.claude-sonnet-4-5-20250929-v1:0
LOG_LEVEL: INFO
# Headers
requestHeaders:
allowlist:
- X-Trace-Id
# Memory - enables conversation persistence (see memory.md)
# Automatically adds memory read/write permissions to the runtime role
memory: myMemory
# Gateway - connects tools to your agent (see gateway.md)
# Automatically adds gateway invocation permissions to the runtime role
gateway: myGateway
# IAM Role
role:
statements:
- Effect: Allow
Action: s3:GetObject
Resource: arn:aws:s3:::my-bucket/*
# Metadata
tags:
Team: AI
Environment: production
JavaScript:
Python: