Rate limits and data retention - Prefect

API rate limits

<Note> **Rate limits are subject to change.**

Contact Prefect support at [email protected] with questions about rate limits. </Note>

Prefect Cloud applies rate limits at the account level to ensure system stability. When limits are exceeded, endpoints return a 429 response with a Retry-After header.

Rate limits are organized into two buckets. All API paths within a bucket share a single counter. There are no sub-limits per resource type.

API Requests

Every HTTP call to the Prefect Cloud orchestration API counts as one request toward this bucket, including:

Creating, reading, updating, and deleting flows, flow runs, task runs, deployments, work pools, work queues, blocks, artifacts, variables, and concurrency limits
Flow run and task run state transitions (each set_state call counts as one request)
Reading events and logs (filter and count queries)

<Note> Concurrency limit operations count toward this bucket. When a task uses a concurrency limit tag, the SDK makes API calls to acquire and release the concurrency slot. This means a task with a concurrency limit tag generates additional API requests beyond the task run state transitions themselves. </Note>

Logs + Events

Every log message and every event submitted to Prefect Cloud counts as one request toward this bucket, including:

Log writes: Each individual log message counts as one request (not each HTTP batch). This includes all logs sent by flows, tasks, and Prefect infrastructure (such as worker and agent logs).
Event submissions: Each event submitted through the /events HTTP endpoint counts as one request.
Task run events via WebSocket: The Prefect SDK streams task run lifecycle events (such as Running, Completed, and Failed state changes) to Prefect Cloud over a WebSocket connection. Each event counts as one request toward this bucket. These events are emitted automatically by the SDK for every @task execution and are not manually created.

Limits by plan

Plan	API Requests (per minute)	Logs + Events (per minute)
Hobby	625	2,000
Starter	1,250	2,800
Team	2,500	8,000
Pro	5,000	40,000
Enterprise	Custom	Custom

For Enterprise limits, contact Prefect's sales team.

Monitor your usage

View your account's rate limit usage in the Prefect Cloud UI under Account Settings > Rate Limits. The dashboard shows current usage against your limits and highlights the top contributors to request volume, such as specific API keys, workspaces, or API routes.

Reduce request volume

If you're approaching your rate limits, consider the following:

Reduce flow run volume by batching work into fewer runs or using task runs within a single flow instead of launching many small flows.
Optimize logging by adjusting log levels or reducing the volume of log output from your flows.
Use webhooks and automations instead of polling the API for state changes.
Upgrade your plan if your workload consistently exceeds your current limits.

SDK retry behavior

The Prefect SDK automatically retries rate-limited requests up to 5 times, using the delay specified in the Retry-After header. You can customize this behavior through client settings.

Request size limits

Prefect Cloud enforces a 5 MiB (5,242,880 byte) maximum on request body size for all API endpoints. Requests that exceed this limit receive a 413 Content Too Large response. The Content-Length header is required on every POST, PUT, and PATCH request; omitting it returns a 411 Length Required response.

This limit applies to the entire HTTP request body, which includes JSON encoding overhead. Common scenarios where you may encounter this limit:

Large artifacts: Table or Markdown artifacts with many rows or large cell values.
Bulk log writes: Batches of log entries that together exceed 5 MiB.
Flow run inputs: Large payloads passed through flow.send_input() or send_input().

Reduce request size

If your requests are approaching or exceeding the 5 MiB limit:

Paginate or chunk large artifacts by splitting data across multiple smaller artifact calls instead of one large request.
Store large results externally and save a reference (such as a URL or storage key) as the artifact value instead of the full dataset.
Reduce log volume per batch by lowering the PREFECT_LOGGING_TO_API_BATCH_SIZE setting so each HTTP request carries fewer log entries.
Trim flow run input payloads by passing references to external storage instead of large inline values.

Query limits

Event queries are limited to a 14-day window per request. You can query any 14-day period within your retention period, but each individual query cannot span more than 14 days. If your plan's retention period is shorter than 14 days, the query window is limited to your retention period.

Metadata retention

Prefect Cloud retains flow run, task run, and artifact metadata for a limited time based on your plan. This applies to all workspaces in your account.

Retention periods by plan

Hobby and Starter: 7 days
Team: 14 days
Pro: 30 days
Enterprise: Custom retention periods

Metadata is retained from creation until the specified period expires. For flow and task runs, retention is calculated from when the run reaches a terminal state. Subflow runs are retained independently from their parent flows.

For custom retention periods, contact Prefect's sales team.