docs/resources/quota-and-pricing.md
Gemini CLI offers a generous free tier that covers many individual developers' use cases. For enterprise or professional usage, or if you need increased quota, several options are available depending on your authentication account type.
For a high-level comparison of available subscriptions and to select the right quota for your needs, see the Plans page.
This article outlines the specific quotas and pricing applicable to Gemini CLI when using different authentication methods.
The following table summarizes the available quotas and their respective limits:
| Authentication method | Tier / Subscription | Maximum requests per user per day |
|---|---|---|
| Google account | Gemini Code Assist (Individual) | 1,000 requests |
| Google AI Pro | 1,500 requests | |
| Google AI Ultra | 2,000 requests | |
| Gemini API key | Free tier (Unpaid) | 250 requests |
| Pay-as-you-go (Paid) | Varies | |
| Vertex AI | Express mode (Free) | Varies |
| Pay-as-you-go (Paid) | Varies | |
| Google Workspace | Code Assist Standard | 1,500 requests |
| Code Assist Enterprise | 2,000 requests | |
| Workspace AI Ultra | 2,000 requests |
Generally, there are three categories to choose from:
Requests are limited per user per minute and are subject to the availability of the service in times of high demand.
Access to Gemini CLI begins with a generous free tier, perfect for experimentation and light use.
Your free usage is governed by the following limits, which depend on your authorization type.
For users who authenticate by using their Google account to access Gemini Code Assist for individuals. This includes:
Learn more at Gemini Code Assist for Individuals Limits.
If you are using a Gemini API key, you can also benefit from a free tier. This includes:
Learn more at Gemini API Rate Limits.
Vertex AI offers an Express Mode without the need to enable billing. This includes:
Learn more at Vertex AI Express Mode Limits.
If you use up your initial number of requests, you can continue to benefit from Gemini CLI by upgrading to one of the following subscriptions:
These tiers apply when you sign in with a personal account. To verify whether you're on a personal account, visit Google One:
Supported tiers: - Tiers not listed above, including Google AI Plus, are not supported.
Google AI Pro and AI Ultra. This is recommended for individual developers. Quotas and pricing are based on a fixed price subscription.
For predictable costs, you can log in with Google.
Learn more at Gemini Code Assist Quotas and Limits
These tiers are applicable when you are signing in with a Google Workspace account.
Supported tiers: - Tiers not listed above, including Workspace AI Standard/Plus and AI Expanded, are not supported.
Purchase a Gemini Code Assist Subscription through Google Cloud.
Quotas and pricing are based on a fixed price subscription with assigned license seats. For predictable costs, you can sign in with Google.
This includes the following request limits:
If you hit your daily request limits or exhaust your Gemini Pro quota even after upgrading, the most flexible solution is to switch to a pay-as-you-go model, where you pay for the specific amount of processing you use. This is the recommended path for uninterrupted access.
To do this, log in using a Gemini API key or Vertex AI.
An enterprise-grade platform for building, deploying, and managing AI models, including Gemini. It offers enhanced security, data governance, and integration with other Google Cloud services.
Learn more at Vertex AI Dynamic Shared Quota and Vertex AI Pricing.
Ideal for developers who want to quickly build applications with the Gemini models. This is the most direct way to use the models.
Learn more at Gemini API Rate Limits, Gemini API Pricing
It’s important to highlight that when using an API key, you pay per token/call. This can be more expensive for many small calls with few tokens, but it's the only way to ensure your workflow isn't interrupted by reaching a limit on your quota.
These plans currently apply only to the use of Gemini web-based products provided by Google-based experiences (for example, the Gemini web app or the Flow video editor). These plans do not apply to the API usage which powers the Gemini CLI. Supporting these plans is under active consideration for future support.
You can check your current token usage and applicable limits using the
/stats model command. This command provides a snapshot of your current
session's token usage, as well as information about the limits associated with
your current quota.
For more information on the /stats command and its subcommands, see the
Command Reference.
A summary of model usage is also presented on exit at the end of a session.
When using a pay-as-you-go plan, be mindful of your usage to avoid unexpected costs.
/stats model command to track your token
usage during a session. This can help you stay aware of your spending in real
time.