docs/_advanced/models.md
{: .no_toc }
{{ page.description }} {: .fs-6 .fw-300 }
{: .no_toc .text-delta }
After reading this guide, you will know:
Model::Info.*_api_base settings.assume_model_exists.RubyLLM maintains an internal registry of known AI models, typically stored in lib/ruby_llm/models.json within the gem. This registry is populated by running the rake models:update task, which queries the APIs of configured providers to discover their available models and capabilities.
The registry stores crucial information about each model, including:
id: The unique identifier used by the provider (e.g., gpt-4o-2024-08-06).provider: The source provider (openai, anthropic, etc.).type: The model's primary function (chat, embedding, etc.).name: A human-friendly name.context_window: Max input tokens (e.g., 128_000).max_tokens: Max output tokens (e.g., 16_384).supports_vision: If it can process images and videos.supports_functions: If it can use [Tools]({% link _core_features/tools.md %}).input_price_per_million: Cost in USD per 1 million input tokens.output_price_per_million: Cost in USD per 1 million output tokens.family: A broader classification (e.g., gpt4o).This registry allows RubyLLM to validate models, route requests correctly, provide capability information, and offer convenient filtering.
You can see the full list of currently registered models in the [Available Models Guide]({% link _reference/available-models.md %}).
For Application Developers:
The recommended way to refresh models in your application is to call RubyLLM.models.refresh! directly:
# In your application code (console, background job, etc.)
RubyLLM.models.refresh!
puts "Refreshed in-memory model list."
This refreshes the in-memory model registry and is what you want 99% of the time. This method is safe to call from Rails applications, background jobs, or any running Ruby process.
Important: refresh! only updates the in-memory registry. To persist changes to disk, call:
RubyLLM.models.refresh!
RubyLLM.models.save_to_json # Saves to configured model_registry_file (v1.9.0+)
If your gem directory is read-only, configure a writable location with config.model_registry_file (v1.9.0+). See the [Configuration Guide]({% link _getting_started/configuration.md %}#model-registry-file) for details.
How refresh! Works:
The refresh! method performs the following steps:
The method returns a chainable Models instance, allowing you to immediately query the updated registry:
# Refresh and immediately query
chat_models = RubyLLM.models.refresh!.chat_models
Note: models.dev is the upstream registry for RubyLLM metadata. If you encounter issues with model data, please report them via the models.dev site or repo.
Local Provider Models:
By default, refresh! includes models from local providers like Ollama and GPUStack if they're configured. To exclude local providers and only fetch from remote APIs:
# Only fetch from remote providers (Anthropic, OpenAI, etc.)
RubyLLM.models.refresh!(remote_only: true)
This is useful when you want to refresh only cloud-based models without querying local model servers.
For Gem Development:
The rake models:update task is designed for gem maintainers and updates the models.json file shipped with the gem:
# Only for gem development - requires API keys and gem directory structure
bundle exec rake models:update
This task is not intended for Rails applications as it writes to gem directories and requires the full gem development environment.
Persisting Models to Your Database:
For Rails applications, the install generator sets up everything automatically:
bin/rails generate ruby_llm:install
bin/rails db:migrate
This creates the Model table and loads model data from the gem's registry.
To refresh model data from provider APIs:
# Fetches latest model info from configured providers (requires API keys)
Model.refresh!
Use RubyLLM.models to explore the registry.
# Get a collection of all registered models
all_models = RubyLLM.models.all
# Filter by type
chat_models = RubyLLM.models.chat_models
embedding_models = RubyLLM.models.embedding_models
# Filter by provider
openai_models = RubyLLM.models.by_provider(:openai) # or 'openai'
# Filter by model family (e.g., all Claude 3 Sonnet variants)
claude3_sonnet_family = RubyLLM.models.by_family('claude3_sonnet')
# Chain filters and use Enumerable methods
openai_vision_models = RubyLLM.models.by_provider(:openai)
.select(&:supports_vision?)
puts "Found #{openai_vision_models.count} OpenAI vision models."
Use find to get a Model::Info object containing details about a specific model.
# Find by exact ID or alias
model_info = RubyLLM.models.find('{{ site.models.openai_tools }}')
if model_info
puts "Model: #{model_info.name}"
puts "Provider: #{model_info.provider}"
puts "Context Window: #{model_info.context_window} tokens"
else
puts "Model not found."
end
# Find raises ModelNotFoundError if the ID is unknown
# RubyLLM.models.find('no-such-model-exists') # => raises ModelNotFoundError
RubyLLM uses aliases (defined in lib/ruby_llm/aliases.json) for convenience, mapping common names to specific versions.
# '{{ site.models.anthropic_current }}' might resolve to 'claude-3-5-sonnet-20241022'
chat = RubyLLM.chat(model: '{{ site.models.anthropic_current }}')
puts chat.model.id # => "claude-3-5-sonnet-20241022" (or latest version)
When you call find without a provider, RubyLLM prioritizes exact ID matches before falling back to aliases.
Specify the provider if the same alias exists across multiple providers.
# Get Claude 3.5 Sonnet from Anthropic
model_anthropic = RubyLLM.models.find('{{ site.models.anthropic_current }}', :anthropic)
# Get Claude 3.5 Sonnet via AWS Bedrock
model_bedrock = RubyLLM.models.find('{{ site.models.anthropic_current }}', :bedrock)
When you pass a provider, RubyLLM resolves aliases first. For Bedrock, it then applies region/inference-profile resolution (for example us. prefixes) before falling back to an exact ID match.
{: .d-inline-block }
Sometimes you need to interact with models or endpoints not covered by the standard registry, such as:
RubyLLM offers two mechanisms for these cases:
openai_api_base)If you need to target an endpoint that uses the OpenAI API format but has a different URL, configure openai_api_base in RubyLLM.configure.
# config/initializers/ruby_llm.rb
RubyLLM.configure do |config|
config.openai_api_key = ENV['AZURE_OPENAI_KEY'] # Key for your endpoint
config.openai_api_base = "https://YOUR_AZURE_RESOURCE.openai.azure.com" # Your endpoint
end
provider: :openai.https://api.openai.com/v1.assume_model_exists)To use a model identifier not listed in RubyLLM's registry, use the assume_model_exists: true flag. This tells RubyLLM to bypass its validation check.
# Example: Using a custom Azure deployment name
# Assumes openai_api_base is configured for your Azure endpoint
chat = RubyLLM.chat(
model: 'my-company-secure-gpt4o', # Your custom deployment name
provider: :openai, # MUST specify provider
assume_model_exists: true # Bypass registry check
)
response = chat.ask("Internal knowledge query...")
puts response.content
# You can also use it in .with_model
chat.with_model(
'gpt-5-alpha',
provider: :openai, # MUST specify provider
assume_exists: true
)
The assume_model_exists flag also works with RubyLLM.embed and RubyLLM.paint for embedding and image generation models:
# Custom embedding model
embedding = RubyLLM.embed(
"Test text",
model: 'my-custom-embedder',
provider: :openai,
assume_model_exists: true
)
# Custom image model
image = RubyLLM.paint(
"A beautiful landscape",
model: 'my-custom-dalle',
provider: :openai,
assume_model_exists: true
)
Key Points when Assuming Existence:
provider: is Mandatory: You must tell RubyLLM which API format to use (ArgumentError otherwise).supports_functions?) are bypassed by assuming true. You are responsible for ensuring the model supports the features you use.Use these features when the standard registry doesn't cover your specific model or endpoint needs. For standard models, rely on the registry for validation and capability awareness. See the [Chat Guide]({% link _core_features/chat.md %}) for more on using the chat object.