docs/_core_features/embeddings.md
{: .no_toc }
{{ page.description }} {: .fs-6 .fw-300 }
{: .no_toc .text-delta }
After reading this guide, you will know:
The simplest way to create an embedding is with the global RubyLLM.embed method:
# Create an embedding for a single text
embedding = RubyLLM.embed("Ruby is a programmer's best friend")
# The vector representation (an array of floats)
vector = embedding.vectors
puts "Vector dimension: #{vector.length}" # e.g., 1536 for {{ site.models.embedding_small }}
# Access metadata
puts "Model used: #{embedding.model}"
puts "Input tokens: #{embedding.input_tokens}"
You can efficiently embed multiple texts in a single API call:
texts = ["Ruby", "Python", "JavaScript"]
embeddings = RubyLLM.embed(texts)
# Each text gets its own vector within the `vectors` array
puts "Number of vectors: #{embeddings.vectors.length}" # => 3
puts "First vector dimensions: #{embeddings.vectors.first.length}"
puts "Model used: #{embeddings.model}"
puts "Total input tokens: #{embeddings.input_tokens}"
Batching multiple texts is generally more performant and cost-effective than making individual requests for each text. {: .note }
By default, RubyLLM uses a capable default embedding model (like OpenAI's {{ site.models.embedding_small }}), but you can specify a different one using the model: argument.
# Use a specific OpenAI model
embedding_large = RubyLLM.embed(
"This is a test sentence",
model: "{{ site.models.embedding_large }}"
)
# Or use a Google model
embedding_google = RubyLLM.embed(
"This is another test sentence",
model: "{{ site.models.embedding_google }}" # Google's model
)
# Use a model not in the registry (useful for custom endpoints)
embedding_custom = RubyLLM.embed(
"Custom model test",
model: "my-custom-embedding-model",
provider: :openai,
assume_model_exists: true
)
You can configure the default embedding model globally:
RubyLLM.configure do |config|
config.default_embedding_model = "{{ site.models.embedding_large }}"
end
Refer to the [Working with Models Guide]({% link _advanced/models.md %}) for details on finding available embedding models and their capabilities.
Each embedding model has its own default output dimensions. For example, OpenAI's {{ site.models.embedding_small }} outputs 1536 dimensions by default, while {{ site.models.embedding_large }} outputs 3072 dimensions. RubyLLM allows you to specify these dimensions per request:
embedding = RubyLLM.embed(
"This is a test sentence",
model: "{{ site.models.embedding_small }}",
dimensions: 512
)
This is particularly useful when:
Note that not all models support custom dimensions. If you specify dimensions that aren't supported by the chosen model, RubyLLM will use the model's default dimensions.
The embedding result contains useful information:
embedding = RubyLLM.embed("Example text")
# The vector representation
puts embedding.vectors.class # => Array
puts embedding.vectors.first.class # => Float
# The vector dimensions
puts embedding.vectors.first.length # => 1536
# The model used
puts embedding.model # => "{{ site.models.embedding_small }}"
A primary use case for embeddings is measuring the semantic similarity between texts. Cosine similarity is a common metric.
require 'matrix' # Ruby's built-in Vector class requires 'matrix'
embedding1 = RubyLLM.embed("I love Ruby programming")
embedding2 = RubyLLM.embed("Ruby is my favorite language")
# Convert embedding vectors to Ruby Vector objects
vector1 = Vector.elements(embedding1.vectors)
vector2 = Vector.elements(embedding2.vectors)
# Calculate cosine similarity (value between -1 and 1, closer to 1 means more similar)
similarity = vector1.inner_product(vector2) / (vector1.norm * vector2.norm)
puts "Similarity: #{similarity.round(4)}" # => e.g., 0.9123
Embedding API calls can fail for various reasons. Handle errors gracefully:
begin
embedding = RubyLLM.embed("Your text here")
# Process embedding...
rescue RubyLLM::Error => e
# Handle API errors
puts "Embedding failed: #{e.message}"
end
For comprehensive error handling patterns and retry strategies, see the [Error Handling Guide]({% link _advanced/error-handling.md %}).
RubyLLM.embed(["text1", "text2"]) is much faster than calling RubyLLM.embed twice.{{ site.models.embedding_small }} uses 1536 dimensions, {{ site.models.embedding_large }} uses 3072).In a Rails application using PostgreSQL with the pgvector extension, you might store and search embeddings like this:
# Migration:
# add_column :documents, :embedding, :vector, limit: 1536 # Match your model's dimensions
# app/models/document.rb
class Document < ApplicationRecord
has_neighbors :embedding # From the neighbor gem for pgvector
# Automatically generate embedding before saving if content changed
before_save :generate_embedding, if: :content_changed?
# Scope for nearest neighbor search
scope :search_by_similarity, ->(query_text, limit: 5) {
query_embedding = RubyLLM.embed(query_text).vectors
nearest_neighbors(:embedding, query_embedding, distance: :cosine).limit(limit)
}
private
def generate_embedding
return if content.blank?
puts "Generating embedding for Document #{id}..."
begin
embedding_result = RubyLLM.embed(content) # Uses default embedding model
self.embedding = embedding_result.vectors
rescue RubyLLM::Error => e
errors.add(:base, "Failed to generate embedding: #{e.message}")
# Prevent saving if embedding fails (optional, depending on requirements)
throw :abort
end
end
end
# Usage in controller or console:
# Document.create(title: "Intro to Ruby", content: "Ruby is a dynamic language...")
# results = Document.search_by_similarity("What is Ruby?")
# results.each { |doc| puts "- #{doc.title}" }
This Rails example assumes you have the
pgvectorextension enabled in PostgreSQL and are using a gem likeneighborfor ActiveRecord integration. {: .note }
Now that you understand embeddings, you might want to explore: