docs/score_fusion.md
Bleve supports hybrid search that combines full-text search (FTS) with vector (kNN) search to leverage the strengths of both approaches:
With v2.5.4 onwards - when using hybrid search, you can choose different score fusion strategies to combine results from both search methods. This document describes the available fusion strategies and how to use them.
By default, Bleve combines FTS and kNN scores using a simple weighted addition. See the Vector Search documentation for details on the default hybrid search behavior and examples.
While this approach works well with proper boost tuning, it can be sensitive to different score scales and distributions. The fusion strategies below (RRF and RSF) provide more robust alternatives that handle score normalization automatically.
Reciprocal Rank Fusion is a rank-based algorithm that combines results based on their position in each result list, rather than their raw scores. This makes it robust to different score scales and distributions.
Algorithm:
For each document appearing in FTS or kNN results, the RRF score is calculated as:
RRF\_score = w_{\text{fts}} \cdot \frac{1}{k + \text{rank}_{\text{fts}}} + \sum_{i=1}^{n} w_{\text{knn}_i} \cdot \frac{1}{k + \text{rank}_{\text{knn}_i}}
Where:
Advantages:
Disadvantages:
Usage:
// Create a hybrid search with RRF fusion
searchRequest := bleve.NewSearchRequest(bleve.NewMatchQuery("dark chocolate"))
searchRequest.Score = bleve.ScoreRRF // Alternatively, set to "rrf"
// Add first kNN component
searchRequest.AddKNN(
"embedding", // Vector field
[]float32{0.1, 0.2, 0.3, 0.4}, // Query vector
30, // k neighbors
1.0, // kNN weight (boost)
)
// Add second kNN component (optional - you can add multiple)
searchRequest.AddKNN(
"image_embedding", // Different vector field
[]float32{0.5, 0.3, 0.1, 0.8}, // Query vector
20, // k neighbors
0.5, // kNN weight (boost)
)
// Optional: Configure RRF parameters
params := bleve.RequestParams{
ScoreRankConstant: 60, // Rank constant (default: 60)
ScoreWindowSize: 150 // Window size (default: size)
}
searchRequest.AddParams(params)
searchResult, err := index.Search(searchRequest)
Relative Score Fusion is a score-based strategy that normalizes scores from both modalities into a common [0, 1] range using min-max normalization before combining them.
Algorithm:
Min-max normalize each result set independently:
\text{normalized\_score} = \frac{\text{score} - \text{min\_score}}{\text{max\_score} - \text{min\_score}}
Combine normalized scores using weighted addition:
RSF\_score = w_{\text{fts}} \cdot \text{normalized\_score\_fts} + \sum_{i=1}^{n} w_{\text{knn}_i} \cdot \text{normalized\_score\_knn}_i
Where:
Advantages:
Disadvantages:
Usage:
// Create a hybrid search with RSF fusion
searchRequest := bleve.NewSearchRequest(bleve.NewMatchQuery("machine learning"))
searchRequest.Score = bleve.ScoreRSF // Or set to "rsf"
// Add first kNN component
searchRequest.AddKNN(
"content_vector", // Vector field
[]float32{0.5, 0.3, 0.1, 0.8}, // Query vector
20, // k neighbors
1.0, // kNN weight (boost)
)
// Add second kNN component (optional - you can add multiple)
searchRequest.AddKNN(
"title_vector", // Different vector field
[]float32{0.2, 0.7, 0.4, 0.1}, // Query vector
15, // k neighbors
0.8, // kNN weight (boost)
)
// Optional: Configure RSF parameters
params := bleve.RequestParams{
ScoreWindowSize: 150 // Window size (default: size)
}
searchRequest.AddParams(params)
searchResult, err := index.Search(searchRequest)
The Score field in your search request specifies which fusion strategy to use:
ScoreRRF ("rrf"): Reciprocal Rank FusionScoreRSF ("rsf"): Relative Score FusionThe Params object contains additional parameters for score fusion:
ScoreWindowSize is the maximum number of results to consider from each result list for fusion.
Size parameterSize and ≥ 1A larger window size increases the chance of finding relevant results but requires more computation. For pagination to work consistently, ensure:
From + Size <= ScoreWindowSize
Example:
{
"score": "rrf",
"params": {
"score_window_size": 150
},
"size": 10,
"from": 0
}
With window size set to 150, you can paginate through up to 150 results. If you try to access results beyond this (e.g., from=160), you'll get an empty result set.
Only applicable for RRF
ScoreRankConstant controls how much the rank position affects the reciprocal rank score.
Example:
{
"score": "rrf",
"params": {
"score_rank_constant": 60
}
}
The boost value in your query components controls their relative importance in hybrid search:
// FTS query with boost 2.0
query := bleve.NewMatchQuery("search term")
query.SetBoost(2.0)
searchRequest := bleve.NewSearchRequest(query)
// kNN query with boost 1.0
searchRequest.AddKNN("vec", queryVector, 10, 1.0)
For RRF and RSF, weights determine the relative importance of each component's contribution, rather than scaling raw scores.
Example: If fts_boost = 2.0 and knn_boost = 1.0, the FTS contribution is twice as important as the kNN contribution in the final ranking in RRF or RSF.
When using score fusion (Score set to "rrf" or "rsf"), certain features are not supported:
From and Size only.-_score) or default sorting is allowed| Use Case | Recommended Strategy |
|---|---|
| Different score scales (e.g., TF-IDF + L2 distance) | RRF/RSF |
| Minimal tuning, out-of-the-box performance | RRF |
| Want to preserve score magnitude importance | RSF |
| Have well-tuned boost values already | Additive (default) |
| Score distributions have extreme outliers | RRF |