Back to Llama Index

LlamaIndex Retrievers Integration: Amazon Kendra

llama-index-integrations/retrievers/llama-index-retrievers-kendra/README.md

0.14.213.0 KB
Original Source

LlamaIndex Retrievers Integration: Amazon Kendra

Overview

Amazon Kendra is an intelligent search service powered by machine learning. Kendra reimagines enterprise search for your websites and applications by allowing users to search your unstructured and structured data using natural language.

Kendra supports a wide variety of data sources including:

  • Documents (PDF, Word, PowerPoint, HTML)
  • FAQs
  • Knowledge bases
  • Databases
  • Websites
  • Custom data sources through connectors

Installation

bash
pip install llama-index-retrievers-kendra

Usage

Here's a basic example of how to use the Kendra retriever:

python
from llama_index.retrievers.kendra import AmazonKendraRetriever

retriever = AmazonKendraRetriever(
    index_id="<kendra-index-id>",
    query_config={
        "PageSize": 4,
        "AttributeFilter": {
            "EqualsTo": {
                "Key": "department",
                "Value": {"StringValue": "engineering"},
            }
        },
    },
)

query = "What is our company's remote work policy?"
retrieved_results = retriever.retrieve(query)

# Print the first retrieved result
print(retrieved_results[0].get_content())

Advanced Configuration

The retriever supports Kendra's rich querying capabilities through the query_config parameter:

python
retriever = AmazonKendraRetriever(
    index_id="<kendra-index-id>",
    query_config={
        "PageSize": 10,  # Number of results to return
        "AttributeFilter": {
            # Filter results based on document attributes
            "AndAllFilters": [
                {
                    "EqualsTo": {
                        "Key": "department",
                        "Value": {"StringValue": "engineering"},
                    }
                },
                {
                    "GreaterThan": {
                        "Key": "last_updated",
                        "Value": {"StringValue": "2023-01-01"},
                    }
                },
            ]
        },
        "QueryResultTypeFilter": "DOCUMENT",  # Only return document results
    },
)

Confidence Scores

The retriever maps Kendra's confidence levels to float scores as follows:

  • VERY_HIGH: 1.0
  • HIGH: 0.8
  • MEDIUM: 0.6
  • LOW: 0.4
  • NOT_AVAILABLE: 0.0

These scores can be accessed through the score attribute of the retrieved nodes:

python
results = retriever.retrieve("query")
for result in results:
    print(f"Text: {result.get_content()}")
    print(f"Confidence Score: {result.score}")

Authentication

The retriever supports various AWS authentication methods:

python
retriever = AmazonKendraRetriever(
    index_id="<kendra-index-id>",
    profile_name="my-aws-profile",  # Use AWS profile
    region_name="us-west-2",  # Specify AWS region
    # Or use explicit credentials
    aws_access_key_id="YOUR_ACCESS_KEY",
    aws_secret_access_key="YOUR_SECRET_KEY",
    aws_session_token="YOUR_SESSION_TOKEN",  # Optional
)