Back to Chatgpt Retrieval Plugin

Azure Cognitive Search

docs/providers/azuresearch/setup.md

latest5.8 KB
Original Source

Azure Cognitive Search

Azure Cognitive Search is a complete retrieval cloud service that supports vector search, text search, and hybrid (vectors + text combined to yield the best of the two approaches). Azure Cognitive Search also offers an optional L2 re-ranking step to further improve results quality.

You can find the Azure Cognitive Search documentation here. If you don't have an Azure account, you can start setting one up here.

Azure Cognitive Search supports searching using pure vectors, pure text, or hybrid mode where both are combined. For the vector-based cases, you'll need to sign up for vector search private preview. To sign up, please fill in this form: https://aka.ms/VectorSearchSignUp

Environment variables

NameRequiredDescriptionDefault
DATASTOREYesDatastore name, set to azuresearch
BEARER_TOKENYesSecret token
OPENAI_API_KEYYesOpenAI API key
AZURESEARCH_SERVICEYesName of your search service
AZURESEARCH_INDEXYesName of your search index
AZURESEARCH_API_KEYNoYour API key, if using key-based auth instead of Azure managed identityUses managed identity
AZURESEARCH_DISABLE_HYBRIDNoDisable hybrid search and only use vector similarityUse hybrid search
AZURESEARCH_SEMANTIC_CONFIGNoEnable L2 re-ranking with this configuration name see re-ranking belowL2 not enabled
AZURESEARCH_LANGUAGENoIf using L2 re-ranking, language for queries/documents (valid values listed here)en-us
AZURESEARCH_DIMENSIONSNoVector size for embeddings256, or other

Authentication Options

  • API key: this is enabled by default; you can obtain the key in the Azure Portal or using the Azure CLI.
  • Managed identity: If the plugin is running in Azure, you can enable managed identity for the host and give that identity access to the service, without having to manage keys (avoiding secret storage, rotation, etc.). More details here.

Re-ranking

Azure Cognitive Search offers the option to enable a second (L2) ranking step after retrieval to further improve results quality. This only applies when using text or hybrid search. Since it has latency and cost implications, if you want to try this option you need to explicitly enable "semantic search" in your Cognitive Search service, and create a semantic search configuration for your index.

Using existing search indexes

If an existing index has fields that align with what's needed by the retrieval plugin but just differ in names, you can map your fields to the plugin fields using the following environment variables:

Plugin field nameEnvironment variable to override it
idAZURESEARCH_FIELDS_ID
textAZURESEARCH_FIELDS_TEXT
embeddingAZURESEARCH_FIELDS_EMBEDDING
document_idAZURESEARCH_FIELDS_DOCUMENT_ID
sourceAZURESEARCH_FIELDS_SOURCE
source_idAZURESEARCH_FIELDS_SOURCE_ID
urlAZURESEARCH_FIELDS_URL
created_atAZURESEARCH_FIELDS_CREATED_AT
authorAZURESEARCH_FIELDS_AUTHOR