docs/usage.md
To add a model:
Resources tab.LLMs sub-tab.Add sub-tab.ChatOpenAI).Add to add the model.Embedding Models sub-tab and repeat the step 3 to 5 to add an embedding model.Alternatively, you can configure the models via the .env file with the information needed to connect to the LLMs. This file is located in
the folder of the application. If you don't see it, you can create one.
Currently, the following providers are supported:
In the .env file, set the OPENAI_API_KEY variable with your OpenAI API key in order
to enable access to OpenAI's models. There are other variables that can be modified,
please feel free to edit them to fit your case. Otherwise, the default parameter should
work for most people.
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_API_KEY=<your OpenAI API key here>
OPENAI_CHAT_MODEL=gpt-3.5-turbo
OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002
For OpenAI models via Azure platform, you need to provide your Azure endpoint and API key. Your might also need to provide your developments' name for the chat model and the embedding model depending on how you set up Azure development.
AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_API_KEY=
OPENAI_API_VERSION=2024-02-15-preview # could be different for you
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-35-turbo # change to your deployment name
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=text-embedding-ada-002 # change to your deployment name
Pros:
Cons:
You can search and download a LLM to be ran locally from the Hugging Face Hub. Currently, these model formats are supported:
You should choose a model whose size is less than your device's memory and should leave about 2 GB. For example, if you have 16 GB of RAM in total, of which 12 GB is available, then you should choose a model that take up at most 10 GB of RAM. Bigger models tend to give better generation but also take more processing time.
Here are some recommendations and their size in memory:
To add a local model to the model pool, set the LOCAL_MODEL variable in the .env
file to the path of the model file.
LOCAL_MODEL=<full path to your model file>
Here is how to get the full path of your model file:
Copy as Path.In order to do QA on your documents, you need to upload them to the application first.
Navigate to the File Index tab and you will see 2 sections:
Upload and Index.Now navigate back to the Chat tab. The chat tab is divided into 3 regions:
Supporting information such as the retrieved evidence and reference will be displayed here.
Direct citation for the answer produced by the LLM is highlighted.
The confidence score of the answer and relevant scores of evidences are displayed to quickly assess the quality of the answer and retrieved content.
Meaning of the score displayed:
full-text search if retrieved from full-text search DB).Generally, the score quality is LLM relevant score > Reranking score > Vectorscore.
By default, overall relevance score is taken directly from LLM relevant score. Evidences are sorted based on their overall relevance score and whether they have citation or not.