server/utils/vectorDbProviders/pgvector/SETUP.md
PGVector for AnythingLLMSetting up PGVector for anythingllm to use as your vector database is quite easy. At a minimum, you will need the following:
pgvector extension installed on DBbrew install postgresql
brew services start postgresql
brew install pgvector
# assuming you have a database already set up + a user
psql <database-name>
CREATE EXTENSION vector;
this can be done via the UI or by directly editing the .env file
First, obtain a valid connection string for the user, credentials, and db you want to target.
eg: postgresql://dbuser:dbuserpass@localhost:5432/yourdb
[!IMPORTANT] If you have an existing table that you want to use as a vector database, AnythingLLM requires that the table be at least minimally conform to the expected schema - this can be seen in the index.js file.
optional - set a table name you wish to have AnythingLLM store vectors to. By default this is anythingllm_vectors
If you are running AnythingLLM in Docker, you will need to ensure that the DB is accessible from the container. If you are running your DB in another Docker container or on the host machine, you will need to ensure that the container can access the DB.
localhost will not work in this case as it will attempt to connect to the DB inside the AnythingLLM container instead of the host machine or another container.
You will need to use the host.docker.internal (or 172.17.0.1 on Linux/Ubuntu) address.
on Mac or Windows:
postgresql://dbuser:dbuserpass@localhost:5432/yourdb => postgresql://dbuser:[email protected]:5432/yourdb
on Linux:
postgresql://dbuser:dbuserpass@localhost:5432/yourdb => postgresql://dbuser:[email protected]:5432/yourdb
Yes, you can use an existing table as a vector database. However, AnythingLLM requires that the table be at least minimally conform to the expected schema - this can be seen in the index.js file.
It is absolutely critical that the embedding column's VECTOR(XXXX) dimensions match the dimension of the embedder in AnythingLLM. The default embedding model is 384 dimensions. However, if you are using a custom embedder, you will need to ensure that the dimension value is set correctly.
When setting the connection string in or table name via the AnythingLLM UI, the following validations will be attempted:
The embedding storage table is created by AnythingLLM on the first upsert of a vector. If you have not yet embedding any documents, the table will not be present in the DB.
at the workspace level in Settings > Vector Database
You can use the "Reset Vector Database" button in the AnythingLLM UI to reset your vector database. This will drop all vectors within that workspace, but the table will remain in the DB.
reset the vector database at the db level
For this, you will need to DROP TABLE from the command line or however you manage your DB. Once the table is dropped, it will be recreated by AnythingLLM on the next upsert.
CREATE TABLE permissionsINSERT permissions in the databaseSELECT permissions in the databaseIf you are using the PGVector as your vector database, you may encounter an error similar to the following when embedding documents:
type 'vector' does not exist
This is due to the fact that the vector type is not installed on the PG database.
First, follow the instructions in the PGVector README to install the vector type on your database.
Then, you will need to create the extension on the database. This can be done by running the following command:
psql <database-name>
CREATE EXTENSION vector;