document/content/docs/self-host/custom-models/bge-rerank.en.mdx
| Model Name | RAM | VRAM | Disk Space | Start Command |
|---|---|---|---|---|
| bge-reranker-base | >=4GB | >=4GB | >=8GB | python app.py |
| bge-reranker-large | >=8GB | >=8GB | >=8GB | python app.py |
| bge-reranker-v2-m3 | >=8GB | >=8GB | >=8GB | python app.py |
Code repositories for the 3 models:
pip install -r requirements.txt
HuggingFace repositories for the 3 models:
Clone the model into the corresponding code directory. Directory structure:
bge-reranker-base/
app.py
Dockerfile
requirements.txt
python app.py
On successful startup, you should see an address like this:
http://0.0.0.0:6006is the connection address.
Image names:
Port
6006
Environment Variables
ACCESS_TOKEN=your_access_token (used in request header: Authorization: Bearer ${ACCESS_TOKEN})
Run Command Example
# auth token set to mytoken
docker run -d --name reranker -p 6006:6006 -e ACCESS_TOKEN=mytoken --gpus all registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
docker-compose.yml Example
version: "3"
services:
reranker:
image: registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
container_name: reranker
# GPU runtime. If the host doesn't have GPU drivers installed, comment out the deploy section.
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- 6006:6006
environment:
- ACCESS_TOKEN=mytoken
bge-reranker-base and the address to {{host}}/v1/rerank, where host is your deployed domain or IP:Port.The custom request token in FastGPT does not match the ACCESS_TOKEN environment variable.
Bus error (core dumped)Try adding the shm_size option to your docker-compose.yml to increase the shared memory size in the container.
...
services:
reranker:
...
container_name: reranker
shm_size: '2gb'
...