Installation and Setup

Configure Docker for Transcribe

The transcribe server embeds the llama.cpp binary directly in the Docker image. The AI models must be downloaded separately and mounted as a volume.

1. Create data directory and download models

shell

mkdir -p ./data/models
chmod 755 ./data
wget -O ./data/models/Model-7.6B-Q4_K_M.gguf https://huggingface.co/openbmb/MiniCPM-o-2_6-gguf/resolve/main/Model-7.6B-Q4_K_M.gguf
wget -O ./data/models/mmproj-model-f16.gguf https://huggingface.co/openbmb/MiniCPM-o-2_6-gguf/resolve/main/mmproj-model-f16.gguf

2. Configure environment

Copy .env-transcribe-sample to your Docker configuration directory.
Rename it to .env-transcribe.
Set API_KEY to a secure value.

3. Run the server

shell

docker run --rm --env-file .env-transcribe -p 4567:4567 \
	-v ./data:/data \
	joplin/transcribe:amd64-latest

The container automatically creates the following inside /data:

images/ - uploaded images
models/ - AI models (you provide these)
queue.sqlite3 - job queue database

Using Docker Compose

The minimal configuration is provided in .env-sample and docker-compose.server.yml.

Run cp .env-sample .env
Update any options you need in .env

Start the server:

shell

docker compose -f docker-compose.server.yml --profile full up --detached

For advanced configuration, refer to .env-sample-transcribe.

Security

The transcribe container runs with these security measures:

Non-root user: The application runs as the transcribe user, not root
Read-only filesystem: The container filesystem is read-only (only /app/packages/transcribe/images and /tmp are writable)
Resource limits: Memory and CPU limits prevent runaway processes
No Docker socket: Unlike previous versions, no Docker socket mount is required

Development Setup

Testing

Integration tests requiring the full model do not run by default (including on CI). Be cautious when modifying the model or prompts. The disabled test is located at: workers/JobProcessor.test.ts.

Run all tests with:

shell

yarn test-all

Starting the Server

From packages/transcribe, run:

shell

yarn start

Environment variables

Required:

API_KEY: Authentication key for API requests
DATA_DIR: Base directory for all data (images, models, database)
HTR_CLI_BINARY_PATH: Path to the llama-mtmd-cli binary

Optional:

QUEUE_DRIVER: sqlite (default in Docker) or pg for PostgreSQL

The following paths are automatically derived from DATA_DIR:

$DATA_DIR/images - uploaded images
$DATA_DIR/models - AI models
$DATA_DIR/queue.sqlite3 - SQLite database (when using sqlite driver)

API Endpoints

All requests must include the Authorization header with the value set to your API_KEY.

POST `/transcribe`

Creates a transcription job. The uploaded image is resized, stored on disk, and assigned to a job record in the database.

Request Body:

Content-Type: multipart/form-data
Field: file (required) – the image file to process

Response:

json

{
	"jobId": "bcd2e633-eb10-44cb-a280-bf723238c12e"
}

Example (cURL):

shell

curl --request POST \
	--url http://localhost:4567/transcribe \
	--header 'Authorization: api-key' \
	--header 'Content-Type: multipart/form-data' \
	--form file=@/home/js/Pictures/2025-07-24_17-42_1.png

GET `/transcribe/{jobId}`

Fetches the result of a transcription job created with POST /transcribe.

Request:

Requires a valid jobId.

Example Responses:

json

{
	"id": "57ebd2e2-b496-40ab-9008-5f861bcb7858",
	"state": "created"
}

json

{
	"id": "07f09553-f5e9-467e-b98d-406778e61969",
	"state": "active"
}

json

{
	"id": "57ebd2e2-b496-40ab-9008-5f861bcb7858",
	"completedOn": "2025-06-11T18:20:22.000Z",
	"output": {
		"result": "markdown\r\n# Main title\r\n\r\nSome text here. This should take more than one line.\r\n\r\n## Sub title\r\n\r\n- One kind\r\n  - of list\r\n    - sub-item\r\n\r\n## Conclusion\r\n\r\nLet's finish here."
	},
	"state": "completed"
}

Example (cURL):

shell

curl --request GET \
	--url http://localhost:4567/transcribe/57ebd2e2-b496-40ab-9008-5f861bcb7858 \
	--header 'Authorization: api-key'

Installation and Setup

Installation and Setup

Configure Docker for Transcribe

1. Create data directory and download models

2. Configure environment

3. Run the server

Using Docker Compose

Security

Development Setup

Testing

Starting the Server

Environment variables

API Endpoints

POST /transcribe

GET /transcribe/{jobId}

POST `/transcribe`

GET `/transcribe/{jobId}`