docs/usage/jobkit.md
Docling's document conversion can be executed as distributed jobs using Docling Jobkit.
This library provides:
You can run Jobkit locally via the CLI:
uv run docling-jobkit-local [configuration-file-path]
The configuration file defines:
Example configuration file:
options: # Example Docling's conversion options
do_ocr: false
sources: # Source location (here Google Drive)
- kind: google_drive
path_id: 1X6B3j7GWlHfIPSF9VUkasN-z49yo1sGFA9xv55L2hSE
token_path: "./dev/google_drive/google_drive_token.json"
credentials_path: "./dev/google_drive/google_drive_credentials.json"
target: # Target location (here S3)
kind: s3
endpoint: localhost:9000
verify_ssl: false
bucket: docling-target
access_key: minioadmin
secret_key: minioadmin
Connectors are used to import documents for processing with Docling and to export results after conversion.
The currently supported connectors are:
To use Google Drive as a source or target, you need to enable the API and set up credentials.
Step 1: Enable the Google Drive API.
Step 2: Create OAuth credentials.
google_drive_credentials.json.Step 3: Add test users.
Step 4: Edit configuration file.
credentials_path with your path to google_drive_credentials.json.path_id with your source or target location. It can be obtained from the URL as follows:
https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 > folder id is 1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5.https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit > document id is 1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw.Step 5: Authenticate via CLI.
token_path and reused for next runs.