examples/README.md
This directory contains examples of running marker in different contexts.
We have a self-contained example that shows how you can quickly use Modal to deploy marker by provisioning a container with a GPU, and expose that with an API so you can submit PDFs for conversion into Markdown, HTML, or JSON.
It's a limited example that you can extend into different use cases.
Make sure you have the modal client installed by following their instructions here.
Modal's Starter Plan includes $30 of free compute each month. Modal is serverless, so you only pay for resources when you are using them.
Once modal is configured, you can deploy it to your workspace by running:
modal deploy marker_modal_deployment.py
Notes:
marker has a few models it uses. By default, the endpoint will check if these models are loaded and download them if not (first request will be slow). You can avoid this by runningmodal run marker_modal_deployment.py::download_models
Which will create a Modal Volume to store them for re-use.
Once the deploy is finished, you can:
invoke_conversion command we expose through Modal's local_entrypointTest from your CLI with invoke_conversion
If your endpoint is live, simply run this command:
$ modal run marker_modal_deployment.py::invoke_conversion --pdf-file <PDF_FILE_PATH> --output-format markdown
And it'll automatically detect the URL of your new endpoint using .get_web_url(), make sure it's healthy, submit your file, and store its output on your machine (in the same directory).
Making a request using your own client
If you want to make requests elsewhere e.g. with cURL or a client like Insomnia, you'll need to get the URL.
When your modal deploy command from earlier finishes, it'll include your endpoint URL at the end. For example:
$ modal deploy marker_modal_deployment.py
...
✓ Created objects.
├── 🔨 Created mount /marker/examples/marker_modal_deployment.py
├── 🔨 Created function download_models.
├── 🔨 Created function MarkerModalDemoService.*.
└── 🔨 Created web endpoint for MarkerModalDemoService.fastapi_app => <YOUR_ENDPOINT_URL>
✓ App deployed in 149.877s! 🎉
If you accidentally close your terminal session, you can also always go into Modal's dashboard and:
datalab-marker-modal-demo)MarkerModalDemoServiceOnce you have your URL, make a request to {YOUR_ENDPOINT_URL}/convert like this (you can also use Insomnia, etc.):
curl --request POST \
--url {BASE_URL}/convert \
--header 'Content-Type: multipart/form-data' \
--form file=@/Users/cooldev/sample.pdf \
--form output_format=html
You should get a response like this
{
"success": true,
"filename": "sample.pdf",
"output_format": "html",
"json": null,
"html": "<YOUR_RESPONSE_CONTENT>",
"markdown": null,
"images": {},
"metadata": {... page level metadata ...},
"page_count": 2
}
Modal makes deploying and scaling models and inference workloads much easier.
If you're interested in Datalab's managed API or on-prem document intelligence solution, check out our platform here.