langchain-paddleocr/README.md
This package provides access to PaddleOCR's capabilities within the LangChain ecosystem.
pip install langchain-paddleocr
PaddleOCRVLLoaderThe PaddleOCRVLLoader enables you to:
Basic usage of PaddleOCRVLLoader looks as follows:
from langchain_paddleocr import PaddleOCRVLLoader
from pydantic import SecretStr
loader = PaddleOCRVLLoader(
file_path="path/to/document.pdf",
api_url="your-api-endpoint",
access_token=SecretStr("your-access-token") # Optional if using environment variable `PADDLEOCR_ACCESS_TOKEN`
)
docs = loader.load()
for doc in docs[:2]:
print(f"Content: {doc.page_content[:200]}...")
print(f"Source: {doc.metadata['source']}")
print("---")
For full documentation, see the LangChain Docs.