examples/paper_metadata/README.md
This example extracts metadata (title, authors, abstract) from PDF papers, stores it in Postgres, and builds embeddings for semantic search.
We appreciate a star ⭐ at CocoIndex Github if this is helpful.
OPENAI_API_KEY for metadata extractionPOSTGRES_URL for Postgres accessInstall dependencies:
pip install -e .
Set environment variables:
export OPENAI_API_KEY="your_key"
export POSTGRES_URL="postgres://cocoindex:cocoindex@localhost/cocoindex"
This example uses the coco_examples_v1 schema by default to avoid clashing with the legacy example tables.
Build/update the index:
cocoindex update main.py
Query:
python main.py query "graph neural networks"
Note: this example does not create a vector index; queries will do a sequential scan.