Back to Cocoindex

Paper Metadata (v1)

examples/paper_metadata/README.md

1.0.31.1 KB
Original Source

Paper Metadata (v1)

This example extracts metadata (title, authors, abstract) from PDF papers, stores it in Postgres, and builds embeddings for semantic search.

We appreciate a star ⭐ at CocoIndex Github if this is helpful.

Prerequisites

  • Install Postgres
  • Set OPENAI_API_KEY for metadata extraction
  • Set POSTGRES_URL for Postgres access

Run

Install dependencies:

sh
pip install -e .

Set environment variables:

sh
export OPENAI_API_KEY="your_key"
export POSTGRES_URL="postgres://cocoindex:cocoindex@localhost/cocoindex"

This example uses the coco_examples_v1 schema by default to avoid clashing with the legacy example tables.

Build/update the index:

sh
cocoindex update main.py

Query:

sh
python main.py query "graph neural networks"

Note: this example does not create a vector index; queries will do a sequential scan.