Back to Pinecone Python Client

How Pinecone Works

docs/guides/concepts.md

9.0.05.4 KB
Original Source

How Pinecone Works

A vector database stores numerical representations of data — called vectors or embeddings — and retrieves the entries most similar to a query vector. Unlike a relational database that matches rows by exact field values, a vector database uses approximate nearest-neighbor algorithms to rank results by geometric closeness in high-dimensional space.

Indexes

An index holds the vectors you store and query. Every index has two required properties:

  • Dimension — the length of vectors stored in the index. Every vector you upsert must have exactly this many values.
  • Metric — the similarity function used when ranking query results: cosine, euclidean, or dotproduct.

Pinecone offers two index types:

ServerlessPod-based
CapacityScales automaticallyFixed by pod type and count
BillingPay per operationPay per pod-hour
Use caseVariable or unpredictable workloadsPredictable, high-QPS workloads

Create a serverless index:

python
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone()
pc.indexes.create(
    name="movie-recommendations",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
)

Namespaces

A namespace is a logical partition within an index. Vectors in different namespaces are completely isolated: upserts, queries, fetches, and deletes in one namespace never touch another.

Common uses for namespaces include separating data by tenant, language, or environment without creating separate indexes:

python
index = pc.index(host="my-index-abc123.svc.pinecone.io")

# Each customer's vectors are isolated in their own namespace
index.upsert(vectors=[("doc-1", [0.1, 0.2, ...])], namespace="customer-acme")
index.upsert(vectors=[("doc-1", [0.4, 0.5, ...])], namespace="customer-globex")

results = index.query(vector=[0.1, 0.2, ...], top_k=5, namespace="customer-acme")

The empty string "" is the default namespace. All operations that omit a namespace target it implicitly.

Vectors

A vector has four components:

ComponentTypeRequiredDescription
idstrYesUnique identifier within a namespace
valueslist[float]YesDense embedding values
sparse_valuesSparseValuesNoSparse representation for hybrid search
metadatadict[str, Any]NoKey-value data for filtering

Upsert vectors by passing tuples or Vector objects:

python
from pinecone import Vector

# Minimal tuple form
index.upsert(vectors=[("article-42", [0.012, -0.087, 0.153, ...])])

# Full object form with metadata
index.upsert(vectors=[
    Vector(
        id="article-42",
        values=[0.012, -0.087, 0.153, ...],
        metadata={"topic": "science", "published": 2024},
    ),
])

Query for similar vectors:

python
results = index.query(
    vector=[0.012, -0.087, 0.153, ...],
    top_k=10,
    filter={"topic": "science"},
)
for match in results.matches:
    print(match.id, match.score)

Records (Integrated Indexes)

Integrated indexes store text or structured data alongside each vector. Pinecone generates the embeddings server-side using a hosted model. You upsert text records; the index handles embedding automatically.

python
from pinecone import Pinecone, ServerlessSpec
from pinecone.models import EmbedConfig

pc = Pinecone()
pc.indexes.create(
    name="article-search",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    spec_embed=EmbedConfig(model="multilingual-e5-large"),
)

index = pc.index(name="article-search")
index.upsert_records(
    namespace="articles-en",
    records=[
        {"id": "article-1", "text": "Quantum computing advances in 2024"},
        {"id": "article-2", "text": "New discoveries in marine biology"},
    ],
)

results = index.search_records(
    namespace="articles-en",
    query={"inputs": {"text": "latest physics research"}, "top_k": 5},
)

Control Plane vs Data Plane

Operations fall into two categories:

Control plane — index lifecycle management: create, list, describe, configure, and delete indexes; manage collections and backups. Control-plane calls are routed through api.pinecone.io. You access them via the Pinecone client.

Data plane — vector operations: upsert, query, fetch, update, delete, and list vectors. Data-plane calls go directly to an index's host URL. You access them via the Index (or AsyncIndex) client.

python
from pinecone import Pinecone

pc = Pinecone()

# Control plane: describe an index to get its host
desc = pc.indexes.describe("movie-recommendations")

# Data plane: connect directly to the index
index = pc.index(host=desc.host)
index.upsert(vectors=[("movie-42", [0.1, 0.2, ...])])

Namespace Pattern in the SDK

The Pinecone client exposes related operations as namespace objects rather than a flat list of methods. This keeps the top-level surface small and groups related functionality together:

NamespaceOperations
pc.indexesCreate, list, describe, configure, delete indexes
pc.collectionsCreate, list, describe, delete collections
pc.inferenceEmbed text, rerank results
pc.assistantsManage Pinecone Assistants
python
pc = Pinecone()

# List all indexes
for index_model in pc.indexes.list():
    print(index_model.name, index_model.status.ready)

# Describe one index
desc = pc.indexes.describe("movie-recommendations")

# Delete an index
pc.indexes.delete("movie-recommendations")