Back to Mem0

Pinecone

embedchain/docs/components/vector-databases/pinecone.mdx

2.0.12.6 KB
Original Source

Overview

Install pinecone related dependencies using the following command:

bash
pip install --upgrade 'pinecone-client pinecone-text'

In order to use Pinecone as vector database, set the environment variable PINECONE_API_KEY which you can find on Pinecone dashboard.

<CodeGroup>
python
from embedchain import App

# Load pinecone configuration from yaml file
app = App.from_config(config_path="pod_config.yaml")
# Or
app = App.from_config(config_path="serverless_config.yaml")
yaml
vectordb:
  provider: pinecone
  config:
    metric: cosine
    vector_dimension: 1536
    index_name: my-pinecone-index
    pod_config:
      environment: gcp-starter
      metadata_config:
        indexed:
          - "url"
          - "hash"
yaml
vectordb:
  provider: pinecone
  config:
    metric: cosine
    vector_dimension: 1536
    index_name: my-pinecone-index
    serverless_config:
      cloud: aws
      region: us-west-2
</CodeGroup> <Note> You can find more information about Pinecone configuration [here](https://docs.pinecone.io/docs/manage-indexes#create-a-pod-based-index). You can also optionally provide `index_name` as a config param in yaml file to specify the index name. If not provided, the index name will be `{collection_name}-{vector_dimension}`. </Note>

Usage

Here is an example of how you can do hybrid search using Pinecone as a vector database through Embedchain.

python
import os

from embedchain import App

config = {
    'app': {
        "config": {
            "id": "ec-docs-hybrid-search"
        }
    },
    'vectordb': {
        'provider': 'pinecone',
        'config': {
            'metric': 'dotproduct',
            'vector_dimension': 1536,
            'index_name': 'my-index',
            'serverless_config': {
                'cloud': 'aws',
                'region': 'us-west-2'
            },
            'hybrid_search': True, # Remember to set this for hybrid search
        }
    }
}

# Initialize app
app = App.from_config(config=config)

# Add documents
app.add("/path/to/file.pdf", data_type="pdf_file", namespace="my-namespace")

# Query
app.query("<YOUR QUESTION HERE>", namespace="my-namespace")

# Chat
app.chat("<YOUR QUESTION HERE>", namespace="my-namespace")

Under the hood, Embedchain fetches the relevant chunks from the documents you added by doing hybrid search on the pinecone index. If you have questions on how pinecone hybrid search works, please refer to their offical documentation here.

<Snippet file="missing-vector-db-tip.mdx" />