docs/examples/node_postprocessor/ibm_watsonx.ipynb
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/ibm_watsonx.ipynb" target="_parent"></a>
WatsonxRerank is a wrapper for IBM watsonx.ai Rerank.
The aim of these examples is to show how to take advantage of watsonx.ai Rerank, Embeddings and LLMs using the LlamaIndex postprocessor API.
Install required packages:
%pip install -qU llama-index
%pip install -qU llama-index-llms-ibm
%pip install -qU llama-index-postprocessor-ibm
%pip install -qU llama-index-embeddings-ibm
The cell below defines the credentials required to work with watsonx Foundation Models, Embeddings and Rerank.
Action: Provide the IBM Cloud user API key. For details, see Managing user API keys.
import os
from getpass import getpass
watsonx_api_key = getpass()
os.environ["WATSONX_APIKEY"] = watsonx_api_key
Additionally, you can pass additional secrets as an environment variable:
import os
os.environ["WATSONX_URL"] = "your service instance url"
os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster"
os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster"
os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster"
os.environ[
"WATSONX_INSTANCE_ID"
] = "your instance_id for accessing the CPD cluster"
Note:
project_id or space_id. To get your project or space ID, open your project or space, go to the Manage tab, and click General. For more information see: Project documentation or Deployment space documentation.In this example, we’ll use the project_id and Dallas URL.
Provide PROJECT_ID that will be used for initialize each watsonx integration instance.
PROJECT_ID = "PASTE YOUR PROJECT_ID HERE"
URL = "https://us-south.ml.cloud.ibm.com"
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
You might need to adjust rerank parameters for different tasks:
truncate_input_tokens = 512
WatsonxRerank instance.You need to specify the model_id that will be used for rerank. You can find the list of all the available models in Supported reranker models.
from llama_index.postprocessor.ibm import WatsonxRerank
watsonx_rerank = WatsonxRerank(
model_id="cross-encoder/ms-marco-minilm-l-12-v2",
top_n=2,
url=URL,
project_id=PROJECT_ID,
truncate_input_tokens=truncate_input_tokens,
)
Alternatively, you can use Cloud Pak for Data credentials. For details, see watsonx.ai software setup.
from llama_index.postprocessor.ibm import WatsonxRerank
watsonx_rerank = WatsonxRerank(
model_id="cross-encoder/ms-marco-minilm-l-12-v2",
url=URL,
username="PASTE YOUR USERNAME HERE",
password="PASTE YOUR PASSWORD HERE",
instance_id="openshift",
version="5.1",
project_id=PROJECT_ID,
truncate_input_tokens=truncate_input_tokens,
)
WatsonxEmbeddings instance.For more information about
WatsonxEmbeddingsplease refer to the sample notebook: <a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/ibm_watsonx.ipynb" target="_parent"></a>
You might need to adjust embedding parameters for different tasks:
truncate_input_tokens = 512
You need to specify the model_id that will be used for embedding. You can find the list of all the available models in Supported embedding models.
from llama_index.embeddings.ibm import WatsonxEmbeddings
watsonx_embedding = WatsonxEmbeddings(
model_id="ibm/slate-30m-english-rtrvr",
url=URL,
project_id=PROJECT_ID,
truncate_input_tokens=truncate_input_tokens,
)
Change default settings
from llama_index.core import Settings
Settings.chunk_size = 512
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(
documents=documents, embed_model=watsonx_embedding
)
WatsonxLLM instance.For more information about
WatsonxLLMplease refer to the sample notebook: <a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/llm/ibm_watsonx.ipynb" target="_parent"></a>
You need to specify the model_id that will be used for inferencing. You can find the list of all the available models in Supported foundation models.
You might need to adjust model parameters for different models or tasks. For details, refer to Available MetaNames.
max_new_tokens = 128
from llama_index.llms.ibm import WatsonxLLM
watsonx_llm = WatsonxLLM(
model_id="meta-llama/llama-3-3-70b-instruct",
url=URL,
project_id=PROJECT_ID,
max_new_tokens=max_new_tokens,
)
WatsonxRerankquery_engine = index.as_query_engine(
llm=watsonx_llm,
similarity_top_k=10,
node_postprocessors=[watsonx_rerank],
)
response = query_engine.query(
"What did Sam Altman do in this essay?",
)
from llama_index.core.response.pprint_utils import pprint_response
pprint_response(response, show_source=True)
query_engine = index.as_query_engine(
llm=watsonx_llm,
similarity_top_k=2,
)
response = query_engine.query(
"What did Sam Altman do in this essay?",
)
Retrieved context is irrelevant and response is hallucinated.
pprint_response(response, show_source=True)