docs/versioned_docs/version-1.9.0/Components/bundles-nvidia.mdx
import Icon from "@site/src/components/icon"; import PartialParams from '@site/docs/_partial-hidden-params.mdx';
<Icon name="Blocks" aria-hidden="true" /> Bundles contain custom components that support specific third-party integrations with Langflow.
This page describes the components that are available in the NVIDIA bundle.
This component generates text using NVIDIA LLMs. For more information about NVIDIA LLMs, see the NVIDIA AI documentation.
| Name | Type | Description |
|---|---|---|
| max_tokens | Integer | Input parameter. The maximum number of tokens to generate. Set to 0 for unlimited tokens. |
| model_name | String | Input parameter. The name of the NVIDIA model to use. Default: mistralai/mixtral-8x7b-instruct-v0.1. |
| base_url | String | Input parameter. The base URL of the NVIDIA API. Default: https://integrate.api.nvidia.com/v1. |
| nvidia_api_key | SecretString | Input parameter. The NVIDIA API Key for authentication. |
| temperature | Float | Input parameter. Controls randomness in the output. Default: 0.1. |
| seed | Integer | Input parameter. The seed controls the reproducibility of the job. Default: 1. |
| model | LanguageModel | Output parameter. An instance of ChatNVIDIA configured with the specified parameters. |
NVIDIA NIM (NVIDIA Inference Microservices) provides containers to self-host GPU-accelerated inferencing microservices.
You can use the NVIDIA component to connect Langflow with NVIDIA NIM on an RTX Windows system with Windows Subsystem for Linux 2 (WSL2) installed.
The following example connects an NVIDIA language model component in Langflow to a deployed mistral-nemo-12b-instruct NIM on an RTX Windows system with WSL2.
Prepare your system:
A NIM container deployed according to the model's instructions
Prerequisites vary between models.
For example, to deploy the mistral-nemo-12b-instruct NIM, follow the instructions for Windows on RTX AI PCs (Beta) on your model's deployment overview.
Windows 11 build 23H2 or later
At least 12 GB of RAM
Create a flow based on the Basic Prompting template.
Replace the OpenAI model component with the NVIDIA component.
In the NVIDIA component's Base URL field, add the URL where your NIM is accessible. If you followed your model's deployment instructions, the value is http://localhost:8000/v1.
In the NVIDIA component's NVIDIA API Key field, add your NVIDIA API Key.
Select your model from the Model Name field.
Open the Playground and chat with your NIM model.
The NVIDIA Embeddings component generates embeddings using NVIDIA models.
For more information about using embedding model components in flows, see Embedding model components.
| Name | Type | Description |
|---|---|---|
| model | String | Input parameter. The NVIDIA model to use for embeddings, such as nvidia/nv-embed-v1. |
| base_url | String | Input parameter. The base URL for the NVIDIA API. Default: https://integrate.api.nvidia.com/v1. |
| nvidia_api_key | SecretString | Input parameter. The API key for authenticating with NVIDIA's service. |
| temperature | Float | Input parameter. The model temperature for embedding generation. Default: 0.1. |
| embeddings | Embeddings | Output parameter. An NVIDIAEmbeddings instance for generating embeddings. |
:::tip Tokenization considerations Be aware of your embedding model's chunk size limit. Tokenization errors can occur if your text chunks are too large. For more information, see Tokenization errors due to chunk size. :::
This component finds and reranks documents using the NVIDIA API.
The NVIDIA Retriever Extraction component integrates with the NVIDIA nv-ingest microservice for data ingestion, processing, and extraction of text files.
The nv-ingest service supports multiple extraction methods for PDF, DOCX, and PPTX file types, and includes pre- and post-processing services like splitting, chunking, and embedding generation. The extractor service's High Resolution mode uses the nemoretriever-parse extraction method for better quality extraction from scanned PDF documents. This feature is only available for PDF files.
The NVIDIA Retriever Extraction component imports the NVIDIA Ingestor client, ingests files with requests to the NVIDIA ingest endpoint, and outputs the processed content as a list of JSON objects. Ingestor accepts additional configuration options for data extraction from other text formats. To configure these options, see the parameters.
:::tip NVIDIA Retriever Extraction is also known as NV-Ingest and NeMo Retriever Extraction. :::
The NVIDIA Retriever Extraction component accepts Message inputs, and then outputs JSON. The component calls an NVIDIA Ingest microservice's endpoint to ingest a local file and extract the text.
To use the NVIDIA Retriever Extraction component in your flow, follow these steps:
Prepare your system:
An NVIDIA Ingest endpoint. For more information on setting up an NVIDIA Ingest endpoint, see the NVIDIA Ingest quickstart.
The NVIDIA Retriever Extraction component requires the installation of additional dependencies to your Langflow environment. To install the dependencies in a virtual environment, run the following commands.
source **YOUR_LANGFLOW_VENV**/bin/activate
uv sync --extra nv-ingest
uv run langflow run
source **YOUR_LANGFLOW_VENV**/bin/activate
uv pip install --prerelease=allow 'langflow[nv-ingest]'
uv run langflow run
Add the NVIDIA Retriever Extraction component to your flow.
In the Base URL field, enter the URL of the NVIDIA Ingest endpoint. You can also store the URL as a global variable to reuse it in multiple components and flows.
Click Select Files to select a file to ingest.
Select which text type to extract from the file: text, charts, tables, images, or infographics.
Optional: For PDF files, enable High Resolution Mode for better quality extraction from scanned documents.
Select whether to split the text into chunks.
<PartialParams />Click <Icon name="Play" aria-hidden="true" /> Run component to ingest the file, and then click Logs or <Icon name="TextSearch" aria-hidden="true" /> Inspect output to confirm the component ingested the file.
To store the processed data in a vector database, add a vector store component to your flow, and then connect the NVIDIA Retriever Extraction component's JSON output to the vector store component's input.
When you run the flow with a vector store component, the processed data is stored in the vector database. You can query your database to retrieve the uploaded data.
The NVIDIA Retriever Extraction component has the following parameters.
For more information, see the NV-Ingest documentation.
| Name | Display Name | Info |
|---|---|---|
| base_url | NVIDIA Ingestion URL | The URL of the NVIDIA Ingestion API. |
| path | Path | File path to process. |
| extract_text | Extract Text | Extract text from documents. Default: true. |
| extract_charts | Extract Charts | Extract text from charts. Default: false. |
| extract_tables | Extract Tables | Extract text from tables. Default: true. |
| extract_images | Extract Images | Extract images from document. Default: true. |
| extract_infographics | Extract Infographics | Extract infographics from document. Default: false. |
| text_depth | Text Depth | The level at which text is extracted. Options: 'document', 'page', 'block', 'line', 'span'. Default: page. |
| split_text | Split Text | Split text into smaller chunks. Default: true. |
| chunk_size | Chunk Size | The number of tokens per chunk. Default: 500. Make sure the chunk size is compatible with your embedding model. For more information, see Tokenization errors due to chunk size. |
| chunk_overlap | Chunk Overlap | Number of tokens to overlap from previous chunk. Default: 150. |
| filter_images | Filter Images | Filter images (see advanced options for filtering criteria). Default: false. |
| min_image_size | Minimum Image Size Filter | Minimum image width/length in pixels. Default: 128. |
| min_aspect_ratio | Minimum Aspect Ratio Filter | Minimum allowed aspect ratio (width / height). Default: 0.2. |
| max_aspect_ratio | Maximum Aspect Ratio Filter | Maximum allowed aspect ratio (width / height). Default: 5.0. |
| dedup_images | Deduplicate Images | Filter duplicated images. Default: true. |
| caption_images | Caption Images | Generate captions for images using the NVIDIA captioning model. Default: true. |
| high_resolution | High Resolution (PDF only) | Process PDF in high-resolution mode for better quality extraction from scanned PDF. Default: false. |
The NVIDIA Retriever Extraction component outputs a list of JSON objects where each object contains:
text: The extracted content.
file_path: The source file name and path.document_type: The type of the document, which can be text, structured, or image.description: Additional description of the content.The output varies based on the document_type:
Documents with document_type: "text" contain:
text field.extract_text parameter.Documents with document_type: "structured" contain:
extract_tables, extract_charts, and extract_infographics parameters.text field after being processed from the table_content metadata.Documents with document_type: "image" contain:
text field when caption_images is enabled.extract_images parameter.The NVIDIA System-Assist component integrates your flows with NVIDIA G-Assist, enabling interaction with NVIDIA GPU drivers through natural language prompts.
For example, prompt G-Assist with "What is my current GPU temperature?" or "Show me the available GPU memory" to get information, and then tell G-Assist to modify your GPU settings.
For more information, see the NVIDIA G-Assist repository.
Prepare your system:
gassist.rise package, which is installed with all Langflow versions that include this component.Create a flow with a Chat Input component, NVIDIA System-Assist component, and Chat Output components.
This is a simplified example that uses only three components. Depending on your use case, your flow might use more components or different inputs and outputs.
Connect the Chat Input component to the NVIDIA System-Assist component's Prompt input.
The Prompt parameter accepts a natural language prompt that is processed by the NVIDIA G-Assist AI Assistant. In this example, you'll provide the prompt as chat input. You could also enter a prompt directly in the Prompt input or connect another input component.
Connect the NVIDIA System-Assist component's output to the Chat Output component.
To test the flow, open the Playground, and then ask a question about your GPU.
For example, "What is my current GPU temperature?".
Through the NVIDIA System-Assist component, NVIDIA G-Assist queries your GPU based on the prompt, and then prints the response to the Playground.
The component's output is a Message containing the NVIDIA G-Assist response.
The string response with the completed operation result is available in the text key in the Message object.