cookbook/langchain/README.md
This guide demonstrates how to integrate LangChain with Nexa SDK's OpenAI-compatible API for VLM (Vision-Language Model) chat capabilities.
Nexa SDK provides an OpenAI-compatible REST API that allows you to use LangChain's ChatOpenAI class with your local VLM models. This integration enables you to leverage LangChain's powerful features (chains, agents, memory, etc.) while running models locally with Nexa SDK.
NexaAI/Qwen3-VL-4B-Instruct-GGUFFollow the installation instructions in the main README to install Nexa CLI for your platform.
nexa pull NexaAI/Qwen3-VL-4B-Instruct-GGUF
Start the Nexa server with the OpenAI-compatible API:
nexa serve
The server will be available at http://localhost:18181/v1 (note the /v1 suffix).
pip install -r requirements.txt
LangChain's ChatOpenAI class can be configured to use Nexa SDK's API by setting the base_url parameter:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="NexaAI/Qwen3-VL-4B-Instruct-GGUF",
base_url="http://localhost:18181/v1",
api_key="not-needed", # Nexa SDK doesn't require authentication
temperature=0.7,
max_tokens=512,
)
base_url: "http://localhost:18181/v1" - Nexa SDK's OpenAI-compatible API endpointapi_key: "not-needed" - Nexa SDK doesn't require authenticationmodel: "NexaAI/Qwen3-VL-4B-Instruct-GGUF" - The model identifier (must match the model name used with nexa pull)from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
llm = ChatOpenAI(
model="NexaAI/Qwen3-VL-4B-Instruct-GGUF",
base_url="http://localhost:18181/v1",
api_key="not-needed",
)
messages = [HumanMessage(content="What is artificial intelligence?")]
response = llm.invoke(messages)
print(response.content)
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
llm = ChatOpenAI(
model="NexaAI/Qwen3-VL-4B-Instruct-GGUF",
base_url="http://localhost:18181/v1",
api_key="not-needed",
)
messages = [
SystemMessage(content="You are a helpful AI assistant."),
HumanMessage(content="Explain quantum computing in simple terms.")
]
response = llm.invoke(messages)
print(response.content)
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
llm = ChatOpenAI(
model="NexaAI/Qwen3-VL-4B-Instruct-GGUF",
base_url="http://localhost:18181/v1",
api_key="not-needed",
)
conversation = [
HumanMessage(content="My name is Alice.")
]
response = llm.invoke(conversation)
print(response.content)
# Continue conversation
conversation.append(response)
conversation.append(HumanMessage(content="What's my name?"))
response = llm.invoke(conversation)
print(response.content)
Run the included example script:
python example.py
The demo showcases:
Once configured, you can use the ChatOpenAI instance with all LangChain features:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
llm = ChatOpenAI(
model="NexaAI/Qwen3-VL-4B-Instruct-GGUF",
base_url="http://localhost:18181/v1",
api_key="not-needed",
)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{question}")
])
chain = LLMChain(llm=llm, prompt=prompt)
result = chain.run(question="What is Python?")
print(result)
If you see connection errors, ensure:
nexa serve --host 127.0.0.1:18181http://localhost:18181If you get a "model not found" error:
nexa pull NexaAI/Qwen3-VL-4B-Instruct-GGUFnexa listNexa SDK's API is compatible with OpenAI's API, but some advanced features may differ. If you encounter issues:
This integration example follows the same license as Nexa SDK. See the LICENSE file for details.