Back to Open Interpreter

Best Practices

docs/language-models/local-models/best-practices.mdx

0.4.21.0 KB
Original Source

Most settings — like model architecture and GPU offloading — can be adjusted via your LLM providers like LM Studio.

However, max_tokens and context_window should be set via Open Interpreter.

For local mode, smaller context windows will use less RAM, so we recommend trying a much shorter window (~1000) if it's is failing or if it's slow.

<CodeGroup>
bash
interpreter --local --max_tokens 1000 --context_window 3000
python
from interpreter import interpreter

interpreter.offline = True # Disables online features like Open Procedures
interpreter.llm.model = "openai/x" # Tells OI to send messages in OpenAI's format
interpreter.llm.api_key = "fake_key" # LiteLLM, which we use to talk to LM Studio, requires this
interpreter.llm.api_base = "http://localhost:1234/v1" # Point this at any OpenAI compatible server

interpreter.llm.max_tokens = 1000
interpreter.llm.context_window = 3000

interpreter.chat()
</CodeGroup>

<Info>Make sure max_tokens is less than context_window.</Info>