docs/integrations/ai-engines/openai.mdx
This documentation describes the integration of MindsDB with OpenAI, an AI research organization known for developing AI models like GPT-3 and GPT-4. The integration allows for the deployment of OpenAI models within MindsDB, providing the models with access to data from various data sources.
Before proceeding, ensure the following prerequisites are met:
Create an AI engine from the OpenAI handler.
CREATE ML_ENGINE openai_engine
FROM openai
USING
openai_api_key = 'api-key-value';
Create a model using openai_engine as an engine.
CREATE MODEL openai_model
PREDICT target_column
USING
engine = 'openai_engine', -- engine name as created via CREATE ML_ENGINE
api_base = 'base-url', -- optional, replaces the default base URL
mode = 'mode_name', -- optional, mode to run the model in
model_name = 'openai_model_name', -- optional with default value of gpt-3.5-turbo
question_column = 'question', -- optional, column name that stores user input
context_column = 'context', -- optional, column that stores context of the user input
prompt_template = 'input message to the model here', -- optional, user provides instructions to the model here
user_column = 'user_input', -- optional, stores user input
assistant_column = 'conversation_context', -- optional, stores conversation context
prompt = 'instruction to the model', -- optional stores instruction to the model
max_tokens = 100, -- optional, token limit for answer
temperature = 0.3, -- temp
json_struct = {
'key': 'value',
...
}'
SELECT question, answer
FROM openai_model
WHERE question = 'input question here'
USING prompt_template = 'input new message to the model here';
The following parameters are available to use when creating an OpenAI model:
<AccordionGroup> <Accordion title="engine"> This is the engine name as created with the [`CREATE ML_ENGINE`](https://docs.mindsdb.com/mindsdb_sql/sql/create/ml-engine) statement. </Accordion> <Accordion title="api_base"> This parameter is optional.It replaces the default OpenAI's base URL with the defined value.
The available modes include `default`, `conversational`, `conversational-full`, `image`, and `embedding`.
- The `default` mode is used by default. The model replies to the `prompt_template` message.
- The `conversational` mode enables the model to read and reply to multiple messages.
- The `conversational-full` mode enables the model to read and reply to multiple messages, one reply per message.
- The `image` mode is used to create an image instead of a text reply.
- The `embedding` mode enables the model to return output in the form of embeddings.
You can find [all models supported by each mode here](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/openai_handler/constants.py).
You can find [all available models here](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/openai_handler/constants.py).
The OpenAI handler can be used with any OpenAI-compatible APIs by providing the api_base parameter that stores the base URL of the OpenAI-compatible APIs.
Here is an example of how to use the OpenAI handler with OpenRouter, the OpenAI-compatible interface for accessing LLMs.
CREATE MODEL openrouter_model
PREDICT answer
USING
engine = 'openai_engine',
api_base = 'https://openrouter.ai/api/v1',
openai_api_key = 'openrouter-api-key',
model_name = 'mistralai/devstral-small-2505',
prompt_template = 'answer a question: {{question}}';
DESCRIBE openrouter_model;
SELECT * FROM openrouter_model
WHERE question = 'how many planets are in the solar system?';
When using OpenAI-compatible APIs, it is required to provide the base URL in the api_base parameter and the API key in the openai_api_key parameter.
Here are the combination of parameters for creating a model:
prompt_template alone.question_column and optionally a context_column.prompt, user_column, and assistant_column to create a model in the conversational mode.The following usage examples utilize openai_engine to create a model with the CREATE MODEL statement.
```sql
CREATE MODEL openai_model
PREDICT answer
USING
engine = 'openai_engine',
question_column = 'question';
```
Query the model to get predictions.
```sql
SELECT question, answer
FROM openai_model
WHERE question = 'Where is Stockholm located?';
```
Here is the output:
```sql
+---------------------------+-------------------------------+
|question |answer |
+---------------------------+-------------------------------+
|Where is Stockholm located?|Stockholm is located in Sweden.|
+---------------------------+-------------------------------+
```
```sql
CREATE MODEL openai_model
PREDICT answer
USING
engine = 'openai_engine',
question_column = 'question',
context_column = 'context';
```
Query the model to get predictions.
```sql
SELECT context, question, answer
FROM openai_model
WHERE context = 'Answer accurately'
AND question = 'How many planets exist in the solar system?';
```
On execution, we get:
```sql
+-------------------+-------------------------------------------+----------------------------------------------+
|context |question |answer |
+-------------------+-------------------------------------------+----------------------------------------------+
|Answer accurately |How many planets exist in the solar system?| There are eight planets in the solar system. |
+-------------------+-------------------------------------------+----------------------------------------------+
```
<Tip>
Good prompts are the key to getting great completions out of large language models like the ones that OpenAI offers. For best performance, we recommend you read their [prompting guide](https://beta.openai.com/docs/guides/completion/prompt-design) before trying your hand at prompt templating.
</Tip>
Let's look at an example that reuses the `openai_model` model created earlier and overrides parameters at prediction time.
```sql
SELECT instruction, answer
FROM openai_model
WHERE instruction = 'Speculate extensively'
USING
prompt_template = '{{instruction}}. What does Tom Hanks like?',
max_tokens = 100,
temperature = 0.5;
```
On execution, we get:
```sql
+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|instruction |answer |
+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|Speculate extensively |Some people speculate that Tom Hanks likes to play golf, while others believe that he enjoys acting and directing. It is also speculated that he likes to spend time with his family and friends, and that he enjoys traveling.|
+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```
```sql
CREATE MODEL openai_chat_model
PREDICT response
USING
engine = 'openai_engine',
mode = 'conversational',
model_name = 'gpt-3.5-turbo',
user_column = 'user_input',
assistant_column = 'conversation_history',
prompt = 'Answer the question in a helpful way.';
```
And here is how to query this model:
```sql
SELECT response
FROM openai_chat_model
WHERE user_input = '<question>'
AND conversation_history = '<optionally, provide the context for the question>';
```
Follow this tutorial on sentiment analysis and this tutorial on finetuning OpenAI models to see more use case examples. </Tip>
SELECT input.text, output.sentiment
FROM integration.travel data AS input
JOIN openai_engine AS output
SELECT input.text, output.sentiment
FROM integration.'travel data' AS input
JOIN openai_engine AS output
SELECT input.text, output.sentiment
FROM integration.`travel data` AS input
JOIN openai_engine AS output