Back to Mindsdb

Question Answering with MindsDB and OpenAI using MQL

docs/use-cases/data_enrichment/question-answering-inside-mongodb-with-openai.mdx

26.1.05.6 KB
Original Source

Introduction

In this blog post, we present how to create OpenAI models within MindsDB. In this example, we ask a question to a model and get an answer. The input data is taken from our sample MongoDB database.

Prerequisites

To follow along, install MindsDB locally via Docker or Docker Desktop.

How to Connect MindsDB to a Database

We use a collection from our MongoDB public demo database, so let’s start by connecting MindsDB to it.

You can use Mongo Compass or Mongo Shell to connect our sample database like this:

bash
test> use mindsdb
mindsdb> db.databases.insertOne({
            'name': 'mongo_demo_db',
            'engine': 'mongodb',
            'connection_args': {
                "host": "mongodb+srv://user:[email protected]/",
                "database": "public"
            }
        })

Tutorial

In this tutorial, we create a predictive model to answer questions in a specified domain.

Now that we've connected our database to MindsDB, let’s query the data to be used in the example:

bash
mindsdb> use mongo_demo_db
mongo_demo_db> db.questions.find({}).limit(3)

Here is the output:

bash
{
  _id: '63d01350bbca62e9c77732c0',
  article_title: 'Alessandro_Volta',
  question: 'Was Volta an Italian physicist?',
  true_answer: 'yes'
}
{
  _id: '63d01350bbca62e9c77732c1',
  article_title: 'Alessandro_Volta',
  question: 'Is Volta buried in the city of Pittsburgh?',
  true_answer: 'no'
}
{
  _id: '63d01350bbca62e9c77732c2',
  article_title: 'Alessandro_Volta',
  question: 'Did Volta have a passion for the study of electricity?',
  true_answer: 'yes'
}

Let's create a model collection to answer all questions from the input dataset:

<Note> Note that you need to create an OpenAI engine first before deploying the OpenAI model within MindsDB.

Here is how to create this engine:

bash
mongo_demo_db> use mindsdb
mindsdb> db.ml_engines.insertOne(
          {
              "name": "openai_engine",
              "handler": "openai",
              "params": {
                  "openai_api_key": "your-openai-api-key"
                  }
          })
</Note>
bash
mongo_demo_db> use mindsdb
mindsdb> db.models.insertOne({
            name: 'question_answering',
            predict: 'answer',
            training_options: {
                        engine: 'openai_engine',
                        prompt_template: 'answer the question of text:{{question}} about text:{{article_title}}'
                }
        })

In practice, the insertOne method triggers MindsDB to generate an AI collection called question_answering that uses the OpenAI integration to predict a field named answer. The model is created inside the default mindsdb project. In MindsDB, projects are a natural way to keep artifacts, such as models or views, separate according to what predictive task they solve. You can learn more about MindsDB projects here.

The training_options key specifies the parameters that this handler requires.

  • The engine parameter defines that we use the openai engine.
  • The prompt_template parameter conveys the structure of a message that is to be completed with additional text generated by the model.
<Note> Follow [this instruction](/integrations/ai-engines/openai#setup) to set up the OpenAI integration in MindsDB. </Note>

Once the insertOne method has started execution, we can check the status of the creation process with the following query:

bash
mindsdb> db.models.find({
            'name': 'question_answering'
        })

It may take a while to register as complete depending on the internet connection. Once the creation is complete, the behavior is the same as with any other AI collection – you can query it either by specifying synthetic data in the actual query:

bash
mindsdb> db.question_answering.find({
            question: 'Was Abraham Lincoln the sixteenth President of the United States?',
            article_title: 'Abraham_Lincoln'
        })

Here is the output data:

bash
{
  answer: 'Yes, Abraham Lincoln was the sixteenth President of the United States.',
  question: 'Was Abraham Lincoln the sixteenth President of the United States?',
  article_title: 'Abraham_Lincoln'
}

Or by joining with a collection for batch predictions:

bash
mindsdb> db.question_answering.find(
            {
                'collection': 'mongo_demo_db.questions'
            },
            {
                'question_answering.answer': 'answer',
                'questions.question': 'question',
                'questions.article_title': 'article_title'
            }
        ).limit(3)

Here is the output data:

bash
{
  answer: 'Yes, Volta was an Italian physicist.',
  question: 'Was Volta an Italian physicist?',
  article_title: 'Alessandro_Volta'
}
{
  answer: 'No, Volta is not buried in the city of Pittsburgh.',
  question: 'Is Volta buried in the city of Pittsburgh?',
  article_title: 'Alessandro_Volta'
}
{
  answer: 'Yes, Volta had a passion for the study of electricity. He was fascinated by the',
  question: 'Did Volta have a passion for the study of electricity?',
  article_title: 'Alessandro_Volta'
}

The questions collection is used to make batch predictions. Upon joining the question_answering model with the questions collection, the model uses all values from the article_title and question fields.

<Tip> Check out [this blog post on time series forecasting with Nixtla and MindsDB using MongoDB-QL](https://mindsdb.com/blog/time-series-forecasting-with-nixtla-and-mindsdb-using-mongodb-query-language). </Tip>