pgml-cms/docs/open-source/pgml/api/pgml.transform.md
The pgml.transform() function is the most powerful feature of PostgresML. It integrates open-source large language models, like Llama, Mixtral, and many more, which allows to perform complex tasks on your data.
The models are downloaded from 🤗 Hugging Face which hosts tens of thousands of pre-trained and fine-tuned models for various tasks like text generation, question answering, summarization, text classification, and more.
The pgml.transform() function comes in two flavors, task-based and model-based.
The task-based API automatically chooses a model based on the task:
pgml.transform(
task TEXT,
args JSONB,
inputs TEXT[]
)
| Argument | Description | Example | Required |
|---|---|---|---|
| task | The name of a natural language processing task. | 'text-generation' | Required |
| args | Additional kwargs to pass to the pipeline. | '{"max_new_tokens": 50}'::JSONB | Optional |
| inputs | Array of prompts to pass to the model for inference. Each prompt is evaluated independently and a separate result is returned. | ARRAY['Once upon a time...'] | Required |
{% tabs %} {% tabs %} {% tab title="Text generation" %}
SELECT *
FROM pgml.transform(
task => 'text-generation',
inputs => ARRAY['In a galaxy far far away']
);
{% endtab %} {% tab title="Translation" %}
SELECT *
FROM pgml.transform(
task => 'translation_en_to_fr',
inputs => ARRAY['How do I say hello in French?']
);
{% endtab %} {% endtabs %}
The model-based API requires the name of the model and the task, passed as a JSON object. This allows it to be more generic and support more models:
pgml.transform(
model JSONB,
args JSONB,
inputs TEXT[]
)
"task": "text-generation",
"model": "mistralai/Mixtral-8x7B-v0.1"
}'::JSONB </div> </td> </tr> <tr> <td>args</td> <td>Additional kwargs to pass to the pipeline.</td> <td><code>'{"max_new_tokens": 50}'::JSONB</code></td> </tr> <tr> <td>inputs</td> <td>Array of prompts to pass to the model for inference. Each prompt is evaluated independently.</td> <td><code>ARRAY['Once upon a time...']</code></td> </tr>
</table>{% tabs %} {% tab title="PostgresML SQL" %}
SELECT pgml.transform(
task => '{
"task": "text-generation",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
"device_map": "auto"
}'::JSONB,
inputs => ARRAY['AI is going to'],
args => '{
"max_new_tokens": 100
}'::JSONB
);
{% endtab %}
{% tab title="Equivalent Python" %}
import transformers
def transform(task, call, inputs):
return transformers.pipeline(**task)(inputs, **call)
transform(
{
"task": "text-generation",
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
},
{"max_new_tokens": 100},
['AI is going to change the world in the following ways:']
)
{% endtab %} {% endtabs %}
See also: LLM guides for more examples