Back to Postgresml

LLMs

pgml-cms/docs/open-source/pgml/guides/llms/README.md

2.10.03.1 KB
Original Source

LLMs

PostgresML integrates 🤗 Hugging Face Transformers to bring state-of-the-art models into the data layer. There are tens of thousands of pre-trained models with pipelines to turn raw inputs into useful results. Many state of the art deep learning architectures have been published and made available for download. You will want to browse all the models available to find the perfect solution for your dataset and task. For instance, with PostgresML you can:

  • Perform natural language processing (NLP) tasks like sentiment analysis, question and answering, translation, summarization and text generation
  • Access 1000s of state-of-the-art language models like GPT-2, GPT-J, GPT-Neo from :hugs: HuggingFace model hub
  • Fine tune large language models (LLMs) on your own text data for different tasks
  • Use your existing PostgreSQL database as a vector database by generating embeddings from text stored in the database.

See pgml.transform for examples of using transformers or pgml.tune for fine tuning.

Supported tasks

PostgresML currently supports most LLM tasks for Natural Language Processing available on Hugging Face:

TaskNameDescription
Fill maskkey-maskFill in the blank in a sentence.
Question answeringquestion-answeringAnswer a question based on a context.
SummarizationsummarizationSummarize a long text.
Text classificationtext-classificationClassify a text as positive or negative.
Text generationtext-generationGenerate text based on a prompt.
Text-to-text generationtext-to-text-generationGenerate text based on an instruction in the prompt.
Token classificationtoken-classificationClassify tokens in a text.
TranslationtranslationTranslate text from one language to another.
Zero-shot classificationzero-shot-classificationClassify a text without training data.
ConversationalconversationalEngage in a conversation with the model, e.g. chatbot.

Structured inputs

Both versions of the pgml.transform() function also support structured inputs, formatted with JSON. Structured inputs are used with the conversational task, e.g. to differentiate between the system and user prompts. Simply replace the text array argument with an array of JSONB objects.

Additional resources