Back to Daft

Modalities Overview

docs/modalities/overview.md

0.7.101.5 KB
Original Source

Modalities Overview

Daft is designed to work with any modality.

Artificial Intelligence now natively understands text, images, audio, video, and documents, but legacy engines were never designed to feed these formats to large models. Daft closes that gap, giving you one distributed engine that processes any modality, respects memory limits, and keeps GPUs fed so you can build the pipeline once and scale it anywhere.

<div class="grid cards" markdown>
  • 🔠 Text

    Normalize, chunk, dedupe, prompt, and embed text data.

  • 🌄 Images

    Work with visual data and image processing.

  • 🔉 Audio

    Read, extract metadata, resample audio files.

  • 🎥 Video

    Working with video files and metadata.

  • 📄 Documents

    Extract text and image data from PDF documents.

  • {} JSON and Nested Data

    Parse, query, and manipulate semi-structured and hierarchical data.

  • Embeddings

    Generate vector representations for RAG and AI search.

  • 📁 Generic Files and URLs

    Take advantage of Daft's built-in URL functions and daft.File types

</div>

Custom Modalities

The most important modality might be one we haven't explored yet. Daft makes it easy to define your own modality with custom connectors to read and write any kind of data, and use user-defined functions to process custom Python code efficiently and reliably at scale.