docs/modalities/overview.md
Daft is designed to work with any modality.
Artificial Intelligence now natively understands text, images, audio, video, and documents, but legacy engines were never designed to feed these formats to large models. Daft closes that gap, giving you one distributed engine that processes any modality, respects memory limits, and keeps GPUs fed so you can build the pipeline once and scale it anywhere.
<div class="grid cards" markdown>🔠 Text
Normalize, chunk, dedupe, prompt, and embed text data.
🌄 Images
Work with visual data and image processing.
🔉 Audio
Read, extract metadata, resample audio files.
🎥 Video
Working with video files and metadata.
Extract text and image data from PDF documents.
Parse, query, and manipulate semi-structured and hierarchical data.
Generate vector representations for RAG and AI search.
Take advantage of Daft's built-in URL functions and daft.File types
The most important modality might be one we haven't explored yet. Daft makes it easy to define your own modality with custom connectors to read and write any kind of data, and use user-defined functions to process custom Python code efficiently and reliably at scale.