Back to Autogpt

Jina Chunking

docs/integrations/block-integrations/jina/chunking.md

0.6.441.4 KB
Original Source

Jina Chunking

<!-- MANUAL: file_description -->

Blocks for splitting text into semantic chunks using Jina AI.

<!-- END MANUAL -->

Jina Chunking

What it is

Chunks texts using Jina AI's segmentation service

How it works

<!-- MANUAL: how_it_works -->

This block uses Jina AI's segmentation service to split texts into semantically meaningful chunks. Unlike simple splitting by character count, Jina's chunking preserves semantic coherence, making it ideal for RAG applications.

Configure maximum chunk length and optionally return token information for each chunk.

<!-- END MANUAL -->

Inputs

InputDescriptionTypeRequired
textsList of texts to chunkList[Any]Yes
max_chunk_lengthMaximum length of each chunkintNo
return_tokensWhether to return token informationboolNo

Outputs

OutputDescriptionType
errorError message if the operation failedstr
chunksList of chunked textsList[Any]
tokensList of token information for each chunkList[Any]

Possible use case

<!-- MANUAL: use_case -->

RAG Preprocessing: Chunk documents for retrieval-augmented generation systems.

Embedding Preparation: Split long texts into optimal chunks for embedding generation.

Document Processing: Break down large documents for analysis or storage in vector databases.

<!-- END MANUAL -->