Back to Everything Claude Code

Search & Indexing Guide

skills/videodb/reference/search.md

1.10.07.7 KB
Original Source

Search & Indexing Guide

Search allows you to find specific moments inside videos using natural language queries, exact keywords, or visual scene descriptions.

Prerequisites

Videos must be indexed before they can be searched. Indexing is a one-time operation per video per index type.

Indexing

Spoken Word Index

Index the transcribed speech content of a video for semantic and keyword search:

python
video = coll.get_video(video_id)

# force=True makes indexing idempotent — skips if already indexed
video.index_spoken_words(force=True)

This transcribes the audio track and builds a searchable index over the spoken content. Required for semantic search and keyword search.

Parameters:

ParameterTypeDefaultDescription
language_codestr|NoneNoneLanguage code of the video
segmentation_typeSegmentationTypeSegmentationType.sentenceSegmentation type (sentence or llm)
forceboolFalseSet to True to skip if already indexed (avoids "already exists" error)
callback_urlstr|NoneNoneWebhook URL for async notification

Scene Index

Index visual content by generating AI descriptions of scenes. Like spoken word indexing, this raises an error if a scene index already exists. Extract the existing scene_index_id from the error message.

python
import re
from videodb import SceneExtractionType

try:
    scene_index_id = video.index_scenes(
        extraction_type=SceneExtractionType.shot_based,
        prompt="Describe the visual content, objects, actions, and setting in this scene.",
    )
except Exception as e:
    match = re.search(r"id\s+([a-f0-9]+)", str(e))
    if match:
        scene_index_id = match.group(1)
    else:
        raise

Extraction types:

TypeDescriptionBest For
SceneExtractionType.shot_basedSplits on visual shot boundariesGeneral purpose, action content
SceneExtractionType.time_basedSplits at fixed intervalsUniform sampling, long static content
SceneExtractionType.transcriptSplits based on transcript segmentsSpeech-driven scene boundaries

Parameters for time_based:

python
video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 5, "select_frames": ["first", "last"]},
    prompt="Describe what is happening in this scene.",
)

Search Types

Natural language queries matched against spoken content:

python
from videodb import SearchType

results = video.search(
    query="explaining the benefits of machine learning",
    search_type=SearchType.semantic,
)

Returns ranked segments where the spoken content semantically matches the query.

Exact term matching in transcribed speech:

python
results = video.search(
    query="artificial intelligence",
    search_type=SearchType.keyword,
)

Returns segments containing the exact keyword or phrase.

Visual content queries matched against indexed scene descriptions. Requires a prior index_scenes() call.

index_scenes() returns a scene_index_id. Pass it to video.search() to target a specific scene index (especially important when a video has multiple scene indexes):

python
from videodb import SearchType, IndexType
from videodb.exceptions import InvalidRequestError

# Search using semantic search against the scene index.
# Use score_threshold to filter low-relevance noise (recommended: 0.3+).
try:
    results = video.search(
        query="person writing on a whiteboard",
        search_type=SearchType.semantic,
        index_type=IndexType.scene,
        scene_index_id=scene_index_id,
        score_threshold=0.3,
    )
    shots = results.get_shots()
except InvalidRequestError as e:
    if "No results found" in str(e):
        shots = []
    else:
        raise

Important notes:

  • Use SearchType.semantic with index_type=IndexType.scene — this is the most reliable combination and works on all plans.
  • SearchType.scene exists but may not be available on all plans (e.g. Free tier). Prefer SearchType.semantic with IndexType.scene.
  • The scene_index_id parameter is optional. If omitted, the search runs against all scene indexes on the video. Pass it to target a specific index.
  • You can create multiple scene indexes per video (with different prompts or extraction types) and search them independently using scene_index_id.

Scene Search with Metadata Filtering

When indexing scenes with custom metadata, you can combine semantic search with metadata filters:

python
from videodb import SearchType, IndexType

results = video.search(
    query="a skillful chasing scene",
    search_type=SearchType.semantic,
    index_type=IndexType.scene,
    scene_index_id=scene_index_id,
    filter=[{"camera_view": "road_ahead"}, {"action_type": "chasing"}],
)

See the scene_level_metadata_indexing cookbook for a full example of custom metadata indexing and filtered search.

Working with Results

Get Shots

Access individual result segments:

python
results = video.search("your query")

for shot in results.get_shots():
    print(f"Video: {shot.video_id}")
    print(f"Start: {shot.start:.2f}s")
    print(f"End: {shot.end:.2f}s")
    print(f"Text: {shot.text}")
    print("---")

Play Compiled Results

Stream all matching segments as a single compiled video:

python
results = video.search("your query")
stream_url = results.compile()
results.play()  # opens compiled stream in browser

Extract Clips

Download or stream specific result segments:

python
for shot in results.get_shots():
    stream_url = shot.generate_stream()
    print(f"Clip: {stream_url}")

Search across all videos in a collection:

python
coll = conn.get_collection()

# Search across all videos in the collection
results = coll.search(
    query="product demo",
    search_type=SearchType.semantic,
)

for shot in results.get_shots():
    print(f"Video: {shot.video_id} [{shot.start:.1f}s - {shot.end:.1f}s]")

Note: Collection-level search only supports SearchType.semantic. Using SearchType.keyword or SearchType.scene with coll.search() will raise NotImplementedError. For keyword or scene search, use video.search() on individual videos instead.

Search + Compile

Index, search, and compile matching segments into a single playable stream:

python
video.index_spoken_words(force=True)
results = video.search(query="your query", search_type=SearchType.semantic)
stream_url = results.compile()
print(stream_url)

Tips

  • Index once, search many times: Indexing is the expensive operation. Once indexed, searches are fast.
  • Combine index types: Index both spoken words and scenes to enable all search types on the same video.
  • Refine queries: Semantic search works best with descriptive, natural language phrases rather than single keywords.
  • Use keyword search for precision: When you need exact term matches, keyword search avoids semantic drift.
  • Handle "No results found": video.search() raises InvalidRequestError when no results match. Always wrap search calls in try/except and treat "No results found" as an empty result set.
  • Filter scene search noise: Semantic scene search can return low-relevance results for vague queries. Use score_threshold=0.3 (or higher) to filter noise.
  • Idempotent indexing: Use index_spoken_words(force=True) to safely re-index. index_scenes() has no force parameter — wrap it in try/except and extract the existing scene_index_id from the error message with re.search(r"id\s+([a-f0-9]+)", str(e)).