sbin/numeric_tree/README.md
This directory contains Python scripts for generating, testing, parsing, and visualizing numeric indexes in RediSearch, specifically designed to test iterator performance improvements and analyze tree structures.
generate_numeric_trees.pyGenerates 3 numeric indexes with 2 fields each, using different value insertion orders to test how insertion patterns affect tree structure and iterator performance.
benchmark_numeric_tree.pyBenchmarks numeric queries against the generated indexes to evaluate iterator performance across different tree structures.
parse_numeric_tree.pyParses RediSearch numeric tree dump files and converts them to JSON format for analysis.
visualize_numeric_tree.pyCreates interactive visualizations of parsed numeric trees using Plotly.
# Install Python dependencies for all tools
pip install -r requirements.txt
# For visualization tools, also install:
pip install plotly networkx
# Ensure Redis with RediSearch is running (for generation/testing tools)
redis-server --loadmodule /path/to/redisearch.so
# Generate 3 indexes with different insertion orders (10K base docs, sparse size 100)
./generate_numeric_trees.py --docs-per-index 10000 --sparse-size 100
# Generate with smaller dataset for quick testing
./generate_numeric_trees.py --docs-per-index 1000 --sparse-size 50
# Generate with larger sparse size for more extreme sparsing effect
./generate_numeric_trees.py --docs-per-index 5000 --sparse-size 200
# Clean up existing indexes before creating new ones
./generate_numeric_trees.py --docs-per-index 10000 --sparse-size 100 --cleanup
# Run benchmark tests on all 3 indexes
./benchmark_numeric_tree.py --iterations 100
# Run benchmark with specific parameters
./benchmark_numeric_tree.py --iterations 50 --range-size 100
# Run benchmark with custom settings
./benchmark_numeric_tree.py --iterations 200
# Parse a RediSearch numeric tree dump file
./parse_numeric_tree.py dump_numidxtree.txt tree.json
# Fast parsing mode (disable assertions for large files)
./parse_numeric_tree.py dump_numidxtree.txt tree.json --fast
# Create interactive visualization
./visualize_numeric_tree.py tree.json
# Create visualization with custom spacing
./visualize_numeric_tree.py tree.json 3.0
# Show tree information only (no visualization)
./visualize_numeric_tree.py tree.json info
The script creates 3 indexes with identical data but different insertion orders:
numeric_idx_sequential)numeric_idx_random)numeric_idx_sparsed)Each index contains 2 numeric fields:
price: Primary field with controlled insertion orderscore: Secondary field (price + random variance)This allows testing:
@price:[100 200]@price:[100 200] @score:[150 250]@field:[min max]@field1:[min max] | @field2:[min max]@field1:[min max] @field2:[min max]# Generate indexes with different insertion orders
./generate_numeric_trees.py --docs-per-index 10000 --sparse-size 100 --cleanup
# Run benchmark tests to compare performance across insertion orders
./benchmark_numeric_tree.py --iterations 200
# Run benchmark tests on intersection queries
./benchmark_numeric_tree.py --iterations 100
# Compare how tree structure affects performance
# Test different sparse sizes for the sparsed index
for sparse_size in 10 50 100 200 500; do
./generate_numeric_trees.py --docs-per-index 5000 --sparse-size $sparse_size --cleanup
./benchmark_numeric_tree.py --iterations 50
done
# Test how insertion order effects scale with dataset size
for docs in 1000 5000 10000 50000; do
./generate_numeric_trees.py --docs-per-index $docs --sparse-size 100 --cleanup
./benchmark_numeric_tree.py --iterations 100
done
✓ Connected to Redis at localhost:6379
✓ Created index: numeric_idx_1 with field: value_1
Populating index numeric_idx_1 with 10000 documents...
✓ Populated numeric_idx_1 with 10000 documents
Index Summary:
numeric_idx_1: 10000 docs, IDs: 1-999901, Values: 0-1000
numeric_idx_2: 10000 docs, IDs: 103-999823, Values: 1000-2000
UNION Query Statistics:
Total queries: 100
Execution time (ms):
Mean: 2.45
Median: 2.31
Min: 1.89
Max: 4.12
Std Dev: 0.67
Result counts:
Mean: 1247.3
Median: 1198.0
Min: 0
Max: 3456
# Connect to remote Redis instance
./generate_numeric_trees.py --redis-host redis.example.com --redis-port 6380 --redis-db 1
./test_numeric_queries.py --redis-host redis.example.com --redis-port 6380 --redis-db 1
# Generate large dataset for stress testing (always creates exactly 3 indexes)
./generate_numeric_trees.py --docs-per-index 100000 --sparse-size 1000
# Smaller datasets for quick iteration
./generate_numeric_trees.py --docs-per-index 1000 --sparse-size 10
These scripts complement the C++ micro-benchmarks in tests/cpptests/micro-benchmarks/:
# Check if Redis is running
redis-cli ping
# Check if RediSearch module is loaded
redis-cli MODULE LIST
# Clean up existing indexes before creating new ones
./generate_numeric_trees.py --cleanup --docs-per-index 1000 --sparse-size 10
# Check Redis memory usage
redis-cli INFO memory
--iterations parameter to increase sample size# Create 3 indexes with different insertion orders
./generate_numeric_trees.py --docs-per-index 10000 --sparse-size 100
# Measure query performance across different tree structures
./benchmark_numeric_tree.py --iterations 100
# From Redis CLI, dump each index's tree structure
redis-cli FT.DEBUG DUMP_NUMIDXTREE numeric_idx_sequential price > sequential_tree.txt
redis-cli FT.DEBUG DUMP_NUMIDXTREE numeric_idx_random price > random_tree.txt
redis-cli FT.DEBUG DUMP_NUMIDXTREE numeric_idx_sparsed price > sparsed_tree.txt
# Parse each tree structure
./parse_numeric_tree.py sequential_tree.txt sequential.json
./parse_numeric_tree.py random_tree.txt random.json
./parse_numeric_tree.py sparsed_tree.txt sparsed.json
# Create interactive visualizations
./visualize_numeric_tree.py sequential.json
./visualize_numeric_tree.py random.json
./visualize_numeric_tree.py sparsed.json
This workflow allows you to:
generate_numeric_trees.py - Redis API data generation scriptbenchmark_numeric_tree.py - Query performance benchmarking scriptparse_numeric_tree.py - Tree dump parservisualize_numeric_tree.py - Interactive tree visualizerrequirements.txt - Python dependenciesREADME.md - This documentation