transformers/llm/collect/README.md
This repository contains Python scripts for analyzing MNN models using different callback mechanisms to collect statistics during model inference.
Collects maximum activation values from MNN model layers during inference.
Calculates sparsity thresholds for MNN model layers based on target sparsity levels.
pip install datasets torch tqdm
You'll also need to build pymnn
cd /path/to/MNN/pymnn/pip_package
python build_deps.py llm
python setup.py install
cd /path/to/MNN/transformers/llm/collect
python get_max_values.py -m <mnn_model_path> [options]
Arguments:
-m, --mnn-path: Path to MNN model config (required)-d, --eval_dataset: Dataset for evaluation (default: 'wikitext/wikitext-2-raw-v1')-o, --output-path: Output file path (default: 'max_values.json')-l, --length: Sample length for processing (default: 512)Example:
python get_maxval.py --m /path/to/MNN/transformers/llm/export/model/config.json -o ./max_val_test.json
cd /path/to/MNN/transformers/llm/collect
python get_thresholds.py -m <mnn_model_path> [options]
Arguments:
-m, --mnn-path: Path to MNN model config(required)-d, --eval_dataset: Dataset for evaluation (default: 'wikitext/wikitext-2-raw-v1')-o, --output-path: Output file path (default: 'thresholds.json')-t, --target-sparsity: Target sparsity level (default: 0.5)-l, --length: Sample length for processing (default: 512)Example:
python get_thredsholds.py -m /path/to/MNN/transformers/llm/export/model/config.json -l 1024 -t 0.5 -o ./thresholds_0.5.json
Both scripts:
The key difference is in the callback configuration:
enable_max_value_callback to collect maximum activation valuesenable_threshold_callback with target sparsity to calculate pruning thresholdsBoth scripts generate JSON files containing the collected statistics that can be used for model optimization, pruning, or quantization analysis.