site/docs/model-audit/scanners.md
ModelAudit includes specialized scanners for different model formats and file types. Each scanner is designed to identify specific security issues relevant to that format.
Use --list-scanners to see the scanner IDs supported by your installed ModelAudit version. Then pass IDs or class names to --scanners to run only those scanners, or use --exclude-scanner to remove scanners from the default set.
promptfoo scan-model --list-scanners
promptfoo scan-model models/ --scanners pickle,tf_savedmodel
promptfoo scan-model models/ --exclude-scanner weight_distribution
The web UI exposes the same catalog in Advanced Scan Options.
File types: .pkl, .pickle, .dill, .bin (when containing pickle data), .pt, .pth, .ckpt
The Pickle Scanner analyzes Python pickle files for security risks, which are common in many ML frameworks. It supports standard pickle files as well as dill-serialized files (an extended pickle format).
Key checks:
os, subprocess, sys)eval, exec, system)Why it matters: Pickle files are a common serialization format for ML models but can execute arbitrary code during unpickling. Attackers can craft malicious pickle files that execute harmful commands when loaded.
File types: .pb files and SavedModel directories
This scanner examines TensorFlow models saved in the SavedModel format.
Key checks:
PyFunc)Why it matters: TensorFlow models can contain operations that interact with the filesystem or execute arbitrary code, which could be exploited if a malicious model is loaded.
File types: .tflite
This scanner examines TensorFlow Lite model files, which are optimized for mobile and embedded devices.
Key checks:
Why it matters: While TensorFlow Lite models are generally safer than full TensorFlow models due to their limited operator set, they can still include custom operations or use the Flex delegate to access the full TensorFlow runtime, potentially introducing security risks. Malicious actors could embed harmful code in custom ops or metadata.
File types: .engine, .plan
This scanner examines NVIDIA TensorRT engine files, which are optimized inference engines for NVIDIA GPUs.
Key checks:
/tmp/, ../) that might indicate unauthorized access.so files) that could contain malicious codeexec, eval) that could run arbitrary codeWhy it matters: TensorRT engines can contain custom plugins and operations. While generally safer than pickle files, they could be crafted to include malicious plugins or reference unauthorized system resources.
File types: .h5, .hdf5, .keras
This scanner analyzes Keras models stored in HDF5 format.
Key checks:
Why it matters: Keras models with Lambda layers can contain arbitrary Python code that executes when the model is loaded or run. This could be exploited to execute malicious code on the host system.
File types: .keras
This scanner analyzes ZIP-based Keras model files (new .keras format introduced in Keras 3).
Key checks:
Why it matters:
The new .keras ZIP format stores Lambda layers as base64-encoded functions that execute during inference. Malicious actors could embed arbitrary code in these layers or hide executables within the archive structure.
File types: .onnx
This scanner examines ONNX (Open Neural Network Exchange) model files for security issues and integrity problems.
Key checks:
Why it matters: ONNX models can reference external data files and custom operators. Malicious actors could exploit these features to include harmful custom operations or manipulate external data references to access unauthorized files on the system.
File types: .xml, .bin (OpenVINO IR format)
This scanner examines Intel OpenVINO Intermediate Representation (IR) model files.
Key checks:
Why it matters: OpenVINO models consist of XML topology files and binary weight files. The XML can contain custom layer definitions or external references that could be exploited to execute malicious code or access unauthorized files.
File types: .pt, .pth
This scanner examines PyTorch model files, which are ZIP archives containing pickled data.
Key checks:
Why it matters: PyTorch models are essentially ZIP archives containing pickled objects, which can include malicious code. The scanner unpacks these archives and applies pickle security checks to the contents.
File types: .pte, .pt (ExecuTorch archives)
This scanner examines PyTorch ExecuTorch model files designed for mobile and edge deployment.
Key checks:
Why it matters: ExecuTorch models package PyTorch models for edge devices but can still contain pickled data and embedded code. Mobile deployment environments are often resource-constrained and may have limited security monitoring, making them attractive targets.
File types: .gguf, .ggml, .ggmf, .ggjt, .ggla, .ggsa
This scanner validates GGUF (GPT-Generated Unified Format) and GGML model files commonly used for large language models like LLaMA, Alpaca, and other quantized models.
Key checks:
Why it matters: GGUF/GGML files are increasingly popular for distributing large language models. While generally safer than pickle formats, they can still contain malicious metadata or be crafted to cause resource exhaustion attacks. The scanner ensures these files are structurally sound and don't contain hidden threats.
File types: .joblib
This scanner analyzes joblib serialized files, which are commonly used by ML libraries for model persistence.
Key checks:
Why it matters: Joblib files often contain compressed pickle data, inheriting the same security risks as pickle files. Additionally, malicious actors could craft compression bombs that consume excessive memory or CPU resources when loaded. The scanner provides safe decompression with security limits.
File types: .skops, .pkl (skops format)
This scanner detects known vulnerabilities in scikit-learn models saved with the skops library.
Key checks:
Why it matters: Skops versions before 0.12.0 contain multiple critical vulnerabilities allowing remote code execution through specially crafted sklearn estimators. Attackers can embed malicious callables in pipelines, transformers, or estimator parameters that execute when the model is loaded or used.
File types: .msgpack, .flax, .orbax, .jax
This scanner analyzes Flax/JAX model files serialized in MessagePack format and other JAX-specific formats.
Key checks:
Why it matters: Flax models serialized as msgpack files can potentially contain embedded code or malicious data structures. While MessagePack is generally safer than pickle, it can still be exploited through carefully crafted payloads that target specific deserializer vulnerabilities or cause denial-of-service attacks through resource exhaustion.
File types: .ckpt, .checkpoint, .orbax-checkpoint, .pickle (when in JAX context)
This scanner analyzes JAX checkpoint files in various serialization formats, including Orbax checkpoints and JAX-specific pickle files.
Key checks:
jax.experimental.host_callback.call)Why it matters: JAX checkpoints can contain custom restore functions or experimental callbacks that could be exploited. Orbax checkpoints may include metadata with arbitrary restore functions that execute during model loading.
File types: .npy, .npz
This scanner validates NumPy binary array files for integrity issues and potential security risks.
Key checks:
Why it matters: While NumPy files are generally safer than pickle files, they can still be crafted maliciously. Object arrays can contain arbitrary Python objects (including code), and extremely large arrays can cause denial-of-service attacks. The scanner ensures arrays are safe to load and don't contain hidden threats.
File types: .manifest (with .tar.gz layer references)
This scanner examines OCI (Open Container Initiative) and Docker manifest files that contain embedded model files in compressed layers.
Key checks:
.tar.gz layersWhy it matters: Container images are increasingly used to distribute ML models and datasets. These containers can contain multiple layers with various file types, potentially hiding malicious models within what appears to be a legitimate container image. The scanner ensures that all model files within container layers are safe.
File types: .json, .yaml, .yml, .xml, .toml, .config, etc.
This scanner analyzes model configuration files and manifests.
Key checks:
Why it matters: Model configuration files can contain settings that lead to insecure behavior, such as downloading content from untrusted sources, accessing sensitive files, or executing commands.
File types: .txt, .md, .markdown, .rst
This scanner analyzes ML-specific text files like vocabulary lists, README files, and model documentation.
Key checks:
Why it matters: Text files in ML repositories like vocab.txt, labels.txt, and README files should follow expected patterns. Deviations may indicate tampering or hidden data.
File types: .gguf, .json, .yaml, .yml, .jinja, .j2, .template
This scanner detects template injection vulnerabilities in Jinja2 templates embedded in model files and configurations.
Key checks:
eval, exec, import)Why it matters: Jinja2 templates in GGUF models, tokenizer configs, and deployment files can execute arbitrary code when processed. CVE-2024-34359 affects llama-cpp-python when loading malicious chat templates. Template injection allows full system compromise.
File types: README.md, MODEL_CARD.md, METADATA.md, model card files
This scanner analyzes model documentation and metadata files for security concerns.
Key checks:
Why it matters: Model documentation often contains setup instructions, example code, and configuration that may reference malicious resources or expose sensitive information. Attackers can embed malicious download links or credentials in README files that users may execute without careful review.
File types: .bin (raw PyTorch tensor files)
This scanner examines raw PyTorch binary tensor files that contain serialized weight data. It performs binary content scanning to detect various threats.
Key checks:
Why it matters:
While .bin files typically contain raw tensor data, attackers could embed malicious code or executables within these files. The scanner performs deep content analysis with PE file detection (including DOS stub validation) to detect such threats.
File types: .zip, .npz
This scanner examines ZIP archives and their contents recursively.
Key checks:
Why it matters: ZIP archives are commonly used to distribute models and datasets. Malicious actors can craft ZIP files that exploit extraction vulnerabilities, contain malware, or cause resource exhaustion. This scanner ensures that archives are safe to extract and that their contents don't pose security risks.
File types: .tar, .tar.gz, .tgz, .tar.bz2
This scanner examines TAR archives and their contents recursively.
Key checks:
Why it matters: TAR archives are commonly used in Linux environments and container images to distribute models. They can contain symlinks that point outside the extraction directory, potentially allowing access to sensitive files. TAR bombs can exhaust system resources during extraction.
File types: .7z
This scanner examines 7-Zip archives and their contents.
Key checks:
Why it matters: 7-Zip archives offer high compression ratios, making them attractive for compression bomb attacks. They support encryption, which can hide malicious content from initial inspection. The scanner safely extracts and analyzes 7z files with appropriate security limits.
File types: .pt, .pth, .h5, .keras, .hdf5, .pb, .onnx, .safetensors
This scanner analyzes neural network weight distributions to detect potential backdoors or trojaned models by identifying statistical anomalies.
Key checks:
Configuration options:
z_score_threshold: Controls sensitivity for outlier detection (default: 3.0, higher for LLMs)cosine_similarity_threshold: Minimum similarity required between neurons (default: 0.7)weight_magnitude_threshold: Threshold for extreme weight detection (default: 3.0 standard deviations)llm_vocab_threshold: Vocabulary size threshold to identify LLM models (default: 10,000)enable_llm_checks: Whether to perform checks on large language models (default: false)Why it matters: Backdoored or trojaned models often contain specific neurons that activate on trigger inputs. These malicious neurons typically have weight patterns that are statistically anomalous compared to benign neurons. By analyzing weight distributions, this scanner can detect models that have been tampered with to include hidden behaviors.
Special handling for LLMs: Large language models with vocabulary layers (>10,000 outputs) use more conservative thresholds due to their naturally varied weight distributions. LLM checking is disabled by default but can be enabled via configuration.
File types: .safetensors, .bin (when containing SafeTensors data)
This scanner examines SafeTensors format files, which are designed to be a safer alternative to pickle files.
Key checks:
Why it matters: While SafeTensors is designed to be safer than pickle files, the metadata section can still contain malicious content. Attackers might try to exploit parsers or include encoded payloads in the metadata. The scanner ensures the format integrity and metadata safety.
File types: .pdmodel, .pdiparams
This scanner examines PaddlePaddle model files, including model definitions and parameter files.
Key checks:
Why it matters: PaddlePaddle models can contain custom operators and may use pickle serialization internally. Malicious actors could embed harmful code in model definitions or exploit custom operators to execute unauthorized operations.
File types: .bst, .model, .json, .ubj
This scanner examines XGBoost model files in binary, JSON, and UBJSON formats.
Key checks:
Why it matters: XGBoost models can include custom Python functions for objectives, metrics, and callbacks. In binary format, these are often pickled, inheriting pickle security risks. JSON formats can contain embedded code strings or references to malicious external resources.
File types: .pmml
This scanner performs security checks on PMML (Predictive Model Markup Language) files to detect potential XML External Entity (XXE) attacks, malicious scripts, and suspicious external references.
Key checks:
<!DOCTYPE, <!ENTITY, <!ELEMENT, and <!ATTLIST declarations that could enable XML External Entity attacks<script>, eval(), exec(), system commands, and imports<Extension> elements which can contain arbitrary contentSecurity features:
Why it matters: PMML files are XML-based and can be exploited through XML vulnerabilities like XXE attacks. Extension elements can contain arbitrary content that might execute scripts or access external resources. The scanner ensures PMML files don't contain hidden security threats while maintaining model functionality.
ModelAudit includes comprehensive file format detection for ambiguous file extensions, particularly .bin files, which can contain different types of model data:
.bin files without other recognizable signaturesDetection Features:
This allows ModelAudit to automatically apply the correct scanner based on the actual file content, not just the extension. When a .bin file contains SafeTensors data, the SafeTensors scanner is automatically applied instead of assuming it's a raw binary file.
ModelAudit includes license detection across all file formats to help organizations identify legal obligations before deployment.
Key features:
Example warnings:
⚠️ AGPL license detected: Component is under AGPL-3.0
This may require source code disclosure if used in network services
🚨 Non-commercial license detected: Creative Commons NonCommercial
This component cannot be used for commercial purposes
Generate SBOM:
promptfoo scan-model ./models/ --sbom model-sbom.json
The SBOM includes component information, license metadata, risk scores, and copyright details in CycloneDX format.
Why it matters: AI/ML projects often combine components with different licenses. AGPL requires source disclosure for network services, non-commercial licenses block commercial use, and unlicensed datasets create legal risks.
ModelAudit includes comprehensive detection of network communication capabilities that could be used for data exfiltration or command & control:
Detection capabilities:
Why it matters: Malicious models could contain embedded network communication code to:
ModelAudit scans for embedded credentials and sensitive information:
Detection patterns:
Why it matters: Credentials embedded in models could:
ModelAudit detects Just-In-Time compilation and script execution patterns:
Detection capabilities:
Why it matters: JIT-compiled code can:
ModelAudit can scan models directly from HuggingFace URLs without manual downloading. When a HuggingFace URL is provided, ModelAudit:
huggingface-hub library to download all model files to a temporary directorySupported URL formats:
https://huggingface.co/user/modelhttps://hf.co/user/modelhf://user/modelThis feature requires the huggingface-hub package to be installed.