Back to Promptfoo

CI/CD Integration

site/docs/model-audit/ci-cd.md

0.121.1016.7 KB
Original Source

CI/CD Integration

ModelAudit integrates into CI/CD pipelines to automatically scan ML model files for security vulnerabilities before deployment.

Exit Codes

ModelAudit uses specific exit codes for CI/CD automation:

  • 0: No security issues found ✅
  • 1: Security issues detected (warnings or critical findings) 🟡
  • 2: Scan errors (file access, installation, timeouts) 🔴

In CI/CD pipelines, exit code 1 indicates findings that should be reviewed. Only exit code 2 represents actual scan failures.

GitHub Actions

Scan Changed Model Files

Scan model files modified in pull requests:

yaml
name: Model Security Scan

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  scan-models:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install Promptfoo and ModelAudit
        run: |
          npm install -g promptfoo
          pip install modelaudit

      - name: Get changed model files
        id: changed-files
        run: |
          MODEL_EXTENSIONS='pkl|pickle|dill|pth|pt|ckpt|checkpoint|orbax-checkpoint|h5|hdf5|keras|onnx|pb|tflite|safetensors|gguf|ggml|ggmf|ggjt|ggla|ggsa|bin|engine|plan|msgpack|flax|orbax|jax|joblib|skops|npy|npz|pmml|zip|tar|7z|json|yaml|yml|xml|toml|config|jinja|j2|template|bst|model|ubj'
          CHANGED=$(git diff --name-only --diff-filter=ACM \
            ${{ github.event.pull_request.base.sha }} ${{ github.sha }} | \
            grep -Ei "\.(${MODEL_EXTENSIONS})$" || true)

          if [ -z "$CHANGED" ]; then
            echo "has_changes=false" >> $GITHUB_OUTPUT
          else
            echo "$CHANGED"
            echo "has_changes=true" >> $GITHUB_OUTPUT
            echo "$CHANGED" > changed_files.txt
          fi

      - name: Scan changed models
        if: steps.changed-files.outputs.has_changes == 'true'
        run: |
          mkdir -p scan_results
          while IFS= read -r file; do
            if [ -f "$file" ]; then
              echo "Scanning: $file"
              report_id=$(printf '%s' "$file" | sha256sum | cut -d ' ' -f1)
              promptfoo scan-model "$file" \
                --format json \
                --output "scan_results/${report_id}.json" || true
            fi
          done < changed_files.txt

      - name: Upload scan results
        if: steps.changed-files.outputs.has_changes == 'true'
        uses: actions/upload-artifact@v4
        with:
          name: model-scan-results
          path: scan_results/

      - name: Check for critical issues
        if: steps.changed-files.outputs.has_changes == 'true'
        run: |
          jq -es '
            all(.[];
              type == "object" and
              (.issues | type == "array") and
              (.checks | type == "array") and
              (.has_errors | type == "boolean") and
              (.failed_checks | type == "number") and
              (.has_errors == false) and
              (.failed_checks == 0) and
              ([.checks[]? | select(.status == "failed")] | length == 0) and
              ([.issues[]? | select(.severity == "critical" or .severity == "error")] | length == 0)
            )
          ' scan_results/*.json >/dev/null

          WARNING_COUNT=$(jq -es '[.[] | .issues[]? | select(.severity == "warning")] | length' scan_results/*.json)

          if [ "$WARNING_COUNT" -gt 0 ]; then
            echo "⚠️  Found warnings in $WARNING_COUNT file(s)"
            exit 0
          else
            echo "✅ No security issues detected"
          fi

Upload to GitHub Advanced Security

Use SARIF format to integrate with GitHub Code Scanning:

yaml
name: Model Security Scan with SARIF

on:
  push:
    branches: [main]
    paths:
      - '**/*.pkl'
      - '**/*.pth'
      - '**/*.h5'
      - '**/*.onnx'
  pull_request:
    branches: [main]

jobs:
  scan-models:
    runs-on: ubuntu-latest
    permissions:
      security-events: write
      contents: read

    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install tools
        run: |
          npm install -g promptfoo
          pip install modelaudit

      - name: Scan models directory
        run: |
          promptfoo scan-model models/ \
            --no-write \
            --format sarif \
            --output modelaudit.sarif

      - name: Upload SARIF to GitHub
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: modelaudit.sarif
          category: model-security

Scheduled Security Scans

Run periodic scans on all models:

yaml
name: Weekly Model Security Audit

on:
  schedule:
    - cron: '0 2 * * 0' # Sundays at 2 AM UTC
  workflow_dispatch:

jobs:
  comprehensive-scan:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install tools
        run: |
          npm install -g promptfoo
          pip install modelaudit[all]

      - name: Comprehensive model scan
        run: |
          promptfoo scan-model models/ \
            --format json \
            --output scan_results.json \
            --strict \
            --sbom sbom.json

      - name: Generate SBOM
        run: |
          echo "Software Bill of Materials generated: sbom.json"

      - name: Check for critical issues
        id: check-issues
        run: |
          if ! jq -e '
            type == "object" and
            (.issues | type == "array") and
            (.checks | type == "array") and
            (.has_errors | type == "boolean") and
            (.failed_checks | type == "number") and
            (.has_errors == false) and
            (.failed_checks == 0) and
            ([.checks[]? | select(.status == "failed")] | length == 0) and
            ([.issues[]? | select(.severity == "critical" or .severity == "error")] | length == 0)
          ' scan_results.json >/dev/null; then
            echo "critical=true" >> $GITHUB_OUTPUT
            exit 1
          fi

      - name: Create issue if scan fails closed
        if: failure() && steps.check-issues.outputs.critical == 'true'
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const results = JSON.parse(fs.readFileSync('scan_results.json', 'utf8'));

            const issues = Array.isArray(results.issues) ? results.issues : [];
            const criticalIssues = issues
                .filter(i => i.severity === 'critical' || i.severity === 'error')
                .map(i => `- ${i.message}`)
                .join('\n');
            const failedCheckCount = results.failed_checks || 0;
            const findingsSummary = criticalIssues || `- Failed checks: ${failedCheckCount}`;

            await github.rest.issues.create({
              owner: context.repo.owner,
              repo: context.repo.repo,
              title: '🚨 Critical Model Security Issues Detected',
              body: `Weekly security scan found non-clean results:\n\n${findingsSummary}`,
              labels: ['security', 'critical', 'model-audit']
            });

      - name: Upload artifacts
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: weekly-audit-results
          path: |
            scan_results.json
            sbom.json

Strict Mode for Production

Block deployments on any security findings:

yaml
name: Production Model Validation

on:
  push:
    branches: [main]
    paths:
      - 'models/production/**'

jobs:
  strict-validation:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install tools
        run: |
          npm install -g promptfoo
          pip install modelaudit[all]

      - name: Strict security scan
        run: |
          promptfoo scan-model models/production/ \
            --strict \
            --format json \
            --output results.json

      - name: Verify no issues
        run: |
          # In strict mode, exit code 1 means issues found
          # This step fails if any warnings or critical issues exist
          if [ $? -ne 0 ]; then
            echo "❌ Security scan found issues - deployment blocked"
            exit 1
          fi

GitLab CI

Example .gitlab-ci.yml:

yaml
model-security-scan:
  stage: test
  image: node:20

  before_script:
    - apt-get update && apt-get install -y python3 python3-pip
    - npm install -g promptfoo
    - pip3 install modelaudit

  script:
    - |
      MODEL_EXTENSIONS='pkl|pickle|dill|pth|pt|ckpt|checkpoint|orbax-checkpoint|h5|hdf5|keras|onnx|pb|tflite|safetensors|gguf|ggml|ggmf|ggjt|ggla|ggsa|bin|engine|plan|msgpack|flax|orbax|jax|joblib|skops|npy|npz|pmml|zip|tar|7z|json|yaml|yml|xml|toml|config|jinja|j2|template|bst|model|ubj'
      CHANGED=$(git diff --name-only --diff-filter=ACM $CI_COMMIT_BEFORE_SHA $CI_COMMIT_SHA | \
        grep -Ei "\.(${MODEL_EXTENSIONS})$" || true)

      if [ -n "$CHANGED" ]; then
        echo "$CHANGED" | while read -r file; do
          if [ -f "$file" ]; then
            promptfoo scan-model "$file" --format json >> scan_results.json
          fi
        done
      else
        echo "No model files changed"
      fi

  artifacts:
    reports:
      junit: scan_results.json
    paths:
      - scan_results.json
    expire_in: 30 days

  only:
    - merge_requests
    - main

Jenkins

Example Jenkinsfile:

groovy
pipeline {
    agent any

    stages {
        stage('Setup') {
            steps {
                sh '''
                    npm install -g promptfoo
                    pip install modelaudit
                '''
            }
        }

        stage('Scan Models') {
            steps {
                script {
                    def changed = sh(
                        script: '''
                            MODEL_EXTENSIONS='pkl|pickle|dill|pth|pt|ckpt|checkpoint|orbax-checkpoint|h5|hdf5|keras|onnx|pb|tflite|safetensors|gguf|ggml|ggmf|ggjt|ggla|ggsa|bin|engine|plan|msgpack|flax|orbax|jax|joblib|skops|npy|npz|pmml|zip|tar|7z|json|yaml|yml|xml|toml|config|jinja|j2|template|bst|model|ubj'
                            git diff --name-only HEAD~1 HEAD | \
                            grep -Ei "\\.(${MODEL_EXTENSIONS})$" || true
                        ''',
                        returnStdout: true
                    ).trim()

                    if (changed) {
                        changed.split('\n').each { file ->
                            def reportFile = "scan_${file.bytes.encodeHex().toString()}.json"
                            withEnv(["MODEL_FILE=${file}", "REPORT_FILE=${reportFile}"]) {
                                sh '''
                                    promptfoo scan-model "$MODEL_FILE" \
                                      --format json \
                                      --output "$REPORT_FILE"
                                '''
                            }
                        }

                        // Check for critical issues
                        def critical = sh(
                            script: '''
                                jq -es '
                                  all(.[];
                                    type == "object" and
                                    (.issues | type == "array") and
                                    (.checks | type == "array") and
                                    (.has_errors | type == "boolean") and
                                    (.failed_checks | type == "number") and
                                    (.has_errors == false) and
                                    (.failed_checks == 0) and
                                    ([.checks[]? | select(.status == "failed")] | length == 0) and
                                    ([.issues[]? | select(.severity == "critical" or .severity == "error")] | length == 0)
                                  )
                                ' scan_*.json >/dev/null
                            ''',
                            returnStatus: true
                        )

                        if (critical != 0) {
                            error("Found model scan failures or critical security issues")
                        }
                    }
                }
            }
        }
    }

    post {
        always {
            archiveArtifacts artifacts: 'scan_*.json', allowEmptyArchive: true
        }
    }
}

CircleCI

Example .circleci/config.yml:

yaml
version: 2.1

jobs:
  model-scan:
    docker:
      - image: cimg/node:20.0

    steps:
      - checkout

      - run:
          name: Install tools
          command: |
            npm install -g promptfoo
            pip install modelaudit

      - run:
          name: Scan changed models
          command: |
            MODEL_EXTENSIONS='pkl|pickle|dill|pth|pt|ckpt|checkpoint|orbax-checkpoint|h5|hdf5|keras|onnx|pb|tflite|safetensors|gguf|ggml|ggmf|ggjt|ggla|ggsa|bin|engine|plan|msgpack|flax|orbax|jax|joblib|skops|npy|npz|pmml|zip|tar|7z|json|yaml|yml|xml|toml|config|jinja|j2|template|bst|model|ubj'
            CHANGED=$(git diff --name-only origin/main...HEAD | \
              grep -Ei "\.(${MODEL_EXTENSIONS})$" || true)

            if [ -n "$CHANGED" ]; then
              echo "$CHANGED" | while read -r file; do
                if [ -f "$file" ]; then
                  promptfoo scan-model "$file" --format json >> scan_results.json
                fi
              done
            fi

      - store_artifacts:
          path: scan_results.json
          destination: model-scan-results

workflows:
  security:
    jobs:
      - model-scan:
          filters:
            branches:
              only:
                - main
                - develop

Best Practices

Scan Strategy

  • Pull Requests: Scan only changed files for fast feedback
  • Main Branch: Run comprehensive scans on merge
  • Scheduled Scans: Weekly or daily full audits
  • Production Gate: Use --strict mode to block deployments

Performance

yaml
# Cache dependencies
- name: Cache npm and pip
  uses: actions/cache@v3
  with:
    path: |
      ~/.npm
      ~/.cache/pip
    key: ${{ runner.os }}-deps-${{ hashFiles('**/package-lock.json', '**/requirements.txt') }}

# Parallel scanning
- name: Parallel scan
  run: |
    find models/ -name "*.pkl" -print0 | \
      xargs -0 -P 4 -I {} promptfoo scan-model {} --format json --output {}.json

Security

  • Store credentials as encrypted secrets
  • Use read-only tokens when possible
  • Rotate credentials regularly
  • Audit access to scan results

Timeout Configuration

For large models (8GB+):

yaml
- name: Scan large model
  run: |
    promptfoo scan-model large_model.bin \
      --timeout 1800 \
      --verbose \
      --format json \
      --output results.json

Retry Logic

yaml
- name: Scan with retry
  uses: nick-fields/retry@v2
  with:
    timeout_minutes: 10
    max_attempts: 3
    command: promptfoo scan-model models/ --format json --output results.json

Notifications

yaml
- name: Notify on critical issues
  if: failure()
  uses: slackapi/slack-github-action@v1
  with:
    payload: |
      {
        "text": "🚨 Critical model security issues detected!",
        "blocks": [
          {
            "type": "section",
            "text": {
              "type": "mrkdwn",
              "text": "*Model Security Scan Failed*\nRepository: ${{ github.repository }}\n<${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|View Details>"
            }
          }
        ]
      }
  env:
    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

Troubleshooting

Verbose Output

bash
promptfoo scan-model models/ --verbose

File Size Limits

bash
# Set maximum file size
promptfoo scan-model models/ --max-size 1GB

Dry Run

bash
# Preview scan without processing
promptfoo scan-model models/ --dry-run

Next Steps