internal/ai/face/README.md
Last Updated: March 3, 2026
This document captures the current state of PhotoPrism's face detection and embedding pipeline following the October 2025 optimizations. It should be used as the canonical reference when assessing detection quality, tuning configuration, or integrating downstream tooling that depends on FaceNet embeddings.
Key changes:
Embedding.Dist and EmbeddingsMidpoint).TODO: Persist detector provenance in
FaceSrc(e.g., useentity.SrcONNXfor SCRFD detections) so hybrid libraries can toggle background filtering per embedding source when upgrading from Pigo.
PhotoPrism now supports two interchangeable detection engines:
FACE_ENGINE=auto and the bundled SCRFD model is present (the prebuilt runtime targets glibc ≥ 2.27 on x86_64/arm64). Operators can switch at runtime via photoprism --face-engine=<auto|pigo|onnx> or photoprism faces reset --engine=<auto|pigo|onnx> for a full re-index.Runtime selection lives in Config.FaceEngine(); auto resolves to ONNX when the SCRFD assets are available, otherwise Pigo. Scheduling is controlled by the face model entry in vision.yml: Config.FaceEngineRunType() simply forwards to vision.Config.RunType(ModelTypeFace) and returns never if no detector is configured. This keeps face detection aligned with embedding generation so both always run together.
FACE_ANGLE option.--face-angle=<rad> (repeatable).FACE_ANGLE (comma-separated list).Config.FaceAngles().face.DetectionAngles, so runtime overrides do not mutate the global defaults.face.QualityThreshold was flattened for better small-face recall:
OverlapThresholdFloor = 41). Tests rely on that value (e.g., Markers.Contains/SameFace).FACE_ANGLE) do not affect landmark estimation, which continues to run at 0° to match the cascade assumptions.FaceNet embeddings are generated through TensorFlow bindings that allocate tensors in C memory. Those allocations are released by Go GC finalizers, so long-running indexing jobs can show steadily rising RSS even when the Go heap stays small. To keep memory bounded during extended face indexing runs, PhotoPrism now triggers periodic garbage collection and returns freed C-allocated tensor buffers to the OS. You can tune this behavior with PHOTOPRISM_TF_GC_EVERY (default 200; set to 0 to disable). Lower values reduce peak RSS but increase GC overhead and can slow indexing, so keep the default unless memory pressure is severe.
All embeddings, regardless of origin, are normalized to unit length (‖x‖₂ = 1):
NewEmbedding normalizes the raw float32 inference output.EmbeddingsMidpoint normalizes each contributor, averages component-wise, and renormalizes the centroid.UnmarshalEmbedding and UnmarshalEmbeddings normalize data when loading from persisted JSON.photoprism faces audit --fix re-normalizes persisted embeddings, rekeys face IDs, and re-links markers (ID + FaceDist) so historical data adopts the canonical unit-length vectors.Faces.Match pre-filters matchable clusters, keeps an in-memory veto list for freshly cleared markers, and caches embeddings to avoid redundant distance checks; BenchmarkSelectBestFace (1024 faces) now reports a bucket size of ~16 candidates out of 1024 (≈98 % fewer distance evaluations) at ≈0.55 ms/op with zero allocations.Samples, ClusterRadius) from the latest matches via Face.UpdateMatchStats, avoiding stale radii during optimize loops. The radius is capped at 0.42 so automatic matches accept new embeddings up to ClusterRadius + MatchDist (≈0.88) away from the centroid.PHOTOPRISM_FACE_MATCH_CHILDREN=true (or the FaceMatchChildren option) to include children, and PHOTOPRISM_FACE_MATCH_BACKGROUND=true to include background samples; both default to false so operators explicitly choose when these categories participate.BenchmarkClusterMaterialize reports ~14.8 µs/op with 64 allocations (≈56 KB) versus the legacy ~29.8 µs/op with 384 allocations (≈105 KB).This guarantees that Euclidean distance comparisons are equivalent to cosine comparisons, aligning our thresholds with FaceNet literature.
| Kind | Value | Source | Matching Behavior | Notes |
|---|---|---|---|---|
RegularFace | 1 | Default embedding classification | Eligible for matching and clustering | Produced when embeddings are distinct and not flagged as child/background. |
ChildrenFace | 2 | Embedding.IsChild() vs. curated samples | Excluded from matching (SkipMatching = true) | Helps avoid unreliable matches on juvenile faces; clusters are retained but not auto-assigned. |
BackgroundFace | 3 | Embedding.IsBackground() heuristics | Excluded from matching and clustering | Used for non-face artifacts and background detections; prevents noise from entering optimization runs. |
AmbiguousFace | 4 | entity.Face.ResolveCollision() heuristic | Excluded from matching and manual merge retries | Assigned when two subjects collide at very low distance (< 0.02); face remains until collision cleared. |
The Faces.Optimize loop still prefers the operator-curated clusters (face_src = 'manual'). When multiple manual clusters for the same subject can be merged, query.MergeFaces materialises a midpoint cluster and reassigns markers to it. If some markers remain attached to the original clusters (for example because their embeddings sit far from the midpoint), the old clusters cannot be purged and the optimiser now emits a warning:
faces: retained manual clusters after merge: kept 4 candidate cluster(s) [...] for subject <uid> because markers still reference them
This is informational—the optimiser skips that merge and progresses. To reduce noise, consider:
photoprism faces reset --engine=<pigo|onnx> to regenerate markers with consistent embeddings.No automatic data cleanup runs in this scenario, so operators remain in control of manual edits.
Additional safeguards were introduced in October 2025 so stubborn clusters are only retried a limited number of times:
faces.merge_retry) and optional note (merge_notes). The optimiser skips clusters once the retry count reaches MergeMaxRetry (default 1). The limit may be raised or disabled with the environment variable PHOTOPRISM_FACE_MERGE_MAX_RETRY (0 = unlimited retries).photoprism faces optimize --retry clears retry counters before running the optimiser, allowing administrators to reprocess clusters after manual cleanup.photoprism faces audit --subject=<uid> focuses the audit report on a specific person and prints retry counts, sample statistics, and outstanding clusters so operators know which photos still need attention.Embedding.Dist was hand-optimized with loop unrolling (4-way accumulation) and now runs at ~155 ns/op, down from ~242 ns/op (≈36 % faster).PHOTOPRISM_FACENET_DISABLED).cos θ = 1 - (d² / 2) (since embeddings are normalized).BenchmarkEmbeddingDistBenchmarkEmbeddingsMidpointTestMergeFaces/SameSubjectsTestNetIf FaceNet unit tests fail with Read less bytes than requested, the local model file is typically incomplete or corrupted (assets/models/facenet/saved_model.pb).
Recovery steps:
rm -f /tmp/photoprism/facenet.ziprm -rf assets/models/facenetmake dep-tensorflow (or scripts/download-facenet.sh)go test ./internal/ai/face -run TestNet -count=1| Setting | Default | Description |
|---|---|---|
FACE_ENGINE | auto | Detection engine (auto, pigo, onnx). auto resolves to ONNX when the SCRFD model exists. |
FACE_ENGINE_THREADS | runtime.NumCPU()/2 (≥1) | ONNX inference threads; ignored by Pigo. |
FACE_ANGLE | -0.3,0,0.3 | Detection angles (radians) swept by Pigo. |
FACE_SCORE | 9.0 (with dynamic offsets) | Base quality threshold before scale adjustments. |
FACE_OVERLAP | 42 | Maximum allowed IoU when deduplicating markers. |
Run scheduling is configured through the face model entry in vision.yml. Adjust the model’s Run value (for example on-schedule, manual, or never) to control when detection and embedding jobs execute—no separate FACE_ENGINE_RUN flag is required.
When the model is left on the default auto run mode, face detection participates in manual, auto, and on-demand workflows but skips scheduled cron runs so background jobs do not trigger unexpectedly; the same applies to an explicit on-demand run mode, which now skips cron executions by default. Set Run to on-schedule explicitly if you want faces processed during scheduled vision passes.
Additional merge tuning: set
PHOTOPRISM_FACE_MERGE_MAX_RETRYto control how often manual clusters are retried (default 1,0= unlimited). See the optimiser notes above.
| Benchmark | Before | After |
|---|---|---|
BenchmarkEmbeddingDist | ~242 ns/op | ~155 ns/op |
BenchmarkEmbeddingsMidpoint | ~194 µs/op, 528 KB | ~99 µs/op, 4 KB |
Re-run these benchmarks after any detector or embedding adjustments to catch regressions early.