TESTING.md
purpose: prevent regressions. test core features rigorously every time
commits: e9c76934, 9acdf850
Native Live Text selection — On macOS, verify that native Live Text selection works within the app's text overlay.
Native Data Detectors — On macOS, verify that native data detectors (e.g., phone numbers, addresses, dates) are active and clickable within the app's text overlay.
Cross-architecture Live Text compilation — On both x86_64 (Intel) and arm64 (Apple Silicon) macOS machines, verify that Live Text functionality is available and works without compilation errors or runtime issues.
window mode CSS restore — In window mode (not fullscreen), verify that CSS styling is correct and as expected (e.g., no unexpected transparent panels).
keyboard input in main window from tray — Open the main window from the tray icon and immediately try typing. Verify that keyboard input works without requiring a click.
WKWebView keyboard focus recovery — Interact with embedded web views (e.g., billing, help sections), then navigate back to other UI elements. Verify keyboard focus is correctly recovered by the WKWebView.
these break CONSTANTLY. any change to window_api.rs, main.rs shortcuts, activation policy, or NSPanel code must test ALL of these.
commits that broke this area: 0752ea59, d89c5f14, 4a64fd1a, fa591d6e, 8706ae73, 6d44af13, b6ff1bf7, 09a18070
MoveToActiveSpace staying set.3097872b, 8706ae73, 09a18070).b4eb2ab4).d74d0665, 5a50aaad).6d44af13, d74d0665).206107ba).2315a39c, 1f2681e3).order_out. clicking where overlay was should NOT trigger overlay buttons (32e1a962).d99444a7, 523a629e).activateIgnoringOtherApps call must not trigger space monitor's hide callback.b6c363e5)eb9e65b4)22830119)e9c76934)a3e29d42a)397f46133)32fed7c8c)commits that broke this area: 0752ea59, 7562ec62, 2a2bd9b5, f2f7f770, 5cb100ea
d794176a).9e151265).ps aux | grep screenpipe.2a2bd9b5. objc2→objc pointer cast was causing panic_cannot_unwind.f2f7f770).078fcfb2)b6c363e5)commits: 28e5c247
f882caef). verify with Activity Monitor GPU tab.e64ee25f4)792145ac6)d3a4b6bcc)b70116b)54a550f4)cb2cc205)6b3a71eb)46350671)2e68400c)3701cce2)73adc9d4)e9e2dc252)ceb77559d)0f287761d)commits: device_monitor.rs atomic swap, tiered backoff, empty device list guard
grep "DEVICE_RECOVERY.*output.*restored" ~/.screenpipe/screenpipe-app.*.log. Verify: curl localhost:3030/search?content_type=audio&limit=5 shows output device transcriptions resume.DEVICE_RECOVERY log entries.curl localhost:3030/health shows device_status_details with output device present within 15 seconds of recovery.grep "device list returned empty" ~/.screenpipe/screenpipe-app.*.log shows warning but no disconnections.sqlite3 ~/.screenpipe/db.sqlite "SELECT t1.timestamp as gap_start, t2.timestamp as gap_end, (julianday(t2.timestamp) - julianday(t1.timestamp)) * 86400 as gap_seconds FROM audio_transcriptions t1 JOIN audio_transcriptions t2 ON t2.id = (SELECT MIN(id) FROM audio_transcriptions WHERE id > t1.id AND is_input_device = 0) WHERE t1.is_input_device = 0 AND (julianday(t2.timestamp) - julianday(t1.timestamp)) * 86400 > 60 ORDER BY t1.timestamp;" — should return no rows if output was continuously captured.commits: calendar_speaker_id.rs, meetings.rs, meeting_persister.rs
grep "meeting detected via calendar" ~/.screenpipe/screenpipe-app.*.log shows detection after restart. verify: sqlite3 ~/.screenpipe/db.sqlite "SELECT id, name FROM speakers WHERE name != ''" shows both user and attendee names.grep "meeting_started" ~/.screenpipe/screenpipe-app.*.log.grep "meeting ended via calendar" ~/.screenpipe/screenpipe-app.*.log.curl 'localhost:3030/search?content_type=audio&speaker_name=<attendee>&limit=5' returns results.grep "auto speaker identification: named" ~/.screenpipe/screenpipe-app.*.log.sqlite3 ~/.screenpipe/db.sqlite "SELECT id, name FROM speakers WHERE name != ''" shows same speakers before and after restart.7684f1d47)34a62c053)ef39e728d)fe905d6af, 01eb9cf33)849372fa9)e6740eb38)ef470d9e1)commits: 6dd5d98e, 831ad258
commits: 6dd5d98e, 831ad258
831ad258.b18ae2253)1b7d0db5b)39c016cb3, d119d060d, 231521192)d2c9d1fb8)8f7294e6)aba74513)commits: d5a9d052, 0b32cc9a, ca29a67b
commits: d9d43d31, 620c89a5, 14acf6f0
fresh install — all prompts appear — screen recording, microphone, accessibility prompts all show on first launch.
denied permission → opens System Settings — if user previously denied mic permission, clicking "grant" opens System Settings > Privacy directly (620c89a5).
permission revoked while running — go to System Settings, revoke screen recording. app shows red permission banner within 10 seconds.
permission banner is visible — solid red bg-destructive banner at top of main window when any permission missing. not subtle (9c0ba5d1).
permission recovery page — navigating to /permission-recovery shows clear instructions.
startup permission gate — on first launch, permissions are requested before recording starts (d9d43d31).
faster permission polling — permission status checked every 5-10 seconds, not 30 (d9d43d31).
No recurring permission modal after close — Grant macOS permissions, quit the app, and relaunch it multiple times. Verify that the macOS permission modal does NOT reappear every time the app is closed.
fresh install — all prompts appear — screen recording, microphone, accessibility prompts all show on first launch.
denied permission → opens System Settings — if user previously denied mic permission, clicking "grant" opens System Settings > Privacy directly (620c89a5).
permission revoked while running — go to System Settings, revoke screen recording. app shows red permission banner within 10 seconds.
permission banner is visible — solid red bg-destructive banner at top of main window when any permission missing. not subtle (9c0ba5d1).
permission recovery page — navigating to /permission-recovery shows clear instructions.
startup permission gate — on first launch, permissions are requested before recording starts (d9d43d31).
faster permission polling — permission status checked every 5-10 seconds, not 30 (d9d43d31).
improved permission recovery UX — Verify that the user experience for recovering from denied permissions is clear and intuitive. (57cca740)
commits: d4abc619, 4f4a8282, 31f37407, 2223af9a, b34a4abd, 303958f9
POST /ai/chat/completions returns valid response using on-device Foundation Model.31f37407). feature gracefully disabled.response_format: { type: "json_object" } returns valid JSON, no prose preamble (2223af9a).{...} is extracted correctly (b34a4abd).stream: true returns Server-Sent Events with incremental tokens (4f4a8282).tools array gets tool definitions injected into prompt, model responds with tool calls (4f4a8282).303958f9, 2223af9a).303958f9).commits: 94531265, d794176a, 9070639c, 0378cab1, 4a3313d3, 7ffdd4f1, 1b36f62d
9070639c).851b3037c)ac46aa437, 418826dfa, 274826dfa)4cb9850f7, c49350df0, 139500d52)d794176a, 94531265).0378cab1, 4a3313d3, 8c435a10).ps aux | grep screenpipe shows nothing. lsof -i :3030 shows nothing.c7fbc3ea).lsof calls have a 5-second timeout, preventing zombie CPU drain, especially on quit. Check logs for lsof timeouts if applicable.tokio shutdown process is stable and doesn't panic in the tree walker, especially during application exit or process restarts.ggml Metal destructor crash.af2b4f3d)d3ead88eb)1b4bf7918)a8413fe2)commits: eea0c865, cc09de61, e61501da, d25191d7, 60096fb9
slow DB insert warning — check logs. "Slow DB batch insert" warnings should be <1s in normal operation. >3s indicates contention.
concurrent DB access — UI queries + recording inserts happening simultaneously. no "database is locked" errors.
store race condition — rapidly toggle settings while recording is active. no crash (eea0c865).
event listener race condition — Tauri event listener setup during rapid window creation. no crash (cc09de61).
UTF-8 boundary panic — search with special characters, non-ASCII text in OCR results. no panic on string slicing (eea0c865).
low disk space — with <1GB free, app should warn user. no crash from failed writes.
large database (>10GB) — search still returns results within 2 seconds. app doesn't freeze on startup.
Snapshot compaction integrity — Verify compaction doesn't result in NULL offset_index or pool exhaustion. (09245af5f)
Audio chunk timestamps — start_time and end_time are correctly set for reconciled and retranscribed audio chunks in the database.
SCREENPIPE_DATA_DIR usage — Set the SCREENPIPE_DATA_DIR environment variable. Verify the app uses this directory for all its data storage. (d5f30db71)
DB pool starvation prevention — Simulate high database load (e.g., rapid screen activity, many pipes running) and monitor logs. Verify no "database is locked" errors or signs of DB pool starvation.
DB write coalescing queue — verify high-frequency captures (e.g. 10 FPS) don't lock the UI or cause write errors. (c23768f41)
Multi-byte window titles in suggestions — Interact with suggestions for windows that have multi-byte (e.g., Unicode, emoji) characters in their titles. Verify no char boundary panics.
no concurrent reconciliation issues — Verify that concurrent reconciliation processes do not cause issues during heavy load or sync operations. (1d436bc3)
pipe_config blobs skipped in sync — Verify that pipe_config blobs are correctly skipped during synchronization, preventing unnecessary data transfer and potential issues. (08d5c53a)
Pi's native auto-compaction for pipe session history — Verify that Pi's native auto-compaction feature for pipe session history works as expected, preventing indefinite growth of history and maintaining performance. (8f49e2cf)
UTF-8 panic with long multi-byte strings — Introduce long strings with multi-byte UTF-8 characters (e.g., in window titles, chat input, search queries). Verify no panics occur when these strings are truncated, stored, or processed.
fsync snapshots before DB commit — verify data integrity by force-quitting during heavy capture; snapshots should match DB entries. (2e63282b8)
slow DB insert warning — check logs. "Slow DB batch insert" warnings should be <1s in normal operation. >3s indicates contention.
concurrent DB access — UI queries + recording inserts happening simultaneously. no "database is locked" errors.
store race condition — rapidly toggle settings while recording is active. no crash (eea0c865).
event listener race condition — Tauri event listener setup during rapid window creation. no crash (cc09de61).
UTF-8 boundary panic — search with special characters, non-ASCII text in OCR results. no panic on string slicing (eea0c865).
low disk space — with <1GB free, app should warn user. no crash from failed writes.
large database (>10GB) — search still returns results within 2 seconds. app doesn't freeze on startup.
Audio chunk timestamps — start_time and end_time are correctly set for reconciled and retranscribed audio chunks in the database.
commits: 8a5f51dd, 0b0d8090, 7e58564e, 2522a7e2, f3e55dbc, 79f2913f
8a5f51dd).21bddd0f).whisper-large-v3-turbo-quantized and functions correctly.credits_exhausted and other LLM-related errors.commits: 8a5f51dd, 0b0d8090
8a5f51dd).21bddd0f).commits: 87abb00d, 9464fdc9, 0f9e43aa, 7ea15f32, bf1f1004
87abb00d, 9464fdc9).0f9e43aa).7ea15f32).commits: 87abb00d, 9464fdc9, 0f9e43aa, 7ea15f32
87abb00d, 9464fdc9).0f9e43aa).7ea15f32).commits: f1255eac, 25cbdc6b, 2529367d, d9821624, e61501da, 039d5fea, 50ff4f4c, 91cc4371, bcce42796, a98fa2991, 0ff93b167, adbbb8f84
f1255eac).25cbdc6b).2529367d).50ef52d1, aa992146).be3ecffb).bcce42796)0ff93b167)0ff93b167)d9821624).0b057046).screenpipe://frame/N or screenpipe://frames/N opens main window and jumps to frame N. works from cold start; invalid IDs show clear error.2e63282b8)frame_id. (a98fa2991)71dee4ca3)2015137a1)frames.full_text and frames_fts. (adbbb8f84)frames_fts for comprehensive accessibility text searching.OR instead of UNION within IN().content_type=all search and pagination — Perform search queries with content_type=all. Verify that the result count is accurate and pagination works correctly without missing or duplicating results.search_ocr() returns results for event-driven capture — Verify that search_ocr() correctly returns OCR results for event-driven captures and does not return empty when visible text is present on screen.2cf0c14e)57cca740)3e8f37fc)cba69e56)a80e9ce6)d6c4b821)0cee47b62)4d2b05990, f09f1e9aa)f108f1f0d, 2a2bd9b5, 5762c60bf)19789657d)0c883819e, b7123231, f09f1e9aa)9277431e4)c029f7779)2bcdf8d8b)2bcdf8d8b)67f4c4304)commits: f1255eac, 25cbdc6b, 2529367d, d9821624
f1255eac).25cbdc6b).2529367d).50ef52d1, aa992146).be3ecffb).d9821624).0b057046).screenpipe://frame/N or screenpipe://frames/N opens main window and jumps to frame N. works from cold start; invalid IDs show clear error.frames_fts for comprehensive accessibility text searching.OR instead of UNION within IN().commits: 2f6b2af5, ea7f1f61, 5cb100ea
5cb100ea).2f6b2af5).ea7f1f61).08d5c53a)pipe_config blobs are correctly skipped during sync (requires inspection of sync data or logs).0e7baaedb)commits: b3628788, 738178da
Shift+Drag region OCR selection on the screen. Verify that the RegionOcrOverlay appears correctly and local OCR processes the selected region.Shift+Drag region OCR uses local OCR and functions correctly without requiring the user to be logged in or have a cloud subscription.commits: eea0c865, fe9060db, c99c3967, aeaa446b, 5a219688, caae1ebc, 67caf1d1, ff4af7b5
eea0c865).d62360bc4)ef39e728d)Alt+S. Verify that the overlay window appears and immediately receives keyboard focus, allowing immediate typing.OcrTextBlock deserialization correctly handles the specific Windows OCR format. (c49ccb55)4d20803a)2e50c772)a0aba1643)c13e21b55)commits: eea0c865, fe9060db, c99c3967, aeaa446b, 5a219688, caae1ebc, 67caf1d1
eea0c865).The event-driven pipeline (paired_capture.rs) decides per-frame whether to use accessibility tree text or OCR. Terminal apps force OCR because their accessibility tree only returns window chrome.
commits: 5a219688 (wire up Windows OCR), caae1ebc (prefer OCR for terminals), 67caf1d1 (no chrome fallback)
App categories and expected behavior:
| App category | Examples | app_prefers_ocr | Text source | Expected text |
|---|---|---|---|---|
| Browser | Chrome, Edge, Firefox | false | Accessibility | Full page content + chrome |
| Code editor | VS Code, Fleet | false | Accessibility | Editor content, tabs, sidebar |
| Terminal (listed) | WezTerm, Windows Terminal, Alacritty | true | Windows OCR | Terminal buffer content via screenshot |
| Terminal (unlisted) | cmd.exe, powershell.exe | false | Accessibility | Whatever UIA exposes (may be limited) |
| System UI | Explorer, taskbar, Settings | false | Accessibility | UI labels, text fields |
| Games / low-a11y apps | Games, Electron w/o a11y | false | Windows OCR (fallback) | OCR from screenshot |
| Lock screen | LockApp.exe | false | Accessibility | Time, date, battery |
Terminal detection list (app_prefers_ocr matches, case-insensitive):
wezterm, iterm, terminal, alacritty, kitty, hyper, warp, ghostty
Note: "terminal" matches WindowsTerminal.exe but NOT cmd.exe or powershell.exe.
Test checklist:
WindowsNative (not AppleNative).SELECT COUNT(*) FROM ocr_text should be non-zero after a few minutes of use on Windows.These apps are common on Windows but have never been tested with the event-driven pipeline. We don't know if their accessibility tree returns useful text or just chrome. Each needs manual verification: open the app, use it for a few minutes, then curl "http://localhost:3030/search?app_name=<name>&limit=3" and check if the text is meaningful.
Status legend: ? = untested, OK = verified good, CHROME = only returns chrome, EMPTY = no text, OCR-NEEDED = should be added to app_prefers_ocr
| App | Status | a11y text quality | Notes |
|---|---|---|---|
| Browsers | |||
| Chrome | OK | good (full page content) | 2778ch avg, rich a11y tree |
| Edge | ? | probably good | same Chromium UIA as Chrome |
| Firefox | ? | unknown | different a11y engine than Chromium |
| Brave / Vivaldi / Arc | ? | probably good | Chromium-based, needs verification |
| Code editors | |||
| VS Code | ? | unknown | Electron, should have good UIA |
| JetBrains (IntelliJ, etc) | ? | unknown | Java Swing/AWT, UIA quality varies |
| Sublime Text | ? | unknown | custom UI, may need OCR fallback |
| Cursor | ? | unknown | Electron fork of VS Code |
| Zed | ? | unknown | custom GPU renderer, a11y unknown |
| Terminals | |||
| WezTerm | CHROME | chrome only ("System Minimize...") | app_prefers_ocr = true, OCR works |
| Windows Terminal | ? | unknown | matches "terminal" in app_prefers_ocr |
| cmd.exe | ? | unknown | NOT matched by app_prefers_ocr |
| powershell.exe | ? | unknown | NOT matched by app_prefers_ocr |
| Git Bash (mintty) | ? | unknown | NOT matched by app_prefers_ocr |
| Communication | |||
| Discord | ? | unknown | Electron, old OCR data exists |
| Slack | ? | unknown | Electron |
| Teams | ? | unknown | Electron/WebView2 |
| Zoom | ? | unknown | custom UI |
| Telegram | ? | unknown | Qt-based |
| ? | unknown | Electron | |
| Productivity | |||
| Notion | ? | unknown | Electron |
| Obsidian | ? | unknown | Electron |
| Word / Excel / PowerPoint | ? | unknown | native Win32, historically good UIA |
| Outlook | ? | unknown | mixed native/web |
| OneNote | ? | unknown | UWP, should have good UIA |
| Media / Creative | |||
| Figma | ? | unknown | Electron + canvas, likely poor a11y on canvas |
| Spotify | ? | unknown | Electron/CEF |
| VLC | ? | unknown | Qt-based |
| Adobe apps (Photoshop, etc) | ? | unknown | custom UI, historically poor a11y |
| System / Utilities | |||
| Explorer | OK | good | file names, paths, status bar |
| Settings | ? | unknown | UWP, should be good |
| Task Manager | ? | unknown | UWP on Win11 |
| Notepad | ? | unknown | should have excellent UIA |
| Games / GPU-rendered | |||
| Any game | ? | likely empty | GPU-rendered, no UIA tree. should fall to OCR |
| Electron w/ disabled a11y | ? | likely empty | some Electron apps disable a11y |
Priority to test (most common user apps):
How to verify an app:
# 1. Open the app, use it for 2 minutes
# 2. Check what was captured:
curl "http://localhost:3030/search?app_name=<exe_name>&limit=3&content_type=all"
# 3. If text is only chrome (System/Minimize/Close), it may need adding to app_prefers_ocr
# 4. If text is empty and screenshots exist, OCR fallback should kick in
# 5. Update this table with findings
Apps that may need adding to app_prefers_ocr list:
"cmd" and "powershell" to the list"mintty"commits: deac5ea9
commits: 8f334c0a, fda40d2c
fda40d2c)..tar.gz + .sig for macOS, .nsis.zip + .sig for Windows.tauri.prod.conf.json to tauri.conf.json before building. identifier is screenpi.pe not screenpi.pe.dev.workflow_dispatch creates draft. manual publish or release-app-publish commit publishes.commits: 8c8c445c
.mcpb file and opens it in Claude Desktop. was broken because GitHub releases API pagination didn't reach mcp-v* releases buried behind 30+ app releases (8c8c445c).getLatestMcpRelease() paginates up to 5 pages (250 releases) to find mcp-v* tagged releases. verify it works even when >30 app releases exist since last MCP release..mcpb, opens Claude Desktop, waits 1.5s, then opens the .mcpb file to trigger Claude's install modal.cmd /c start instead of open -a.{}).commits: fa887407, 815f52e6, 60840155, e66c3ff8, c905ffbf, 01147096, 5908d7f4, 46422869, 4f43da70, 71a1a537, 6abaaa36, f3e55dbc, 8e426dec, 1289f51e, 4bc9ff1a, c336f73d, 2f7416ae
ps aux | grep pi should show a single, stable pi process that doesn't restart or get killed.pi process is manually killed, it should restart automatically within a few seconds and be ready for chat.activity-summary tool, and the activity-summary endpoint works correctly.search-elements tool.frame-context tool.screenpipe-analytics skill can be used by the Pi agent to perform raw SQL usage analytics.screenpipe-retranscribe skill can be used by the Pi agent for retranscription.user_token is correctly passed to Pi pre-configuration so pipes use the screenpipe provider.bee49f1e7)timeout to pipe.md frontmatter. Verify pipe respects this timeout. (cc0ecef53)f501c19fb)da206471a)--provider and --model flags should be correctly moved before -p prompt in pi spawn commands.2e68400c)b709af2f)602419151)de56176e5)2f75e90bf)5dff9d21a)89d2e0129)6c23e1399, d81ea65c1)/notify API (e.g., via a pipe). Verify an in-app notification panel appears instead of a system notification. (34937b2dc)41c8b8085)603c84f7b)f01213cf5)commits: fa887407, 815f52e6, 60840155, e66c3ff8, c905ffbf, 01147096, 5908d7f4, 46422869, 4f43da70, 71a1a537, 6abaaa36
ps aux | grep pi should show a single, stable pi process that doesn't restart or get killed.pi process is manually killed, it should restart automatically within a few seconds and be ready for chat.activity-summary tool, and the activity-summary endpoint works correctly.search-elements tool.frame-context tool.screenpipe-analytics skill can be used by the Pi agent to perform raw SQL usage analytics.screenpipe-retranscribe skill can be used by the Pi agent for retranscription.user_token is correctly passed to Pi pre-configuration so pipes use the screenpipe provider.--provider and --model flags should be correctly moved before -p prompt in pi spawn commands.commits: 58460e02, 853e0975
44a19b73f, b53b08b6e)commits: 58460e02
commits: fc830b43, f54d3e0d
08feb4df5)commits: 274a968af, dc575e48e, 81aabbf18, d5e071854, db08f8c06, f4225b580
dc575e48e)81aabbf18)screenpipe vault commands work without the server running. (f4225b580)d5e071854)commits: ad431b513, d9722bccc, 4df21e83d
ad431b513)d9722bccc)commits: fc830b43
0d42ea221)AlertDialog appears instead of a standard window.confirm. (b5db080d6)591710246)run section 1 and 2 completely. these are the most fragile.
run section 3, 5, and 14 (Windows text extraction matrix) completely.
run section 4 completely.
run section 7 and 10.
CanJoinAllSpaces (visible on all Spaces simultaneously). chat and main overlay should use MoveToActiveSpace (moved to current Space on show, then flag removed to pin).macOS: ~/.screenpipe/screenpipe-app.YYYY-MM-DD.log
Windows: %USERPROFILE%\.screenpipe\screenpipe-app.YYYY-MM-DD.log
Linux: ~/.screenpipe/screenpipe-app.YYYY-MM-DD.log
# crashes/errors
grep -E "panic|SIGABRT|ERROR|error" ~/.screenpipe/screenpipe-app.*.log
# monitor events
grep -E "Monitor.*disconnect|Monitor.*reconnect|Starting vision" ~/.screenpipe/screenpipe-app.*.log
# frame skip rate (debug level only)
grep "Hash match" ~/.screenpipe/screenpipe-app.*.log
# queue health
grep "Queue stats" ~/.screenpipe/screenpipe-app.*.log
# DB contention
grep "Slow DB" ~/.screenpipe/screenpipe-app.*.log
# audio issues
grep -E "audio.*timeout|audio.*error|device.*disconnect" ~/.screenpipe/screenpipe-app.*.log
# window/overlay issues
grep -E "show_existing|panel.*level|Accessory|activation_policy" ~/.screenpipe/screenpipe-app.*.log
# Apple Intelligence
grep -E "FoundationModels|apple.intelligence|fm_generate" ~/.screenpipe/screenpipe-app.*.log
7ea1eb94e)commits: cf2dcd5f8, ad1d00d8f, 6f623b30a, aaf031169
cf2dcd5f8)ad1d00d8f)aaf031169)6f623b30a)f82b4f350)commits: f6c21a022, 31e67ae1c, 8d0a5348d, b1c30e99b
f6c21a022)31e67ae1c, 8d0a5348d)b1c30e99b)commits: c8769545b, 4f522325b, 54000c295
c8769545b)c8769545b)4f522325b, 54000c295)commits: c6a73b17e, 945b687ec
c6a73b17e, 945b687ec)