.known-couplings/stress-test-concurrency.md
A GitHub concurrency group admits only one run, so "no more than two stress workflows at once" is enforced by two repo-wide groups, not by a count or by cron spacing. The three tcp-open-close workflows (stress-test-tcp-open-close-{linux,macos,windows}.yml) share stress-scheduled; the six generative workflows (stress-test-generative-{systematic,normal}-{linux,macos,windows}.yml) share stress-scheduled-generative. One run per group ⇒ at most two concurrent. A manual workflow_dispatch run instead gets a unique …-${{ github.run_id }} group so a branch check never queues behind the scheduled loop — this relies on every stress workflow having exactly the two triggers (schedule + workflow_dispatch); the per-event failure() && github.event_name == 'schedule' Zulip alert and the --skip-on-miss libs path assume the same split. A new stress workflow MUST join one of the two groups (or the 2-at-a-time cap silently breaks); the cron slots are merely spread across the day (the website's scheduled-jobs.md tracks them — keep it in sync) and do not enforce the cap. With six workflows now sharing stress-scheduled-generative and cancel-in-progress: false, GitHub keeps only one run pending and discards older pending ones, so if cron slots ever collided a scheduled run would be silently dropped (and the failure()-gated Zulip alert never fires for a run that never ran) — keep the six generative slots well spaced. The generative engine (test/rt-stress/generative/main.pony) runs in two modes, each its own orchestrator under test/rt-stress/generative/ (not a make test-stress-* target), and both upload their failure bundles as an artifact (the tcp workflows upload nothing): systematic builds a systematic ponyc — use=scheduler_scaling_pthreads,systematic_testing on linux-glibc/macos-arm, use=systematic_testing alone on windows (which scales the scheduler with native primitives, not pthreads, so it does not pair with scheduler_scaling_pthreads; output dir build\debug-systematic_testing) — and runs orchestrate_systematic.py (serialized, reproducible; linux-glibc + macos-arm + windows); normal builds a plain config=debug ponyc (no use flags) and runs orchestrate_normal.py (the real, multi-threaded runtime; linux-glibc + macos-arm + windows, since a normal build works everywhere). The normal jobs run each seed under lldb so a crash still yields a backtrace — a normal-mode crash does not reproduce, so the in-the-moment stack is the only artifact — which is why the normal linux container adds --cap-add=SYS_PTRACE --security-opt seccomp=unconfined and the normal Windows job keeps the msys2/lldb Install Dependencies step (the systematic mode reproduces from its seed, so it runs direct and re-debugs locally). All their libs-cache platforms (ubuntu26.04-x86, arm64-macos, x86-windows) are warmer Stage-1 PR platforms, so the stress jobs need no warmer change of their own.