Eval runs · New
Queue a new eval run.
The runner picks up the queued row, snapshots the bound golden set, and replays every case. Leave the version-pin fields blank to use the agent's current snapshot.
Tenant: primeassist-dev
P26 dashboard skeleton — module pages land in M2+.
Eval runs · New
The runner picks up the queued row, snapshots the bound golden set, and replays every case. Leave the version-pin fields blank to use the agent's current snapshot.