Tenant: primeassist-dev

P26 dashboard skeleton — module pages land in M2+.

Eval runs · New

Queue a new eval run.

The runner picks up the queued row, snapshots the bound golden set, and replays every case. Leave the version-pin fields blank to use the agent's current snapshot.

Filter golden sets by agent

Advanced version pins
Cancel