Assessments and runs
Everything about a single assessment — writing the goal, choosing execution, reviewing the plan, watching the workbench, and reading the output.
A run is one assessment, beginning to end. You write a goal, the agent plans the work, executes it inside the scope's rules of engagement, and produces assets, findings, evidence, activity, and a report. Everything else on the platform — automations, retests, compliance readiness — is built on top of runs.
This page is the long-form manual for the run lifecycle: how to launch one, what to put in the goal, what the plan looks like, how the workbench works, and how to use the output afterward.
What you will learn
- Launch a run. The exact flow from the Runs page to a running assessment.
- Write a goal that produces good work. The difference between vague and useful objectives.
- The plan and Safe Mode. What gets approved before execution and what to look at.
- The workbench and run output. What you see while a run is live and after it finishes.
Related app areas
The run lifecycle
Every run moves through the same lifecycle. Knowing the stages makes the UI obvious: each tab on the run detail page reflects one stage.
- Queued — the run is accepted and waiting for capacity.
- Planning — the agent reads the goal, scope, RoE, and prior context, then drafts a structured plan.
- Awaiting approval — if Safe Mode is on for the scope, the run pauses for human review.
- Running — the agent executes the plan, adapting in real time as it learns.
- Completed — the agent finished. Findings, assets, evidence, and the report are durable.
- Other states — paused, failed, cancelled. Each has a reason on the run detail page.
Launching a run
The two entry points are the global New run button (top of the Runs page) and the New run button on a scope detail page. The fields are the same either way.
- 1Open Runs (or the scope you want to assess).
- 2Click New run.
- 3Confirm the scope.
- 4Choose execution: cloud for public targets, the attached private runner for internal targets.
- 5Choose an intelligence category if prompted.
- 6Write a goal that describes the outcome you want.
- 7Start the run.
- 8Approve the plan when it pauses for review (Safe Mode only).
Writing a goal the agent can act on
The goal is the most important sentence you write per assessment. Treat it like a brief for a senior pentester: describe the surface, the security question, the constraints, and the output you want. Do not prescribe techniques — the agent will pick those itself.
- Name the surface — which app, API, workflow, or subnet should be assessed.
- State the question — what security property should be verified or attacked.
- Surface constraints — anything beyond the rules of engagement (e.g., specific accounts, specific hours).
- Ask for an output — validated findings, code locations, a readiness report, a specific finding to retest.
Goals to copy
Discovery — "Discover the reachable attack surface for this scope, fingerprint technologies, map auth surfaces, and recommend follow-up priorities." Targeted — "Review the customer portal admin workflows for IDOR and privilege escalation using the test account in scope credentials." Retest — "Re-validate finding F-1234 and confirm whether the IDOR on /orders is resolved."
Choosing where the run executes
Execution location is mechanical: cloud for anything on the public internet, attached private runner for anything that lives behind a network boundary. If the scope contains internal CIDR targets, only private runner execution will reach them.
If the scope contains both public and internal targets, you can launch separate runs against each kind, or write a goal that explicitly names the surface you want the current run to cover.
Reviewing the plan
When Safe Mode is on, the run pauses after planning. You see the structured plan the agent drafted: which surfaces it intends to assess, which techniques it expects to need, what credentials or repositories it expects to use, and the testing depth it plans for.
Approve the plan as-is, request feedback (edit the goal or rules of engagement and let the agent replan), or discard if the plan is not appropriate.
- Approve — execution begins immediately.
- Request feedback — the run pauses; update the goal or scope and the agent re-plans.
- Discard — the run ends without execution. Credits are not consumed for a discarded plan.
The workbench — watching a live run
Once execution starts, the workbench is the live view. You can see what the agent is doing right now, which assets and findings are accumulating, and the run activity stream. You do not need to babysit — the workbench exists for visibility, not supervision.
You can pause a manual run at any time, request the agent take a specific direction, or cancel it. Automated runs are read-only while active; if they need input, the app surfaces an explicit action.
Reading the run output
When a run finishes, the run detail page becomes the authoritative record. Each tab serves a clear purpose.
- Overview — status, duration, summary, and the goal that drove the run.
- Plan — the structured plan that was approved and executed.
- Findings — security issues created or updated by this run, with evidence and remediation guidance.
- Assets — attack surface discovered or touched, with state changes (candidate → validated → managed).
- Activity — request and discovery summaries, suitable for replay or audit.
- Reports — the stakeholder-ready writeup, when the run was scoped to produce one.
Pause, resume, or cancel
Manual runs can be paused at any time. The agent finishes any in-flight action, then waits. Resume restarts execution from where it stopped. Cancel ends the run; partial output (assets, findings, evidence) remains.
Credits are consumed as work happens, not at the end. Pausing or cancelling does not refund credits for work already performed but stops further consumption.
What to do after a run
A finished run is the start of the operating loop, not the end. Triage findings, promote new assets, retest fixes when they ship, and turn recurring assessments into automations once the goal is stable enough to repeat.
