Reasoning Canvas | Runbook Technology

Runbooks manifest the playbook inside the system

A visual, reusable, triggerable pipeline. The canvas is not a drawing of the configuration. The canvas is the configuration interface.

Drop a node, and infrastructure materialises beneath it. Run the runbook and the DAG walks itself: source, staging, view, enrichment, action and evidence-grade case output.

Premise Move runbooks out of Confluence and into executable case infrastructure
Response Detection tools find the threat; Simply Discover completes the case
Output Reusable runbook models, governed decisions and implementation-ready examples

1. The Premise

The manifested playbook.

A runbook is a visual, reusable, triggerable pipeline. Drop a node, and infrastructure materialises beneath it. Run the runbook and the DAG walks itself.

“Manual runbooks drift when they sit apart from the work. A manifested runbook keeps the playbook in the system, so authorised teams can tune what happens next and rely on the process to run consistently.”

Reasoning Canvas principle

2. What Runbooks Replace

Three failure modes made obsolete.

These show up in many organisations. A runbook makes each one operational rather than aspirational.

01. Wiki rot

Confluence playbooks

Written once, drift forever. Disconnected from the thing doing the work. The only honest version lives in someone’s head.

02. Hardcoded pipelines

Fixed policy flows

Inflexible by design. A policy tweak means a code change, operators cannot shape the flow directly, and bespoke work piles up.

03. Ad-hoc work

Manual investigation

Unrepeatable. The artefacts survive but the process does not. Analysts answer differently and nobody can prove why.

3. Anatomy

Five swimlanes, one executable DAG.

Each node maps to an operational layer in the runbook. The canvas remains the controlled configuration surface.

Sources
Approved custodiansSource connector
Staging
Collected itemsEvidence intake
Views
External vendor commsScoped review view
Enrich
Classify data typeClassification rule
Classify sensitivitySensitivity rule
Actions
Create DSAR caseCase action
Export CISO reportReport action

When the runbook runs, it walks this graph in topological order. Source nodes collect approved data. View nodes scope it. Enrich nodes classify it. Action nodes produce case outputs. The colours map to capability layers.

LaneNode typeMaterialises asStatus surface
SourcessourceApproved data locations, custodians and connectorsCollection readiness and progress
StagingintakeNormalised evidence set ready for review and automationEvidence intake progress
ViewsviewReusable filtered lens over the collected materialScoped view readiness
EnrichenrichClassification, extraction, confidence and review rulesEnrichment progress and review state
ActionsactionCase creation, export packs, notifications and hand-off tasksOutput-specific action status

4. Canonical Example

Malicious URL response, end to end.

This runbook shows the platform pattern end to end. Each step is a node on the canvas. Each step turns into infrastructure.

00
Trigger
Watchlist match, security platform alert, or manual bad URLOne DAG supports all three entry points.
01
Source
Search the entire email estate for URL hashInbox, sent items and forwards across every mailbox.
02
View
Scope to matching itemsEveryone who received it, sent it, forwarded it, when and to whom.
03
Enrich
Build the knowledge graphAffected people, seniority, first and last seen, external third parties.
04
Synthesise
Generate the incident narrativeBlast radius, affected people by seniority and remediation steps.
05
Action: report
CISO-ready PDF“We had an incident. This was the first email. These are the people affected. These are their levels. This is the remediation step we have taken.”
06
Action: LFS message
High-priority SimplySend to affected people“We have removed access to these messages. Here is what happened and what you need to know.”

Value

The security lead has data for leadership immediately.

Three of these a week, each delivered as a completed case with drill-down through the knowledge graph. They tune the runbook over time and it gets better.

Operational metric

Security-team time on repeat incident response.

Data pull, knowledge-graph build, affected-people lookup and report drafting. Reduce repeated manual investigation effort and measure the time returned to the security team.

5. Execution Models

Two ways a runbook can live.

Same DAG, different relationship to cases. Pick once at authoring time.

Mode 1. Ongoing

1 Runbook → 1 Case

A single monitoring case. The pipeline re-runs on each trigger; new data flows through the same DAG into the same case. Continuous compliance monitoring lives here.

RunbookMonitoring caseAccruing detections

Mode 2. Triggered

1 Runbook → N Cases

Each trigger spawns a new case. The monitoring case becomes a dashboard; the triggered cases are investigations. Malicious URL response and DSAR intake live here.

RunbookCase #1Case #2Case #3

6. Triggers

Five ways in.

Manual, scheduled and polling triggers are available now; webhook and platform-event triggers are planned extensions.

TypeMechanismCanonical useStatus
ManualUser clicks Run and fills slot dialogAd-hoc investigationLive
ScheduleTemporal cron workflowNightly compliance sweepLive
PollScheduled interval patternWatch for new detections matching criteriaLive
WebhookExternal POST to controlled endpointSecurity alert integrationPlanned
System eventPlatform event matchPlatform-generated eventsPlanned

7. The Core

The enrich node is where AI meets the contract.

Sources move bytes. Views filter them. Actions produce outputs. Enrich is where meaning gets attached and where the human authors the contract the model is held to.

Each LabelDefinition already has a ValueMode. Three options, mutually exclusive.

ValueSet

AI picks from a list

Predefined LabelAllowedValue entries. Example: Data Category - Financial, Health, Employment, Communications.

FreeText

AI writes arbitrary text

Open-ended annotation such as reviewer notes on a triage, useful when the answer space cannot be enumerated.

Standalone

Presence equals truth

No value at all. The label being applied is the signal, for example Contains PII as a binary flag.

LabelAllowedValue.ParentLabelAllowedValueId already lets values nest. The AI classifies at the leaf; the parent is a grouping affordance for the review UI.

Data Category (LabelDefinition)
Financial Data
  Bank Statements
  Tax Records
  Investment Docs
Health Data
  Medical Records
  Insurance Claims
Employment Data
Communications Data

Author the taxonomy once; the model navigates it. “Show me all Financial items” expands the tree in the review UI.

The current model creates one LabelApplied row per value, so multi-value is technically possible. The gap is that nothing tells the AI or backend whether to pick one or many.

public enum LabelCardinalityMode
{
    PickOne  = 0, // Exactly one value. AI picks the single best match.
    PickBest = 1, // Same as PickOne for AI, but UI shows confidence + override.
    PickMany = 2  // AI tags all that apply. Multiple LabelApplied rows.
}
ModePrompt instructionBackend assertionUI rendering
PickOneSelect exactly one value from the list.Reject if >1 row per DetectionRadio buttons
PickBestSelect the single most appropriate value.Reject if >1 row per DetectionRadio + confidence + override
PickManySelect all values that apply.No constraint, 0..N rowsCheckboxes

The label configuration drives the prompt. There is no separate AI instruction system; the user authors the contract and the pipeline materialises it.

Each label gets two thresholds, per LabelDefinition, not global. Sensitivity might run hot at 80/60; Topic might run loose at 60/30.

≥ 80%
Auto-apply
LabelApplied created. TimeProcessed = now.
AutoApplyThreshold
50-80%
Review band
LabelApplied created. TimeProcessed = null. Human review required.
ReviewBandFloor
< 50%
Skip
No LabelApplied row. The model was not confident enough.
Below floor

When the runbook reaches an enrich node, six things happen in order.

  1. Create a Query on the case, scoped to the upstream view node’s filter.
  2. Create LabelDefinitions from the enrich config: cardinality, thresholds, allowed values and hierarchy.
  3. Build the classification prompt from the LabelDefinitions.
  4. Run the existing classification pipeline: AI, Detections, LabelsApplied.
  5. Validate responses against cardinality constraints. Reject or truncate non-conforming responses and log warnings.
  6. Apply confidence thresholds: auto-apply, review or skip.

The enrich node reuses the existing pipeline. The new work is cardinality enum, thresholds and prompt-injection logic that reads label configuration.

8. The Schema

How it fits together.

Six entities, two new columns and one new FK in the source model; the implementation list below also captures the review floor field explicitly.

Runbook

  • Id int
  • Name string
  • Graph jsonb
  • TriggerConfig jsonb
  • ExecutionMode enum

Case

  • Id int
  • CaseType enum, Runbook=10
  • RunbookId → Runbook
  • TriggerDetectionId → Detection
  • Provenance and status

CaseMailboxConfiguration

  • CaseId → Case
  • Mailboxes / Custodians
  • Status enum

ViewSnapshot / CaseView

  • CaseId → Case
  • Filter jsonb
  • State enum

Query

  • CaseId → Case
  • Status enum

LabelDefinition

  • Id, Name, ValueMode
  • QueryId → Query
  • CardinalityMode enum
  • AutoApplyThreshold int?
  • ReviewBandFloor int?
  • Existing colour/export fields

LabelAllowedValue

  • LabelDefinitionId → LD
  • Value string
  • Color string
  • ParentLabelAllowedValueId self

Detection

  • Id int
  • QueryId → Query
  • ItemRef string

LabelApplied

  • DetectionId → Detection
  • LabelDefinitionId → LD
  • LabelAllowedValueId → LAV?
  • Confidence int
  • TimeProcessed datetime?
TableFieldTypePurpose
LabelDefinitionsCardinalityMode Newint enum, not null, default 0How many values the AI should select
LabelDefinitionsAutoApplyThreshold Newint?, nullableConfidence percent above which to auto-apply
LabelDefinitionsReviewBandFloor Newint?, nullableConfidence percent below which to skip entirely
CasesTriggerDetectionId Newint? FK → DetectionsProvenance: which Detection triggered this case

9. The Three Layers

Templates first, compounds second, primitives last.

Palette ordering does real work. New users reach for what is at the top; power users scroll down.

L3 Template
Malicious URL Response
Zero-config, senior-analyst pre-baked. A junior officer fills slots.
Case officer
L2 Compound
Compound.Distill, Compound.Classify, Compound.Profile
Single node, configurable and promotable. Advanced users can reveal the underlying L1 subgraph.
Regular officer
L1 Primitive
Filter, Label, Extract, Synthesise, Relate
Full control for senior analysts building templates and reusable compound patterns.
Senior analyst

Senior analysts compose primitives into compounds, compounds into templates, and the customer’s library accumulates over time.

10. Status

Where the work stands.

Available foundation Live

  • Runbook definitions with graph and trigger configuration
  • Runbook-backed case records
  • Authoring, editing and governance services
  • Case linkage to active runbook definitions
  • Trigger infrastructure: arm/disarm, scheduled and polling runs
  • Manual run with governed slot inputs
  • Frontend canvas authoring surface

Planned capability layer

  • Single-value and multi-value label behaviour Core
  • Auto-apply confidence thresholds Core
  • Human review floor controls Core
  • Prompt instructions generated from runbook configuration Core
  • Validation of classification responses Core
  • Confidence threshold logic Core
  • Enrich node configuration panel, tree editor and sliders Next
  • Case-creation action with trigger provenance Core
  • Scheduled and event-driven trigger handling Core

Runbook template library / sharing

Versioning of runbook definitions

Branching / conditional DAG edges

Cross-runbook dependencies

Policy-template migration

Webhook + system event triggers

11. The Vision

A runbook is a visual, reusable, triggerable pipeline.

The canvas is the configuration interface. Labels define how AI classifies and what it can choose. The whole thing materialises into infrastructure already built, and every capability becomes customer-authorable.

Policy templates

Become system templates.

Every hardcoded policy can be lifted into an L3 template that preserves the working pattern without keeping it trapped in code.

DSAR pipelines

Become customer-tunable runbooks.

Customers author at L2 and tune the workflow without waiting for engineering to move a filter or threshold.

Power users

Compose primitives into new operating models.

The system gets smarter as runbooks accumulate cross-case learning and repeatable patterns.

Explore the broader Reasoning Canvas model.

See how runbook reasoning sits inside the broader graph, narrative export and defensible decision record.