Reasoning Canvas | Runbook Technology

Runbooks manifest the playbook inside the system

A visual, reusable, triggerable pipeline. The canvas is not a drawing of the configuration. The canvas is the configuration interface.

Drop a node, and infrastructure materialises beneath it. Run the runbook and the DAG walks itself: source, staging, view, enrichment, action and evidence-grade case output.

Premise Move runbooks out of Confluence and into executable case infrastructure

Response Detection tools find the threat; Simply Discover completes the case

Output Reusable runbook models, governed decisions and implementation-ready examples

Back to Reasoning Canvas

Parent page

Reasoning Canvas

The broader technology write-up: case reasoning as a graph, narrative export and defensible decision records.

1. The Premise

The manifested playbook.

A runbook is a visual, reusable, triggerable pipeline. Drop a node, and infrastructure materialises beneath it. Run the runbook and the DAG walks itself.

“Manual runbooks drift when they sit apart from the work. A manifested runbook keeps the playbook in the system, so authorised teams can tune what happens next and rely on the process to run consistently.”
Reasoning Canvas principle

2. What Runbooks Replace

Three failure modes made obsolete.

These show up in many organisations. A runbook makes each one operational rather than aspirational.

01. Wiki rot

Confluence playbooks

Written once, drift forever. Disconnected from the thing doing the work. The only honest version lives in someone’s head.

02. Hardcoded pipelines

Fixed policy flows

Inflexible by design. A policy tweak means a code change, operators cannot shape the flow directly, and bespoke work piles up.

03. Ad-hoc work

Manual investigation

Unrepeatable. The artefacts survive but the process does not. Analysts answer differently and nobody can prove why.

3. Anatomy

Five swimlanes, one executable DAG.

Each node maps to an operational layer in the runbook. The canvas remains the controlled configuration surface.

Sources

Approved custodiansSource connector

Staging

Collected itemsEvidence intake

Views

External vendor commsScoped review view

Enrich

Classify data typeClassification rule

Classify sensitivitySensitivity rule

Actions

Create DSAR caseCase action

Export CISO reportReport action

When the runbook runs, it walks this graph in topological order. Source nodes collect approved data. View nodes scope it. Enrich nodes classify it. Action nodes produce case outputs. The colours map to capability layers.

Lane	Node type	Materialises as	Status surface
Sources	`source`	Approved data locations, custodians and connectors	Collection readiness and progress
Staging	`intake`	Normalised evidence set ready for review and automation	Evidence intake progress
Views	`view`	Reusable filtered lens over the collected material	Scoped view readiness
Enrich	`enrich`	Classification, extraction, confidence and review rules	Enrichment progress and review state
Actions	`action`	Case creation, export packs, notifications and hand-off tasks	Output-specific action status

4. Canonical Example

Malicious URL response, end to end.

This runbook shows the platform pattern end to end. Each step is a node on the canvas. Each step turns into infrastructure.

Trigger

Watchlist match, security platform alert, or manual bad URLOne DAG supports all three entry points.

Source

Search the entire email estate for URL hashInbox, sent items and forwards across every mailbox.

View

Scope to matching itemsEveryone who received it, sent it, forwarded it, when and to whom.

Enrich

Build the knowledge graphAffected people, seniority, first and last seen, external third parties.

Synthesise

Generate the incident narrativeBlast radius, affected people by seniority and remediation steps.

Action: report

CISO-ready PDF“We had an incident. This was the first email. These are the people affected. These are their levels. This is the remediation step we have taken.”

Action: LFS message

High-priority SimplySend to affected people“We have removed access to these messages. Here is what happened and what you need to know.”

Value

The security lead has data for leadership immediately.

Three of these a week, each delivered as a completed case with drill-down through the knowledge graph. They tune the runbook over time and it gets better.

Operational metric

Security-team time on repeat incident response.

Data pull, knowledge-graph build, affected-people lookup and report drafting. Reduce repeated manual investigation effort and measure the time returned to the security team.

5. Execution Models

Two ways a runbook can live.

Same DAG, different relationship to cases. Pick once at authoring time.

Mode 1. Ongoing

1 Runbook → 1 Case

A single monitoring case. The pipeline re-runs on each trigger; new data flows through the same DAG into the same case. Continuous compliance monitoring lives here.

RunbookMonitoring caseAccruing detections

Mode 2. Triggered

1 Runbook → N Cases

Each trigger spawns a new case. The monitoring case becomes a dashboard; the triggered cases are investigations. Malicious URL response and DSAR intake live here.

RunbookCase #1Case #2Case #3

6. Triggers

Five ways in.

Manual, scheduled and polling triggers are available now; webhook and platform-event triggers are planned extensions.

Type	Mechanism	Canonical use	Status
Manual	User clicks Run and fills slot dialog	Ad-hoc investigation	Live
Schedule	Temporal cron workflow	Nightly compliance sweep	Live
Poll	Scheduled interval pattern	Watch for new detections matching criteria	Live
Webhook	External POST to controlled endpoint	Security alert integration	Planned
System event	Platform event match	Platform-generated events	Planned

7. The Core

The enrich node is where AI meets the contract.

Sources move bytes. Views filter them. Actions produce outputs. Enrich is where meaning gets attached and where the human authors the contract the model is held to.

Each LabelDefinition already has a ValueMode. Three options, mutually exclusive.

ValueSet

AI picks from a list

Predefined LabelAllowedValue entries. Example: Data Category - Financial, Health, Employment, Communications.

FreeText

AI writes arbitrary text

Open-ended annotation such as reviewer notes on a triage, useful when the answer space cannot be enumerated.

Standalone

Presence equals truth

No value at all. The label being applied is the signal, for example Contains PII as a binary flag.

LabelAllowedValue.ParentLabelAllowedValueId already lets values nest. The AI classifies at the leaf; the parent is a grouping affordance for the review UI.

Data Category (LabelDefinition)
Financial Data
  Bank Statements
  Tax Records
  Investment Docs
Health Data
  Medical Records
  Insurance Claims
Employment Data
Communications Data

Author the taxonomy once; the model navigates it. “Show me all Financial items” expands the tree in the review UI.

The current model creates one LabelApplied row per value, so multi-value is technically possible. The gap is that nothing tells the AI or backend whether to pick one or many.

public enum LabelCardinalityMode
{
    PickOne  = 0, // Exactly one value. AI picks the single best match.
    PickBest = 1, // Same as PickOne for AI, but UI shows confidence + override.
    PickMany = 2  // AI tags all that apply. Multiple LabelApplied rows.
}

Mode	Prompt instruction	Backend assertion	UI rendering
PickOne	Select exactly one value from the list.	Reject if >1 row per Detection	Radio buttons
PickBest	Select the single most appropriate value.	Reject if >1 row per Detection	Radio + confidence + override
PickMany	Select all values that apply.	No constraint, 0..N rows	Checkboxes

The label configuration drives the prompt. There is no separate AI instruction system; the user authors the contract and the pipeline materialises it.

Each label gets two thresholds, per LabelDefinition, not global. Sensitivity might run hot at 80/60; Topic might run loose at 60/30.

≥ 80%

Auto-apply
LabelApplied created. TimeProcessed = now.

AutoApplyThreshold

50-80%

Review band
LabelApplied created. TimeProcessed = null. Human review required.

ReviewBandFloor

< 50%

Skip
No LabelApplied row. The model was not confident enough.

Below floor

When the runbook reaches an enrich node, six things happen in order.

Create a Query on the case, scoped to the upstream view node’s filter.
Create LabelDefinitions from the enrich config: cardinality, thresholds, allowed values and hierarchy.
Build the classification prompt from the LabelDefinitions.
Run the existing classification pipeline: AI, Detections, LabelsApplied.
Validate responses against cardinality constraints. Reject or truncate non-conforming responses and log warnings.
Apply confidence thresholds: auto-apply, review or skip.

The enrich node reuses the existing pipeline. The new work is cardinality enum, thresholds and prompt-injection logic that reads label configuration.

8. The Schema

How it fits together.

Six entities, two new columns and one new FK in the source model; the implementation list below also captures the review floor field explicitly.

Runbook

Id int
Name string
Graph jsonb
TriggerConfig jsonb
ExecutionMode enum

Case

Id int
CaseType enum, Runbook=10
RunbookId → Runbook
TriggerDetectionId → Detection
Provenance and status

CaseMailboxConfiguration

CaseId → Case
Mailboxes / Custodians
Status enum

ViewSnapshot / CaseView

CaseId → Case
Filter jsonb
State enum

Query

CaseId → Case
Status enum

LabelDefinition

Id, Name, ValueMode
QueryId → Query
CardinalityMode enum
AutoApplyThreshold int?
ReviewBandFloor int?
Existing colour/export fields

LabelAllowedValue

LabelDefinitionId → LD
Value string
Color string
ParentLabelAllowedValueId self

Detection

Id int
QueryId → Query
ItemRef string

LabelApplied

DetectionId → Detection
LabelDefinitionId → LD
LabelAllowedValueId → LAV?
Confidence int
TimeProcessed datetime?

Table	Field	Type	Purpose
LabelDefinitions	`CardinalityMode` New	int enum, not null, default 0	How many values the AI should select
LabelDefinitions	`AutoApplyThreshold` New	int?, nullable	Confidence percent above which to auto-apply
LabelDefinitions	`ReviewBandFloor` New	int?, nullable	Confidence percent below which to skip entirely
Cases	`TriggerDetectionId` New	int? FK → Detections	Provenance: which Detection triggered this case

9. The Three Layers

Templates first, compounds second, primitives last.

Palette ordering does real work. New users reach for what is at the top; power users scroll down.

L3 Template

Malicious URL Response
Zero-config, senior-analyst pre-baked. A junior officer fills slots.

Case officer

L2 Compound

Compound.Distill, Compound.Classify, Compound.Profile
Single node, configurable and promotable. Advanced users can reveal the underlying L1 subgraph.

Regular officer

L1 Primitive

Filter, Label, Extract, Synthesise, Relate
Full control for senior analysts building templates and reusable compound patterns.

Senior analyst

Senior analysts compose primitives into compounds, compounds into templates, and the customer’s library accumulates over time.

10. Status

Where the work stands.

Available foundation Live

Runbook definitions with graph and trigger configuration
Runbook-backed case records
Authoring, editing and governance services
Case linkage to active runbook definitions
Trigger infrastructure: arm/disarm, scheduled and polling runs
Manual run with governed slot inputs
Frontend canvas authoring surface

Planned capability layer

Single-value and multi-value label behaviour Core
Auto-apply confidence thresholds Core
Human review floor controls Core
Prompt instructions generated from runbook configuration Core
Validation of classification responses Core
Confidence threshold logic Core
Enrich node configuration panel, tree editor and sliders Next
Case-creation action with trigger provenance Core
Scheduled and event-driven trigger handling Core

Runbook template library / sharing

Versioning of runbook definitions

Branching / conditional DAG edges

Cross-runbook dependencies

Policy-template migration

Webhook + system event triggers

11. The Vision

A runbook is a visual, reusable, triggerable pipeline.

The canvas is the configuration interface. Labels define how AI classifies and what it can choose. The whole thing materialises into infrastructure already built, and every capability becomes customer-authorable.

Policy templates

Become system templates.

Every hardcoded policy can be lifted into an L3 template that preserves the working pattern without keeping it trapped in code.

DSAR pipelines

Become customer-tunable runbooks.

Customers author at L2 and tune the workflow without waiting for engineering to move a filter or threshold.

Power users

Compose primitives into new operating models.

The system gets smarter as runbooks accumulate cross-case learning and repeatable patterns.

Explore the broader Reasoning Canvas model.

See how runbook reasoning sits inside the broader graph, narrative export and defensible decision record.

Back to Reasoning Canvas