Documentation

Anchor Engine

Anchor is the stateful core of Warden. While Reflex makes binary safety decisions on individual tool calls, Anchor tracks the session as a whole — where it started, where it is now, whether it is drifting, and how much the assistant can be trusted.

Anchor contains five modules: Compass, Focus, Ledger, Debt, and Trust.

Compass: Phase Detection

Every coding session follows a natural arc. Compass models this arc as five phases:

Phase	Description	Typical behavior
Orientation	Understanding the codebase and task	Reading files, searching symbols, asking questions
Exploring	Investigating approaches and gathering context	Running tests, reading documentation, trying small experiments
Building	Active implementation	Writing code, creating files, running builds
Verifying	Testing and validating the implementation	Running tests, reviewing diffs, checking output
Wrapping	Finalizing and cleaning up	Committing, formatting, writing docs, closing issues

Compass detects the current phase by analyzing 8 parameters over a rolling window:

Read/write ratio — high reads suggest Orientation/Exploring; high writes suggest Building
Test invocation rate — spikes during Verifying
File diversity — many distinct files suggest Exploring; few files suggest focused Building
Error rate — increasing errors during Building suggest a transition to Verifying is needed
Command repetition — high repetition in Building is normal (edit-compile-test); high repetition in Exploring is a loop
Turn count since last phase change — phases that persist too long may indicate drift
Commit/save signals — indicate Wrapping
User message frequency — high frequency suggests Orientation (back-and-forth); low frequency suggests autonomous Building

Hysteresis

Phase transitions use hysteresis to prevent oscillation. A phase must score above the entry threshold for 3 consecutive evaluations before Compass commits to the transition. Once in a phase, it must score below the exit threshold (lower than entry) for 3 evaluations before leaving.

This prevents the common case where a single test run during Building briefly scores as Verifying, causing a phase flip-flop that would confuse injection targeting.

Phase budgets

Each phase has a different injection budget and signal sensitivity:

Phase	Max injections per turn	Trust sensitivity
Orientation	2	Low — exploring is expected
Exploring	3	Medium — drift detection active
Building	1	High — interruptions are costly
Verifying	2	Medium — errors are expected
Wrapping	1	Low — finishing up

Focus: Coherence Tracking

Focus maintains a score from 0 to 100 representing how coherent the session’s current activity is. A focused session works on a small set of related files toward a clear goal. An unfocused session jumps between unrelated files and directories.

Focus is computed as a weighted combination of:

File-set stability (40%) — how much the set of recently-touched files overlaps with the set from 5 turns ago
Directory concentration (30%) — what fraction of file operations target a single directory tree
Goal alignment (20%) — whether recent tool calls are consistent with the detected phase
Topic coherence (10%) — whether file names and command arguments share lexical similarity

A Focus score below 40 triggers a FocusDrift signal. Below 20 triggers a FocusCritical signal that increases the injection budget to deliver a re-centering reminder.

Focus naturally drops during phase transitions (Orientation to Exploring, or Building to Verifying) and this is expected. The signal is suppressed during the first 3 turns after a phase change.

Ledger: Turn Tracking

Ledger is the simplest Anchor module. It counts turns (tool invocations) and tracks gaps between verification events (test runs, build checks, lint passes).

Key metrics:

Total turns — lifetime count for the session
Turns since last verify — resets when a test/build/lint command is detected
Turns since last user message — measures autonomous run length
Phase duration — turns spent in the current phase

Ledger feeds into Debt and Trust calculations but does not emit signals directly. It is a bookkeeper, not a decision-maker.

Debt: Verification Tracking

Debt tracks how much unverified work has accumulated. Every file write increments debt; every successful test run decrements it. The formula:

debt = unverified_writes - (successful_tests * 2) - (successful_builds * 1)

Debt is clamped to [0, 100]. When debt exceeds 30, Anchor emits a VerificationNeeded signal. When it exceeds 60, the signal escalates to VerificationUrgent.

The multipliers reflect that a single test run typically validates multiple file changes, while a build check validates fewer (compilation success does not mean behavioral correctness).

Debt resets to 0 when the phase transitions to Wrapping, on the assumption that the developer has accepted the current state.

Trust: Session Confidence Score

Trust is the most consequential Anchor metric because it directly controls the injection budget — how many context injections Warden delivers per tool call.

The Formula

trust = 100
      - (errors * 5)
      - (debt * 3)
      - (phase_switches * 2)
      - (dead_ends * 4)
      - (denials * 3)
      + bonuses

Where:

errors — count of tool calls that produced stderr output in the last 20 turns
debt — current verification debt (0-100, scaled to 0-10 for this formula)
phase_switches — number of phase transitions in the last 30 turns (frequent switching suggests confusion)
dead_ends — count of sequences where the assistant tried an approach, hit an error, and reverted (detected by Loopbreaker)
denials — count of Reflex denials in the last 20 turns
bonuses — positive signals: successful test runs (+3 each), clean builds (+2), phase progression in natural order (+5)

Trust is clamped to [0, 100].

Trust Gates

Trust directly controls the injection budget through a tiered gate system:

Trust Range	Max Injections	Interpretation
85-100	1	High confidence — minimal guidance needed
50-84	3	Moderate confidence — occasional nudges
25-49	5	Low confidence — active guidance
0-24	15	Very low confidence — heavy guardrails

The counterintuitive inversion — more injections at lower trust — reflects the design philosophy that struggling sessions need more help, not less. A high-trust session is humming along and extra injections would only waste context window space.

Gate transitions

Gate transitions use the same hysteresis as Compass phase transitions. Trust must remain in a new tier for 3 consecutive evaluations before the injection budget changes. This prevents a single error from flooding the context with injections.

Signal Categories

Anchor emits signals in 7 categories, each with a utility function that determines whether it is worth injecting:

Category	Signal	Utility threshold	Effect
Phase	`PhaseShift`	Always emitted	Updates injection targeting
Phase	`PhaseStall`	> 30 turns in phase	Suggests phase transition
Focus	`FocusDrift`	Focus < 40	Re-centering reminder
Focus	`FocusCritical`	Focus < 20	Strong re-centering + file list
Debt	`VerificationNeeded`	Debt > 30	Test/build reminder
Debt	`VerificationUrgent`	Debt > 60	Escalated test reminder
Trust	`TrustDrop`	Trust crosses gate boundary	Budget adjustment + explanation

Signals that fall below their utility threshold are logged but not injected. This prevents low-value noise from consuming the injection budget.

Interaction with Other Engines

Anchor does not make safety decisions (that is Reflex’s job) and does not learn across sessions (that is Dream’s job). Its role is strictly intra-session state management.

However, Anchor’s outputs feed the other engines:

Reflex reads the current trust score to adjust Loopbreaker thresholds (low-trust sessions have tighter loop detection)
Dream reads the full session state at session end to extract patterns worth remembering
Harbor reads signals and trust gates to determine the injection budget and format context blocks

This one-way data flow keeps the engine boundaries clean while allowing cross-engine coordination.