Anchor Engine
Anchor is the stateful core of Warden. While Reflex makes binary safety decisions on individual tool calls, Anchor tracks the session as a whole — where it started, where it is now, whether it is drifting, and how much the assistant can be trusted.
Anchor contains five modules: Compass, Focus, Ledger, Debt, and Trust.
Compass: Phase Detection
Every coding session follows a natural arc. Compass models this arc as five phases:
| Phase | Description | Typical behavior |
|---|---|---|
| Orientation | Understanding the codebase and task | Reading files, searching symbols, asking questions |
| Exploring | Investigating approaches and gathering context | Running tests, reading documentation, trying small experiments |
| Building | Active implementation | Writing code, creating files, running builds |
| Verifying | Testing and validating the implementation | Running tests, reviewing diffs, checking output |
| Wrapping | Finalizing and cleaning up | Committing, formatting, writing docs, closing issues |
Compass detects the current phase by analyzing 8 parameters over a rolling window:
- Read/write ratio — high reads suggest Orientation/Exploring; high writes suggest Building
- Test invocation rate — spikes during Verifying
- File diversity — many distinct files suggest Exploring; few files suggest focused Building
- Error rate — increasing errors during Building suggest a transition to Verifying is needed
- Command repetition — high repetition in Building is normal (edit-compile-test); high repetition in Exploring is a loop
- Turn count since last phase change — phases that persist too long may indicate drift
- Commit/save signals — indicate Wrapping
- User message frequency — high frequency suggests Orientation (back-and-forth); low frequency suggests autonomous Building
Hysteresis
Phase transitions use hysteresis to prevent oscillation. A phase must score above the entry threshold for 3 consecutive evaluations before Compass commits to the transition. Once in a phase, it must score below the exit threshold (lower than entry) for 3 evaluations before leaving.
This prevents the common case where a single test run during Building briefly scores as Verifying, causing a phase flip-flop that would confuse injection targeting.
Phase budgets
Each phase has a different injection budget and signal sensitivity:
| Phase | Max injections per turn | Trust sensitivity |
|---|---|---|
| Orientation | 2 | Low — exploring is expected |
| Exploring | 3 | Medium — drift detection active |
| Building | 1 | High — interruptions are costly |
| Verifying | 2 | Medium — errors are expected |
| Wrapping | 1 | Low — finishing up |
Focus: Coherence Tracking
Focus maintains a score from 0 to 100 representing how coherent the session’s current activity is. A focused session works on a small set of related files toward a clear goal. An unfocused session jumps between unrelated files and directories.
Focus is computed as a weighted combination of:
- File-set stability (40%) — how much the set of recently-touched files overlaps with the set from 5 turns ago
- Directory concentration (30%) — what fraction of file operations target a single directory tree
- Goal alignment (20%) — whether recent tool calls are consistent with the detected phase
- Topic coherence (10%) — whether file names and command arguments share lexical similarity
A Focus score below 40 triggers a FocusDrift signal. Below 20 triggers a FocusCritical signal that increases the injection budget to deliver a re-centering reminder.
Focus naturally drops during phase transitions (Orientation to Exploring, or Building to Verifying) and this is expected. The signal is suppressed during the first 3 turns after a phase change.
Ledger: Turn Tracking
Ledger is the simplest Anchor module. It counts turns (tool invocations) and tracks gaps between verification events (test runs, build checks, lint passes).
Key metrics:
- Total turns — lifetime count for the session
- Turns since last verify — resets when a test/build/lint command is detected
- Turns since last user message — measures autonomous run length
- Phase duration — turns spent in the current phase
Ledger feeds into Debt and Trust calculations but does not emit signals directly. It is a bookkeeper, not a decision-maker.
Debt: Verification Tracking
Debt tracks how much unverified work has accumulated. Every file write increments debt; every successful test run decrements it. The formula:
debt = unverified_writes - (successful_tests * 2) - (successful_builds * 1)
Debt is clamped to [0, 100]. When debt exceeds 30, Anchor emits a VerificationNeeded signal. When it exceeds 60, the signal escalates to VerificationUrgent.
The multipliers reflect that a single test run typically validates multiple file changes, while a build check validates fewer (compilation success does not mean behavioral correctness).
Debt resets to 0 when the phase transitions to Wrapping, on the assumption that the developer has accepted the current state.
Trust: Session Confidence Score
Trust is the most consequential Anchor metric because it directly controls the injection budget — how many context injections Warden delivers per tool call.
The Formula
trust = 100
- (errors * 5)
- (debt * 3)
- (phase_switches * 2)
- (dead_ends * 4)
- (denials * 3)
+ bonuses
Where:
- errors — count of tool calls that produced stderr output in the last 20 turns
- debt — current verification debt (0-100, scaled to 0-10 for this formula)
- phase_switches — number of phase transitions in the last 30 turns (frequent switching suggests confusion)
- dead_ends — count of sequences where the assistant tried an approach, hit an error, and reverted (detected by Loopbreaker)
- denials — count of Reflex denials in the last 20 turns
- bonuses — positive signals: successful test runs (+3 each), clean builds (+2), phase progression in natural order (+5)
Trust is clamped to [0, 100].
Trust Gates
Trust directly controls the injection budget through a tiered gate system:
| Trust Range | Max Injections | Interpretation |
|---|---|---|
| 85-100 | 1 | High confidence — minimal guidance needed |
| 50-84 | 3 | Moderate confidence — occasional nudges |
| 25-49 | 5 | Low confidence — active guidance |
| 0-24 | 15 | Very low confidence — heavy guardrails |
The counterintuitive inversion — more injections at lower trust — reflects the design philosophy that struggling sessions need more help, not less. A high-trust session is humming along and extra injections would only waste context window space.
Gate transitions
Gate transitions use the same hysteresis as Compass phase transitions. Trust must remain in a new tier for 3 consecutive evaluations before the injection budget changes. This prevents a single error from flooding the context with injections.
Signal Categories
Anchor emits signals in 7 categories, each with a utility function that determines whether it is worth injecting:
| Category | Signal | Utility threshold | Effect |
|---|---|---|---|
| Phase | PhaseShift | Always emitted | Updates injection targeting |
| Phase | PhaseStall | > 30 turns in phase | Suggests phase transition |
| Focus | FocusDrift | Focus < 40 | Re-centering reminder |
| Focus | FocusCritical | Focus < 20 | Strong re-centering + file list |
| Debt | VerificationNeeded | Debt > 30 | Test/build reminder |
| Debt | VerificationUrgent | Debt > 60 | Escalated test reminder |
| Trust | TrustDrop | Trust crosses gate boundary | Budget adjustment + explanation |
Signals that fall below their utility threshold are logged but not injected. This prevents low-value noise from consuming the injection budget.
Interaction with Other Engines
Anchor does not make safety decisions (that is Reflex’s job) and does not learn across sessions (that is Dream’s job). Its role is strictly intra-session state management.
However, Anchor’s outputs feed the other engines:
- Reflex reads the current trust score to adjust Loopbreaker thresholds (low-trust sessions have tighter loop detection)
- Dream reads the full session state at session end to extract patterns worth remembering
- Harbor reads signals and trust gates to determine the injection budget and format context blocks
This one-way data flow keeps the engine boundaries clean while allowing cross-engine coordination.