This page is intentionally shaped: enough detail for a technical reader to evaluate whether Kairos is real engineering rather than a marketing surface, without becoming a recipe for re-implementation. The specific tuning numbers (decay rates, scoring weights, pressure-trigger signals, dream-stage internals) are deliberately omitted — they are tuned ongoing and are not part of the public contract.Documentation Index
Fetch the complete documentation index at: https://docs.backant.io/llms.txt
Use this file to discover all available pages before exploring further.
How it works at a glance
Kairos is a wrapper around your Claude Code installation. It keeps Claude Code working against your repository: reading recent activity, deciding what to pick up next, shipping the change, sleeping briefly, then doing it again. The wrapper is what makes it work at all without human attention — managing state between turns, pacing the request budget, and supplying persistent memory that Claude Code can read and write through MCP. Sleep length adapts when--pace is set to your Anthropic rate-limit headroom.
The memory system
Memory has two tiers with asymmetric decay:- STM (short-term) — fast decay. Captures in-session observations, retries, anomalies. Entries that aren’t reinforced collapse below the archive threshold quickly.
- LTM (long-term) — slow decay. Captures consolidated, durable facts: architecture decisions, conventions, distilled failure signatures. Survives long disuse.
Hybrid recall
Recall combines four signals via late fusion:| Channel | What it measures |
|---|---|
| Lexical (BM25-style) | Surface-form overlap between the cue and the entry’s content |
| Dense (cosine over local embeddings) | Semantic similarity |
| Entry weight | The entry’s current lifecycle weight (decays over time, rises on citation) |
| Recency | Exponential decay since the entry was last reinforced |
Local embeddings
Embeddings are produced by Qwen3-Embedding served from a local Ollama container. The model tier (0.6B / 4B / 8B) is selected from detected hardware. Vector storage and the lexical index share a single SQLite file viasqlite-vec and FTS5.
No memory operation requires network outside the host.
The dream pass
A periodic offline consolidation. Triggered by a pressure score that combines several operational signals (work elapsed since last dream, unresolved retries, observation novelty, log volume, plus others). When pressure crosses a threshold, the next turn is a dream. A safety floor guarantees a dream every N turns; an anti-thrash floor prevents back-to-back dreams.Pipeline shape
The agent surface
The reasoning agent does not have privileged direct database access. Every memory operation is exposed as a tool over the Model Context Protocol (MCP):memory_recall,memory_recall_by_id,memory_recall_with_edgesmemory_write_stm,memory_write_ltm,memory_reinforcememory_promote,memory_demote,memory_decay_sweepmemory_edge_propose,memory_edge_approve,memory_edge_rejectwake_edge_triage,wake_ground_epic_topicdream_bucket_pending,dream_bucket_verdict,dream_bucket_write- Plus a handful of audit / introspection tools
Eval
backant eval executes a fixed simulated-scenario replay against the current memory state alongside the production metrics. Intentionally adversarial — small, fixed, deterministic — designed to surface regressions in the memory layer that live operation wouldn’t notice.
The replay corpus is curated and frozen per release. New scenarios are added when a real production issue would have been caught by one.
Context hygiene
Three complementary mechanisms keep the in-process state of the daemon fresh on long-lived deployments:--freshflag: manual escape hatch. Hard-resets.session/and.state/so the next turn re-reads memory from disk.- Freshness manager: a small meta-agent that periodically inspects recent signals (repeated failures, lesson churn, decay patterns) and decides whether the next turn should start fresh.
- Reactive overflow detector: catches the Claude context-window overflow signature in the stream and writes the fresh-flag automatically.
.backant.toml. Long-lived daemons accumulate stale judgment in their in-process context, and the cheapest correction is “start the next turn as if you’d just booted”.