Agent Memory Architectures: JITIR Against the Field
A decomposition exercise: Cloudflare Sessions, MemGPT/Letta, Zep/Graphiti, Mem0, A-MEM, LangMem, and what the CLI coding agents already do

Table of Contents

1. What "memory" names

"Agent memory" is one word for at least four distinct things.

1. Conversation log:    durable record of messages and tool calls.
2. Working scratchpad:  mutable state the agent rewrites mid-task.
3. Learned facts:       extracted assertions, indexed for retrieval.
4. Reference material:  large documents loaded on demand.

The four have different write disciplines, different read patterns, and different cost profiles in the context window. Most frameworks that present a single "memory" API are quietly conflating two or more of them. The interesting question is not "which framework has the best memory" but "where does each one draw the seams, and what gets complected as a result?"

The doc below uses Cloudflare's Sessions API as the reference decomposition, characterizes the other systems against it, and then locates JITIR on a separate axis the reactive-memory field has left mostly unbuilt. The companion CLI-coding-agent survey (CLI Coding Agents Q2 2026) covers the same problem one level down, at the harness boundary, and is referenced where the parallels are direct.

2. Reference decomposition: Cloudflare Sessions

Cloudflare's Sessions API decomposes memory into four block types, each characterized by a provider contract. The contract is a set of methods on a JavaScript object; the Session detects which methods exist and synthesizes tools to match.

Block type Provider methods Prompt presence Generated tool
readonly get() full content (none)
writable get() + set() full content + budget set_context
searchable get() + search() + set() summary count search_context
loadable get() + load() + set() metadata listing load_context / unload

Three properties define this decomposition:

  • The provider is the contract. Capability is structural, not declared. A provider with a search() method becomes a searchable block; the search-tool generation is mechanical from that fact.
  • The system prompt is the schema. Each block declares its type to the model inline, every turn, via the tags [readonly], [writable], [searchable], [loadable]. The prompt is the schema declaration the model reads.
  • Every read is agent-initiated. search_context and load_context are tools the model calls. The substrate (Durable Object, R2, SQLite FTS5) does nothing on its own.

The third property is the one the rest of this document organizes around. It is the design choice that nearly every production memory system has made.

Two further moves in the Sessions API are worth naming because they separate things other systems complect:

  • Compaction overlays. Older messages are summarized into overlays stored in a separate table, applied at read time. The original messages are not deleted. Identity (the conversation) is preserved; a derived value (the summary) is layered over it.
  • Frozen system prompt. freezeSystemPrompt + withCachedPrompt decouple the rendered prompt from the underlying state. A set_context call writes through to the provider but does not change the cached rendered prompt until an explicit refreshSystemPrompt at a turn boundary. The write event and the read event are unbraided.

3. The field, decomposed

3.1. MemGPT / Letta – virtual memory for context windows

The conversation log and the working scratchpad are decomposed (recall vs main context). Learned facts and reference material are both placed in archival memory, where the agent must distinguish them by its own conventions. The OS-paging metaphor is the metaphor for the contract, not for the storage shape.

Cost of the metaphor: page-out is voluntary. There is no kernel that swaps an unused page when memory pressure is high. The model has to have the discipline to write to archival memory before the relevant fact leaves the FIFO window. Failure mode is silent forgetting that reads as fluency.

3.2. Zep / Graphiti – bi-temporal knowledge graph

Each edge in the semantic graph carries four timestamps: t_created, t_expired (transaction time) and t_valid, t_invalid (event time). New episodes can invalidate existing edges by setting t_invalid; the old edge is not deleted.

Zep/Graphiti adopts a value-oriented treatment: facts accrue and are invalidated rather than mutated in place.

Cost: entity extraction and edge invalidation are LLM judgments. The graph is the product of an extractor that is itself fallible.

3.3. Mem0 – managed hybrid store with scopes

Scope is reified as a first-class axis (user / session / agent). The four block types from Cloudflare are collapsed into a single store, with the LLM extractor deciding what gets written. What is complected: the extractor's judgment about what is a fact is fused with the store's behavior.

3.4. A-MEM – Zettelkasten with mutating notes

Each interaction becomes an atomic note with bidirectional links generated at insertion time based on semantic similarity. The memory-evolution step is where the design becomes place-oriented: historical notes are mutated. From a value-oriented lens this is exactly the thing to avoid for any setting that needs an audit trail.

3.5. LangMem – semantic, episodic, procedural in LangGraph

The procedural-memory category is what nobody else factors out explicitly: "when summarizing email, the first sentence should name the action item" is a procedural rule, not a semantic fact. LangMem stores it as memory rather than as prompt text.

Cost: tight coupling to LangGraph state machines. Reported p95 search latency makes it impractical for interactive use.

4. CLI coding agents – convention as schema

The CLI-agent comparison (CLI Coding Agents Q2 — Memory and persistence) shows the same decomposition played out one level down, at the harness boundary, where the schema is filenames rather than provider methods:

Sense CLI-agent convention
Project-level instructions CLAUDE.md / AGENTS.md / GEMINI.md
Per-category long-term memory ~/.claude/.../memory/ (typed dirs)
Loadable reference material .claude/skills/SKILL.md + AGENTS.md
Auto-extracted project context Kiro: product.md tech.md structure.md

The schema is convention, not type. Convention scales to cross-harness portability — SKILL.md is now read by Claude Code, Copilot CLI, and OpenCode — but it does not give the model an in-prompt declaration of what the contract is for each file.

4.1. winze – semantic search over the convention

winze — an MCP server with semantic search and a GUI search/edit tool over a Markdown knowledge base — is the convention made queryable. The substrate is the same typed-directory markdown the CLI agents already keep; winze adds the search() method the bare filesystem lacks, promoting a readonly-or-writable block to a searchable one without moving where the bytes live. Two properties earn naming:

  • Writes are human-curated, not extracted. The GUI edit path means a person authors and revises the notes. There is no LLM extractor adjudicating what is a fact (contrast Mem0, A-MEM); the store's contents and an extractor's judgment cannot be complected, because there is no extractor.
  • Retrieval stays reactive. Semantic search is a tool the agent calls. winze sits on the reactive contract with everyone else, not on the JITIR axis below.

It is the Cloudflare search() lesson restated at the file layer: capability is structural. Point a semantic index at the markdown the harness already writes, and the convention becomes a searchable block. winze is one of a small family doing this (basic-memory, sqlite-memory, memsearch); it is the Clojure/MCP instance. (h/t Clojure Deref, 2026-05-19.)

5. JITIR – a different cut

JITIR is on an axis the systems above mostly do not occupy. The storage shape is conventional; the contract is what differs.

5.1. Two contracts named

  • Reactive contract: search(intent) -> candidates. An agent or human authors the intent. The substrate responds to queries.
  • Proactive contract: observe(context) -> candidates pushed. The substrate authors the candidate set. The agent or human is the reader.

The two contracts are distinguished by who decides to look. Every system above adopts the reactive contract. The cost of the reactive contract is the failure mode it makes invisible: the agent did not search because it did not know there was anything to find.

Prior art: Rhodes, B. "The Wearable Remembrance Agent." Personal Technologies 1.4 (1997). MIT Media Lab.

5.2. Proactive consolidation: Dreaming as managed exemplar

Dreaming (Claude Managed Agents, research preview, 2026-05-06) occupies the proactive extreme of the axis. Where the reactive systems write on turn boundaries and retrieve at query time, Dreaming runs a scheduled, out-of-band process that reads a batch of past session transcripts against the existing store and emits a reorganized store for future reads. It is the nearest production sibling to LangMem's background memory-manager: the same shape, packaged as a managed service with a request-gated preview and an evaluator loop (Outcomes) attached.

The classification carries the weight. The trade is not local. A reactive write that goes wrong costs one trajectory. A consolidation pass that goes wrong rewrites the substrate every subsequent session reads from, with no per-session rollback. The proactive pole buys amortized retrieval quality against a store-scoped blast radius.

5.2.1. Open question: non-destructive claim and governance gap

Anthropic describes the rewrite as non-destructive. The claim is only meaningful if the prior store survives as an addressable, diffable artifact. Absent retained versions, non-destructive reduces to lossy compaction under a reassuring label. The governance slot is also unfilled: nothing exposed in the preview names who reviews dream(t) against dream(t-1), or against what invariant. As a tuple the gap is explicit, [reviewer:dreaming-agent:_@managed(store)], reviewer absent.

5.2.2. Falsification conditions

F1 (non-destructive claim). "Non-destructive" holds only if dream(t-1) is retained as an addressable, diffable artifact. If consolidation is lossy compaction with no retained prior, F1 refutes the label. Open pending vendor confirmation of store versioning.

F2 (primitive vs packaging). Dreaming is a distinct memory architecture iff it exposes a mechanism beyond managed scheduling plus an Outcomes-style evaluator over an external store. Absent that mechanism, treat it as managed packaging of the LangMem memory-manager pattern, and downgrade the entry from "new architecture" to "operational pattern."

6. The empty cell

Cloudflare's blocks are characterized by two binary axes: where the content lives in the prompt (always vs on-demand), and who writes it (agent vs code). A third row, substrate-initiated, is empty in every production memory framework:

Where Reads pushed to agent Writes initiated by substrate
Substrate-initiated JITIR-shape (no production system)

The bottom-right cell — the substrate decides to write — is approached but occupied by none. Kiro's auto-generated product.md / tech.md / structure.md is the closest: the harness observes the repository on first run and writes a derived understanding. It is one-shot, not continuous.

7. What composes, what complects

System Decomposed Complected
Cloudflare all four (typed by block) writes and reads at render time
MemGPT/Letta log vs scratchpad vs archive facts and reference in archival
Zep/Graphiti identity vs state (bi-temporal) semantic and episodic share graph
Mem0 scope (user/session/agent) extraction fused with storage
A-MEM notes linked by similarity identity mutates in place
LangMem semantic vs episodic vs procedural memory ops complected with graph
JITIR trigger separated from store consumer is Emacs; needs port
Dreaming proactive consolidation (scheduled) blast radius is store-scoped

The systems differ less in their storage shape than in which distinctions they hold.

8. Related

9. Anchors