Memory Is a Snapshot, Not a Source of Truth

A few weeks ago a recalled memory said a particular module exported a function called run_in_kata. The code had moved on. The function was now called KataBackend.invoke. The memory was useful for intent (“there is a Kata backend with a single-entrypoint invocation pattern”) and useless for the actual call signature. An action taken on the memory’s literal claim would have produced a call into a name that did not exist.

This is not a memory-system failure. It is the way memory works. The system captured what was true at the moment of capture. The code moved after. The memory did not.

The Stale Reference Problem

Any persistent memory across sessions accumulates references to things that can change: function names, file paths, flag names, configuration keys, version numbers, library names, even directory layouts. Each of those is a small claim about the world that was true when written and might not be true now.

The drift rate depends on what is being referenced. Architectural intent (the kind of decision that took a meeting and a write-up) is stable on the timescale of months. Specific implementation details (a function signature, a flag key) drift on the timescale of weeks. Volatile state (a feature flag being on or off, a service being deployed, an experiment running) drifts on the timescale of days or hours.

A memory entry written six weeks ago that names a specific function is a roughly even bet on whether the function still has that name. A memory entry that captures the architectural reason the function exists is much more durable.

The Verification Discipline

The fix is simple to describe and easy to skip. Before acting on any recalled fact that names a specific code artifact, verify the artifact still exists with the name and shape the memory says it has.

The verification is cheap: a grep, a file existence check, a quick read of the relevant module. The cost of skipping it is whatever it costs when the agent runs against the wrong name and produces a diff that does not compile, or worse, compiles and silently calls into something that is not what the memory implied.

The cost asymmetry is what makes the discipline obvious in principle and rare in practice. The verification costs seconds. The non-verification costs minutes-to-hours when it fires. Most of the time the memory is correct and the verification was unnecessary. The handful of times the memory is wrong, the verification was the only thing standing between the agent and a broken dispatch.

What Memory Is Good For

The discipline is not “do not use memory.” Memory does work that nothing else can do. The set of things memory is good for, in roughly the order I trust them:

Decisions and their reasons. Why a particular technical choice was made, what the alternatives were, what made the chosen path win. This is the durable kind of memory. Decisions persist across refactors of the code they apply to.

Architectural intent. The shape of a system at the level of “here is the boundary between concerns” or “here is what role this module plays.” Implementation moves; intent moves much more slowly.

Personal preferences and discipline. How a particular person wants to work, what review processes they require, what they have been burned by before and want to avoid. These do not change with the code.

Empirical facts about external systems that move slowly. The behavior of a regulatory environment, the political dynamics of a procurement process, the structure of a published proposal template. These shift across years, not weeks.

The set of things memory is NOT good for, in the order it most often hurts:

Specific function or class signatures. Specific file paths. Specific flag names or configuration keys. Specific version numbers of dependencies. Specific test counts or coverage percentages. Anything stamped with a snapshot date that is more than a couple of weeks old.

If a memory entry contains one of these, the entry’s lifetime is short and the verification step is the price of using it.

A Generalization

The pattern is not unique to AI memory. It applies to anything that captures a snapshot of code or infrastructure and then sits unmaintained.

Inline code comments are memory. So are READMEs, runbooks, on-call playbooks, internal wikis, and the document a senior engineer wrote when they left explaining how the deploy pipeline works. All of these contain claims about specific artifacts that were true at the moment of writing. All of them drift.

The discipline of verifying a memory entry against current code is the same discipline as verifying that an on-call runbook still reflects how the system works before you act on it during an incident. The cost of skipping the verification is higher under incident pressure. The need for the verification is the same.

What This Means for AI Workflows

If your agentic workflow stores cross-session memory, build the verification step into the workflow rather than relying on the agent to remember to verify. Specifically: any time the agent retrieves a memory entry that contains a code artifact name, the next action should be a verification call that confirms the artifact still exists. If it does not, the memory entry gets flagged stale and the agent re-derives from the current code rather than acting on the recall.

The cost of this verification step is small enough that it pays for itself the first time it catches a stale reference. The discipline is to require it, not optional.

The Framing

Memory is a tool for capturing what is hard to re-derive. Architectural intent and decision reasoning are hard to re-derive; capturing them in memory is a good investment. Specific code artifacts are easy to re-derive by reading the current code; capturing them in memory creates a small liability that grows over time.

The most useful memory entry is the one that names the intent, gestures at the implementation, and lets the verification step against current code fill in the specifics. The most expensive memory entry is the one that names the implementation in detail and gets quoted weeks later against a codebase that has moved.

The cheapest line of defense against the second kind of entry is to verify before you trust. The memory said X. The code is the only thing that knows whether X is still true.