Visual guide
A Visual Guide to Belief Revision in AI Agents
Your agent stores facts but cannot change its mind. This guide explains, visually, why vector databases fail at contradictions, what forty years of belief revision theory can teach us, and how to build the machinery that lets an agent update what it knows.
24 min read
In 1994, Hollyn Johnson and Colleen Seifert ran a study that should haunt anyone building AI agents. They told participants a warehouse fire had been caused by arson. Later, they issued a correction: actually, it was an electrical fault. When asked afterward why the fire spread so quickly, the participants kept citing arson materials. They had received the correction. They could recall it if asked directly. But the original misinformation continued to shape their reasoning as if it had never been retracted (Johnson & Seifert, 1994).
Psychologists call this the continued influence effect, and two decades of follow-up research have confirmed it is not a fringe finding but a robust feature of human cognition: retractions rarely erase the initial belief from the mental model, even when people remember being corrected (Lewandowsky et al., 2012; Ecker et al., 2022).
If you have built anything on top of an LLM that stores facts about users, projects, or conversations, you have built a system with exactly the same defect, except worse. Your agent doesn’t even have the human advantage of vaguely recalling the correction. It has a vector database full of contradictions it cannot tell apart.
This guide is about the machinery that fixes that. Not by making the vector store smarter, but by building a layer above it that can detect when facts contradict each other, decide which one wins, and retire the loser in a way that is queryable, auditable, and reversible. The field that formalized this problem is called belief revision, and it has been a solved problem in logic since 1985. We just haven’t been using it.
What’s in this guide: the retrieval trap · why vectors can’t tell truth from contradiction · how your brain handles this · what belief revision is · the claim as atom · bi-temporal tracking · conflict detection · the three AGM operations · the property graph · the full architecture · what breaks in production · references
The retrieval trap
Here is a conversation that will eventually happen in every agent with long-term memory:
Turn 5 — user: We’re an Express.js shop. Everything runs on Node.
Turn 23 — user: Actually, we switched to Go last month. Much happier.
Turn 41 — user: What framework are we using?
Turn 41 — agent: You’re using Express.js for your backend stack.
The agent isn’t hallucinating. It isn’t confused. It is faithfully reporting what it retrieved. The problem is that its memory contains two facts about the same question, and its retriever has no way to know that one of them replaced the other. Both are about the project’s framework. Both embed to nearly the same point in vector space. The retriever returned the older one because it happened to score slightly higher on cosine similarity, and the agent quoted it with full confidence.
This is not a retrieval quality problem. No amount of re-ranking, hybrid search, or better embedding models will fix it. The failure is structural: the system has no concept of supersession. It stores facts but has no mechanism for one fact to replace another.
The industry’s instinct has been to fix this inside the retrieval layer: better embeddings, smarter chunking, re-ranking with a cross-encoder. These are fine improvements to retrieval. But retrieval is the wrong abstraction for this problem, because retrieval answers “what is similar to this query?” and the question you actually need answered is “what is currently true?”
Why vectors can’t tell truth from contradiction
To see why the fix can’t live in the embedding layer, you need to understand what embeddings actually encode. An embedding model maps a sentence to a point in high-dimensional space, arranged so that sentences about the same topic end up near each other. This is useful for many things. It is useless for detecting contradictions, because contradictory statements are, by definition, about the same topic.
The numbers in the diagram are not hypothetical. Research on “negation blindness” across major embedding model families has documented the effect systematically: sentences that differ only by a negation routinely achieve cosine similarity above 0.9 (Cao, 2025). “We use Express.js” and “We stopped using Express.js” are near-neighbors in every embedding space tested. They differ in exactly the dimension that matters for memory, and the embedding encodes exactly the dimensions that don’t.
This is not a flaw in any particular model. Embedding models are trained on contrastive objectives that pull similar texts together and push dissimilar texts apart. Contradictory statements are semantically similar by any reasonable training signal: same words, same entities, same domain. The objective that makes retrieval work is the same objective that makes contradiction detection impossible.
Some teams try to patch this with metadata filters: attach a timestamp, filter by recency, hope the newer fact wins. This helps in the simple case (two facts, clear temporal ordering) and fails in every interesting one. What if the user says “actually, I was wrong last week, we are still on Express.js”? What if two facts don’t contradict but one refines the other? What if a fact depends on another fact that was just retracted? Timestamps tell you when something was said. They don’t tell you whether it is still true.
The fix requires a layer above retrieval that operates on structured claims, not raw text. Before we build that layer, it’s worth looking at how the system we evolved to solve this problem actually works.
How your brain handles this
Your brain does not store beliefs in a flat list. It has dedicated neural hardware for detecting when two beliefs conflict and resolving the conflict before it corrupts downstream reasoning. The best-studied model of this system is the conflict monitoring theory developed by Botvinick, Cohen, and Carter.
The anterior cingulate cortex, a region in the medial frontal lobe, acts as a conflict detector. It monitors for simultaneous activation of mutually incompatible responses. When it detects a conflict, it does not resolve the conflict itself. Instead, it fires an alert signal to the dorsolateral prefrontal cortex, which implements top-down control: biasing processing toward the stronger or more relevant belief and suppressing the competing one (Botvinick et al., 2001; Botvinick, Cohen, & Carter, 2004). Van Veen and colleagues later showed this is not just a theoretical model: in an fMRI study of cognitive dissonance, the degree of anterior cingulate activation predicted how much participants subsequently changed their attitudes to resolve the inconsistency (Van Veen et al., 2009).
The architecture is two stages: detect, then resolve. The detector does not need to know which belief is correct. It only needs to recognize that two active beliefs are incompatible. The resolver has a different job: weighing evidence, considering source reliability, and committing to one belief while suppressing the other. These are separable functions, handled by different brain regions, and they map cleanly to separable engineering components.
The Bayesian brain hypothesis takes this further. Karl Friston’s predictive coding framework proposes that the brain maintains hierarchical internal models of the world, constantly generating predictions and comparing them against incoming evidence. When a prediction error occurs, the brain updates its model, weighted by the precision (confidence) of both the prior belief and the new evidence (Friston, 2010; Clark, 2013). High-confidence evidence overrides low-confidence priors. A user’s explicit statement (“We switched to Go”) has high precision. An agent’s inference from a code snippet has lower precision. The brain’s update rule respects this asymmetry. Our agents’ memory systems, by default, do not.
What’s striking is that despite this sophisticated machinery, humans still fall prey to the continued influence effect. The correction is received, the conflict is detected, but the original belief has already been woven into the causal model, and unwinding it requires more than just marking it as false. Leon Festinger described the motivational cost of this kind of unwinding in 1957 as cognitive dissonance: the discomfort of holding contradictory beliefs is real, and the brain will work to avoid it, sometimes by rejecting valid corrections rather than revising the model (Festinger, 1957).
The engineering lesson: even a system with dedicated conflict detection and resolution hardware can fail at belief revision if it doesn’t also handle the downstream dependencies of the retracted belief. A fact doesn’t exist in isolation. It supports other facts. It was used in reasoning. Revoking it means tracing its influence, and that requires structure that a flat list of embeddings simply doesn’t have.
What belief revision actually is
The formalization arrived in 1985, when Carlos Alchourrón, Peter Gärdenfors, and David Makinson published the paper that became the foundation of the field. The AGM postulates, named after the authors’ initials, define three operations that any rational belief system must support, and eight properties those operations must satisfy (Alchourrón, Gärdenfors, & Makinson, 1985).
The three operations are:
Expansion. A new belief arrives that doesn’t conflict with anything you already know. It simply joins the set. This is the easy case, and it’s the only one that most agent memory systems handle correctly.
Contraction. A belief is removed. But removal is not deletion. Other beliefs may have depended on the removed belief, and those need to be re-evaluated. The AGM postulates require minimal change: remove as little as possible to restore consistency.
Revision. The hard one. A new belief arrives that contradicts an existing belief. Revision is decomposed into contraction (remove the old belief and its dependents) followed by expansion (add the new belief). The Levi identity, named after philosopher Isaac Levi, formalizes this decomposition. It means you never have to solve revision as a special case; if your contraction operator is sound, revision comes for free.
This framework is forty years old and comes from mathematical logic, not computer science. But the engineering translation is direct. Jon Doyle built the first Truth Maintenance System in 1979, tracking the justifications behind each belief so that when one was retracted, everything it supported could be re-examined automatically (Doyle, 1979). De Kleer’s Assumption-based TMS (1986) extended this to track multiple consistent worldviews simultaneously (de Kleer, 1986). These systems were practical tools for expert systems in the 1980s. When the expert-system era ended, the techniques largely dropped out of the industry’s working vocabulary. They’re overdue for a return.
The AGM postulates don’t tell you how to decide which belief wins. They tell you what properties the decision must have to be rational. In practice, the resolution policy needs to come from somewhere, and the two most useful inputs are source authority and temporal recency. A user’s direct statement outranks an inference the agent made. A more recent statement from the same source outranks an older one. These aren’t novel ideas. They’re the engineering translation of the precision-weighted updating that the Bayesian brain does automatically.
Sven Ove Hansson argued in 1991 that real agents don’t operate on logically closed theories (infinite sets of all logical consequences) but on finite belief bases: the specific facts they’ve been told, without computing every implication. This is a better model for production systems where you store exactly the claims you’ve extracted, not their transitive closure (Hansson, 1991). Willard Van Orman Quine made a related point from philosophy: beliefs form a web, not a list, and any revision at one point sends ripples through the web. No belief is immune to revision, and no revision is purely local (Quine, 1951). The property graph we’ll build later is a literal implementation of Quine’s web.
The claim as the atom of belief
If you want to detect contradictions, compare beliefs, and track which one supersedes which, you need a unit of belief that is smaller than a sentence and more structured than a string. The unit is the claim: a structured triple of subject, predicate, and object, annotated with temporal metadata and source provenance.
The subject-predicate-object structure is not accidental. It is the minimum viable representation that makes conflict detection mechanical. Two claims conflict when they share the same subject and predicate but differ in their object. “Project uses Express.js” and “Project uses Go” have the same subject (project) and predicate (uses_framework) but different objects. That structural match is the signal. No embeddings needed, no LLM call required, no fuzzy similarity threshold to tune.
Free text can’t do this. “The budget changed” is a single string. You need an LLM to figure out that it relates to a prior budget statement, what the old value was, and what the new value is. With a structured claim, the relationship is explicit in the data.
The extraction step, turning free-text conversation turns into structured claims, is itself an LLM call. This is the one place in the belief revision pipeline where you deliberately use a language model. The prompt is narrow: given this conversation turn, extract any factual claims as subject-predicate-object triples, and for each, note the source (was it stated by the user, inferred from context, or observed from tool output) and your confidence. This is not a general reasoning task. It is a structured extraction task, and modern models handle it reliably.
Confidence matters. “I’m vegetarian” stated outright is not the same as probably vegetarian inferred from three salad orders. Production systems that skip confidence tracking end up asserting inferences as facts, which is how an agent insists you’re vegetarian because you once said you liked a salad. Psychologists call the human version of this a source-monitoring error: confusing where a belief came from, and thereby misattributing its reliability (Johnson, Hashtroudi, & Lindsay, 1993).
Bi-temporal tracking
Facts have two timelines, and conflating them is a common source of bugs. The first is validity time: when was this fact true in the world? The Express.js claim was valid from March 1 to May 15. The Go claim is valid from May 15 onward. The second is assertion time: when did the system learn this fact? The Express.js claim was recorded on March 3. The Go claim was recorded on May 16. These are different questions, and they need different answers.
Bi-temporal tracking is borrowed from database design, where it has been standard practice since Richard Snodgrass formalized it in the 1990s. The Zep team’s Graphiti architecture applies it to agent memory specifically, modeling facts as nodes in a temporal knowledge graph where both timelines are first-class properties (Rasmussen et al., 2025).
The payoff is not just correctness on current queries. It’s the ability to answer historical questions (“what did we think the budget was in April?”), audit trails (“when did we learn this was wrong?”), and rollback (“undo the last revision”). These are not academic concerns. In regulated industries, auditable memory is a compliance requirement. In any production system, the ability to trace a wrong answer back to the fact that caused it is the difference between debugging and guessing.
Conflict detection
Detection is the anterior cingulate cortex of the system: the component whose only job is to recognize that two active beliefs are incompatible. It does not resolve the conflict. It flags it for the revision engine.
The core primitive is structural matching on subject and predicate. When a new claim arrives, the detector checks: does an active claim already exist with the same subject and the same predicate? If yes, and the objects differ, that is a conflict. This is a database lookup, not a language model call, and it runs in milliseconds.
The subtle part is predicate equivalence. “Uses” and “runs on” might mean the same thing in context. “Budget” and “project budget” are the same predicate. You have two options here. The conservative one is to normalize predicates during extraction: instruct the extraction prompt to use a controlled vocabulary of predicates, and map synonyms at write time. The aggressive one is to use a lightweight similarity check on predicates at conflict-detection time. In practice, the conservative approach works better. A small, curated predicate vocabulary (fifty to two hundred entries for most domains) catches the vast majority of conflicts without the fragility of runtime semantic matching.
What the detector should not do: try to catch every possible logical implication. “We use Go” and “We deploy on Vercel” don’t share a predicate, but they may be practically incompatible (Vercel doesn’t support Go). Catching this kind of implicit conflict requires world knowledge and reasoning that belongs in a different layer, not in a fast, structural conflict detector. Build the simple detector first. If implicit conflicts become a real problem, add a periodic “consistency audit” that runs an LLM over the active belief set to flag potential issues. Don’t put that on the hot path.
The three AGM operations
With structured claims and a conflict detector, the three AGM operations translate directly into code.
Expand. The new claim has no conflict with any active claim. Insert it into the graph with status active, set valid_from to now, leave valid_to open. No existing claims are touched. This is the common case, and it’s fast.
Contract. A claim needs to be removed (the user says “forget that I told you our budget”). Set its status to superseded and close its validity window (valid_to = now). Then check for dependents: any active claim whose existence was justified by the removed claim needs to be flagged for re-evaluation. In the Vercel-depends-on-Express.js example, contracting Express.js should flag the Vercel claim. Minimal change: only flag direct dependents, don’t cascade recursively unless a dependent is itself contradicted.
Revise. The new claim conflicts with an existing active claim. This is contraction followed by expansion (the Levi identity). Contract the old claim (mark it superseded, close its validity window, check dependents), then expand with the new claim. The revision should return a structured result indicating what was superseded, what was added, and what dependents were flagged, so the system can log the decision and, if needed, explain it to the user.
The operation that most teams get wrong is contraction, not revision. Deletion feels simpler than supersession, but deletion destroys the audit trail and makes historical queries impossible. The correct operation is never to delete a claim but to close its validity window and mark it superseded. The claim remains in the graph, queryable for historical questions, but excluded from current-state queries. This is the tombstoning pattern from database design, applied to beliefs.
The property graph
The data structure that holds all of this together is a property graph: claims as nodes, relationships between claims as typed, directed edges.
Three edge types do most of the work:
SUPERSEDES. Directed from the new claim to the old one. “Go supersedes Express.js.” This edge is what makes the revision operation queryable: you can always trace backward to see what a claim replaced.
SUPPORTS. Bidirectional or directed from one claim to another that it reinforces. “The team hired a Go developer” supports “We use Go.” Support edges increase confidence. They also create a useful signal for consolidation: claims with many support edges are more durable.
DEPENDS_ON. Directed from a claim that logically depends on another. “Deploy on Vercel” depends on “We use Express.js.” When the dependency target is contracted, the dependent claim is flagged for re-evaluation. This is the cascading revision that the AGM postulates require: minimal change, but not zero change.
Production implementations of this graph use either a dedicated graph database (Neo4j, FalkorDB) or a property-graph layer built on top of Postgres with a JSONB edges table. Zep’s Graphiti architecture uses the latter approach, and their paper reports that bi-temporal graph queries are fast enough for the read path at production latencies (Rasmussen et al., 2025). Young Bin Park’s Kumiho system (2026) takes this further, grounding the property graph in formal AGM semantics with immutable revision histories and URI-based claim addressing (Park, 2026).
The graph is not a replacement for the vector store. It is a layer above it. The vector store still handles episodic retrieval (finding past conversations that are relevant to the current turn). The belief graph handles semantic truth (deciding what is currently true about the user, the project, and the world). These are different questions, and they deserve different data structures.
The full architecture
Read it with three observations:
The read path filters before retrieval. The graph query returns only active claims. The model never sees superseded beliefs, which means it cannot quote them, which means the continued influence effect is structurally prevented. This is the core win, and it is worth more than any improvement to the embedding model or the re-ranker.
The write path is an extraction-detection-resolution pipeline. Extraction (LLM call: turn to structured claims) is the expensive step. Conflict detection (structural match on subject + predicate) is cheap. Revision (AGM operations on the graph) is cheap. The whole pipeline runs asynchronously after the reply is sent, so it doesn’t add latency to the user experience. This is the same async-write principle from the memory guide: reads are synchronous and lean; writes are asynchronous and thorough.
The consolidation daemon is sleep. Periodically, a background process walks the graph and performs maintenance: merge near-duplicate claims (same meaning, slightly different phrasing), prune orphaned nodes (superseded claims with no inbound SUPERSEDES edges from active claims), and check referential integrity (does every DEPENDS_ON target still exist?). This is the engineering equivalent of what the hippocampal-cortical consolidation loop does during sleep: compress, deduplicate, and strengthen the important memories while pruning the noise. Budget for it from day one.
What breaks in production
Everything above is the blueprint. Here is what happens when it meets real traffic, in the order I’ve been bitten:
1. Extraction quality is the bottleneck. The entire pipeline depends on the extraction prompt correctly parsing a conversation turn into structured claims. In practice, extraction misses subtle corrections (“yeah, we actually changed that”), hallucinates claims that weren’t in the turn, or mis-assigns the subject. Every extraction error is a potential poisoned belief that will persist until something contradicts it. The fix is an extraction prompt that is narrow, heavily tested, and run against a regression suite of real conversation turns. Over-extract and you poison the store. Under-extract and you lose the update.
2. Predicate normalization is harder than it sounds. “Uses” vs. “runs on” vs. “is built with” vs. “stack includes” all mean the same thing, and if the extraction prompt isn’t consistent, you get parallel claims that never conflict-detect because their predicates don’t match. The controlled-vocabulary approach works, but it requires maintenance: every new domain brings new predicates, and the vocabulary needs to grow with the product.
3. Cascading revision can over-trigger. If your DEPENDS_ON edges are too aggressively drawn, contracting one claim can flag a cascade of dependents for re-evaluation, each of which may trigger further cascades. In the worst case, a single correction destabilizes half the belief graph. The fix is conservative dependency tracking: only mark an explicit dependency when the claim logically requires the target to be true, not merely when they’re related.
4. The poisoning loop is real. An adversary (or a confused user, or a hallucinating tool) injects a wrong fact. The agent uses it in reasoning. The user’s confused reply mentions it again. The extractor re-extracts it with fresh evidence. The false belief now has multiple supporting sources and high confidence. Zou and colleagues demonstrated a version of this at scale in PoisonedRAG, showing that injecting as few as five poisoned texts into a million-document corpus achieved a 90% attack success rate (Zou et al., 2024). Chen and colleagues showed in AgentPoison that optimized triggers in a long-term memory can backdoor agent behavior while maintaining benign performance with less than 0.1% poison rate (Chen et al., 2024). The defense is provenance: every claim must record its source, and the system must support a “trust hierarchy” where user-stated facts outrank inferences, and human-verified facts outrank everything. A claim with provenance “agent inference, confidence 0.4” should not survive a conflict with “user stated, confidence 0.95.”
5. Users need a way to see and correct what the agent believes. This is a product feature, not a debugging tool. If an agent says something wrong because of a bad belief, the user should be able to inspect the belief, trace its provenance, and delete or correct it. In much of the world, this is also a legal requirement (GDPR Article 16: right to rectification). The belief graph makes this possible because every claim is a discrete, addressable, deletable object with a recorded history. A flat vector store makes it nearly impossible.
Will belief revision replace RAG?
No. They solve different problems. RAG answers “what past information is relevant to this question?” Belief revision answers “what is currently true?” You need both.
A retrieval system without belief revision serves stale facts alongside current ones and lets the model pick. A belief revision system without retrieval has no way to find relevant episodic context from past conversations. The architecture this guide describes layers belief revision above retrieval: the vector store handles the episodic layer (finding relevant past moments), and the belief graph handles the semantic layer (maintaining a consistent, temporal, revisable model of what is true).
The real question isn’t whether to choose one over the other. It’s whether to build the revision layer at all, or to keep hoping the retriever will figure it out. If your agent ever needs to handle corrections, changing facts, or user preferences that evolve over time, the retriever will not figure it out. This guide has tried to show why, and what to build instead.
The belief graph I describe here is the architecture behind Chapter 4 of Atlas, and the revision engine is shipping in the memory layer of Nexus v2 next month. If you build one of these systems and find a failure mode I haven’t cataloged, I want to hear about it.
References and further reading
Belief revision and formal logic:
- Alchourrón, Gärdenfors & Makinson (1985), On the Logic of Theory Change: Partial Meet Contraction and Revision Functions, Journal of Symbolic Logic, 50(2), 510–530. The AGM postulates.
- Doyle (1979), A Truth Maintenance System, Artificial Intelligence, 12(3), 231–272. Justification-based dependency tracking.
- de Kleer (1986), An Assumption-based TMS, Artificial Intelligence, 28(2), 127–162. Multiple simultaneous worldviews.
- Hansson (1991), Belief Base Dynamics, PhD thesis, Uppsala University. Belief bases versus closed theories.
- Quine (1951), Two Dogmas of Empiricism, The Philosophical Review, 60(1), 20–43. The web of belief.
Cognitive science and neuroscience:
- Botvinick, Braver, Barch, Carter & Cohen (2001), Conflict Monitoring and Cognitive Control, Psychological Review, 108(3), 624–652. The conflict monitoring model.
- Botvinick, Cohen & Carter (2004), Conflict Monitoring and Anterior Cingulate Cortex: An Update, Trends in Cognitive Sciences, 8(12), 539–546.
- Van Veen, Krug, Schooler & Carter (2009), Neural Activity Predicts Attitude Change in Cognitive Dissonance, Nature Neuroscience, 12(11), 1469–1474. ACC activation predicts belief change.
- Friston (2010), The Free-Energy Principle: A Unified Brain Theory?, Nature Reviews Neuroscience, 11(2), 127–138. Precision-weighted belief updating.
- Clark (2013), Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science, Behavioral and Brain Sciences, 36(3), 181–204. The brain as prediction machine.
- Festinger (1957), A Theory of Cognitive Dissonance, Row, Peterson. The cost of holding contradictory beliefs.
- Johnson & Seifert (1994), Sources of the Continued Influence Effect, Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(6), 1420–1436. Retracted facts persist in reasoning.
- Lewandowsky, Ecker, Seifert, Schwarz & Cook (2012), Misinformation and Its Correction, Psychological Science in the Public Interest, 13(3), 106–131. Comprehensive review of the continued influence effect.
- Ecker et al. (2022), The Psychological Drivers of Misinformation Belief and Its Resistance to Correction, Nature Reviews Psychology, 1, 13–29.
- Loftus & Palmer (1974), Reconstruction of Automobile Destruction, Journal of Verbal Learning and Verbal Behavior, 13, 585–589. Memory is reconstructive.
- Johnson, Hashtroudi & Lindsay (1993), Source Monitoring, Psychological Bulletin, 114(1), 3–28. Misattributing where a belief came from.
- Kunda (1990), The Case for Motivated Reasoning, Psychological Bulletin, 108(3), 480–498.
Agent memory systems and attacks:
- Rasmussen, Paliychuk, Beauvais, Ryan & Chalef (2025), Zep: A Temporal Knowledge Graph Architecture for Agent Memory. Bi-temporal knowledge graphs for agents.
- Park (2026), Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures. Kumiho: AGM semantics on a property graph.
- Cao (2025), Semantic Adapter for Universal Text Embeddings: Diagnosing and Mitigating Negation Blindness. Embedding models can’t distinguish “X” from “not X.”
- Zou, Geng, Wang & Jia (2024), PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation. Five poisoned texts in a million documents: 90% attack success.
- Chen, Xiang, Xiao, Song & Li (2024), AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases. Backdooring agent memory with optimized triggers.
Diagrams are hand-drawn SVG and free to reuse with attribution. If this guide helped, the subscribe box below is how you hear about the next one.