Theory & Belief Revision

AGM belief revision engine, provenance DAG, failure analysis, and rollback.

Theory Manager: Wraps py-brs operations with curate-ipsum-specific logic.

The TheoryManager provides a high-level interface for managing synthesis theories, handling belief revision operations, and integrating evidence from mutation testing.

M3 additions: provenance tracking, rollback, failure analysis, typed assertions.

class curate_ipsum.theory.manager.TheoryManager(project_path, domain='code_mutation', world_label='green')[source]

Bases: object

Manages belief revision operations for code synthesis.

This class wraps py-brs CASStore with curate-ipsum-specific logic, providing a simplified interface for: - Adding assertions (typed beliefs about code) - Contracting assertions (removing beliefs via AGM contraction) - Revising assertions (incorporating new evidence) - Computing entrenchment (belief resilience scores)

Example

manager = TheoryManager(project_path) node = manager.add_assertion(

assertion_type=”behavior”, content=”function handles null input”, evidence_id=”test_123”, confidence=0.8

) score = manager.get_entrenchment(node[“id”])

Parameters:
  • project_path (Path)

  • domain (str)

  • world_label (str)

__init__(project_path, domain='code_mutation', world_label='green')[source]

Initialize the TheoryManager.

Parameters:
  • project_path (Path) – Path to project directory (will create .curate_ipsum subdir)

  • domain (str) – BRS domain identifier

  • world_label (str) – Initial world version label

property store: CASStore

Lazy-load the CASStore.

property domain: str

Current domain identifier.

property world_label: str

Current world version label.

property provenance_dag: ProvenanceDAG

Lazy-load the provenance DAG from CASStore.

get_rollback_manager()[source]

Get or create a RollbackManager.

Return type:

RollbackManager

add_assertion(assertion_type, content, evidence_id, confidence=0.5, region_id=None, metadata=None)[source]

Add a new assertion (belief) to the theory with evidence grounding.

Parameters:
  • assertion_type (str) – Type of assertion (“type”, “behavior”, “invariant”, “contract”)

  • content (str) – Human-readable description of the assertion

  • evidence_id (str) – ID of the evidence that grounds this assertion

  • confidence (float) – Confidence level (0.0 to 1.0)

  • region_id (str | None) – Optional code region this assertion relates to

  • metadata (dict[str, Any] | None) – Optional additional metadata

Returns:

The created node as a dict

Raises:

ValueError – If assertion_type is invalid or confidence out of range

Return type:

dict[str, Any]

contract_assertion(node_id, strategy='entrenchment', cascade=True)[source]

Remove an assertion using AGM contraction.

Parameters:
  • node_id (str) – ID of the node to contract (remove)

  • strategy (str) – Contraction strategy - “entrenchment”: Remove target and dependents with lower entrenchment - “minimal”: Remove only target and direct edges - “full_cascade”: Remove target and all descendants

  • cascade (bool) – Whether to cascade removal to unsupported dependents

Returns:

ContractionResult with details of what was removed

Raises:
Return type:

ContractionResult

revise_with_assertion(assertion_type, content, evidence_id, confidence=0.5, contraction_strategy='entrenchment')[source]

Revise theory by incorporating a new assertion.

Implements AGM revision via the Levi identity: K*φ = (K÷¬φ)+φ

If the new assertion contradicts existing beliefs, those are first contracted before adding the new assertion.

Parameters:
  • assertion_type (str) – Type of assertion

  • content (str) – Assertion content

  • evidence_id (str) – Grounding evidence ID

  • confidence (float) – Confidence level

  • contraction_strategy (str) – Strategy for contracting contradictions

Returns:

Tuple of (new_world_hash, contraction_result_if_any)

Return type:

tuple[str, ‘ContractionResult’ | None]

get_entrenchment(node_id)[source]

Get the entrenchment score for a node.

Entrenchment measures how resilient a belief is to removal. Higher scores (closer to 1.0) mean the belief is more entrenched.

Parameters:

node_id (str) – ID of the node to score

Returns:

Entrenchment score (0.0 to 1.0)

Return type:

float

list_assertions(assertion_type=None, region_id=None)[source]

List all assertions in the current world.

Parameters:
  • assertion_type (str | None) – Filter by assertion type (optional)

  • region_id (str | None) – Filter by region ID (optional)

Returns:

List of assertion nodes

Return type:

list[dict[str, Any]]

get_theory_snapshot()[source]

Get the current theory state as a snapshot.

Returns:

WorldBundle data for the current world

Return type:

dict[str, Any]

store_evidence(evidence)[source]

Store a BRS Evidence object.

Parameters:

evidence (Any) – Evidence object from adapters

Returns:

Evidence ID

Return type:

str

why_believe(assertion_id)[source]

Trace which evidence grounds an assertion.

Parameters:

assertion_id (str) – The assertion to trace

Returns:

List of evidence IDs that support this assertion

Return type:

list[str]

when_added(assertion_id)[source]

Find when an assertion was first added.

Parameters:

assertion_id (str) – The assertion to look up

Returns:

The expansion event, or None

Return type:

RevisionEvent | None

belief_stability(assertion_id)[source]

Measure how stable an assertion is.

Returns 1.0 for never-revised beliefs, lower for frequently revised.

Parameters:

assertion_id (str) – The assertion to measure

Returns:

Stability score (0.0 to 1.0)

Return type:

float

get_provenance_summary()[source]

Get a summary of the provenance DAG.

Returns:

Dict with event counts, world hashes, and recent events

Return type:

dict[str, Any]

analyze_failure(error_message='', test_pass_rate=None, mutation_score=None, failing_tests=None, region_id=None)[source]

Analyze why a synthesis attempt failed.

Combines error message classification, overfitting/underfitting detection, and assertion-based contraction suggestions.

Parameters:
  • error_message (str) – Error output from test execution

  • test_pass_rate (float | None) – Fraction of tests passing (0.0 to 1.0)

  • mutation_score (float | None) – Fraction of mutants killed (0.0 to 1.0)

  • failing_tests (list[str] | None) – Names of failing tests

  • region_id (str | None) – Region where the patch was applied

Returns:

FailureAnalysis with mode, confidence, and suggestions

Return type:

FailureAnalysis

find_contradictions(assertion_type, content, confidence=0.5, region_id=None)[source]

Find existing assertions that would contradict a new assertion.

Parameters:
  • assertion_type (str) – Type of the new assertion

  • content (str) – Content of the new assertion

  • confidence (float) – Confidence of the new assertion

  • region_id (str | None) – Region of the new assertion

Returns:

List of contradicting assertion node dicts

Return type:

list[dict[str, Any]]

Typed assertion model for the belief revision engine.

Assertions are typed beliefs about code properties. Each assertion is grounded by evidence (test results, mutation results, etc.) and tracked with confidence scores.

Assertion types:
  • TYPE: “variable x has type int”

  • BEHAVIOR: “function f returns positive values for positive inputs”

  • INVARIANT: “loop counter i is always < len(items)”

  • CONTRACT: “function f satisfies precondition P and postcondition Q”

  • PRECONDITION: “x > 0 holds before calling f”

  • POSTCONDITION: “result != None holds after calling f”

class curate_ipsum.theory.assertions.AssertionKind(*values)[source]

Bases: StrEnum

Classification of code assertions.

TYPE = 'type'
BEHAVIOR = 'behavior'
INVARIANT = 'invariant'
CONTRACT = 'contract'
PRECONDITION = 'precondition'
POSTCONDITION = 'postcondition'
class curate_ipsum.theory.assertions.Assertion(id, kind, content, confidence, region_id=None, grounding_evidence_ids=<factory>, created_utc=<factory>, metadata=<factory>)[source]

Bases: object

A typed belief about code properties.

Assertions are the atomic units of belief in the synthesis theory. Each must be grounded by at least one piece of evidence.

Parameters:
id: str
kind: AssertionKind
content: str
confidence: float
region_id: str | None = None
grounding_evidence_ids: list[str]
created_utc: str
metadata: dict[str, Any]
curate_ipsum.theory.assertions.assertion_to_node_dict(assertion)[source]

Serialize an Assertion to a CASStore-compatible node dict.

Parameters:

assertion (Assertion) – The assertion to serialize

Returns:

Dict suitable for storing as a Node in CASStore

Return type:

dict[str, Any]

curate_ipsum.theory.assertions.node_dict_to_assertion(node)[source]

Deserialize a CASStore node dict to an Assertion.

Parameters:

node (dict[str, Any]) – Dict from CASStore with Node kind

Returns:

Assertion instance

Raises:

ValueError – If node is not a valid Assertion

Return type:

Assertion

class curate_ipsum.theory.assertions.ContradictionDetector[source]

Bases: object

Detects logical conflicts between assertions.

Contradiction detection is essential for AGM revision — when a new assertion contradicts existing beliefs, the contradicted beliefs must be contracted before the new assertion can be added.

Contradiction rules: 1. Same-region TYPE conflict: two TYPE assertions for the same region

with different content (e.g., “x: int” vs “x: str”)

  1. Negated BEHAVIOR: a BEHAVIOR assertion that logically negates another for the same region (detected by keyword heuristics)

  2. CONTRACT violations: a PRECONDITION that contradicts a POSTCONDITION or vice versa within the same region

classmethod find_contradictions(new_assertion, existing_assertions)[source]

Find assertions that contradict the new one.

Parameters:
  • new_assertion (Assertion) – The assertion being added

  • existing_assertions (list[Assertion]) – All existing assertions in the theory

Returns:

List of existing assertions that conflict with the new one

Return type:

list[Assertion]

Provenance DAG: tracks causal chains of belief evolution.

The provenance DAG is an append-only log of revision events that records WHY each world state exists. While CASStore preserves WHAT each world looks like (via content-addressed snapshots), the provenance DAG captures the causal reasoning: which evidence triggered which revision, which assertions were added or removed, and why.

Design decisions: - Append-only: events are immutable once recorded (audit trail) - DAG structure derived from event ordering (no explicit parent pointers) - Serialized to CASStore as a single object for persistence

class curate_ipsum.theory.provenance.RevisionType(*values)[source]

Bases: StrEnum

Type of belief revision operation.

EXPAND = 'expand'
CONTRACT = 'contract'
REVISE = 'revise'
EVIDENCE = 'evidence'
ROLLBACK = 'rollback'
class curate_ipsum.theory.provenance.RevisionEvent(event_type, timestamp, assertion_id=None, evidence_id=None, from_world_hash=None, to_world_hash=None, strategy=None, reason=None, nodes_removed=<factory>, nodes_added=<factory>, metadata=<factory>)[source]

Bases: object

A single event in the theory’s revision history.

Records what happened, why, and the before/after world state hashes.

Parameters:
event_type: RevisionType
timestamp: str
assertion_id: str | None = None
evidence_id: str | None = None
from_world_hash: str | None = None
to_world_hash: str | None = None
strategy: str | None = None
reason: str | None = None
nodes_removed: list[str]
nodes_added: list[str]
metadata: dict[str, Any]
to_dict()[source]

Serialize to dict.

Return type:

dict[str, Any]

classmethod from_dict(data)[source]

Deserialize from dict.

Parameters:

data (dict[str, Any])

Return type:

RevisionEvent

class curate_ipsum.theory.provenance.ProvenanceDAG[source]

Bases: object

Directed acyclic graph of theory revision events.

Stores an append-only log of all revision operations and provides query methods for understanding belief evolution.

property events: list[RevisionEvent]

All events in chronological order.

add_event(event)[source]

Record a new revision event.

Parameters:

event (RevisionEvent) – The revision event to record

Return type:

None

get_history()[source]

Get all events in chronological order.

Return type:

list[RevisionEvent]

get_path(from_hash, to_hash)[source]

Get the chain of events between two world states.

Parameters:
  • from_hash (str) – Starting world hash

  • to_hash (str) – Ending world hash

Returns:

List of events connecting the two states (may be empty)

Return type:

list[RevisionEvent]

why_believe(assertion_id)[source]

Trace which evidence grounds an assertion.

Parameters:

assertion_id (str) – The assertion to trace

Returns:

List of evidence IDs that support this assertion

Return type:

list[str]

when_added(assertion_id)[source]

Find the first event that added this assertion.

Parameters:

assertion_id (str) – The assertion to look up

Returns:

The expansion event, or None if not found

Return type:

RevisionEvent | None

when_removed(assertion_id)[source]

Find the event that removed this assertion.

Parameters:

assertion_id (str) – The assertion to look up

Returns:

The contraction/revise event, or None if still present

Return type:

RevisionEvent | None

belief_stability(assertion_id)[source]

Measure how stable an assertion is.

Returns a score from 0.0 (constantly revised) to 1.0 (never touched). Formula: 1.0 - (revisions / (revisions + 1)) An assertion that’s been added once and never revised = 1.0. An assertion added, removed, re-added = 0.5.

Parameters:

assertion_id (str) – The assertion to measure

Returns:

Stability score (0.0 to 1.0)

Return type:

float

get_world_hashes()[source]

Get all world hashes in chronological order.

Return type:

list[str]

to_dict()[source]

Serialize the entire DAG to a dict.

Return type:

dict[str, Any]

classmethod from_dict(data)[source]

Deserialize from a dict.

Parameters:

data (dict[str, Any])

Return type:

ProvenanceDAG

class curate_ipsum.theory.provenance.ProvenanceStore[source]

Bases: object

Persistence layer for ProvenanceDAG using CASStore.

The provenance DAG is stored as a single CASStore object with kind ‘ProvenanceDAG’, keyed by domain_id.

PROVENANCE_KEY_PREFIX = 'provenance_'
static save(store, domain_id, dag)[source]

Save the provenance DAG to CASStore.

Parameters:
  • store (CASStore) – The CASStore instance

  • domain_id (str) – Domain identifier

  • dag (ProvenanceDAG) – The provenance DAG to save

Returns:

Content hash of the stored DAG

Return type:

str

static load(store, domain_id)[source]

Load the provenance DAG from CASStore.

Parameters:
  • store (CASStore) – The CASStore instance

  • domain_id (str) – Domain identifier

Returns:

The provenance DAG (empty if none found)

Return type:

ProvenanceDAG

Failure mode analyzer for code synthesis.

Classifies why a synthesis attempt failed and maps the failure to belief revision operations. Uses heuristic pattern matching — formal verification is M5’s concern.

Failure modes (from belief_revision_framework.md):
  • TYPE_MISMATCH: Generated code has type errors

  • PRECONDITION_VIOLATION: Input constraints not met

  • POSTCONDITION_VIOLATION: Output doesn’t satisfy specification

  • INVARIANT_VIOLATION: Loop or structural invariant broken

  • SEMANTIC_DRIFT: Code compiles but doesn’t do what was intended

  • OVERFITTING: Passes tests but kills few mutants (too specific)

  • UNDERFITTING: Fails basic tests (too general/wrong approach)

class curate_ipsum.theory.failure_analyzer.FailureMode(*values)[source]

Bases: StrEnum

Classification of why a synthesis attempt failed.

TYPE_MISMATCH = 'type_mismatch'
PRECONDITION_VIOLATION = 'precondition_violation'
POSTCONDITION_VIOLATION = 'postcondition_violation'
INVARIANT_VIOLATION = 'invariant_violation'
SEMANTIC_DRIFT = 'semantic_drift'
OVERFITTING = 'overfitting'
UNDERFITTING = 'underfitting'
UNKNOWN = 'unknown'
class curate_ipsum.theory.failure_analyzer.FailureAnalysis(mode, confidence, root_cause_assertion_id=None, evidence_summary='', suggested_contraction_ids=<factory>, metadata=<factory>)[source]

Bases: object

Result of analyzing a synthesis failure.

Parameters:
mode: FailureMode
confidence: float
root_cause_assertion_id: str | None = None
evidence_summary: str = ''
suggested_contraction_ids: list[str]
metadata: dict[str, Any]
to_dict()[source]

Serialize to dict for MCP tool output.

Return type:

dict[str, Any]

class curate_ipsum.theory.failure_analyzer.FailureModeAnalyzer[source]

Bases: object

Heuristic analyzer for synthesis failure classification.

Examines error messages, test results, and mutation results to determine why a generated patch failed and suggests which beliefs should be contracted to avoid future failures.

static classify_error(error_message)[source]

Classify a failure based on error message patterns.

Parameters:

error_message (str) – The error output from test execution

Returns:

The most likely FailureMode

Return type:

FailureMode

static detect_overfitting(test_pass_rate, mutation_score, overfitting_threshold=0.3)[source]

Detect if a patch is overfitting to tests.

Overfitting = high test pass rate but low mutation kill rate. The code passes existing tests but doesn’t actually implement the correct behavior (just gets lucky on the test cases).

Parameters:
  • test_pass_rate (float) – Fraction of tests passing (0.0 to 1.0)

  • mutation_score (float) – Fraction of mutants killed (0.0 to 1.0)

  • overfitting_threshold (float) – Gap threshold for overfitting detection

Returns:

True if overfitting detected

Return type:

bool

static detect_underfitting(test_pass_rate, underfitting_threshold=0.5)[source]

Detect if a patch is underfitting.

Underfitting = failing basic tests. The generated code doesn’t even satisfy fundamental requirements.

Parameters:
  • test_pass_rate (float) – Fraction of tests passing (0.0 to 1.0)

  • underfitting_threshold (float) – Pass rate below which underfitting is declared

Returns:

True if underfitting detected

Return type:

bool

classmethod analyze(error_message='', test_pass_rate=None, mutation_score=None, failing_tests=None, assertions=None, region_id=None)[source]

Full failure analysis combining all heuristics.

Parameters:
  • error_message (str) – Error output from test execution

  • test_pass_rate (float | None) – Fraction of tests passing (0.0 to 1.0)

  • mutation_score (float | None) – Fraction of mutants killed (0.0 to 1.0)

  • failing_tests (list[str] | None) – Names of failing tests

  • assertions (list[Assertion] | None) – Current assertions in the theory

  • region_id (str | None) – Region where the patch was applied

Returns:

FailureAnalysis with mode, confidence, and suggestions

Return type:

FailureAnalysis

Rollback mechanism for theory state recovery.

Provides an explicit API for reverting to prior world states, building on CASStore’s content-addressed world versioning and the provenance DAG’s event history.

Design: rollback changes the world_label pointer to reference a prior world_hash — it does NOT copy or mutate any worlds (content-addressable storage means all historical worlds are already preserved).

exception curate_ipsum.theory.rollback.RollbackError[source]

Bases: Exception

Raised when a rollback operation fails.

class curate_ipsum.theory.rollback.Checkpoint(name, world_hash, timestamp, reason='')[source]

Bases: object

Named bookmark for a world state.

Parameters:
name: str
world_hash: str
timestamp: str
reason: str = ''
class curate_ipsum.theory.rollback.RollbackManager(manager, dag)[source]

Bases: object

Manages rollback operations on the synthesis theory.

Provides methods to revert to prior world states, undo operations, and create/restore named checkpoints.

Parameters:
__init__(manager, dag)[source]

Initialize the RollbackManager.

Parameters:
Return type:

None

rollback_to(target_world_hash)[source]

Revert the theory to a prior world state.

This changes the world_label to point to the target world hash. Does NOT modify any existing world data (content-addressable).

Parameters:

target_world_hash (str) – Hash of the world to revert to

Raises:

RollbackError – If the target world doesn’t exist

Return type:

None

undo_last(n=1)[source]

Undo the last N revision operations.

Walks backward through the provenance DAG to find the world state before the last N events, then rolls back to it.

Parameters:

n (int) – Number of operations to undo (default: 1)

Returns:

List of undone events

Raises:

RollbackError – If there aren’t enough events to undo

Return type:

list[RevisionEvent]

create_checkpoint(name, reason='')[source]

Create a named checkpoint of the current world state.

Parameters:
  • name (str) – Human-readable checkpoint name

  • reason (str) – Why this checkpoint was created

Returns:

The created Checkpoint

Return type:

Checkpoint

list_checkpoints()[source]

Get all named checkpoints.

Return type:

list[Checkpoint]

restore_checkpoint(name)[source]

Restore a named checkpoint.

Parameters:

name (str) – The checkpoint name to restore

Raises:

RollbackError – If checkpoint not found

Return type:

None

list_world_history()[source]

List all historical world states with timestamps.

Returns:

List of (world_hash, timestamp, reason) tuples

Return type:

list[tuple[str, str, str]]