CortexGraph: Teaching AI Assistants to Forget Like Humans Do

Hook

Most AI memory systems remember everything forever or forget arbitrarily—but human memory doesn’t work that way. CortexGraph implements Ebbinghaus forgetting curves so your AI assistant’s memories naturally decay unless reinforced, just like yours do.

Context

The AI assistant memory problem is surprisingly unsolved. Current approaches fall into two camps: stateless models that forget everything between sessions, or naive append-only systems that treat every interaction as equally important forever. Neither reflects how human memory actually works. We don’t remember every conversation with perfect clarity—we forget the trivial, reinforce the important, and consolidate repeated information into lasting knowledge.

CortexGraph addresses this by implementing a biologically-inspired memory architecture based on cognitive science research. It applies the Ebbinghaus forgetting curve—the exponential decay function that describes how human memory fades over time—to AI assistant interactions. Memories start in short-term storage with decay scores calculated from recency, frequency, and importance. Access a memory repeatedly and it strengthens; ignore it and it fades. Sufficiently reinforced memories get promoted to long-term storage as Obsidian-compatible Markdown files. Everything stays local in human-readable formats, and the system exposes an MCP (Model Context Protocol) server interface for seamless integration with Claude and other assistants.

Technical Insight

The core innovation is the temporal decay algorithm that determines which memories persist. CortexGraph combines three factors into a composite score: recency decay (how long since last access), frequency weighting (how often accessed), and importance multipliers (explicit strength ratings). The recency component supports multiple forgetting curve models—power-law decay models gradual forgetting (t^-beta), exponential decay models rapid forgetting (e^(-t/half_life)), and two-component models that combine both for nuanced behavior.

Here’s how the decay calculation works in practice:

import time
from datetime import datetime, timedelta

def calculate_decay_score(memory, current_time, config):
    # Time since last access in days
    elapsed = (current_time - memory['last_accessed']) / 86400
    
    # Recency: exponential decay with configurable half-life
    if config['decay_model'] == 'exponential':
        recency_score = 0.5 ** (elapsed / config['half_life_days'])
    else:  # power-law decay
        recency_score = (1 + elapsed) ** -config['beta']
    
    # Frequency: sublinear scaling to avoid over-weighting
    frequency_score = min(1.0, memory['access_count'] / config['saturation_count'])
    
    # Combine with importance multiplier
    composite_score = (
        config['recency_weight'] * recency_score +
        config['frequency_weight'] * frequency_score
    ) * memory['importance']
    
    return composite_score

# Consolidation trigger: 5+ accesses in 14 days OR score >= 0.65
if memory['access_count'] >= 5 and age_days <= 14:
    promote_to_long_term(memory)
elif calculate_decay_score(memory, time.time(), config) >= 0.65:
    promote_to_long_term(memory)

The two-tier storage architecture keeps everything transparent. Short-term memories live in JSONL files—one JSON object per line, easy to grep, diff, and version control. Each entry contains the memory content, timestamps, access count, importance score, and calculated decay value. No black-box embeddings or proprietary formats. When memories hit consolidation thresholds, they’re transformed into Markdown files with YAML frontmatter and Obsidian-style wikilinks for knowledge graph relationships.

The MCP server integration is where this becomes practical for daily use. Instead of manually managing memory operations, you interact naturally with Claude through the Model Context Protocol. The system exposes tools like store_memory, search_memories, and get_active_memories that Claude can invoke conversationally. Ask “remember that I prefer TypeScript” and it stores with appropriate importance. Later reference “my language preference” and the system retrieves it, reinforcing the memory through access. The decay mechanics happen invisibly—memories you actually use stick around, while one-off mentions fade naturally.

The modular architecture separates concerns cleanly: core algorithms handle decay calculations, storage backends abstract JSONL vs SQLite implementations, agents orchestrate consolidation pipelines, and MCP tools expose the interface. This enables experimentation—swap in different forgetting curve models, adjust consolidation thresholds, or plug in alternative storage without rewriting the system. The codebase inherits 791 tests with 98%+ coverage from its mnemex predecessor, providing confidence despite the fresh 0.1.0 version number.

Knowledge graph features emerge from the consolidation process. As memories get promoted to long-term Markdown, the system can extract entities and relationships, creating bidirectional wikilinks. A memory about “learning Rust for systems programming” might link to entities for [[Rust]], [[Systems Programming]], and [[Learning Resources]]. Over time, your AI assistant builds a personal knowledge graph that mirrors how you actually think about topics, with connection strength reflecting genuine usage patterns rather than arbitrary tagging.

Gotcha

CortexGraph is explicitly labeled as a Proof of Concept for research purposes—the maintainers are refreshingly honest that this is not production-ready software. Breaking changes can happen without warning, there’s no commercial support, and stability guarantees don’t exist. The repository’s purpose is to validate theoretical frameworks (the STOPPER Protocol and CortexGraph cognitive architecture) rather than serve as a turnkey product. If you need something you can deploy and forget about, this will frustrate you.

The small community footprint (25 stars) and recent rebranding from mnemex adds confusion. The project reset to version 0.1.0 despite inheriting a mature codebase, and GitHub misclassifies the language as HTML when it’s actually Python 3.10+. Migration documentation from the frozen mnemex PyPI package is sparse. You’ll need comfort with self-hosting, manual MCP server configuration, and reading source code when documentation gaps appear. The research artifact positioning means you’re expected to study, adapt, and experiment—not install and run. For privacy-conscious developers willing to tinker with cognitive memory parameters in their personal Claude workflows, that’s a feature. For teams needing reliable infrastructure, it’s a dealbreaker.

Verdict

Use if: You’re researching cognitive architectures for AI and want a working implementation of forgetting curves to study and extend; you value data sovereignty and need fully local, transparent, human-readable memory storage; you’re comfortable self-hosting and configuring MCP servers for personal Claude workflows; or you want to experiment with biologically-inspired memory mechanics and have tolerance for research-grade code. Skip if: You need production stability, commercial support, or turnkey deployment; you expect active community engagement or rapid bug fixes; you want a maintained product rather than a research artifact to adapt; or you lack the technical comfort to debug MCP integrations and tune decay parameters yourself. This is an intellectually fascinating exploration of how AI memory could work, not a polished product ready for critical applications.

CortexGraph: Teaching AI Assistants to Forget Like Humans Do

Hook

Context

Technical Insight

Gotcha

Verdict

// RELATED

AG-UI: The Missing Protocol Layer Between AI Agents and Your Frontend

Heretic: Using Multi-Objective Optimization to Automatically Uncensor Language Models

Serena: Teaching LLMs to Navigate Code Like an IDE, Not a Text Editor