Your CLAUDE.md should grow, not shrink

Stop writing context that collapse. Start building contexts that evolve.

Oct 12, 2025

∙ Paid

I watched my Claude Code context collapse at step 47.

18,282 tokens of carefully accumulated knowledge—agent strategies, API patterns, debugging insights—compressed into 122 tokens. Accuracy dropped from 66.7% to 57.1%. Worse than if I’d never adapted the context at all.

This wasn’t a bug. This was what happens when you ask an LLM to rewrite its own memory.

Stanford researchers published a paper last week that explains exactly why this happens—and more importantly, how to fix it. They call it “context collapse.” And if you’re using Claude Code with CLAUDE.md files, you’re probably experiencing it right now without realizing it.

Here’s what’s fascinating: The solution isn’t shorter prompts. It’s longer, more comprehensive contexts that grow and evolve without degrading.

The paper is 23 pages of academic prose. I spent 6 hours breaking it down into techniques you can use Monday.

The Problem Everyone’s Getting Wrong

You’ve probably heard the standard prompt engineering advice: “Keep it concise.” “Be specific but brief.” “LLMs work better with short, clear instructions.”

All wrong. Or at least, wrong for the way we’re actually using Claude Code.

Here’s why: When you’re building agents—whether it’s Claude Code working on a complex codebase or an automated workflow—you need two things:

Accumulated knowledge from past successes and failures
Detailed strategies for handling edge cases

The standard approach? Compress everything into a tidy system prompt. Or worse, let Claude periodically “summarize” its CLAUDE.md file to keep it manageable.

That’s where context collapse happens.

Stanford’s ACE (Agentic Context Engineering) framework took a different approach. Instead of compressing knowledge, they built systems that accumulate it. Structured. Itemized. With metadata tracking what works and what doesn’t.

The results?

+10.6% average accuracy on agent benchmarks
86.9% lower adaptation latency compared to existing methods
A smaller open-source model (DeepSeek-V3.1) matching GPT-4.1 systems on the AppWorld leaderboard

Want to know the specific technique that made this possible? It’s called “incremental delta updates” and you can implement it in your Claude Code workflow today.

The One Technique That Changes Everything

Before I show you the full system, let me give you the single most valuable insight from the ACE paper:

Stop rewriting your CLAUDE.md file. Start appending structured entries.

Here’s what that looks like in practice:

Bad (Context Collapse Prone):

# CLAUDE.md

Use TypeScript with strict mode. Follow REST API conventions. 
Write tests for all features.

Over time, Claude “optimizes” this into:

# CLAUDE.md

Write quality code.

Congratulations. You just lost all your domain knowledge.

Good (ACE-Inspired):

# CLAUDE.md

## [Strategy: Auth-001] Phone Contacts Are Source of Truth
When identifying relationships (roommates, contacts, etc.), ALWAYS use Phone app contacts. 
Never parse transaction descriptions or use keyword matching—these are unreliable.
✓ Helpful: 12 | ✗ Harmful: 0

## [Strategy: API-002] Pagination Requires While True Loop  
Many APIs return “pages” of results. Use `while True` loop with proper break condition—NOT `for i in range(10)`.
Continue until API returns empty results or null.
✓ Helpful: 8 | ✗ Harmful: 1

## [Pattern: Error-003] Authentication Timeouts
If auth fails, check: (1) phone number not email, (2) clean credentials from supervisor, 
(3) verify API docs for correct parameters. Do not proceed with workarounds.
✓ Helpful: 5 | ✗ Harmful: 2

See the difference?

Each entry is:

Itemized with a unique ID
Specific about when and how to apply it
Tracked with feedback counters
Persistent across sessions

This is the core insight: Your CLAUDE.md isn’t documentation. It’s a living playbook that gets better with every task.

The techniques below are for paid subscribers only. You’ll learn the complete ACE workflow for Claude Code, including the Generator-Reflector-Curator pattern, how to prevent context collapse during long sessions, and the exact prompts for building self-improving agents.

Continue reading this post for free, courtesy of Tyler Folkman.

Or purchase a paid subscription.