The no-dev explanation
Imagine Claude Code has to reread a giant binder every time you ask a question: system instructions, tool list, project rules, files it already read, chat history, and your new request. Prompt caching is like saying, "The first part of this binder is identical to last time, so reuse it."
One sentence rule
Cache loves boring consistency. Set up the session once, then keep working in the same lane until a natural break.
Start stableWork in one task laneCompact at breaksUse subagents for side questsClear between unrelated tasksThe request stack Claude Code sends
Claude Code sends context every turn. The cache works by exact prefix matching, so earlier layers are more sensitive than later layers.
Plain English: changing "rules of the game" is expensive. Adding another normal message is expected.
Before you start Claude Code
| Do this | Why it helps cache/context |
|---|---|
| Open the correct repo folder first. | Starting in the right place avoids rebuilding around a different project context. |
| Pick the model once. | Each model has its own cache. Switching models forces a fresh read. |
| Connect MCP servers before serious work. | Tool definitions affect the system prompt. Changing tool availability mid-session can invalidate cache. |
Check CLAUDE.md and auto memory. | Project rules and memories load at session start. Stable, concise memory reduces repeated explanations. |
| Choose output style before the session matters. | Output style is part of the system prompt and loads at session start. |
While working
| Habit | Use it when |
|---|---|
/usage or a status line | You want to know whether cache reads are high and creation tokens are low. |
/recap | You want a session summary without replacing history. |
/compact Focus on... | You reached a natural break and want a smaller conversation going forward. |
/clear | You are switching to an unrelated task and stale context would confuse Claude. |
| Subagents | You need research, logs, or file digging that would pollute your main conversation. |
The simple decision tree
/compact at a clean checkpoint and tell Claude what to preserve./rename, then /clear, then start clean.Copy-paste prompt library
Use these directly in Claude Code. They are written for someone who does not want to think about cache mechanics every time.
Stable session start prompt
Add durable project rules to CLAUDE.md
Safe compaction prompt
Cache performance troubleshooting
Subagent prompt for a side quest
MCP stability checklist prompt
Cache warmth simulator
Select actions you plan to take during a Claude Code session. The app estimates whether your next turn stays warm or likely rebuilds cache.
Result
Your session is stable. Keep going in the same lane.
Constraints and edge cases
| Situation | What actually happens | Beginner takeaway |
|---|---|---|
Editing CLAUDE.md mid-session | Does not usually invalidate cache, but the new content also does not apply until /clear, /compact, or restart. | Edit it before the session matters, or restart after major changes. |
| Changing output style | It is part of the system prompt and is read at session start; a change applies after /clear or a new session. | Pick your style up front. |
| Switching models | Each model has its own cache. | Do not bounce between models unless the cold turn is worth it. |
| MCP server changes | Tool definitions affect the system prompt layer. | Connect needed servers first; avoid unstable tool lists mid-task. |
| File edits | Editing a file does not rewrite old conversation history. Claude rereads changed files when needed. | Normal coding is fine. The issue is changing session structure, not editing code. |
/compact | Replaces history with a summary, so the old conversation prefix changes. | Compact at natural breaks and tell Claude what to preserve. |
/recap | Generates a displayed summary without replacing the conversation history. | Use it when you just need a status summary. |
| Subagents | They run in their own context/cache and return summaries to the parent session. | Great for research that would clutter the main thread. |
| Subscription vs API key | Claude subscription sessions can use one-hour TTL automatically; API/third-party defaults are usually five minutes unless configured. | On API usage, long breaks can cool the cache faster. |
| Privacy/local transcripts | Claude Code can store local session transcripts in plaintext under ~/.claude/projects/ by default for resumption. | Be careful with sensitive projects and retention settings. |
Your project cache notes
This scratchpad saves in this browser. If hosted in an environment with window.vibes.save/load, it uses that; otherwise it falls back to localStorage.
Personal checklist
References and where this app's guidance came from
This app paraphrases these official Anthropic / Claude sources and turns them into beginner workflows. It does not call the Claude API or any external service.
Used for cache prefixes, cache hit vs creation behavior, TTL behavior, and repetitive long-prompt guidance.
platform.claude.com/docs/en/build-with-claude/prompt-caching
Used for Claude Code-specific automatic caching, exact prefix matching, invalidators, cache scope, TTL by authentication method, and subagent cache behavior.
code.claude.com/docs/en/prompt-caching
Used for CLAUDE.md, auto memory, /memory, and memory loading behavior.
code.claude.com/docs/en/memory
Used for context-size cost guidance, /usage, /clear, /rename, /resume, and custom compaction instructions.
code.claude.com/docs/en/costs
Used for output-style behavior and system-prompt impact.
code.claude.com/docs/en/output-styles
Used for beginner prompt patterns: overview, finding relevant code, planning before editing, resuming, worktrees, and delegating research.
code.claude.com/docs/en/common-workflows
Used for the recommendation to isolate broad research/log digging into subagents to preserve main context.
code.claude.com/docs/en/sub-agents
Used for the note that local Claude Code transcripts can be stored in plaintext under
~/.claude/projects/ by default for session resumption.code.claude.com/docs/en/data-usage