If you’ve spent any real time with Claude Code or Cursor, you know the feeling. The thing you told the agent five minutes ago is now optional as far as it’s concerned. The fix isn’t a smarter model. It’s architecture.
This week David Zendzian and I dig into memory for AI agents - what it actually means, why one giant context window isn’t it, and what a real structure for long-running agent work looks like. The conversation lands on the old mnemonic trick of the “mind palace”: different rooms, drawers in each room, a hallway connecting them, a central brain holding the map. It turns out to be a decent architecture pattern for giving an agent persistent context without blowing through the token budget.
Key topics:
- Memory architectures for long-running agents - rooms, drawers, and central orchestration
- The “mind palace” as a practical pattern, not just a party trick
- Why one context window can’t stay consistent - and why the fix is multi-agent orchestration, not bigger models
- Multi-agent orchestration mirrors real org charts: PM agent, engineer agents, security agent, each with their own focus
- The self-driving car fallacy and how to actually reason about AI reliability baselines
- Jevons Paradox, agentic AI edition - efficiency creates more work, not less
- What all this means for platform teams getting ready to run agents in production
Interested in a platform that can help you with enterprise-y excellence? Just TryTanzu.ai.
Tanzu Catsup is a weekly conversation about platform engineering, cloud-native operations, and building software in large organizations…and, of course, AI.
Check us out Fridays at 10am US Eastern/4pm Amsterdam time! Full playlist on YouTube.
Hosts: @thecote and David Zendzian