Skip to content

NirDiamant/Agent_Memory_Techniques

🧠 Agent Memory Techniques

Agent Memory Techniques for LLMs: 30 runnable Jupyter notebooks covering every major memory pattern

Learn every agent memory technique for LLM agents.

⭐ If you find this useful, please star the repo so more learners can discover it.

🧭 New here? Start with 01 Conversation Buffer Memory or pick a Learning Path. Prefer a visual? See the Decision Tree below. 30 runnable Jupyter notebooks covering conversation buffers, vector stores, knowledge graphs, episodic and semantic memory, working memory, MemGPT, Mem0, Letta, Zep, Graphiti, LoCoMo benchmarks, and production memory patterns.

License: Apache 2.0 Python 3.10+ Jupyter GitHub Stars Issues Contributions Welcome


LinkedIn Twitter Reddit Discord Sponsor DiamantAI Collective is hiring

From the same author

#1 Best Seller in Generative AI on Amazon

#1 Best Seller on Amazon in Generative AI

Want to go deeper on RAG (Retrieval-Augmented Generation, the technique of giving a model extra documents so it can answer better)? The book is the long version. You'll get the intuition behind every technique. You'll get side-by-side comparisons that show when each one wins and when it quietly fails. You'll get illustrations that make the tricky parts click.

⏳ Launch window only: $0.99

The price goes up once the launch window closes. Readers who grab it now lock in the lowest price it will ever have.


DiamantAI Collective - AI engineering jobs

πŸ’Ό Apply for open AI engineering jobs

AI-first companies are hiring through the DiamantAI Collective.

See open jobs and apply


πŸ“« Stay Updated

πŸš€
Weekly
Updates
πŸ’‘
Expert
Insights
🎯
Top 0.1%
Content

Subscribe to DiamantAI Newsletter

Join over 50,000 readers getting clear AI tutorials every week. Subscribers also get early access and a 33% discount on my book.

DiamantAI newsletter



πŸ’‘ Why Agent Memory Matters

πŸ’‘ Quick Answer (for search engines and skimmers)

Agent memory is the set of techniques that let an LLM-based agent (a system built around a Large Language Model) remember information across turns, sessions, and tasks. Without memory, an agent re-derives context every time and cannot personalize, learn, or maintain coherence over long interactions. This repository documents 30 distinct memory techniques, grouped into six families: short-term context management, long-term storage, cognitive architectures, retrieval and multi-agent patterns, batteries-included frameworks, and production deployment patterns.

Think about a friend who forgets every conversation you've ever had. Every morning you're strangers again. That's what most AI agents are like today.

Every AI agent eventually hits the same wall: it forgets.

In 2026, AI agents are everywhere. But most of them still forget what you told them yesterday. Without strong memory, an agent can't keep context across conversations. It can't learn from past chats. It can't build a lasting relationship with you.

The landscape is shifting fast:

  • Anthropic's 7 Layers of Memory (March 2026): from conversation context to cross-project knowledge, defining the memory hierarchy for Claude Code
  • Mem0: managed memory layer gaining rapid adoption for personalized AI
  • Letta (MemGPT): self-editing memory with inner/outer monologue architecture
  • Zep: temporal knowledge graphs for long-term agent memory
  • Graphiti: episodic-to-semantic knowledge graph extraction
  • MemOS & Memori: memory-as-infrastructure platforms for production agents

But there's no single hands-on guide that teaches you how each technique works, when to use it, and how to build it yourself.

That's why this repository exists. 30 techniques. Runnable notebooks. Real code you can use today.


πŸ—ΊοΈ Taxonomy of Agent Memory Techniques

Agent memory taxonomy: 30 techniques across 6 families (short-term, long-term, cognitive architectures, retrieval, frameworks, production)

The 30 techniques fall into six families. Each family solves a different memory problem. Each technique lives in its own notebook.

Family What it solves Techniques
Short-term Keep recent turns in memory without filling up the context window. 01 - 05
Long-term Save knowledge across sessions, users, and time. 06 - 11
Cognitive architectures Working, hierarchical, and reflective memory systems. 12 - 19
Retrieval & routing Choose what to recall and when. 20 - 23
Frameworks Production-ready memory libraries (Mem0, Letta, Zep, Graphiti). 24 - 27
Evaluation & production Measure, benchmark, and deploy memory. 28 - 30

🧭 Which Technique Do I Need?

30 techniques grouped by what you are building. Pick the group that matches your goal, then open the technique inside it.

Decision tree: which agent memory technique do I need?

Quick text version:

  • Need to manage the current chat? Start with 01-05 (short-term memory).
  • Need to persist across sessions? Start with 06 Vector Store or 21 Cross-Session Memory.
  • Building a cognitive architecture with multiple stores? See 12-19.
  • Using a framework? Go straight to 24 Graphiti, 25 Mem0, 26 Letta, or 27 Zep.
  • Evaluating or shipping to production? See 28-30.

Still not sure? Start with 01 Conversation Buffer. Almost every other technique builds on it.


πŸ“ Compare Techniques at a Glance

Looking to filter by constraint (persistence, retrieval style, token cost, best-for use case)? See the side-by-side comparison matrix covering all 30 techniques in one table.


πŸ“š All 30 Techniques

Short-term memory techniques for LLM agents: conversation buffers, sliding window, summary, token budget

πŸ”„ Short-Term Memory (Techniques 1-5)

Manage the conversation inside a single chat.

# Technique Description Notebook
01 Conversation Buffer Memory Save the full conversation, word for word. The simplest pattern, and the base for everything else. βœ… Notebook Β· Colab
02 Sliding Window Memory Keep only the last few messages. You limit the size, but you keep the recent parts. βœ… Notebook Β· Colab
03 Summary Memory Replace old turns with a short summary written by the model. You lose length but keep the meaning. βœ… Notebook Β· Colab
04 Summary Buffer Memory Summarize older turns, but keep recent messages word for word. You get both. βœ… Notebook Β· Colab
05 Token Buffer Memory Trim the history to fit a strict token budget. Drop the oldest messages first. βœ… Notebook Β· Colab

Long-term memory techniques for LLM agents: vector store, entity, knowledge graph, episodic, semantic, procedural

πŸ’Ύ Long-Term Memory (Techniques 6-11)

Storage that survives across sessions and users.

# Technique Description Notebook
06 Vector Store Memory Turn past messages into vectors (number lists that capture meaning). Search them later by similarity. βœ… Notebook Β· Colab
07 Entity Memory Pull out and track facts about people, projects, and preferences. Update them as the conversation grows. βœ… Notebook Β· Colab
08 Knowledge Graph Memory Build a graph of how entities connect. Walk the graph to reason over what the agent has learned. βœ… Notebook Β· Colab
09 Episodic Memory Store complete interactions with when-and-where context. Good for "what happened when" questions. βœ… Notebook Β· Colab
10 Semantic Memory Pull general facts out of interactions. Store them on their own, away from the raw episodes. βœ… Notebook Β· Colab
11 Procedural Memory Capture "how-to" knowledge: the procedures and workflows the agent picks up over time. βœ… Notebook Β· Colab

Cognitive architecture memory patterns: working memory, hierarchical layers, consolidation, compaction, self-reflection, routing, temporal, forgetting

🧩 Cognitive Architectures (Techniques 12-19)

Patterns borrowed from how humans remember.

# Technique Description Notebook
12 Working Memory & Context Window Manage the agent's limited attention. Prioritize, pin, and evict context on the fly. βœ… Notebook Β· Colab
13 Hierarchical Memory Layers Tiered storage with hot, warm, and cold layers. Promote and demote items as they age. βœ… Notebook Β· Colab
14 Memory Consolidation Merge, deduplicate, and strengthen memories. Inspired by how the brain consolidates during sleep. βœ… Notebook Β· Colab
15 Memory Compaction Compress stored memories with summaries, entity extraction, or distillation. Save storage and tokens. βœ… Notebook Β· Colab
16 Self-Reflection Memory The agent looks back at its own actions. It writes notes on what worked, and uses them next time. βœ… Notebook Β· Colab
17 Memory Routing Pick the right memory store to read from or write to. Route by content type and intent. βœ… Notebook Β· Colab
18 Temporal Memory Attach timestamps to memories. Retrieve with time awareness and weight recent items higher. βœ… Notebook Β· Colab
19 Forgetting & Decay Forget on purpose. Use decay, access counts, or relevance to prune. βœ… Notebook Β· Colab

Memory retrieval and multi-agent patterns: retrieval patterns, cross-session memory, multi-agent shared memory, memory as tools

πŸ” Retrieval & Multi-Agent (Techniques 20-23)

How agents find and share memories.

# Technique Description Notebook
20 Memory Retrieval Patterns Compare retrieval strategies: semantic search, recency, hybrid scoring, diversity, and re-ranking. βœ… Notebook Β· Colab
21 Cross-Session Memory Save and reload agent state across sessions. The user picks up where they left off. βœ… Notebook Β· Colab
22 Multi-Agent Shared Memory Shared stores, message passing, and agreement protocols for multi-agent teams. βœ… Notebook Β· Colab
23 Memory with Tools Give the agent memory tools it can call: save, search, forget. Treated like any other tool. βœ… Notebook Β· Colab

Agent memory frameworks and libraries: Graphiti, Mem0, Letta (MemGPT), Zep

πŸ”§ Frameworks & Platforms (Techniques 24-27)

Work with the leading memory frameworks, hands-on.

# Technique Description Notebook
24 Graph Memory with Graphiti Use Zep's Graphiti to build time-aware knowledge graphs from chat. Extract episodes and general facts. βœ… Notebook Β· Colab
25 Mem0 Patterns Use Mem0's managed memory layer. It handles extracting, storing, and fetching user-specific memories. βœ… Notebook Β· Colab
26 Letta (MemGPT) Patterns Build MemGPT's self-editing memory. Covers inner monologue, heartbeat events, and memory pressure. βœ… Notebook Β· Colab
27 Zep Memory Use Zep for dialog classification, entity extraction, and time-aware graphs. Built for production. βœ… Notebook Β· Colab

Agent memory evaluation and production: memory evaluation, LoCoMo and LongMemEval benchmarks, production deployment patterns

πŸ“Š Evaluation & Production (Techniques 28-30)

Measure your memory. Then ship it.

# Technique Description Notebook
28 Memory Evaluation Measure memory quality. Check retrieval precision and recall, staleness, contradictions, and user satisfaction. βœ… Notebook Β· Colab
29 Memory Benchmarks (LoCoMo) Run your memory against LoCoMo and LongMemEval benchmarks. See how it does over long conversations. βœ… Notebook Β· Colab
30 Production Memory Patterns Run memory at scale. Caching, TTLs (time-to-live), sharding, backups, GDPR, and observability. βœ… Notebook Β· Colab

🎯 Learning Paths

Beginner: Foundations

New to agent memory? Start here. These are the building blocks.

01 Conversation Buffer β†’ 02 Sliding Window β†’ 03 Summary Memory β†’
05 Token Buffer β†’ 06 Vector Store Memory β†’ 21 Cross-Session Memory

Intermediate: Structured Memory

Ready for more? Add entities, graphs, and smarter retrieval.

07 Entity Memory β†’ 08 Knowledge Graph β†’ 09 Episodic Memory β†’
10 Semantic Memory β†’ 20 Retrieval Patterns β†’ 22 Multi-Agent Shared Memory

Advanced: Cognitive Architectures

Build human-inspired memory patterns for advanced agents.

12 Working Memory β†’ 13 Hierarchical Layers β†’ 14 Consolidation β†’
16 Self-Reflection β†’ 17 Memory Routing β†’ 19 Forgetting & Decay

Practitioner: Frameworks & Production

Connect to production tools and measure what you've built.

25 Mem0 β†’ 26 Letta/MemGPT β†’ 24 Graphiti β†’ 27 Zep β†’
28 Evaluation β†’ 29 Benchmarks β†’ 30 Production Patterns

πŸš€ Quick Start

πŸ’‘ Prefer not to install anything? Every notebook renders on GitHub directly. Click a technique in the table above to read it in your browser. Or use the Colab badges to run it in the cloud.

# Clone the repository
git clone https://github.com/NirDiamant/Agent_Memory_Techniques.git
cd Agent_Memory_Techniques

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up your API keys
cp .env.example .env
# Edit .env with your OPENAI_API_KEY and/or ANTHROPIC_API_KEY

# Launch Jupyter and start with the first technique
jupyter notebook all_techniques/01_conversation_buffer_memory/

πŸ“ Project Structure

Agent_Memory_Techniques/
β”œβ”€β”€ README.md                           # You are here
β”œβ”€β”€ ROADMAP.md                          # Current state and what's next
β”œβ”€β”€ LICENSE                             # Apache 2.0
β”œβ”€β”€ CITATION.cff                        # How to cite this work
β”œβ”€β”€ requirements.txt                    # Python dependencies
β”œβ”€β”€ .env.example                        # API key template
β”œβ”€β”€ llms.txt                            # LLM-discoverability index
β”‚
β”œβ”€β”€ all_techniques/                     # 30 technique folders, each with notebook + README
β”‚   β”œβ”€β”€ 01_conversation_buffer_memory/
β”‚   β”œβ”€β”€ 02_sliding_window_memory/
β”‚   β”œβ”€β”€ ...
β”‚   └── 30_production_memory_patterns/
β”‚
β”œβ”€β”€ docs/                               # Project documentation
β”‚   β”œβ”€β”€ architecture.md                 # Memory system design patterns
β”‚   β”œβ”€β”€ comparison.md                   # Side-by-side comparison of all 30 techniques
β”‚   β”œβ”€β”€ glossary.md                     # Key terms and definitions
β”‚   β”œβ”€β”€ learning_path.md                # Detailed learning path guide
β”‚   β”œβ”€β”€ topics.md                       # Keyword index
β”‚   β”œβ”€β”€ roadmap.md                      # Original planning archive
β”‚   β”œβ”€β”€ FAQ.md                          # Frequently asked questions
β”‚   └── CONTENT_STANDARDS.md            # Writing-style rules
β”‚
β”œβ”€β”€ .github/                            # GitHub community files
β”‚   β”œβ”€β”€ CONTRIBUTING.md                 # How to contribute
β”‚   β”œβ”€β”€ CODE_OF_CONDUCT.md              # Community guidelines
β”‚   β”œβ”€β”€ SECURITY.md                     # Security policy
β”‚   β”œβ”€β”€ FUNDING.yml                     # Sponsorship config
β”‚   β”œβ”€β”€ ISSUE_TEMPLATE/                 # Issue templates
β”‚   β”œβ”€β”€ pull_request_template.md        # PR template
β”‚   └── workflows/                      # CI workflows
β”‚
β”œβ”€β”€ utils/                              # Shared helpers and validators
β”‚   β”œβ”€β”€ helpers.py                      # Env loading, LLM clients, cosine, tokens
β”‚   β”œβ”€β”€ validate_cells.py               # Notebook cell-structure validator
β”‚   └── validate_style.py               # Prose-style validator
β”‚
β”œβ”€β”€ tests/                              # pytest smoke tests
β”œβ”€β”€ data/                               # Small sample datasets
└── images/                             # Diagrams and visuals

🀝 Contributing

Contributors

We welcome contributions. You can fill in a notebook, fix a bug, improve the docs, or propose a new technique. Every contribution helps the next reader.

See CONTRIBUTING.md for the details.

Where we need help the most:

  • More techniques we haven't covered yet (propose one via an issue)
  • Architecture diagrams (Mermaid or ASCII)
  • More memory benchmarks and evaluation metrics
  • Integration examples for new frameworks


πŸ’– Sponsors

Supporting this project helps keep educational AI content free and open. If your company uses agent memory in production, consider sponsoring to get your logo below.

Become a Sponsor


πŸ”— Related Work

This repo is part of a bigger collection of AI technique tutorials.

Repository Stars Focus
RAG Techniques 26k+ Retrieval-Augmented Generation techniques
GenAI Agents 21k+ Generative AI agent architectures
Agents Towards Production 18k+ Production-grade agent deployment
Prompt Engineering 7k+ Prompt engineering techniques

🏷️ Topics Covered

This repository is a practical reference for agent memory in Large Language Model (LLM) applications. For the full keyword index covering short-term, long-term, cognitive architectures, retrieval, frameworks, evaluation, and production patterns, see docs/topics.md.


⚠️ Disclaimer

This repository is for educational purposes. The code here shows how agent memory techniques work. It is not production-ready software. Do not use it as-is for handling regulated data, medical decisions, legal advice, or any high-stakes application without a careful review. The authors accept no responsibility for how you use this material.


πŸ“„ License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.


πŸ“– Citation

If you use this repository in your research or teaching, please cite:

@misc{diamant2026agentmemory,
    title={Agent Memory Techniques: A Comprehensive Collection},
    author={Nir Diamant},
    year={2026},
    url={https://github.com/NirDiamant/Agent_Memory_Techniques
}

Built with care by Nir Diamant, making advanced AI accessible to everyone.


About

Agent memory for LLMs: 30 runnable Jupyter notebooks covering conversation buffers, vector stores, knowledge graphs, episodic and semantic memory, MemGPT, Mem0, Letta, Zep, Graphiti, LoCoMo benchmarks, and production patterns.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors