Designing Memory System for AI Agents

Agent Memory is hard to build, but it’s the key to creating a personalized, accurate agents with real high switching costs through accumulated context and difficult to replace.

Open Table of Contents

LLMs vs Agents
Why Memory is Critical for Agents
Memory System Design for AI Agents
- What is a Memory System?
- System Design
Example: Designing Agentic Memory System for “Personalised Quizzing App”
References

LLMs vs Agents

Most product builders confuse b/w LLMs and AI Agents. LLMs are

Stateless text/image/video predictor
Single-turn responder - no memory between requests
No actions - only produces output

Whereas Agents are intelligent software that accomplishes higher-order human level tasks and needs a series of components such as:

Workflow - orchestrates multi-step tasks with user control and transparency
LLM (Brain) - reasoning and decision-making engine
Actions via Tools - capabilities to interact with external world (APIs, databases, file systems)
Memory - Stores context across interactions (short-term session state + long-term knowledge)
Orchestration - manages loops, error handling, state transitions

Why Memory is Critical for Agents

Memory is critical to humans as they accomplish any tasks to note down learnings, remember instructions, learn from mistakes, etc

In the same way, memory matters for AI agents:

Personalisation (to each customer…)
Boosts accuracy in accomplishing tasks to not make same mistakes…
Being able to perform long horizon tasks (working long term requires iterations, learnings, un-learnings)

Memory System Design for AI Agents

AI Agents memory design system

What is a Memory System?

A memory system for agent creates, stores, and utilises “memories” that helps agents achieve tasks with better accuracy and in a more personalised manner. Where memories are “valuable (important) & Relevant (recent & not outdated)” to the task and the human accomplishing the task.

System Design

To design a memory system for your AI Agent, you need to carefully define the following :

Entities you require information for: Human, company, places, etc. For example, conversation agents require information about their users, but maybe info companies they work for, etc
Types of information that you require:
- Working memory (or short-term memory): short-term session memory in the context window
- Semantic or Factual memory: remember facts about “entities” (“Neeraj is a AI product builder”)
- Episodic Memory: “Experiences” or “actions & results” (“Went to Six Flags on 10th birthday”, “created this code and faced bug last time..”)
- Procedural Memory: Explicit Instructions to follow for agents (“Always provide concise answers)
Memory Lifecycle
- When memories are generated:
  - In the hot path: Agent decides what to remember during conversation
  - In the background: Async processing after turns/sessions
- When & how it is utilised: deciding whether to use memory requires nontrivial reasoning over your request and the system rules
Memory should be Inspectable: Can go back to any “memory snippet” used to understand “why”
User Control: when memory gets used and how users control it
Memory Store: Database for efficient upsert & retrieval

Example: Designing Agentic Memory System for “Personalised Quizzing App”

AI Agents memory system for personalised quizzing app

Problem:

Build a adaptive learning agent that quizzes users on topics they’ve studied, adapting difficulty based on performance and remembering learning patterns across sessions.

Challenges:

Track what topics user has learned and their proficiency levels
Remember past quiz performance
Adapt difficulty based on learning progress on topics
Maintain context within a study session while building long-term knowledge

Memory System Design:

Entities:

User : Individual learner (proficiency, preferences, learning patterns)
Topics : Subjects being studied (coverage, difficulty levels)

Types of memories required:

Working Memory (Short-term/Session-level) : Current quiz topic and difficult level, Questions asked in this session, Real-time performance score
Semantic Memory (Facts) : User’s proficiency per topic, Topics mastered vs. struggling with, Study schedule preferences
Episodic Memory (Events happened…) : Past quiz attempts with results (Topic: Calculus, Date: Nov 20, Score: 7/10), Learning breakthroughs, Common mistake patterns, Questions answered incorrectly multiple times

Procedural Memory (Instructions to follow) : - When to Increase difficulty (3 consecutive correct answers), How to space repetition, Escalation rules (suggest human tutor after 3 failed attempts

User Control: Opt-in memory with “remember by learning patterns”, granular control with reset proficiency per topic, mark quizzes practice only (no memory updates), See and edit what agent remembers, etc

References

Paper on Cognitive Architectures for Language Agents : https://arxiv.org/pdf/2309.02427
https://www.leoniemonigatti.com/blog/memory-in-ai-agents.html