This post presents some ideas I worked on over a few months making a biologically inspired memory system for use by my AI agent system. It's designed to emulate characteristics of human cognitive processes, with the hope that it lends itself to the character and contextual awareness of the agents powered by it.
We model dynamic memory networks that facilitate inter-memory interactions with an entity management system for organizing knowledge about things– all with connected memories, cues and decay over time. I also recently added a "sleep phase" for memory consolidation, a time-aware retrieval step for foundational memories, a meta-memory system for higher-level memory management, and memory evolution processes that enable modifications over time.
Drawing from neuroscience, the system integrates multiple types of memory—including episodic, semantic, procedural, working, and several others—each modeled with properties and behaviors akin to their biological counterparts.
While none of this is required, this was an exploration in figuring out the effects of these ideas in a qualitative sense. Not all of it is essential. The system makes room for a wide range of mechanisms, but you only need a few pieces for practical use with different types of agents. An important consideration is that agents themselves are more like a cog in the system versus being the system. An organisation of agents, each specialized, each with the right shared context, is bound to be easier to configure, reason about, and validate.
It started as a thought exercise, tracing the path new memories takes from our sense, to how it's encoded in the hippocampus and everything that we know happens after.
How our brain makes memories
When memories are first formed, the hippocampus is heavily involved in their retrieval. It "indexes" or binds together different elements of a memory stored across the cortex (e.g., sights, sounds, emotions) and helps retrieve these components by reactivating the relevant cortical areas.
Over time, through a process called consolidation, memories become increasingly integrated into the cortex itself. This means that as memories age, the cortex takes on more responsibility for their storage and retrieval. The hippocampus becomes less critical for retrieving well-established, older memories, as these memories have stronger direct connections within cortical regions.
Once memories are fully consolidated, they can often be retrieved directly from the cortex with little or no hippocampal involvement, especially for semantic (factual) memories. However, episodic memories—those with rich context and detail—often continue to rely on the hippocampus even after being partially consolidated in the cortex.
While my initial curiosity led me to explore a fine-tuned embedding model focused on a dense retrieval pipeline, the most effective solution proved to be more comprehensive. In practice, a hybrid approach works best - one that combines dense retrieval with a retrieval system that accounts for factors like time-relevancy, popularity and other key variables related to the use case (more akin to sparse retrieval.)
Designing a neural-inspired memory system
Replicating the intricacies of human memory is a formidable challenge. AI architectures often rely on static databases or representations in vector spaces, lacking the depth and dynamism inherent in human memory systems over semantic relationships.
The motivation is to enhance an agents' ability to learn in a manner resembling human cognition and create AI systems capable of forming, retrieving, and modifying memories in a way that enriches their learning and decision-making capabilities.
Memory system
├── Meta-memory system
├── Memory processes
│ ├── Memory consolidation (daily sleep phase)
│ ├── Time and popularity awareness
│ └── Memory evolution
├── Sensory memory (real-time world state)
├── Working memory
├── Long-term memory
│ ├── Declarative memory (explicit memory)
│ │ ├── Episodic memory
│ │ └── Semantic memory
│ └── Non-declarative memory (implicit memory)
│ └── Procedural memory
├── Entity system
Fine-tuning embedding models
Embedding models, which convert high-dimensional data into lower-dimensional vector spaces, are instrumental in enabling agents to understand and relate different pieces of information.
By fine-tuning an embedding model, we can move the model towards simulating how humans recall memories based on learned associations and contextual cues. This approach allows the system to retrieve related memories more efficiently, leading to more coherent and contextually appropriate responses. Practically speaking, this can be done using a sparse embedding model alonside using an LLM to reshape new inputs within the context of any related memories, before embedding and using it for retrieval. The combinatation gives us more control over shaping the final result.
Fine-tuning an embedding model isn't the only way to build this sort of this, it's a particularly powerful approach if you have strong opinions about what the embedding space should capture. Between all the RAG techniques and setup of the overall retrieval system, you can configure essentially every aspect of our memory system.
The effects of an expressive memory system on AI behavior have been fascinating to observe. The conversational agents demonstrate enhanced learning capabilities, forming connections between new information and existing memories in ways that feel more organic and less mechanical. Their responses become more adaptive, drawing from a rich tapestry of interconnected memories rather than isolated data points.
Perhaps most interestingly, it leads to more consistent behaviour over time. As the AI draws from its well-structured memory system, its decisions and actions maintain a coherent thread that reflects its configuration and accumulated experiences.