Overview
Titans: Learning to Memorize at Test Time (Google Research, Dec 2024) proposes a new model architecture designed to mitigate the quadratic cost issue of traditional Transformers while achieving long-term memory.
The Core Concept
For over a decade, sequence modeling has been split between:
- Recurrent Models (RNNs/LSTMs): Compress data into a fixed-size hidden state. Great for compute, terrible for long-range recall.
- Attention (Transformers): Attend to the entire context window. Perfect for recall, but computationally quadratic ($O(N^2)$).
Titans bridge this gap by using a neural architecture inspired by human cognitive memory. It features “learning by surprise,” where the model utilizes a test-time training mechanism to dynamically update its internal memory representations as it encounters new tokens, effectively learning to memorize during inference.
TODO: Add structural diagram of Titans Memory Module.