Back to Blog

External Memory Modules: Because My Model Has Commitment Issues

You know what takes forever? Training a transformer. You know what takes less forever? Training a tiny thing that just remembers stuff. Enter External Memory Modules, or EMM for people who enjoy acronyms more than free time.

The idea is simple. Instead of cramming everything into the model's weights, let it write stuff down. Think of it as the model having a little notebook. It can jot down important information, reference it later, and best of all, it does not even need to remember where it wrote it down. That is what the memory retrieval is for.

How It Works

We give the model 384 memory slots. It can read from any of them, write to any of them, and the whole thing is differentiable so it learns what to remember and what to forget. It is like giving a goldfish a smartphone to take notes.

The model does not have to be perfect. It just has to know where its notes are.

Is it cheating? Maybe. Is it effective? Absolutely. Does it understand what it remembers? Probably not. But neither do I and I still take notes.