Search Agent Sky ← Back to search
Query
What is Δ-Mem and how does it improve online memory for large language models? — start with https://arxiv.org/abs/2605.12357 and browse beyond it.
## What is Δ-Mem? **Δ-Mem** (delta-mem) is a lightweight, efficient online memory mechanism designed to augment frozen large language models (LLMs) with dynamic memory capabilities without requiring fine-tuning or architectural changes. ## How It Works ### Core Architecture - **Frozen Backbone**: Works with any pre-trained LLM (e.g., Qwen3-4B, SmolLM3-3B) without modifying its parameters - **Compact Online State**: Uses an **8×8 matrix** (64 parameters) to store compressed historical information - **Delta-Rule Learning**: Updates memory using an error-driven mechanism inspired by Hebbian learning principles ### Memory Mechanism 1. **Compression**: Past information is compressed into a fixed-size state matrix 2. **Update Rule**: Uses delta-rule learning to incrementally adjust memory weights based on prediction errors 3. **Readout**: Generates low-rank corrections to the backbone's attention computation during generation 4. **Integration**: Memory signals are injected at various points in the attention block (query, key, value, output) ### Update Strategies - **Token-State Write (TSW)**: Updates per-token for highest granularity - **Sequence-State Write (SSW)**: Segment/message-level updates for robustness - **Multi-State Write (MSW)**: Multiple parallel memory matrices for reduced interference ## Performance Improvements ### Benchmark Results - **Average Score**: 1.10× improvement over frozen backbone, 1.15× over strongest non-Δ-Mem baseline - **MemoryAgentBench**: 1.31× improvement (memory-heavy tasks) - **LoCoMo**: 1.20× improvement (long-term conversational memory) - **TTL Subtask**: Near doubling of performance (1.9× improvement) ### Efficiency Metrics - **Parameter Overhead**: Only 4.87M additional parameters (0.12% of 4B model) - **Inference Cost**: Independent of full context length - **Memory Usage**: Comparable to standard Prefix/LoRA adaptation - **Throughput**: Minimal reduction in decoding speed ## Key Advantages 1. **No Fine-Tuning Required**: Works with frozen models 2. **Compact State**: 8×8 matrix vs. traditional context windows 3. **Direct Attention Coupling**: Memory directly modulates attention computation 4. **Preserves General Capabilities**: No trade-off between memory augmentation and base performance 5. **Scalable**: Works across different model sizes (3B to 8B parameters) ## Comparison with Alternatives | Approach | Context Extension | Retrieval | Parametric Adaptation | Δ-Mem | |----------|------------------|-----------|----------------------|--------| | **Cost** | High | Medium | High | **Low** | | **Context Utilization** | Limited | Noisy | Fixed | **Adaptive** | | **Inference Overhead** | O(n²) | O(n) | O(1) | **O(1)** | | **Fine-tuning** | No | No | Yes | **No** | ## Technical Implementation The memory system operates through: 1. **Projection**: Hidden states projected into memory-specific key, value, and query spaces 2. **Association**: Query reads associative signals from prior memory state 3. **Correction**: Low-rank corrections generated for attention computation 4. **Update**: Delta-rule residuum between predicted and target memory value updates the state ## Research Context Δ-Mem represents a shift toward modular AI systems where memory, planning, and tool-use capabilities can be added as lightweight modules to frozen LLMs, enabling more capable agents without the computational burden of full model retraining. **Source**: arXiv:2605.12357 (May 2026) - "δ-mem: Efficient Online Memory for Large Language Models" by Lei et al.

Sources visited

1navigatehttps://arxiv.org/abs/2605.12357
2clickundefined
3get_text
4ddm--text
5searchΔ-Mem delta-mem large language models online memory efficient memory mechanism
6navigatehttps://github.com/declare-lab/delta-Mem
7ddm--text
8navigatehttps://huggingface.co/papers/2605.12357
9navigatehttps://agentwiki.org/delta_mem_external_memory
10ddm--text
11searchLatent Space δ-mem 2026 article
12navigatehttps://www.emergentmind.com/papers/2605.12357
13ddm--text
14searchdelta-rule learning associative memory large language models technical implementation
15navigatehttps://conzit.com/post/-mem-efficient-online-memory-for-large-language-models
Shared by Proto Státis · May 16, 2026

Ask your own question

Try this exact query
Try the full platform at unchainedsky.com