New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.

AI agents handling complex, multi-step tasks face a fundamental bottleneck: their context windows overflow with irrelevant information while critical details get buried. Researchers at the National University of Singapore tackled this problem with MRAgent, a framework that ditches the standard "retrieve-then-reason" pipeline in favor of dynamic memory reconstruction integrated directly into the LLM's reasoning process.

The core insight is simple but powerful. Instead of passively retrieving information once and hoping it sticks, MRAgent allows agents to actively rebuild their memory as they accumulate evidence across reasoning steps. This keeps the agent focused on what matters for the current task, not everything that might matter.

The efficiency gains are substantial. MRAgent uses 118,000 tokens per query. Compare that to LangMem, a competing agentic memory framework that burns through 3.26 million tokens for the same work. That's a 27-fold reduction in token consumption. Fewer tokens mean faster execution and dramatically lower inference costs, a practical win for anyone deploying these systems at scale.

The framework addresses a real failure mode in current agentic systems. As reasoning chains grow longer, retrieval-augmented generation (RAG) systems retrieve increasingly noisy results. Static memory snapshots become outdated. Agents lose track of what they've already determined and spiral into computational waste, re-solving problems they'd already solved earlier in their reasoning chain.

MRAgent integrates memory updates into the LLM's forward pass itself. The agent rebuilds its memory state based on what it's learned so far, not based on a fixed retrieval strategy. This tighter coupling between reasoning and memory management means the agent only holds onto information it actually uses.

This approach matters for real-world deployment. Long-horizon tasks like research synthesis, code debugging, or multi-step planning require dozens or hundreds of reasoning steps.

New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.

Prompt injection is exploiting enterprise AI's biggest design flaws by targeting agents, RAG pipelines and model routers

Claude Code turned every engineer into three. Now companies need more product thinkers

Autonomous security agents need complete data. Here's how to check if yours is ready.

Get Daily AIWireDaily