Skip to main content

Posts

Featured

Recent Papers on LLM Agents: Memory, Negotiation, and Structural Failure

Recent Papers on LLM Agents: Memory, Negotiation, and Structural Failure A recent set of arXiv papers looks at LLM agents from a less celebratory angle: not just what they can do, but why they keep failing in repeated, practical settings. ANNEAL examines how agents repeat the same mistakes when the symbolic structures behind task execution are never repaired. A negotiation paper asks whether modeling the other side is enough to bargain well. Another studies how persistent memory can hide safety problems through summarization. A fourth analyzes several agent interaction paradigms together inside one practical framework. Taken together, these papers point to a common shift: from adding more capability to understanding failure modes in process, memory, and coordination. [S1][S2][S9][S11] [S1] [S2] [S9] [S11] Introduction: what these papers are about All four papers appeared on arXiv in May 2026 and focus on different parts of the LLM agent stack. ANNEAL, from its title and abstract, is...

Latest Posts

Three Recent Papers on Making LLM Agent Execution More Reliable: SDOF, SkillSmith, and STAR

Two Axes for Reading LLM Agent Design: What the Agent Does and How It Runs

Designing Safer LLM Agents: Key Issues from Recent Papers

Why LLMs Lose Context in Multi-Turn Interaction: What Three New Papers Suggest About Causes and Responses

Three AI News Updates on Safer Agents, Multi-Turn Tool Use, and Infrastructure Scale

How Conversational LLM Agents Choose the Next Question: BALAR and PRISM

Can LLMs Reuse Tools Creatively? What CreativityBench Tries to Measure

Why Safety in LLM Agents May Depend More on Interaction Topology Than on the Model

When Do Tools Help LLM Agents, and When Do They Backfire?

Why Does LLM Diversity Shrink? Reconsidering Generative Diversity After Supervised Fine-Tuning