Posts

Showing posts with the label multi-agent systems

Safety, Efficiency, and Real-World Use of LLM Agents: Reading Four Recent arXiv Papers

Three New Papers on LLM Memory and Reasoning: ChatHealthAI, Traj-Evolve, and DELTAMEM

Why LLM Agent Evaluation Is Hard: Recent Papers on the Gap Between Benchmarks and Real Deployment

Recent Papers on LLM Agents: Memory, Negotiation, and Structural Failure

Three Recent Papers on Making LLM Agent Execution More Reliable: SDOF, SkillSmith, and STAR

Designing Safer LLM Agents: Key Issues from Recent Papers

LLM Agents and Scientific Discovery: What Four New arXiv Papers Suggest About the Next Wave of Automation