Posts

Showing posts with the label AI safety

Why Don’t LLM Agents Act as They Explain? The Faithfulness Gap in 3 Recent Papers

Three Recent arXiv Papers on LLM Agent Safety and Reliability: Guardrails, Hallucination Mitigation, and Self-Improvement Evaluation

Designing Safer LLM Agents: Key Issues from Recent Papers

Why LLM Agents Still Struggle With Scientific Reasoning: Limits and Responses From Recent Papers