Posts
Showing posts with the label LLM agents
LLM Agents and Scientific Discovery: What Four New arXiv Papers Suggest About the Next Wave of Automation
- Get link
- X
- Other Apps
DreamProver and AGEL-Comp: What LLM Agents Need to Reason Better and Generalize Further
- Get link
- X
- Other Apps
Three Recent Papers on Making LLM Agents More Stable in Planning and Reasoning
- Get link
- X
- Other Apps
Two Ways to Stabilize LLM Agents on Complex Tasks: Hierarchical Planning and CAP-CoT
- Get link
- X
- Other Apps
How LLM Agents Combine Decision-Making and Skill Use in Long-Horizon Tasks
- Get link
- X
- Other Apps
Tool Choice and Interpretability in LLM Agents: Key Ideas from Three Recent Papers
- Get link
- X
- Other Apps
Why LLM Agents Still Struggle With Scientific Reasoning: Limits and Responses From Recent Papers
- Get link
- X
- Other Apps
Why LLM Agents Stay Unstable: Three Recent arXiv Papers on Reliability, Web Skill Learning, and Reasoning Limits
- Get link
- X
- Other Apps
Why Do Long-Horizon Agents Break? Diagnosing Failure with HORIZON and Related Papers
- Get link
- X
- Other Apps
Why Do Long-Horizon Agents Break? HORIZON and the Case for Diagnostic Evaluation
- Get link
- X
- Other Apps
How LLM Agents Handle Real Work and Exploration Problems: Four Recent Papers in Brief
- Get link
- X
- Other Apps