Posts
Showing posts with the label tool use
Designing Safer LLM Agents: Key Issues from Recent Papers
- Get link
- X
- Other Apps
Three AI News Updates on Safer Agents, Multi-Turn Tool Use, and Infrastructure Scale
- Get link
- X
- Other Apps
Can LLMs Reuse Tools Creatively? What CreativityBench Tries to Measure
- Get link
- X
- Other Apps
Why Do Long-Horizon Agents Break? Diagnosing Failure with HORIZON and Related Papers
- Get link
- X
- Other Apps
Why Do Long-Horizon Agents Break? HORIZON and the Case for Diagnostic Evaluation
- Get link
- X
- Other Apps