RESEARCH27

Harnesses for Inference-Time Alignment over Execution Trajectories

arXiv CS.LG·May 23, 2026

This research investigates harness engineering as an inference-time technique for large language model (LLM) agents, focusing on improving long-term performance via task decomposition and guided execution. It quantifies how design elements like workflow granularity and guidance impact performance, revealing common failure modes such as over-decomposition and hallucinated execution.

inference LLMs machine learning Task Decomposition AI agents

Read original ↗