RESEARCHarXiv CS.LG·18d ago
Harnesses for Inference-Time Alignment over Execution Trajectories
This research investigates harness engineering as an inference-time technique for large language model (LLM) agents, focusing on improving long-term performance via task decomposition and guided execution. It quantifies how design elements like workflow granularity and guidance impact performance, revealing common failure modes such as over-decomposition and hallucinated execution.
27