RESEARCHarXiv CS.LG·22d ago
TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination
This paper introduces TeamTR, a trust-region framework for fine-tuning multi-agent LLM systems, addressing structural failures in sequential fine-tuning. It proves that stale-occupancy evaluation incurs a quadratic penalty with the number of agents and improves performance by 7.1% on average.
28