RESEARCH54
TinyJudge: Unverifiable Constraint Alignment via Lightweight Specialist Ensembles
arXiv CS.CLΒ·June 9, 2026
The paper introduces TinyJudge, a framework that uses an ensemble of specialized tiny language models (0.6B) to provide lightweight and high-precision rewards for soft, unverifiable constraints in LLM instruction following. This approach addresses the bottlenecks of reward hacking and high computational overhead found in traditional LLM-as-a-judge methods for constraint alignment.
Read original β