RESEARCH54

TinyJudge: Unverifiable Constraint Alignment via Lightweight Specialist Ensembles

arXiv CS.CL·June 9, 2026

The paper introduces TinyJudge, a framework that uses an ensemble of specialized tiny language models (0.6B) to provide lightweight and high-precision rewards for soft, unverifiable constraints in LLM instruction following. This approach addresses the bottlenecks of reward hacking and high computational overhead found in traditional LLM-as-a-judge methods for constraint alignment.

Tiny Models Model Alignment LLMs reinforcement learning Constraint Alignment

Read original ↗