Skip a Layer or Loop It? Learning Program-of-Layers in LLMs
This research proposes "program-of-layers (PoLar)" for LLMs, enabling dynamic skipping or looping of pretrained layers during inference to achieve better or equivalent accuracy with shorter execution paths. A lightweight prediction network learns to generate these customized programs, demonstrating improved performance on mathematical reasoning benchmarks.