← heapsort-ai

Machine Learning Theory

4 items

RESEARCHarXiv CS.LG·5/8/2026

Are Flat Minima an Illusion?

This paper challenges the conventional view that flat minima inherently lead to better generalization, showing that function-preserving reparameterization can drastically alter a minimum's perceived sharpness. It introduces "weakness"—a reparameterization-invariant measure based on what the network does—as the actual driver of generalization, proving its minimax optimality and correlation with PAC-Bayes bounds.

27
RESEARCHarXiv CS.AI·28d ago

On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective

This research proposes distinguishing between capability elicitation and capability creation in large language model post-training. It argues that elicitation reweights existing behaviors within a model's accessible support, while creation changes that support itself, developing this through a free-energy view.

27