← heapsort-ai

prediction-error

1 items

RESEARCHarXiv CS.LG·4/22/2026

Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training

Curiosity-Critic introduces an intrinsic reward for world model training, focusing on improving cumulative prediction error rather than just current transitions. It uses a learned critic to estimate an asymptotic error baseline, effectively separating epistemic from aleatoric error and directing exploration towards learnable transitions.

27