RESEARCH27
Regularized Centered Emphatic Temporal Difference Learning
arXiv CS.AIΒ·May 7, 2026
This paper introduces Regularized Emphatic Temporal-Difference Learning (RETD) to address the stability, projection geometry, and variance trade-off in off-policy temporal-difference learning. It proposes a method that regularizes the auxiliary centering recursion to maintain the positive-definiteness of the ETD key matrix and proves its convergence.
Read original β