RESEARCHarXiv CS.AI·5/7/2026
Regularized Centered Emphatic Temporal Difference Learning
This paper introduces Regularized Emphatic Temporal-Difference Learning (RETD) to address the stability, projection geometry, and variance trade-off in off-policy temporal-difference learning. It proposes a method that regularizes the auxiliary centering recursion to maintain the positive-definiteness of the ETD key matrix and proves its convergence.
27