← heapsort
RESEARCH27

Regularized Centered Emphatic Temporal Difference Learning

arXiv CS.AIΒ·May 7, 2026

This paper introduces Regularized Emphatic Temporal-Difference Learning (RETD) to address the stability, projection geometry, and variance trade-off in off-policy temporal-difference learning. It proposes a method that regularizes the auxiliary centering recursion to maintain the positive-definiteness of the ETD key matrix and proves its convergence.

Read original β†—