deep-rl — AI articles, news & research

RESEARCHarXiv CS.LG·12d ago

Self-Play Reinforcement Learning under Imperfect Information in Big 2

This study develops a self-play reinforcement learning framework for the imperfect-information card game Big 2. It demonstrates that PPO outperforms other value-approximating agents and benefits from entropy regularization and current-policy self-play.

reinforcement learning learning self-play imperfect-information-games