Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees
This research addresses the challenge of decision-making in environments with strategic adversaries or external factors, where traditional policies can fail catastrophically in safety-critical settings. It proposes an optimistic policy learning approach designed to account for these interactions and provide regret and violation guarantees.