RESEARCHarXiv CS.AI·5/7/2026
Agent Island: A Saturation- and Contamination-Resistant Benchmark from Multiagent Games
Agent Island is a new multiagent simulation environment for language models, serving as a dynamic benchmark designed to mitigate saturation and contamination. Models like openai/gpt-5.5 are ranked based on their performance in games involving cooperation, conflict, and persuasion.
27