Open-world AI — AI articles, news & research

RESEARCHarXiv CS.LG·19d ago

GROW: Aligning GRPO with State-Action Modeling for Open-World VLM Agents

This paper introduces GROW, an RL framework for open-world VLM agents, addressing limitations of existing Supervised Fine-Tuning methods. It proposes a novel approach for Group Relative Policy Optimization (GRPO) by decomposing trajectories into state-action samples rather than full entities.

VLM Agents Policy optimization Open-world AI reinforcement learning