← heapsort
RESEARCH27

ReVision: Scaling Computer-Use Agents via Temporal Visual Redundancy Reduction

arXiv CS.CLΒ·May 13, 2026

ReVision introduces a method to scale computer-use agents by reducing temporal visual redundancy in interaction trajectories. It employs a learned patch selector to remove redundant visual tokens, cutting token usage by approximately 46% and improving efficiency for multimodal language models across benchmarks.

Read original β†—