RESEARCH27

SaFeR-Steer: Evolving Multi-Turn MLLMs via Synthetic Bootstrapping and Feedback Dynamics

arXiv CS.LG·April 21, 2026

SaFeR-Steer is a novel framework designed to improve the safety alignment of Multi-modal Large Language Models (MLLMs) in multi-turn dialogues, addressing challenges like escalating unsafe intent and long-context safety decay. It employs synthetic bootstrapping and feedback dynamics, while also releasing the STEER dataset for training and evaluation.

Safety security MLLMs multi-turn alignment

Read original ↗