← heapsort-ai

attention mechanisms

28 items

RESEARCHarXiv CS.LG·4/21/2026

CGCMA: Conditionally-Gated Cross-Modal Attention for Event-Conditioned Asynchronous Fusion

This paper studies asynchronous alignment in multimodal learning, where a dense primary stream must be fused with sporadic external context, requiring models to reason explicitly about freshness and trust. It proposes CGCMA (Conditionally-Gated Cross-Modal Attention), a model that separates text-conditioned grounding from lag-aware trust control, tested on cryptocurrency markets.

27
RESEARCHarXiv CS.AI·29d ago

Where Reliability Lives in Vision-Language Models: A Mechanistic Study of Attention, Hidden States, and Causal Circuits

This research tests the "Attention-Confidence Assumption" in Vision-Language Models (VLMs), finding that attention structure is a near-zero predictor of correctness. The study uses a unified mechanistic pipeline (VLM Reliability Probe) to analyze attention, generation dynamics, and hidden-state geometry in three VLM families.

27
RESEARCHarXiv CS.AI·13d ago

LaneRoPE: Positional Encoding for Collaborative Parallel Reasoning and Generation

LaneRoPE is a novel technique designed to enhance parallel Large Language Model (LLM) generation by enabling coordination and collaboration among multiple sequences at test time. It achieves this through an inter-sequence attention mask and a RoPE extension that injects positional information, demonstrating promising results on mathematical reasoning tasks.

27
RESEARCHarXiv CS.CL·5/6/2026

How Language Models Process Negation

This study investigates how Large Language Models (LLMs) mechanistically process negation, revealing that even open-weight models possess internal components for correct negation processing despite often providing wrong answers. Their poor accuracy is attributed to late-layer attention promoting simple shortcuts, and models implement both attending to negated phrases and directly constructing negative phrase representations.

27