← heapsort
RESEARCH27

How Language Models Process Negation

arXiv CS.CLΒ·May 6, 2026

This study investigates how Large Language Models (LLMs) mechanistically process negation, revealing that even open-weight models possess internal components for correct negation processing despite often providing wrong answers. Their poor accuracy is attributed to late-layer attention promoting simple shortcuts, and models implement both attending to negated phrases and directly constructing negative phrase representations.

Read original β†—