RESEARCH29

Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions

arXiv CS.AI·May 18, 2026

This research paper explores the disconnect between fair outputs of language models and their latent internal biases in high-stakes decisions like mortgage underwriting. It demonstrates that while LLMs may show no output bias, they retain and amplify demographic representations which can cause decision reversals, and this bias is asymmetric.

LLM bias machine learning causality AI ethics fairness

Read original ↗