RESEARCH28

Can We Locate and Prevent Stereotypes in LLMs?

arXiv CS.CL·April 23, 2026

This study investigates where stereotypes reside in LLMs like GPT 2 Small and Llama 3.2. It explores identifying individual neuron activations and attention heads to map "bias fingerprints" and provide initial insights for mitigation.

neural networks LLMs bias detection Bias Mitigation AI ethics

Read original ↗