RESEARCH28
Can We Locate and Prevent Stereotypes in LLMs?
arXiv CS.CLΒ·April 23, 2026
This study investigates where stereotypes reside in LLMs like GPT 2 Small and Llama 3.2. It explores identifying individual neuron activations and attention heads to map "bias fingerprints" and provide initial insights for mitigation.
Read original β