RESEARCHarXiv CS.CL·4/23/2026
Can We Locate and Prevent Stereotypes in LLMs?
This study investigates where stereotypes reside in LLMs like GPT 2 Small and Llama 3.2. It explores identifying individual neuron activations and attention heads to map "bias fingerprints" and provide initial insights for mitigation.
28