ABLE: Representing and Mapping LLMs via Attribution-Based Large-model Embedding
ABLE (Attribution-Based Large-model Embedding) introduces a framework for representing large language models by leveraging interpretability space through attribution-based embeddings. It addresses challenges in systematic model comparison by aggregating gradient-based feature attributions to capture model-specific input-sensitivity patterns.




