← heapsort
RESEARCH27

Evaluation Revisited: A Taxonomy of Evaluation Concerns in Natural Language Processing

arXiv CS.CLΒ·April 30, 2026

Prompted by recent LLM advances, this paper conducts a scoping review of NLP's long history of methodological reflection on evaluation concerns. It develops a taxonomy, synthesizing recurring positions and trade-offs, and provides a structured checklist to support deliberate evaluation design and interpretation.

Read original β†—