confidence estimation

2 items

RESEARCHarXiv CS.CL·5/4/2026

Confidence Estimation in Automatic Short Answer Grading with LLMs

This work investigates confidence estimation in Automatic Short Answer Grading (ASAG) with Large Language Models (LLMs), essential for human-AI collaboration in education. It compares model-based confidence estimation strategies and proposes a hybrid framework to address their limitations.

education LLMs AI grading human-AI interaction

RESEARCHarXiv CS.CL·21d ago

Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution

This paper introduces Stepwise Confidence Attribution (SCA), a framework for closed-source LLMs that diagnoses multi-step reasoning failures by assigning step-level confidence. SCA applies the Information Bottleneck principle, flagging deviations from consensus structures as potential errors, and proposes two complementary methods: NIBS and GIBS.

LLMs information bottleneck Reasoning confidence estimation