Large Language Models (LLMs)

3 items

RESEARCHarXiv CS.CL·4/15/2026

Robust Explanations for User Trust in Enterprise NLP Systems

This research proposes a unified black-box robustness evaluation framework for token-level explanations to improve user trust in enterprise NLP systems, especially when migrating to LLMs. It operationalizes robustness via top-token flip rate under realistic perturbations, conducting a systematic comparison across various encoder and decoder architectures like BERT, RoBERTa, Qwen, and Llama.

model robustness Explainable AI (XAI)User Trust Large Language Models (LLMs)

ARTICLEDEV.to AI·4/19/2026

Gemma-4 Deployment Woes, `easyaligner` for Audio, & Claude Enterprise Privacy

This content covers practical challenges in deploying Google's Gemma-4 model, introduces a new open-source tool for speech-text alignment, and discusses critical data privacy considerations for Claude Enterprise users.

Open Source MLOps data privacy Large Language Models (LLMs)

RESEARCHarXiv CS.CL·4/27/2026

Outcome Rewards Do Not Guarantee Verifiable or Causally Important Reasoning

This paper investigates whether outcome rewards in reinforcement learning for chain-of-thought reasoning guarantee verifiable or causally important reasoning in LLMs. Introducing Causal Importance of Reasoning (CIR) and Sufficiency of Reasoning (SR) metrics, the authors find that while RLVR improves accuracy, it does not reliably enhance CIR or SR, and a small amount of SFT can remedy these issues.

reinforcement learning AI training Large Language Models (LLMs)Model Evaluation