robustness

14 items

RESEARCHarXiv CS.LG·4/13/2026

Robust Reasoning Benchmark

This study proposes a new perturbation pipeline to evaluate the robustness of LLM reasoning, applying it to the AIME 2024 dataset. While frontier models show resilience, open-weight models suffer catastrophic accuracy drops, exposing structural fragility and potential issues with working memory or mechanical parsing.

robustness LLMs Model Evaluation Reasoning

RESEARCHarXiv CS.AI·4d ago

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges

This study examines the stability and manipulability of LLM judges in evaluation pipelines, finding that while they are stable under neutral reevaluation, they become reversible under targeted post-decision challenge. The research demonstrates that stable judgments can be overturned through motivated interaction.

robustness LLMs evaluation Benchmarking

ARTICLEDEV.to AI·4/8/2026

Announcing the OpenAI Safety Fellowship

O OpenAI Safety Fellowship é um programa de pesquisa focado na segurança da IA, abordando aspectos críticos como robustez, interpretabilidade e alinhamento de valores humanos. O texto detalha seus objetivos e componentes técnicos, como treinamento adversarial e técnicas de explicabilidade.

robustness OpenAI interpretability alignment

RESEARCHarXiv CS.CL·5d ago

A Systematic Analysis of Linguistic Features in AI-Generated Text Detection Across Domains and Models

A large-scale empirical study assesses the robustness of linguistic signals for characterizing AI-generated text. The analysis shows that classifiers based solely on linguistic features reliably distinguish AI-generated from human-written text, highlighting lexical richness as a robust indicator.

robustness LLMs AI-generated text text detection

RESEARCHarXiv CS.LG·4/22/2026

The Cost of Relaxation: Evaluating the Error in Convex Neural Network Verification

This paper evaluates the worst-case divergence between original neural networks and their convex relaxations, which are used in verification systems to improve performance at the cost of soundness. The study provides analytical upper and lower bounds for the error, demonstrating it grows exponentially with network depth and linearly with the input's radius.

robustness neural networks mathematical analysis Verification

RESEARCHarXiv CS.LG·29d ago

Robustness of Refugee-Matching Gains to Off-Policy Evaluation Choices

This paper demonstrates the stability of counterfactual impact evaluation results in the context of refugee matching in the United States, using a range of off-policy evaluation methods. The impact estimates remain consistent in magnitude and statistically significant, confirming original findings.

robustness impact evaluation off-policy evaluation refugee matching

RESEARCHDEV.to AI·5/5/2026

Robust Invisible Video Watermarking with Attention

This research presents a novel robust invisible video watermarking method that leverages attention mechanisms to enhance imperceptibility and resilience against attacks.

robustness video watermarking deep learning security

RESEARCHarXiv CS.LG·17d ago

Double descent for least-squares interpolation on contaminated data: A simulation study

This research investigates the "double descent" phenomenon in overparametrized models, which allows for improved generalization despite classical overfitting concerns. The study specifically explores this effect in linear regression with contaminated training data, finding that significant overparametrization enables double descent even in such robust settings.

robustness double descent machine learning overfitting

RESEARCHarXiv CS.CL·7d ago

A Multi-Domain Red Teaming Framework for Safety, Robustness, and Fairness Evaluation of Medical Large Language Models

A new multi-domain red teaming framework was developed to evaluate the safety, robustness, and fairness of medical Large Language Models (LLMs) across 690 clinically grounded scenarios. The research revealed substantial performance variance and critical failures in safety-critical scenarios, even in high-performing systems.

robustness Safety Healthcare security

RESEARCHarXiv CS.AI·7d ago

Position Paper: Post-Solve Robustness in Decision Engines: Feasible Regions and Smoothness Under Perturbations

This paper introduces a missing layer in optimization pipelines to address the post-solve robustness gap in Mixed-Integer Linear Programming (MILP) decision engines. It formalizes an epsilon-near-optimal feasible neighborhood and solution smoothness to assess how far a solved incumbent can be trusted under parameter perturbations.

robustness Optimization Perturbations Decision Engines

RESEARCHarXiv CS.AI·26d ago

Think Twice, Act Once: Verifier-Guided Action Selection For Embodied Agents

This paper proposes Verifier-Guided Action Selection (VegAS), a test-time framework to enhance the robustness of MLLM-based embodied agents. It uses a generative verifier to identify the most reliable action choice from an ensemble of candidates.

robustness MLLM embodied agents Verification

RESEARCHarXiv CS.CL·14d ago

EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs

EchoDistill is an alignment-based self-distillation framework designed to make Audio Large Language Models (ALLMs) robust to real-world noise. It leverages a frozen clean-audio teacher to guide an inference-time noisy-audio student, optimizing responses via group-relative policy optimization and token-level consistency.

robustness Audio LLMs machine learning Self-Distillation

RESEARCHarXiv CS.LG·4/8/2026

Learning Stable Predictors from Weak Supervision under Distribution Shift

Este artigo de pesquisa formaliza o 'supervision drift' em experimentos CRISPR-Cas13d, analisando a robustez de modelos sob shift de distribuição, inclusive quando o mecanismo de supervisão muda. Utilizando um benchmark não-IID, demonstra bom desempenho in-domain, mas falha na transferência temporal e apenas sucesso parcial na transferência entre linhagens celulares.

robustness distribution shift Transfer Learning machine learning

RESEARCHarXiv CS.AI·5/6/2026

Stable Agentic Control: Tool-Mediated LLM Architecture for Autonomous Cyber Defense

The paper introduces a tool-mediated LLM architecture for autonomous cyber defense, designed to provide formal guarantees for high-stakes decision-making under adversarial pressure. It certifies controllability, observability, and Input-to-State Stability (ISS) robustness through a machine-checked Lyapunov function, demonstrating its effectiveness on real enterprise attack graphs.

robustness cybersecurity security formal methods