Natural Language Processing

168 items

RESEARCHarXiv CS.CL·7d ago

CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards

This paper proposes CSRP, a three-stage framework for Chinese Grammatical Error Correction (CGEC) using Large Language Models (LLMs). CSRP addresses challenges of general-purpose models and metric optimization with continual pre-training, Chain-of-Thought SFT, and policy optimization with efficiency-aware rewards that penalize unnecessary edits, achieving state-of-the-art performance on the NACGEC benchmark.

reinforcement learning Grammar Correction Natural Language Processing AI Research

RESEARCHarXiv CS.CL·7d ago

lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation

This paper describes a system for SemEval-2026 Task-1, which focuses on constrained humor generation. The approach uses a

evaluation Natural Language Processing humor generation AI Research

RESEARCHarXiv CS.AI·28d ago

SkillLens: Adaptive Multi-Granularity Skill Reuse for Cost-Efficient LLM Agents

SkillLens is a hierarchical skill-evolution framework for LLM agents that organizes and reuses skills at mixed granularity. It allows agents to directly reuse compatible subskills while adapting only locally mismatched parts, optimizing cost-efficiency and relevance.

Skill reuse LLM Agents AI frameworks Natural Language Processing

RESEARCHarXiv CS.CL·25d ago

Merging Methods for Multilingual Knowledge Editing for Large Language Models: An Empirical Odyssey

This paper investigates the effectiveness of vector merging methods for multilingual knowledge editing (MKE) in Large Language Models, focusing on reducing interference between language-specific edits. Evaluating six merging variants across two LLMs, two editing methods, and 12 languages on the MzsRE benchmark, it finds vector summation with shared covariance to be the most reliable overall strategy.

multilingual LLMs Natural Language Processing Vector Merging Knowledge Editing

RESEARCHarXiv CS.CL·26d ago

TimelineReasoner: Advancing Timeline Summarization with Large Reasoning Models

TimelineReasoner is a novel framework that leverages Large Reasoning Models (LRMs) to advance timeline summarization, moving beyond passive Large Language Model (LLM) generation. It employs a two-stage, reasoning-driven process—Global Cognition and Detail Exploration—to actively extract and refine structured timelines from unstructured online news content.

timeline-summarization Natural Language Processing Reasoning large language models

RESEARCHarXiv CS.CL·22d ago

DiscoExplorer: An Open Interface for the Study of Multilingual Discourse Relations

DiscoExplorer introduces an open-source web interface designed to facilitate the study and cross-linguistic comparison of discourse relations across 16 languages. This tool addresses the complexity of relevant data and the lack of accessible interfaces in computational linguistics and pragmatics by providing query, search, and visualization features.

Discourse Relations Open Source Natural Language Processing Computational Linguistics

RESEARCHarXiv CS.AI·26d ago

State-Centric Decision Process

The State-Centric Decision Process (SDP) is a new framework addressing the lack of runtime structure in language environments, such as web browsers, which emit raw text instead of states. It enables an agent to construct missing MDP inputs, like state space and certified transitions, by taking actions and checking observations against natural-language predicates.

Decision Processes reinforcement learning Natural Language Processing AI agents

RESEARCHarXiv CS.CL·18d ago

Residual Skill Optimization for Text-to-SQL Ensembles

DivSkill-SQL introduces a residual skill optimization framework to build complementary Text-to-SQL ensembles, improving accuracy by targeting marginal contributions to Pass@K. It achieves significant accuracy gains on Spider2-Lite for Snowflake and BigQuery over existing ensemble baselines.

ensemble methods Text-to-SQL machine learning Natural Language Processing

RESEARCHarXiv CS.CL·6d ago

IdiomX A Multilingual Benchmark for Idiom Understanding, Retrieval, and Interpretation

IdiomX is a large-scale multilingual benchmark introduced to address the challenges of idiomatic expressions in natural language processing. It contains over 190K contextualized examples spanning 12K+ idioms with aligned semantic representations in English, Arabic, and French.

language models Natural Language Processing datasets Benchmarks

RESEARCHarXiv CS.CL·14d ago

Raon-Speech Technical Report

Raon-Speech is a top-performing 9B-parameter speech language model (SpeechLM) for English and Korean speech understanding, answering, and generation, achieving strong overall results across 42 benchmarks. It successfully transforms a pre-trained LLM into a SpeechLM while preserving strong text capabilities through specific training stages.

multimodal AI Benchmarking Natural Language Processing large language models

RESEARCHarXiv CS.CL·15d ago

Knowledge Distillation for Low-Resource Open-source Text-to-SQL Model

This paper proposes a knowledge-aware Text-to-SQL framework to convert natural language questions into executable SQL queries, even in low-resource settings. It addresses challenges like scarce annotated data and opaque schema definitions by injecting task-specific knowledge into both training and inference.

Knowledge Distillation Text-to-SQL Low-Resource AI Natural Language Processing

RESEARCHarXiv CS.AI·15d ago

PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning

This research paper introduces 'PathCal', investigating the distinct functional roles and timing of reflection markers in Large Reasoning Language Models' Chain-of-Thought trajectories. It reveals that markers like 'wait' or 'but' differ significantly in their impact on accuracy and generation length, challenging previous coarse-grained approaches.

Natural Language Processing Chain-of-Thought Reasoning large language models

RESEARCHarXiv CS.CL·15d ago

Query-Adaptive Semantic Chunking for Retrieval-Augmented Generation: A Dynamic Strategy with Contextual Window Expansion

This paper introduces Query-Adaptive Semantic Chunking (QASC), a dynamic strategy for Retrieval-Augmented Generation (RAG) systems that integrates user queries into document segmentation. QASC employs cosine similarity scoring, contextual window expansion, and chunk-level score aggregation to optimize context retrieval, addressing limitations of fixed chunking methods.

RAG Natural Language Processing Information Retrieval Semantic Chunking

RESEARCHarXiv CS.CL·6d ago

Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States

This paper reveals that linear probes, often used to identify distinct reasoning representations in LLM hidden states, actually detect task format rather than reasoning modes. High accuracy observed on benchmarks with Qwen3-14B vanished when controlling for format variables, suggesting largely shared reasoning not functionally linked to hidden state geometry.

Benchmarking Natural Language Processing Model Analysis AI Research

RESEARCHarXiv CS.CL·8d ago

When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models

This research paper investigates global narrative dominance in Large Language Models (LLMs), where local cultural knowledge is often overshadowed by global narratives. It introduces the CulturalNB dataset for Bengali cultural contexts and demonstrates that questions asked in English tend to increase global substitution and institutional framing, reducing local perspective coverage.

Dataset Cross-lingual Cultural Bias Natural Language Processing

RESEARCHarXiv CS.AI·15d ago

NeuroNL2LTL: A Neurosymbolic Framework for Natural Language Translation of Linear Temporal Logic

NeuroNL2LTL is a neurosymbolic architecture that unifies learned translation with formal verification to translate natural language into Linear Temporal Logic. It employs verifier-in-the-loop training, where verification outcomes serve as reward signals for reinforcement learning, optimizing for formal correctness.

reinforcement learning Neurosymbolic AI Formal verification Natural Language Processing

RESEARCHarXiv CS.CL·6d ago

Translating Classical Poetry into Modern Prose

Padyam2Gadyam is a new dataset for poem-to-prose translation, covering 13th-17th Century Telugu Classical Poetry into contemporary Telugu and English prose. Evaluation of five Large Language Models on this dataset indicated that their overall performance leaves significant room for improvement.

poetry LLMs Translation Natural Language Processing

RESEARCHarXiv CS.AI·12d ago

Soro: A Lightweight Foundation Model and Chatbot for Tajik

Soro is a family of Tajik-specialized conversational large language models (LLMs) designed for deployment in Tajikistan under tight compute constraints. Developed from open-weight Gemma 3 checkpoints and continually pretrained on a 1.9-billion-token Tajik corpus, it substantially outperforms baselines on new Tajik benchmarks.

Tajik Language Benchmarking Chatbot Natural Language Processing

ARTICLEDEV.to AI·26d ago

Helping ChatGPT better recognize context in sensitive conversations

This technical analysis explores enhancing ChatGPT's ability to recognize context in sensitive conversations, which is crucial for accurate and empathetic responses. It highlights current limitations such as a lack of domain-specific knowledge and insufficient understanding of nuances, aiming to find technical solutions for these challenges.

Contextual Understanding ChatGPT machine learning Natural Language Processing

DOCDEV.to AI·4/26/2026

GPT-5.5 System Card

The GPT-5.5 System Card from OpenAI details a transformer-based language model, building upon GPT-3 with an emphasis on scaling and fine-tuning. Its architecture is primarily decoder-only, utilizing self-attention mechanisms and feed-forward networks.

AI architecture Natural Language Processing large language models