← heapsort-ai

LLM

612 items

RESEARCHarXiv CS.CL·5/7/2026

FMI_SU_Yotkova_Kastreva at SemEval-2026 Task 13: Lightweight Detection of LLM-Generated Code via Stylometric Signals

This paper details participation in SemEval-2026 Task 13, focusing on lightweight detection of LLM-generated code using stylometric signals. The approach employs ratio-based features, parsing engines, and language classifiers, proving computationally efficient with near-instant inference time.

27
RESEARCHarXiv CS.CL·22d ago

Exploring Lightweight Large Language Models for Court View Generation

The research explores the capabilities of lightweight Large Language Models (LLMs) in Criminal Court View Generation (CVG) and their impact on charge prediction within Legal AI. It systematically investigates architectural differences, model size, and comparison with Deep Neural Networks, introducing the CVGEvalKit framework for evaluation.

27
RESEARCHarXiv CS.CL·8d ago

CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards

This paper proposes CSRP, a three-stage framework for Chinese Grammatical Error Correction (CGEC) using Large Language Models (LLMs). CSRP addresses challenges of general-purpose models and metric optimization with continual pre-training, Chain-of-Thought SFT, and policy optimization with efficiency-aware rewards that penalize unnecessary edits, achieving state-of-the-art performance on the NACGEC benchmark.

27
RESEARCHarXiv CS.AI·18d ago

The Shape of Testimony: A Scalable Framework for Oral History Archive Comparison

This study presents a scalable computational framework for comparing oral history archives, specifically focusing on Holocaust survivor testimonies. By leveraging LLM-based analysis, discourse segmentation, and topic modeling, it quantifies the "structuredness" of testimonies. The research largely corroborates earlier distinctions while revealing significant overlaps between collections.

27
RESEARCHarXiv CS.LG·21d ago

Theory-optimal Quantization Based on Flatness

This research models the relationship between quantization error and outliers in Large Language Models (LLMs) and introduces a new metric, Flatness, to quantify outlier distribution. Based on this, it derives a theoretical optimal solution and proposes Bidirectional Diagonal Quantization (BDQ) for post-training quantization.

27
RESEARCHarXiv CS.LG·28d ago

Rotation-Preserving Supervised Fine-Tuning

This paper introduces Rotation-Preserving Supervised Fine-Tuning (RPSFT) to improve out-of-domain generalization in large language models while mitigating the degradation caused by standard SFT. RPSFT penalizes changes in projected singular subspaces of pretrained weights, acting as an efficient proxy for Fisher-sensitive directions and outperforming standard SFT baselines.

27
RESEARCHarXiv CS.CL·27d ago

BoostTaxo: Zero-Shot Taxonomy Induction via Boosting-Style Agentic Reasoning and Constraint-Aware Calibration

BoostTaxo introduces a novel boosting-style LLM framework designed for zero-shot taxonomy induction, aiming to overcome limitations in generalization and efficiency of existing methods. It refines taxonomy construction through a coarse-to-fine parent identification process, leveraging retrieval-augmented definition refinement and hybrid candidate selection.

27
RESEARCHarXiv CS.AI·21d ago

Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance

This position paper advocates for developing systematic methodologies to generate synthetic sequences, termed 'data probes,' to fundamentally understand how data characteristics affect LLM performance across various stages. The aim is to move beyond current compute-intensive empirical approaches by providing a principled way to comprehend model behavior.

27
RESEARCHarXiv CS.LG·12d ago

Representation Signatures and Risk-Feedback Alignment in LLM Trading Agents

This research investigates the behavioral alignment and representation dynamics of large language model (LLM) agents in financial decision environments. Using TradeArena, measurable pre-failure signatures were found, including planning embeddings drifting and fused plan-risk representations separating before drawdowns, indicating effective-rank contraction.

27
RESEARCHarXiv CS.LG·15d ago

Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

This paper investigates truthful online preference aggregation for fine-tuning Large Language Models (LLMs) in mobile crowdsourcing. It proposes a novel online weighted aggregation mechanism to address strategic misreporting by workers, modeling the process as a dynamic Bayesian game. The goal is to overcome existing approaches that fail to identify the most accurate worker and result in linear regret.

27