large language models

265 items

RESEARCHDEV.to AI·4/26/2026

AI 法律科技 2026 上半年中国市场扫描

China's AI-driven Legal-Tech market is rapidly expanding in H1 2026, reaching 4.8 billion RMB with 78% growth, fueled by large AI models and supportive regulatory policies. Key segments include contract review, legal retrieval, and indictment generation, with domestic LLMs achieving high accuracy.

Regulatory policy AI China market large language models

RESEARCHDEV.to AI·4/16/2026

ExpertPrompting: Instructing Large Language Models to be Distinguished Experts

This content introduces "ExpertPrompting," a novel method for instructing Large Language Models to behave as distinguished experts. It focuses on enhancing the specialized knowledge and performance of AI models through advanced prompting techniques.

AI models prompt-engineering large language models

ARTICLEDEV.to AI·4/19/2026

The Personal Small Model (PSM): Memory as a Learned Cognitive Primitive

This content critiques the current assumption that AI memory is a storage problem, proposing an alternative architecture inspired by human memory specialization. It introduces the Personal Small Model (PSM), a small model trained to master memory operations like relevance gating.

specialized AI models cognitive architecture AI Memory Systems large language models

NEWSDEV.to AI·8d ago

Claude Opus 4.8: Dynamic Workflows and Parallel Subagents

Anthropic launched Claude Opus 4.8, introducing dynamic workflows that enable hundreds of parallel subagents for complex tasks. This version shows significant improvements in benchmarks like SWE-bench Verified and USAMO, with unchanged standard pricing and a new, more affordable fast mode.

AI models Anthropic benchmarks large language models

DOCDEV.to AI·4/24/2026

How to implement Claude conversation history without storing everything (token-efficient pattern)

This content addresses a common mistake in Claude-powered app development: sending the full conversation history with every request, leading to high token costs. It proposes a token-efficient pattern to manage conversation history, ensuring functionality while controlling API spend.

Optimization Claude API large language models

RESEARCHAnthropic (YouTube)·5/7/2026

Translating Claude’s thoughts into language

This content explores the fascinating area of translating the internal processes or "thoughts" of an artificial intelligence model like Claude into understandable language. It investigates how the complex operations of AI can be interpreted and expressed to better understand its reasoning.

cognitive AI Natural Language Processing interpretability AI

Translating Claude’s thoughts into language

NEWSDEV.to AI·5/3/2026

Together AI Free API: Run Llama 3.3, DeepSeek R1, and FLUX Image Generation for Free in 2026

Together AI is offering free API access to advanced models like Llama 3.3, DeepSeek R1, and FLUX for image generation. This free access will be available until 2026, allowing developers to utilize these powerful AI capabilities without cost.

image generation API Free Access Together AI

ARTICLEDEV.to AI·4/19/2026

I Built an AI Memory System. Then I Forgot About It.

The author built an AI memory system for Claude that has been running since February. This retrospective explores how the system became self-sufficient and integrated, reducing the need for constant maintenance and intervention from the creator.

knowledge graphs AI Memory Systems personal projects large language models

RESEARCHarXiv CS.CL·4/15/2026

Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision

Self-Distillation Zero (SD-Zero) is a novel post-training method designed to be more training sample-efficient than traditional reinforcement learning, without requiring external teachers or high-quality demonstrations. It operates by having a single model act as both a Generator and a Reviser, using the Reviser's improved responses and token distributions to provide dense supervision for the Generator through on-policy self-distillation.

reinforcement learning post-training Dense Supervision Self-Distillation

RESEARCHarXiv CS.CL·4/15/2026

Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models

This research systematically investigates the Identifiable Victim Effect (IVE) in Large Language Models, a cognitive bias where specific, narratively described victims receive more resources than statistically characterized groups. The large-scale empirical study across 16 frontier LLMs determines if these systems inherit human affective irrationalities in critical applications like humanitarian triage and content moderation.

Identifiable Victim Effect cognitive bias AI ethics large language models

RESEARCHarXiv CS.LG·4/15/2026

Disposition Distillation at Small Scale: A Three-Arc Negative Result

This paper details an attempt to distill behavioral dispositions into small language models (0.6B-2.3B parameters) through a distillation pipeline. Initial reported gains were later falsified due to evaluation artifacts, resulting in a negative outcome for the core hypothesis and leading to three subsequent arcs of investigation.

Negative Results Model Distillation Behavioral Dispositions large language models

RESEARCHarXiv CS.LG·4/15/2026

A Layer-wise Analysis of Supervised Fine-Tuning

This research analyzes Supervised Fine-Tuning (SFT), revealing that instruction-following capabilities emerge distinctly across layers: middle layers are stable while final layers are highly sensitive. Leveraging this, the authors propose Mid-Block Efficient Tuning, which updates critical intermediate layers, outperforming standard LoRA with reduced parameter overhead.

Supervised Fine-Tuning Layer-wise Analysis Catastrophic Forgetting large language models

RESEARCHarXiv CS.AI·4/25/2026

Adaptive Test-Time Compute Allocation with Evolving In-Context Demonstrations

This work introduces an innovative framework for adaptive test-time compute allocation, jointly adjusting where computation is spent and how generation is performed. The method uses a warm-up phase to identify easy queries and then concentrates further computation on unresolved queries, reshaping generation distributions with evolving in-context demonstrations.

deep learning machine learning in-context learning AI

RESEARCHarXiv CS.AI·4/13/2026

Model Space Reasoning as Search in Feedback Space for Planning Domain Generation

This research investigates using an agentic language model feedback framework to generate high-quality planning domains from augmented natural language descriptions. It evaluates the impact of various symbolic feedback mechanisms, like landmarks and plan validation output, in conjunction with heuristic search over model space to optimize domain quality.

Symbolic AI Agentic AI AI Planning Feedback Systems

RESEARCHarXiv CS.LG·4/13/2026

Distributionally Robust Token Optimization in RLHF

To address LLMs' susceptibility to failures from small prompt shifts, especially in multi-step reasoning, researchers propose Distributionally Robust Token Optimization (DRTO). This approach combines token-level Reinforcement Learning from Human Feedback (RLHF) with Distributionally Robust Optimization (DRO) to enhance consistency under distribution shifts, showing improvements on mathematical reasoning benchmarks.

DRO LLMs RLHF Distributionally Robust Optimization

RESEARCHarXiv CS.CL·5/1/2026

Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling

This paper introduces the Length Value Model (LenVM), a novel token-level framework for modeling the remaining generation length in autoregressive models. By formulating length modeling as a value estimation problem, LenVM provides an annotation-free, scalable, and effective signal for LLMs and VLMs, improving performance on exact length matching tasks.

deep learning Model Architecture computer vision large language models

RESEARCHarXiv CS.CL·4/14/2026

Self-Calibrating Language Models via Test-Time Discriminative Distillation

Large language models are often overconfident, expressing high certainty even when incorrect. This paper introduces SECL, a test-time training pipeline that exploits a self-supervised signal to improve calibration without requiring labeled data or human supervision.

Calibration self-supervision Overconfidence large language models

RESEARCHarXiv CS.AI·4/27/2026

Introducing Background Temperature to Characterise Hidden Randomness in Large Language Models

This content introduces a novel concept, 'Background Temperature', to characterize the hidden randomness present in Large Language Models.

LLMs machine learning randomness large language models

RESEARCHarXiv CS.CL·4/30/2026

SpecTr-GBV: Multi-Draft Block Verification Accelerating Speculative Decoding

SpecTr-GBV is a novel speculative decoding method that unifies multi-draft and greedy block verification to accelerate language model inference. It formulates the verification step as an optimal transport problem, improving both theoretical efficiency and empirical performance by achieving the optimal expected acceptance length.

large language models inference optimization Speculative Decoding AI research

RESEARCHarXiv CS.CL·4/9/2026

Hallucination as output-boundary misclassification: a composite abstention architecture for language models

Este artigo enquadra a alucinação em grandes modelos de linguagem como um erro de classificação e propõe uma intervenção composta por recusa baseada em instruções e um gate de abstenção estrutural. O gate utiliza um score de déficit de suporte de sinais como auto-consistência e cobertura de citação, mas a avaliação controlada mostrou que nenhum mecanismo isolado foi suficiente para mitigar totalmente o problema.

hallucination Abstention Architectures large language models AI safety