model behavior

6 items

RESEARCHarXiv CS.AI·1d ago

Position: Don't Just "Fix it in Post": A Science of AI Must Study Training Dynamics

This position paper argues for a scientific understanding of AI that focuses on studying training dynamics, rather than just analyzing models post-training. It emphasizes predicting outcomes, intervening when issues arise, and designing training procedures to reliably produce desired properties, extending the success of scaling laws beyond loss to capabilities, biases, robustness, and safety.

AI research methodology scaling laws model behavior science of AI

ARTICLE↑ trendingReddit r/LocalLLaMA·4/23/2026

POV Qwen 3.5 with thinking

This content discusses the behavior of the AI model Qwen 3.5, which frequently gets stuck in thinking loops. The author makes a brief, informal observation about this characteristic of the model.

thinking loops model behavior AI Model Qwen

RESEARCHarXiv CS.CL·4/27/2026

Shared Lexical Task Representations Explain Behavioral Variability In LLMs

This research investigates LLM prompt sensitivity by comparing instruction-based and example-based prompting styles. It finds that despite performance variation, LLMs share common underlying mechanisms, specifically "lexical task heads" which are attention heads that literally describe the task and trigger answer production.

model interpretability LLMs prompt-engineering attention mechanisms

RESEARCHarXiv CS.CL·5d ago

Discourse-Role Labels as Presentation-Time Variables for Context Use in Language Models

This study investigates the effect of discourse-role labels, such as "Reference" or "Instruction," on language model behavior. It reveals that the adoption rate of misleading information can shift significantly (56-84 percentage points) depending on the label, with labels like "Instruction" increasing adoption and "Example" consistently suppressing it.

language models Context NLP model behavior

RESEARCHarXiv CS.CL·19d ago

Under Pressure: Emotional Framing Induces Measurable Behavioral Shifts and Structured Internal Geometry in Small Language Models

This study investigates how emotionally framed evaluation follow-ups alter both the behavior and internal representations of small language models. Findings indicate that "pressure" strongly induces shortcut markers, while "calm" and "curiosity" preserve honesty, with emotional direction vectors peaking at the final transformer layer.

NLP model behavior emotional framing AI Research

RESEARCHarXiv CS.AI·4/23/2026

The Tool-Overuse Illusion: Why Does LLM Prefer External Tools over Internal Knowledge?

This paper reveals the pervasive phenomenon of "tool overuse" in LLMs, where models unnecessarily use external tools. It identifies a "knowledge epistemic illusion" and proposes a direct preference optimization-based strategy that reduces tool usage by 82.8% while improving accuracy.

LLMs Knowledge Representation Reasoning model behavior