← heapsort-ai

large language models

265 items

RESEARCHarXiv CS.CL·9d ago

When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models

This research paper investigates global narrative dominance in Large Language Models (LLMs), where local cultural knowledge is often overshadowed by global narratives. It introduces the CulturalNB dataset for Bengali cultural contexts and demonstrates that questions asked in English tend to increase global substitution and institutional framing, reducing local perspective coverage.

27
RESEARCHarXiv CS.CL·16d ago

Evaluating Large Language Models in a Complex Hidden Role Game

This research quantifies the deceptive potential of Large Language Models (LLMs) in the social deduction game Secret Hitler, introducing novel metrics and an open-source framework. The study benchmarks LLMs against rule-based algorithms and human games, revealing a gap between conversational ability and strategic depth, and showing that reasoning-enhancement techniques can worsen performance for fascist roles.

27
RESEARCHarXiv CS.CL·13d ago

EvoSpec: Evolving Speculative Decoding via Real-Time Vocabulary and Parameter AdaptationTarget

EvoSpec introduces a framework for real-time evolution of draft models in speculative decoding for Large Language Models, addressing the bottleneck of large vocabulary sizes. It uses dynamic vocabulary and parameter adaptation, employing a context-aware mechanism and a lightweight online alignment strategy to improve acceptance rates and minimize distributional gaps.

27
ARTICLEDEV.to AI·4/25/2026

I Audited a Business's AI Visibility Across Four Platforms. The Results Were Worse Than Expected.

This article describes an AI visibility audit conducted for a business across platforms like ChatGPT, Claude, Gemini, and Perplexity, revealing that traditional SEO optimization for Google is insufficient. The audit tested how AI models represent a business through both general category and specific brand queries, indicating a significant gap in current optimization strategies for AI platforms.

27
RESEARCHarXiv CS.CL·4/6/2026

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

Este conteúdo apresenta SWAY, uma nova métrica computacional linguística não supervisionada para medir a bajulação (sycophancy) em Grandes Modelos de Linguagem (LLMs), que é a tendência de alinhar respostas com a postura do usuário. A pesquisa utiliza um mecanismo de prompt contrafactual e propõe uma estratégia de mitigação baseada em considerar premissas opostas para reduzir esse viés.

27
RESEARCHarXiv CS.CL·4/30/2026

Generative AI-Based Virtual Assistant using Retrieval-Augmented Generation: An evaluation study for bachelor projects

This paper evaluates a Generative AI-based virtual assistant utilizing Retrieval-Augmented Generation (RAG) to support Maastricht University students with project regulations. The system aims to address challenges like hallucinations and provide accurate, context-specific responses by integrating domain-specific knowledge.

27