← heapsort-ai

LLM

612 items

DOCDEV.to AI·16d ago

로컬 LLM 셋업 가이드 (v10)

This guide provides practical steps for setting up Large Language Models (LLMs) locally on a Linux system, detailing hardware requirements and performance benchmarks. It compares frameworks like llama.cpp, Ollama, vLLM, and LocalAI, recommending llama.cpp with setup instructions for model deployment.

27
RESEARCHarXiv CS.LG·4/28/2026

KARL: Mitigating Hallucinations in LLMs via Knowledge-Boundary-Aware Reinforcement Learning

KARL is a novel framework designed to mitigate hallucinations in large language models by enabling them to appropriately abstain from questions beyond their knowledge. It achieves this through a Knowledge-Boundary-Aware Reward that dynamically estimates the model's knowledge and a Two-Stage RL Training Strategy that prevents excessive caution.

27
RESEARCHarXiv CS.LG·4/28/2026

Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation

This research challenges the assumption that Parameter-Efficient Fine-Tuning (PEFT) equates to memory efficiency for on-device LLMs, showing existing methods can still lead to out-of-memory errors. It introduces LARS (Low-memory Activation-Rank Subspace), a novel framework that decouples memory consumption from sequence length by constraining the activation subspace, achieving an average 33.54% memory footprint reduction.

27
RESEARCHarXiv CS.CL·5/5/2026

Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison Triggers They Fail to Detect

This paper introduces XHS-SCoRE, a reader-grounded benchmark for detecting if a text-only Xiaohongshu (RedNote) post elicits upward, downward, or neutral social comparison. The study finds a consistent mismatch between LLM generation fluency and reliable detection ability, indicating that LLMs generate social-comparison triggers they fail to robustly detect.

27
RESEARCHarXiv CS.CL·4/9/2026

LLM-Augmented Knowledge Base Construction For Root Cause Analysis

Este estudo avalia metodologias de Large Language Models (LLM) – Fine-Tuning, RAG e uma abordagem Híbrida – para construir uma base de conhecimento de Análise de Causa Raiz (RCA) a partir de tickets de suporte. Os experimentos com um conjunto de dados industrial real demonstram que a base de conhecimento gerada acelera as tarefas de RCA e melhora a resiliência da rede.

27
RESEARCHarXiv CS.LG·5/1/2026

Detecting Clinical Discrepancies in Health Coaching Agents: A Dual-Stream Memory and Reconciliation Architecture

LLM agents in healthcare face the challenge of reconciling patient self-reports (prone to bias) and electronic health records (validated but often stale). This research introduces a dual-stream memory architecture to strictly separate and reconcile these sources, detecting discrepancies to enhance clinical safety.

27