large language models

262 items

NEWS↑ trendingHacker News (AI)·1d ago

Apple reveals new AI architecture built around Google Gemini models

Apple has unveiled a new AI architecture built around Google Gemini models, signifying a major collaboration in artificial intelligence. This development aims to enhance AI capabilities across Apple's devices.

AI architecture Apple AI Google Gemini large language models

ARTICLEDEV.to AI·2h ago

Claude Fable 5 dropped this morning. By noon, 13 of my 31 production skills were quietly obsolete.

A developer recounts how Anthropic's Claude Fable 5 release rendered 13 of their 31 production AI skills obsolete due to changes in prompting and API behavior. Old instructions, previously effective, now actively degrade the new model's output quality, necessitating a complete re-evaluation of their autonomous agent fleet.

prompt engineering model migration autonomous agents large language models

RESEARCHDEV.to AI·4/24/2026

Kimi K2.6 Benchmark: Results vs GPT-5.4, Claude, Gemini, and K2.5

This content analyzes the Kimi K2.6 benchmark results compared to GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, and Kimi K2.5, using a standardized reference table. K2.6 demonstrates strong performance in coding and agentic tasks, clearly ahead of its predecessor and closing the gap with frontier proprietary models.

AI models Benchmarks Kimi large language models

RESEARCHarXiv CS.CL·1d ago

Signal-Driven Observation for Long-Horizon Web Agents

Long-horizon web agents experience progressive context degradation by ingesting raw DOM trees at every action step, eroding reasoning before tasks complete. Signal-Driven Observation (SDO) is proposed, where a dedicated sub-call reads the full DOM but returns only task-relevant elements, re-invoked by lightweight signals, to optimize observation and compression.

Observation compression large language models Context management AI agents

NEWSAnthropic (YouTube)·6h ago

Introducing Claude Fable 5

This content announces the introduction of Claude Fable 5. Details about the new AI model would be presented in this release.

Claude Anthropic AI model large language models

RESEARCHarXiv CS.CL·19h ago

Evaluating Hallucinations in Domain-Adapted Large Language Models

This study investigates hallucinations in domain-adapted Large Language Models, specifically Llama-2 fine-tuned with the Lamini dataset. It found that while the model excels in training-similar tasks, its ability to reason about and recall new domain-specific information is limited, leading to hallucinations and a tendency for over-generation.

Llama-2 hallucinations Domain Adaptation large language models

RESEARCHarXiv CS.AI·19h ago

Some hypotheses on how chatbots work in problem-solving-driven conversations. Large Language Models as confirmation of the Innovation Illusion

This article examines the nature of chatbots, particularly Large Language Models, as problem-solving conversational partners, drawing on Aggregation Dynamics, Cognitive Linguistics, Neuropsychology, and Psychology. It hypothesizes that LLM training datasets only partially imitate human thinking and understanding, encoding artificial metaphorical problem propagations.

chatbots cognitive science large language models linguistics

RESEARCHarXiv CS.CL·19h ago

Community-Specific Slang and Entity Detection via Semantic Shift in Fine-Tuned Language Models

This research proposes an unsupervised method to identify community-specific slang and unique entities by analyzing the magnitude of semantic shift. Semantic shift is defined as the evolution of a word's encoded representation after fine-tuning a pre-trained Large Language Model (LLM) on a community-specific text corpus.

online-communities semantic-shift natural language processing large language models

RESEARCHarXiv CS.CL·19h ago

Implicit Causal Graph Construction in Text via Chain Discovery

This paper investigates implicit causal graph construction from text by inferring intermediate causal events using Large Language Models (LLMs). It compares end-to-end graph construction with causal chain discovery methods and evaluates the validity of inferred causal relations against a manually curated database.

text analysis natural language processing graph theory large language models

ARTICLEDEV.to AI·1d ago

GEO (Generative Engine Optimization): How to Get ChatGPT, Perplexity, and Gemini to Recommend Your Business

This article introduces Generative Engine Optimization (GEO) as a new strategy for businesses to ensure their content is recommended by LLMs like ChatGPT, Perplexity, and Gemini. This shift is critical as users increasingly seek immediate, synthesized answers from conversational AI, moving away from traditional search engine results.

ChatGPT Generative Engine Optimization large language models SEO

DOC↑ trendingReddit r/LocalLLaMA·27d ago

AIDC-AI/Ovis2.6-80B-A3B · Hugging Face

Ovis2.6-80B-A3B is introduced as the latest advancement in Multimodal Large Language Models (MLLMs), upgrading to a Mixture-of-Experts (MoE) architecture for superior multimodal performance at reduced serving costs. It also brings significant improvements in long-context and high-resolution understanding, visual reasoning, and information-dense document comprehension.

AI models multimodal AI Mixture of Experts large language models

RESEARCH↑ trendingReddit r/LocalLLaMA·4/22/2026

Personal Eval follow-up: Gemma4 26B MoE (Q8) vs Qwen3.5 27B Dense vs Gemma4 31B Dense Compared

This follow-up compares Gemma4 26B MoE (Q8), Qwen3.5 27B Dense, and Gemma4 31B Dense models, including previous Qwen 3.6 35B and Gemma 4 26B (Q4) results. The analysis benchmarks their performance, highlighting the impact of 8-bit quantization and the effectiveness of different model architectures.

Benchmarking Gemma model comparison quantization

RESEARCH↑ trendingReddit r/LocalLLaMA·4/10/2026

National University of Singapore Presents "DMax": A New Paradigm For Diffusion Language Models (dLLMs) Enabling Aggressive Parallel Decoding.

DMax é um novo paradigma para modelos de linguagem de difusão (dLLMs) eficientes que mitiga o acúmulo de erros na decodificação paralela. Ele permite um paralelismo agressivo ao reformular a decodificação como um processo de auto-refinamento progressivo e introduzir uma estratégia de treinamento unificada.

Diffusion Models Parallel Decoding natural language processing AI

NEWS↑ trendingReddit r/LocalLLaMA·4/24/2026

Deepseek V4 Flash and Non-Flash Out on HuggingFace

Deepseek AI has announced the release of the new Deepseek V4 Flash and Non-Flash versions of their models on the HuggingFace platform. This collection provides direct access to Deepseek's latest models for the AI community.

AI models DeepSeek V4 large language models model release

DOC↑ trendingReddit r/LocalLLaMA·5/6/2026

Qwen3.6-27B with MTP grafted on Unsloth UD XL: 2.5x throughput via unmerged llama.cpp PR

This content details the implementation of Multi-Token Prediction (MTP) with quantized GGUFs for Qwen3-27B, utilizing Unsloth's UD XL quantizations with Q8_0 MTP layers grafted on top, resulting in a 2.5x throughput increase. The author shares grafted GGUF files, raw MTP layer source, and a conversion script, along with custom llama.cpp build instructions incorporating speculative decoding support from an unmerged PR.

Multi-Token Prediction llama.cpp quantization large language models

ARTICLE↑ trendingReddit r/LocalLLaMA·4/16/2026

PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on.

Qwen 3.6 now ships with a new `preserve_thinking` flag that addresses the KV cache invalidation issue by maintaining the model's full reasoning context. This feature is particularly beneficial for agent scenarios, enhancing decision consistency and optimizing token consumption and KV cache utilization.

large language models model optimization Qwen AI agents

PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on.

ARTICLE↑ trendingHacker News (AI)·11d ago

DeepSeek Slashes AI Costs to Cents

DeepSeek has dramatically reduced the costs of AI inference, bringing them down to mere cents. This development makes AI technology more accessible and economically viable for a wider range of applications.

DeepSeek AI costs inference cost reduction

CASE↑ trendingReddit r/LocalLLaMA·5/1/2026

16x Spark Cluster (Build Update)

This update details the successful build of a 16x Nvidia DGX Spark cluster, configured for high-speed fabric and unified memory. The setup involved standard node provisioning and custom scripting for network optimization, aiming to maximize unified memory capacity for serving large language models like GLM-5.1-NVFP4, DeepSeek, and Kimi.

AI hardware unified memory cluster computing large language models

ARTICLE↑ trendingHacker News (AI)·11d ago

Notes from the Mistral AI Now Summit in Paris

This article provides key notes and insights from the Mistral AI Now Summit, held in Paris. It covers the highlights and relevant announcements made during the event.

AI Events Mistral AI large language models AI Summit

ARTICLE↑ trendingHacker News (AI)·11d ago

Liquid AI reveals 8B-A1B MoE trained on 38T

Liquid AI has unveiled its new 8B-A1B MoE model, trained on 38 trillion tokens, representing a significant advancement in AI model development. This release showcases the company's progress in advanced AI architectures.

AI models Mixture of Experts large language models AI development