← heapsort-ai

LLM

609 items

DOCGoogle for Developers (YouTube)·21h ago

Gemma Playground: Robot Duck

This content explores the Gemma Playground, using a 'Robot Duck' as an application example. The focus is on demonstrating the capabilities of the Gemma model in a practical scenario.

Gemma Playground: Robot Duck
46
RESEARCHarXiv CS.LG·21h ago

Enabling KV Caching of Shared Prefix for Diffusion Language Models

The paper introduces "bicache", the first KV caching technique for shared prefixes in diffusion language models (DLMs), addressing challenges where existing LLM caching methods fail due to DLMs' bidirectional attention. This new approach aims to unlock high-throughput DLM serving by leveraging observations about shared prefix KVs stability in shallow layers.

46
RESEARCHarXiv CS.CL·21h ago

GraphLoRA: Structure-Aware Low-Rank Adaptation for Large Language Model Recommendation

GraphLoRA proposes a novel framework for Large Language Model Recommendation (LLMRec) that integrates structural information with textual semantics. It achieves this by embedding a trainable graph message-passing network within the low-rank adaptation pathway, allowing collaborative topology to explicitly guide parameter updates.

46
RESEARCH↑ trendingReddit r/MachineLearning·4/24/2026

New project about llm hallucination [P]

This content introduces a new side project and its GitHub repository, focusing on mitigating LLM hallucination through a novel contrastive sampling and selective training method. The core idea treats hallucination as a preference problem, using self-generated negative samples and divergence-based, gated learning to push correct answers and suppress wrong ones.

New project about llm hallucination [P]
45
RESEARCH↑ trendingReddit r/MachineLearning·4/9/2026

[R] Forced Depth Consideration Reduces Type II Errors in LLM Self-Classification: Evidence from an Exploration Prompting Ablation Study - (200 trap prompts, 4 models, 8 Step-0 variants) [R]

Este estudo aborda erros de Tipo II na classificação de tarefas por LLMs, onde prompts aparentemente simples exigem compreensão profunda. A pesquisa demonstrou que prompts de exploração aberta ("What's really going on here?") reduzem significativamente esses erros em comparação com prompts de extração direta.

45
RESEARCH↑ trendingReddit r/MachineLearning·4/15/2026

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]

The author successfully trained a Qwen2.5-0.5B-Instruct model for Reddit post summarization using GRPO, achieving an average rollout length of 64 tokens with combined quality and length rewards. The experiment, run on a Mac Mini cluster, uses an LLM-as-a-Judge (GPT-5) for evaluation and plans future iterations with adjusted reward functions.

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]
44
CASE↑ trendingReddit r/LocalLLaMA·4/17/2026

Qwen3.6 is incredible with OpenCode!

The user praises Qwen3.6 OpenCode as an "incredible" local model for complex coding tasks, highlighting its effectiveness in implementing RLS across a multi-language codebase. While not perfect, its ability to iterate on compiler errors makes it a viable alternative to models like Claude Code for daily use.

44
CASE↑ trendingReddit r/LocalLLaMA·4/17/2026

Qwen3.6. This is it.

A user recounts their experience with the Qwen3.6 model, which successfully built and tested a tower defense game, demonstrating the ability to identify and fix its own bugs. The AI confirmed builds using screenshots, astonishing the user with its advanced capabilities.

Qwen3.6. This is it.
43