← heapsort-ai

LLMs

720 items

RESEARCHarXiv CS.AI·6d ago

StepPRM-RTL: Stepwise Process-Reward Guided LLM Fine-Tuning for Enhanced RTL Synthesis

StepPRM-RTL is a novel framework that enhances LLM-based RTL code generation by combining stepwise trajectory modeling, process-reward modeling (PRM), and retrieval-augmented fine-tuning (RAFT). It uses dense feedback from a PRM to guide reinforcement-style updates and Monte Carlo Tree Search (MCTS) to enrich the training dataset.

29
ARTICLEDEV.to AI·4d ago

<think>

This article delves into cost-effective alternatives to GPT-4o, revealing how other AI models can offer significant savings for developers. It provides direct cost comparisons, highlighting options like DeepSeek V4 Flash and Qwen3-32B.

29
DOCML Mastery·5d ago

Using Scikit-LLM with Open-Source LLMs

This article provides a tutorial on integrating locally hosted open-source large language models such as Mistral, Gemma, and Llama 3 for language tasks like text classification. It demonstrates how to achieve this for free using Ollama and the Scikit-LLM Python library.

Using Scikit-LLM with Open-Source LLMs
29
RESEARCHarXiv CS.LG·4/22/2026

Towards Understanding the Robustness of Sparse Autoencoders

This research explores the robustness implications of Sparse Autoencoders (SAEs) against jailbreak attacks on Large Language Models (LLMs). Integrating pretrained SAEs at inference time significantly reduces jailbreak success rates by up to 5x and decreases cross-model attack transferability across various LLM families.

29
RESEARCHarXiv CS.CL·4/9/2026

Consistency-Guided Decoding with Proof-Driven Disambiguation for Three-Way Logical Question Answering

Este conteúdo apresenta CGD-PD, uma camada leve para modelos de linguagem grandes (LLMs) que melhora a resposta a perguntas lógicas de três vias (Verdadeiro/Falso/Desconhecido). Ele aborda falhas recorrentes como inconsistência de negação e previsões 'Desconhecido' epistêmicas, utilizando decisões consistentes e desambiguação baseada em prova para maior precisão.

29
RESEARCHarXiv CS.CL·9d ago

Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study

This paper presents a zero-shot multi-label topic classification framework, systematically investigating how per-article knowledge graph augmentation affects its performance. The authors test eight methods across fifteen LLMs and eight multi-label datasets, finding that keyword-enhanced classification is the best performing method in the base framework.

29