← heapsort-ai

ASR

11 items

NEWS↑ trendingReddit r/LocalLLaMA·4/12/2026

mtmd: qwen3 audio support (qwen3-omni and qwen3-asr)

The Qwen3 model now supports audio input through its `qwen3-omni-moe` (multimodal with vision and audio input) and `qwen3-asr` (audio speech recognition) versions. GGUF models for Qwen3-Omni (30B variants) and Qwen3-ASR (1.7B and 0.6B) are available on Hugging Face for community use.

mtmd: qwen3 audio support (qwen3-omni and qwen3-asr)
42
ARTICLE↑ trendingReddit r/MachineLearning·4/10/2026

Building a chatbot with ASR [P]

Um desenvolvedor busca a melhor abordagem ASR para integrar speech-to-text em um chatbot, enfrentando restrições orçamentárias e de segurança que o levam a preferir modelos auto-hospedados como Whisper em vez de APIs externas. Ele solicita insights sobre os trade-offs entre modelos locais e APIs, performance e facilidade de implantação para um lançamento de MVP.

35
RESEARCHarXiv CS.CL·4/16/2026

A Proactive EMR Assistant for Doctor-Patient Dialogue: Streaming ASR, Belief Stabilization, and Preliminary Controlled Evaluation

This paper introduces a proactive EMR assistant for doctor-patient dialogue, designed to overcome limitations of passive systems by integrating streaming ASR, belief stabilization, and action planning. The system was evaluated in a preliminary controlled setting, achieving an F1 of 0.84 and Recall@5 of 0.87.

27