Natural Language Processing

168 items

ARTICLEDEV.to AI·15d ago

GLM-4: The Chinese-English Bilingual Workhorse You Didn't Know You Needed

GLM-4 is a Chinese-English bilingual AI model from Tsinghua University / Zhipu AI, optimized from the ground up for both languages, unlike most English-centric models. It features a Mixture of Experts architecture for fast inference, long context up to 128K tokens, and a focus on function calling and agent workflows.

bilingual AI Function Calling Natural Language Processing Mixture of Experts

ARTICLEDeepLearning.AI (YouTube)·18d ago

Semantic Search Starts With Embeddings

This content explores the concept of semantic search, emphasizing that it begins with the use of embeddings. It delves into the technical foundation behind meaning-based information retrieval.

Natural Language Processing semantic search embeddings AI

ARTICLEDEV.to AI·4/19/2026

Attention Mechanisms: Stop Compressing, Start Looking Back

This article delves into the limitations of LSTMs in maintaining context, even with their improved memory capabilities over vanilla RNNs. The author uses a personal experience of learning English to illustrate the three specific problems LSTMs still don't solve, setting the stage for discussing attention mechanisms.

deep learning attention mechanisms Natural Language Processing

ARTICLEGoogle AI Blog·21d ago

How AI Mode is changing the way people search in the U.S.

One year post-launch, AI Mode is changing U.S. search behavior as users increasingly shift from traditional keyword-based queries to natural language input. This indicates a significant transformation in how people interact with search engines.

user behavior Natural Language Processing search-technology AI

How AI Mode is changing the way people search in the U.S.

RESEARCHAnthropic (YouTube)·5/7/2026

Translating Claude’s thoughts into language

This content explores the fascinating area of translating the internal processes or "thoughts" of an artificial intelligence model like Claude into understandable language. It investigates how the complex operations of AI can be interpreted and expressed to better understand its reasoning.

cognitive AI Natural Language Processing interpretability AI

Translating Claude’s thoughts into language

ARTICLEDEV.to AI·4/17/2026

Error Genome: Teaching Your AI System to Learn from Failures

The author built an AI customer support system, Nova, which achieved significant success by focusing on learning from its mistakes rather than solely on minimizing errors. This approach, termed "Error Genome," led to a 40% reduction in error rates and a 20% increase in overall system accuracy.

customer service AI machine learning Natural Language Processing error analysis

NEWSAWS Machine Learning Blog·5/4/2026

Generate dashboards from natural language prompts in Amazon Quick

Amazon QuickSight now generates complete multi-sheet dashboards from natural language prompts. This feature allows BI professionals to create production-ready analyses in minutes, significantly reducing manual setup time.

Amazon QuickSight AI automation Natural Language Processing Dashboards

RESEARCHDEV.to AI·5/10/2026

Neural Language Correction with Character-Based Attention

This research introduces a novel approach to neural language correction leveraging character-based attention mechanisms. The method aims to improve the accuracy and robustness of automatically correcting grammatical and spelling errors in text.

neural networks deep learning attention mechanisms Natural Language Processing

RESEARCHarXiv CS.CL·4/15/2026

Leveraging Weighted Syntactic and Semantic Context Assessment Summary (wSSAS) Towards Text Categorization Using LLMs

This paper introduces the Weighted Syntactic and Semantic Context Assessment Summary (wSSAS), a deterministic framework to optimize text categorization using LLMs. It addresses LLM limitations by organizing text hierarchically and employing a Signal-to-Noise Ratio (SNR) to focus on high-value semantic features.

LLMs data integrity Text Categorization Natural Language Processing

RESEARCHarXiv CS.CL·5/5/2026

Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison Triggers They Fail to Detect

This paper introduces XHS-SCoRE, a reader-grounded benchmark for detecting if a text-only Xiaohongshu (RedNote) post elicits upward, downward, or neutral social comparison. The study finds a consistent mismatch between LLM generation fluency and reliable detection ability, indicating that LLMs generate social-comparison triggers they fail to robustly detect.

Benchmarking Natural Language Processing social comparison AI

RESEARCHarXiv CS.CL·5/5/2026

Controlled Paraphrase Geometry in Sentence Embedding Space: Local Manifold Modeling and Latent Probing

This paper investigates the local geometry of embedding clouds induced by controlled classes of semantically close sentences. The authors introduce a local geometric modeling scheme and a latent probing procedure for representation-space analysis and local manifold modeling.

Latent Space sentence embeddings Natural Language Processing Manifold Learning

RESEARCHarXiv CS.CL·4/10/2026

TR-EduVSum: A Turkish-Focused Dataset and Consensus Framework for Educational Video Summarization

Este estudo apresenta o dataset TR-EduVSum, focado em vídeos educacionais turcos, e propõe o método AutoMUP. Este método gera resumos padrão-ouro de forma automática e reproduzível a partir de múltiplos resumos humanos, usando agrupamento de unidades de significado e modelagem estatística de consenso.

Dataset consensus framework educational video summarization machine learning

RESEARCHarXiv CS.CL·5/5/2026

Compared to What? Baselines and Metrics for Counterfactual Prompting

This work argues that observed effects from "counterfactual prompting" in LLMs cannot be attributed to a targeted factor without accounting for meaning-preserving text modifications that establish general model sensitivity. The research shows that prediction flip rates when surgically changing patient gender are statistically indistinguishable from rates induced by simply paraphrasing inputs, suggesting that special sensitivity to patient gender cannot be concluded.

counterfactual prompting model robustness AI bias Natural Language Processing

RESEARCHarXiv CS.CL·4/27/2026

An End-to-End Ukrainian RAG for Local Deployment. Optimized Hybrid Search and Lightweight Generation

This paper introduces a highly efficient Retrieval-Augmented Generation (RAG) system specifically for Ukrainian document question answering, which achieved 2nd place in the UNLP 2026 Shared Task. It features a custom hybrid search and a specialized Ukrainian language model, compressed for high-quality, verifiable local deployment on resource-constrained hardware.

Ukrainian language RAG Natural Language Processing Local AI

RESEARCHarXiv CS.CL·4/9/2026

Beyond Facts: Benchmarking Distributional Reading Comprehension in Large Language Models

Este artigo introduz o Text2DistBench, um novo benchmark para avaliar a capacidade de LLMs inferirem conhecimento distribucional a partir de linguagem natural. Diferente dos benchmarks tradicionais, ele foca em tarefas do mundo real, como estimar proporções de sentimentos ou identificar tópicos frequentes em coleções de texto como comentários do YouTube.

Distributional Information Reading Comprehension LLMs Benchmarking

RESEARCHarXiv CS.CL·4/30/2026

MATH-PT: A Math Reasoning Benchmark for European and Brazilian Portuguese

This paper introduces MATH-PT, a novel dataset of 1,729 mathematical problems in European and Brazilian Portuguese, to address the linguistic bias in LLM mathematical reasoning evaluations. The benchmark reveals that frontier reasoning models achieve strong performance in multiple-choice questions but their performance decreases for open-ended questions.

Dataset mathematical reasoning LLMs Benchmarking

RESEARCHarXiv CS.CL·5/1/2026

BatteryPass-12K: The First Dataset for the Novel Digital Battery Passport Conformance Task

This paper introduces BatteryPass-12K, the first public dataset for the novel task of digital battery passport (DBP) conformance classification, addressing a critical need before new EU regulations. It benchmarks 22 language models, finding that "Thinking models" like GPT-5.4 achieve the best performance, and few-shot examples significantly enhance results on this challenging task.

evaluation Benchmarking Natural Language Processing datasets

RESEARCHarXiv CS.CL·4/16/2026

A Proactive EMR Assistant for Doctor-Patient Dialogue: Streaming ASR, Belief Stabilization, and Preliminary Controlled Evaluation

This paper introduces a proactive EMR assistant for doctor-patient dialogue, designed to overcome limitations of passive systems by integrating streaming ASR, belief stabilization, and action planning. The system was evaluated in a preliminary controlled setting, achieving an F1 of 0.84 and Recall@5 of 0.87.

Natural Language Processing ASR healthcare AI medical AI

RESEARCHarXiv CS.CL·4/30/2026

CogRAG+: Cognitive-Level Guided Diagnosis and Remediation of Memory and Reasoning Deficiencies in Professional Exam QA

CogRAG+ is a training-free framework designed to diagnose and remediate memory and reasoning deficiencies in large language models for professional exam QA. It decouples and aligns retrieval and reasoning with human cognitive hierarchies, employing Reinforced Retrieval and cognition-stratified Constrained Reasoning to enhance accuracy and consistency.

Retrieval Augmented Generation Natural Language Processing AI Reasoning large language models

RESEARCHarXiv CS.CL·4/27/2026

Optimal Question Selection from a Large Question Bank for Clinical Field Recovery in Conversational Psychiatric Intake

This research paper addresses optimal question selection for information gathering in psychiatric intake using conversational AI. It introduces a benchmark with 655 questions and synthetic vignettes, evaluating LLM-guided adaptive policies.

Healthcare Natural Language Processing Conversational AI LLM