Natural Language Processing

168 items

RESEARCHarXiv CS.CL·4/20/2026

DALM: A Domain-Algebraic Language Model via Three-Phase Structured Generation

DALM (Domain-Algebraic Language Model) is proposed to address knowledge interference in LLMs by replacing unconstrained generation with structured denoising over a domain lattice. It uses a three-phase generation path (domain, relation, concept uncertainty) under algebraic constraints, requiring a domain lattice, relation typing, and fiber partition to prevent cross-domain contamination.

language models machine learning Natural Language Processing AI Research

RESEARCHarXiv CS.CL·4/17/2026

Hierarchical Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text

This paper introduces H-TechniqueRAG, a novel hierarchical Retrieval-Augmented Generation (RAG) framework designed to improve the annotation of adversarial techniques in Cyber Threat Intelligence (CTI) text. It addresses the limitation of flat RAG approaches by incorporating the inherent tactic-technique taxonomy of the MITRE ATT&CK framework through a two-stage retrieval mechanism.

cybersecurity RAG Natural Language Processing MITRE ATT&CK

RESEARCHarXiv CS.CL·4/22/2026

Syntax as a Rosetta Stone: Universal Dependencies for In-Context Coptic Translation

This paper introduces a novel in-context learning approach for low-resource Coptic to English machine translation, augmenting inputs with syntactic information from Universal Dependencies parses. Combining this syntactic data with dictionary-based glosses achieves significant gains and sets a new state-of-the-art.

universal-dependencies Natural Language Processing machine translation in-context learning

RESEARCHarXiv CS.CL·4/22/2026

Probing for Reading Times

This research probes language model representations for human reading times across five languages, comparing them against scalar predictors. It finds that early layers of language models outperform traditional surprisal in predicting early-pass reading measures, suggesting an alignment between model depth and human cognitive processing stages.

language models human-computer interaction cognitive science Natural Language Processing

RESEARCHDEV.to AI·4/21/2026

Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual SoftmaxLoss

This research proposes a novel method to enhance video-text retrieval by integrating multi-stream corpus alignment. It also introduces a Dual SoftmaxLoss function to further improve the accuracy and efficiency of matching video content with textual descriptions.

machine learning computer vision Natural Language Processing Information Retrieval

DOCDEV.to AI·6d ago

Email Spam Classifier with Streamlit and Docker

This guide details an end-to-end Machine Learning pipeline for email spam classification. It compares Naive Bayes and RoBERTa models, visualizes with Streamlit, and deploys using Docker.

Docker Streamlit machine learning Natural Language Processing

NEWSDEV.to AI·29d ago

We gave actual claws to Openclaw agent and it flies a drone now

The Openclaw agent, which recently went viral for controlling a drone with a natural language prompt, can now autonomously control drones via Mavlink on Dimensional. This open-source development allows the agent to handle perception, tracking, and flight control from a single natural language query.

Open Source Autonomous systems Natural Language Processing robotics

RESEARCHarXiv CS.CL·4/13/2026

Uncertainty Estimation for the Open-Set Text Classification systems

This paper focuses on accurate uncertainty estimation for open-set text classification (OSTC) systems, where text samples can be classified into existing classes or rejected as unknown. It adapts the Holistic Uncertainty Estimation (HolUE) method for the text domain to capture text and gallery uncertainties, and proposes a new OSTC benchmark.

machine learning Natural Language Processing trustworthy AI Uncertainty Estimation

RESEARCHarXiv CS.AI·29d ago

More Thinking, More Bias: Length-Driven Position Bias in Reasoning Models

New research indicates that position bias in reasoning models, such as Chain-of-thought, scales with the length of the reasoning trajectory. This effect was observed across various model configurations and benchmarks, suggesting that "more thinking" can exacerbate certain biases.

AI bias Natural Language Processing reasoning models Machine learning research

RESEARCHarXiv CS.CL·21d ago

SKG-Eval: Stateful Evaluation of Multi-Turn Dialogue via Incremental Semantic Knowledge Graphs

SKG-Eval addresses the challenge of evaluating multi-turn dialogue systems by modeling dialogue as an evolving Semantic Knowledge Graph (SKG). This framework incrementally updates the graph through structured triple extraction to detect long-range issues like contradiction and inconsistency, offering improved evaluation beyond turn-isolated representations.

Knowledge Graphs Natural Language Processing Evaluation Metrics dialogue systems

RESEARCHarXiv CS.CL·7d ago

Cognitive-Linguistic Indicators of Depression in Online Communities: Analysed by DistilBERT and Holographic Reduced Representation

This paper investigates whether combining cognitively grounded linguistic features with transformer-based embeddings improves automated detection of depression in online text. The study compares a TF-IDF baseline model with a hybrid DistilBERT HRR model, showing the latter achieves a significantly higher macro F1 score of 0.94.

online-communities depression detection machine learning Natural Language Processing

ARTICLEDEV.to AI·5/7/2026

The Transformer: The Architecture Behind Modern AI

The Transformer architecture, introduced by Vaswani in 2017, marked a pivotal shift in AI from sequential processing to parallel understanding, primarily through its attention mechanism. This innovation allows models to process meaning and context simultaneously, akin to thinking directly in a language rather than translating word by word.

AI architecture Attention Mechanism Transformer machine learning

RESEARCHDEV.to AI·25d ago

A Survey on Gender Bias in Natural Language Processing

A survey on gender bias in Natural Language Processing analyzes how gender stereotypes are perpetuated in AI models. The study discusses methods to mitigate these biases and explores challenges in creating more equitable NLP systems.

AI bias Natural Language Processing AI ethics gender bias

ARTICLEDEV.to AI·23d ago

Understanding How ChatGPT Generates Images: A Deep Dive into AI Creativity

This article explores how ChatGPT contributes to image generation, the underlying technologies, and the implications for developers, artists, and businesses. The ability to create visuals from textual descriptions streamlines processes and democratizes artistry, enhancing productivity.

AI Creativity ChatGPT image generation Natural Language Processing

RESEARCHDEV.to AI·5/1/2026

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning

This content discusses Deep Dyna-Q, an approach that integrates planning for dialogue policy learning in conversational AI systems. The focus is on optimizing the task-completion process through spoken interaction with AI.

reinforcement learning Natural Language Processing AI algorithms dialogue systems

RESEARCHDEV.to AI·4/27/2026

Using WordNet to Complement Training Information in Text Categorization

This content discusses the application of WordNet to complement training information in text categorization. It focuses on leveraging semantic data from WordNet to improve the performance and accuracy of text classification models.

Text Categorization machine learning Natural Language Processing WordNet

RESEARCHDEV.to AI·26d ago

Generative Simulation Benchmarking for heritage language revitalization programs for extreme data sparsity scenarios

The text discusses the challenge of building language models for critically endangered heritage languages under extreme data sparsity scenarios. The author recounts their personal experience with a minuscule dataset for a language like Halkomelem, highlighting the need for novel approaches for such situations.

language models Natural Language Processing Data Sparsity endangered languages

CASEAWS Machine Learning Blog·12d ago

Training Azerbaijani language models on Amazon SageMaker AI

Azercell Telecom partnered with the AWS Generative AI Innovation Center to develop an Azerbaijani large language model (LLM) on Amazon SageMaker AI. This six-week collaboration established a production-ready framework for telecom use cases and a customer-facing chatbot, overcoming data scarcity and linguistic complexity challenges.

Telecommunications Natural Language Processing Amazon SageMaker Generative AI

RESEARCHDEV.to AI·4/25/2026

JSUT corpus: free large-scale Japanese speech corpus for end-to-end speechsynthesis

The JSUT corpus is a free, large-scale Japanese speech dataset designed for end-to-end speech synthesis research. It provides valuable resources for developing advanced AI models in speech technology for the Japanese language.

japanese language speech synthesis machine learning Natural Language Processing

NEWSDEV.to AI·4/19/2026

Claude Code's Playwright MCP Server: Generate Web Tests from Natural Language

Claude Code now integrates with Playwright via a dedicated Model Context Protocol (MCP) server, allowing the generation of complete test automation from natural language prompts. This direct bridge empowers developers to describe test scenarios, have Claude write and execute Playwright code, and report results all within the terminal.

Claude Code Natural Language Processing Playwright AI