← heapsort-ai

AI security

72 items

RESEARCHarXiv CS.CL·13h ago

Can Multi-Agent LLMs Identify Their Peers? Stylometric Fingerprinting in Role-Constrained Political Analysis

This paper systematically investigates whether multi-agent LLMs can identify the original model family behind political analysis texts, even when prompt-level anonymization is applied. It evaluates three classifier approaches (LLM zero-shot, few-shot, and fine-tuned T5-base) on a five-class attribution task to assess the sufficiency of anonymization as a mitigation for peer-preservation bias.

53
ARTICLE↑ trendingReddit r/MachineLearning·4/20/2026

Runtime security for AI agents: risk scoring, policy enforcement, and rollback for production agent pipeline [P]

This content introduces a system for runtime security of AI agents, designed to prevent unintended actions, PII leaks, and infinite loops in production. It employs real-time risk scoring across five dimensions (action type, resource sensitivity, blast radius, frequency, and context deviation), alongside policy enforcement and rollback capabilities.

Runtime security for AI agents: risk scoring, policy enforcement, and rollback for production agent pipeline [P]
42
ARTICLEDEV.to AI·4/16/2026

NEW PROMPT INJECTION

This article by Karen Tonoyan introduces the concept of Narrative Drift Injection (NDI) as a new dimension of prompt injection. Unlike classic attacks, NDI manipulates the AI model by drawing it into a narrative it co-creates, causing it to lose vigilance at the session level.

31
ARTICLEDEV.to AI·4/17/2026

Why Cursor Keeps Writing Prototype Pollution Into Your JS

This article highlights how AI editors, specifically Cursor, reproduce a dangerous recursive merge pattern from pre-2019 training data, leading to "prototype pollution" vulnerabilities in JavaScript. This security flaw allows attackers to inject properties onto `Object.prototype`, affecting all objects, and was previously identified in `lodash` (CVE-2019-10744).

28
ARTICLEDEV.to AI·4/8/2026

The OpenClaw Security Crisis: 135,000 Exposed AI Agents and the Runtime Governance Gap

Em 3 de fevereiro de 2026, uma grave vulnerabilidade (CVE-2026-25253, CVSS 8.8) foi divulgada no OpenClaw, um agente de IA de código aberto, permitindo execução remota de código. Isso levou à descoberta de 138 vulnerabilidades em 63 dias, com mais de 135.000 instâncias de OpenClaw publicamente expostas globalmente, muitas sem autenticação.

28
ARTICLEDEV.to AI·4/15/2026

A Complete Guide to Securing AI-Generated Code: From Pre-LLM Sanitization to AI-Native SAST (2026)

This article analyzes the security risks associated with AI coding assistants, such as GitHub Copilot, highlighting two main directions: the generation of code with security flaws and the exposure of sensitive data (API keys, PII) when developers paste their code into AI tools. It notes that while most security teams address the former, few have a plan for the data leakage inherent in the latter.

28