LLM vulnerabilities

3 items

RESEARCHDEV.to AI·19d ago

One hidden neuron can disable safety guards

This study reveals that safety layers in large language models can be disabled by flipping a single hidden neuron. This minimal intervention works across various model families and scales, challenging the assumption that alignment is robustly spread throughout the network.

LLM vulnerabilities security AI safety

ARTICLEDEV.to AI·4/17/2026

The Prompt-Injection Bug That Took Down My Agent for 6 Hours

The author describes a 6-hour outage of their AI content agent caused by an indirect prompt injection bug originating from an unvalidated research file. This led to the agent generating 47 identical, unfinished drafts, highlighting the critical need for input validation in AI systems.

LLM vulnerabilities prompt injection AI security AI agents

ARTICLEDEV.to AI·4/13/2026

Corpus poisoning and indirect prompt injection against RAG-based SOC assistants benchmark results (80% and 100% ASR respectively)

This article demonstrates how to poison a RAG-based AI security assistant and perform indirect prompt injection. Benchmark results show attack success rates of 80% and 100%, proving the vulnerability of these systems.

LLM vulnerabilities Corpus Poisoning RAG prompt injection