← heapsort-ai

red-teaming

6 items

RESEARCHarXiv CS.CL·15d ago

How Far Will They Go? Red-Teaming Online Influence with Large Language Models

This research proposes an empirical red-teaming framework to evaluate the capacity of locally deployed open-source large language models (LLMs) to support political influence campaigns, focusing on information integrity. It measures "LLM Overton Windows" and quantifies how natural-language jailbreaks expand the range of political opinions models can express, revealing systematic asymmetries in political expressivity.

28
NEWSDEV.to AI·25d ago

Agentic AI Red Teaming Emerges as Defence Against AI-Speed Attack Chains

Sweet Security has launched 'Sweet Attack', a continuous agentic AI red teaming platform designed to counter the growing asymmetry between AI-assisted attackers and human defenders. The platform leverages live runtime telemetry from customer environments to identify genuinely exploitable attack chains, signaling an industry shift towards autonomous AI agents in security.

27