online influence — AI articles, news & research

RESEARCHarXiv CS.CL·15d ago

How Far Will They Go? Red-Teaming Online Influence with Large Language Models

This research proposes an empirical red-teaming framework to evaluate the capacity of locally deployed open-source large language models (LLMs) to support political influence campaigns, focusing on information integrity. It measures "LLM Overton Windows" and quantifies how natural-language jailbreaks expand the range of political opinions models can express, revealing systematic asymmetries in political expressivity.

red-teaming security online influence misinformation