A group of computer scientists has been debating what constitutes persuasion for an excessive amount of time somewhere in a Santiago de Compostela research building. Not the obvious kind. Not the kind of person who sells used cars. The more subdued version, the one that changes your perspective before you’ve finished your coffee by sliding into a political tweet or news headline. Last year, they released a thorough and methodical survey outlining how machine learning and natural language processing are being trained to detect this kind of behavior. It’s like watching someone attempt to weigh smoke when you read it.
The idea seems fairly straightforward. Persuasion is now quicker, less expensive, and more difficult to track down thanks to online platforms. Thus, scientists are developing models that look for linguistic clues of manipulation, propaganda, loaded framing, and emotional bait in text. Davide Bassi and his colleagues divide these efforts into two groups in their Frontiers paper. The results are easy to explain but have a narrow scope when one relies on linguistics, examining particular words and stylistic patterns. The other relies on deep learning and argumentation, which can scale beautifully but has a tendency to act like a black box. The trade-off doesn’t seem to excite anyone.

Reading the literature gives the impression that this field is still figuring out what it is measuring. Propaganda is not the same as persuasion. Manipulation is not propaganda. An intense argument is not the same as manipulation. Particularly in Arabic, Spanish, English, and the dozens of other languages that are currently being fed into these models, the terms become hazy. A different team that worked on Arabic text during the 2023 ArAIEval challenge trained a multilingual model known as XLM-RoBERTa and reported a micro F1 score of 0.64 for identifying persuasive strategies. Not spectacular, but decent. The number provides some insight into the task’s slippery nature.
It’s remarkable how frequently the developers of these systems sound more like concerned philosophers than engineers. In other words, a well-crafted argument and a piece of propaganda can resemble a neural network in an uncomfortable way. The Bassi paper candidly acknowledges that existing models are unable to distinguish between vicious and virtuous persuasion. Some of the rhetorical techniques used by a political operative promoting a conspiracy are also used by a teacher persuading students to read more. Sometimes the algorithm is unable to distinguish between the two. To be honest, we don’t always do either.
More recent work is making progress. SafePersuasion, a dataset that attempts to distinguish between manipulation and logical persuasion across a layered taxonomy, was first presented in a 2025 paper. Earlier this year, a different group at the University of Illinois published a thorough survey in which they argued that as large language models become more adept at producing persuasive content, the field needs to consider safety, fairness, and the dangers of machines that can persuade people. Additionally, there is an increasing amount of research on identifying persuasion attacks produced by GPT-4 and Llama variants—a statement that five years ago would have sounded like science fiction.
It’s difficult to ignore the irony. We are being asked to protect ourselves from the same technology that facilitates the production of synthetic persuasion. Researchers at Rutgers, Caen, Padua, and Urbana-Champaign are all using potentially weaponized tools to pursue the same elusive goal. It’s still unclear if any of this will spread as quickly as a viral post. It’s a serious job. The stakes are clear. Less so was the result. As of right now, the majority of those deciphering persuasion are merely attempting to keep up with those using it.

