Machine Learning

Coordinated Reply Attacks in Influence Operations: Characterization and Detection

This paper presents a machine learning framework to detect tweets that get coordinated replies as well as repliers involved in such attacks.

Coordinated Reply Attacks: Characterization and Detection

Code repository for the ICWSM 2025 paper on coordinated reply attacks — includes ML classifiers to detect targeted tweets (AUC 0.88) and coordinated accounts (AUC 0.97).

How to Handle Extremely Imbalanced Datasets

Undersampling One way of handling an imbalanced dataset is to reduce the number of observations from all classes except the minority class. The most well-known algorithm in this group is random undersampling, where samples from the targeted classes are removed at random. These methods can be grouped based on their undersampling strategy into: Prototype generation methods Prototype selection methods Prototype Generation Given an original dataset $S$, prototype generation algorithms will generate a new set $S’$ where $|S’| < |S|$ and $S’ \notin S$. These techniques reduce the number of samples in the targeted classes, but the remaining samples are generated — not selected — from the original set. ...