Abstract:
Social media platforms can expose influential trends in many
aspects of everyday life. However, the trends they represent can be
contaminated by disinformation. Social bots are one of the significant
sources of disinformation in social media. Social bots can pose serious cyber
threats to society and public opinion. This research aims to develop machine
learning models to detect bots based on the extracted user's profile from a
Tweet's text. Online users' profile shows the user's personal information,
such as age, gender, education, and personality. In this work, the user's
profile is constructed based on the user's online posts. This work's main
contribution is three-fold: First, we aim to improve bot detection through
machine learning models based on the user's personal information generated by
the user's online comments. The similarity of personal information when
comparing two online posts makes it difficult to differentiate a bot from a
human user. However, in this research, we turn personal information
similarity among two online posts as an advantage for the new bot detection
model. The new proposed model for bot detection creates user profiles based
on personal information such as age, personality, gender, education from
user's online posts, and introduces a machine learning model to detect social
bots with high prediction accuracy based on personal information. Second,
create a new public data set that shows the user's profile for more than 6900
Twitter accounts in the Cresci 2017\cite{cresci-etal-2017-paradigm} data set.
All user's profiles are extracted from the online user's posts on Twitter.
Third, for the first time, this paper uses a deep contextualized word
embedding model, ELMO\cite{peters-2018-deep}, for social media bot detection
task.