A Combined Text-Based and Metadata-Based Deep-Learning Framework for the Detection of Spam Accounts on the Social Media Platform Twitter

Atheer S. Alhassun; Murad A. Rassam

LAPSE:2023.2679

Published Article

LAPSE:2023.2679

A Combined Text-Based and Metadata-Based Deep-Learning Framework for the Detection of Spam Accounts on the Social Media Platform Twitter

Atheer S. Alhassun, Murad A. Rassam

February 21, 2023

Abstract
Social networks have become an integral part of our daily lives. With their rapid growth, our communication using these networks has only increased as well. Twitter is one of the most popular networks in the Middle East. Similar to other social media platforms, Twitter is vulnerable to spam accounts spreading malicious content. Arab countries are among the most targeted, possibly due to the lack of effective technologies that support the Arabic language. In addition, as a complex language, Arabic has extensive grammar rules and many dialects that present challenges when extracting text data. Innovative methods to combat spam on Twitter have been the subject of many current studies. This paper addressed the issue of detecting spam accounts in Arabic on Twitter by collecting an Arabic dataset that would be suitable for spam detection. The dataset contained data from premium features by using Twitter premium API. Data labeling was conducted by flagging suspended accounts. A combined framework was proposed based on deep-learning methods with several advantages, including more accurate, faster results while demanding less computational resources. Two types of data were used, text-based data with a convolution neural networks (CNN) model and metadata with a simple neural networks model. The output of the two models combined identified accounts as spam or not spam. The results showed that the proposed framework achieved an accuracy of 94.27% with our combined model using premium feature data, and it outperformed the best models tested thus far in the literature.

Record ID

LAPSE:2023.2679

Keywords

Arabic spam account, deep convolution neural networks, deep learning, online social network, spam detection

Subject

Numerical Methods and Statistics

Suggested Citation

Alhassun AS, Rassam MA. A Combined Text-Based and Metadata-Based Deep-Learning Framework for the Detection of Spam Accounts on the Social Media Platform Twitter. (2023). LAPSE:2023.2679

Author Affiliations

Alhassun AS: Department of Information Technology, College of Computer, Qassim University, Buraydah 51452, Saudi Arabia
Rassam MA: Department of Information Technology, College of Computer, Qassim University, Buraydah 51452, Saudi Arabia; Faculty of Engineering and Information Technology, Taiz University, Taiz 6803, Yemen [ORCID]

Journal Name

Processes

Volume

10

Issue

3

First Page

439

Year

2022

Publication Date

2022-02-22