We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

When the timeline meets the pipeline: A survey on automated cyberbullying detection

Elsafoury, Fatma and Katsigiannis, Stamos and Pervez, Zeeshan and Ramzan, Naeem (2021) 'When the timeline meets the pipeline: A survey on automated cyberbullying detection.', IEEE access., 9 . pp. 103541-103563.


Web 2.0 helped user-generated platforms to spread widely. Unfortunately, it also allowed for cyberbullying to spread. Cyberbullying has negative effects that could lead to cases of depression and low self-esteem. It has become crucial to develop tools for automated cyberbullying detection. The research on developing these tools has been growing over the last decade, especially with the recent advances in machine learning and natural language processing. Given the large body of work on this topic, it is vital to critically review the literature on cyberbullying within the context of these latest advances. In this paper, we survey the automated detection of cyberbullying. Our survey sheds light on some challenges and limitations for the field. The challenges range from defining cyberbullying, data collection, and feature representation to model selection, training, and evaluation. We also provide some suggestions for improving the task of cyberbullying detection. In addition to the survey, we propose to improve the task of cyberbullying detection by addressing some of the raised limitations: 1) Using recent contextual language models like BERT for the detection of cyberbullying; 2) Using slang-based word embeddings to generate better representations of the cyberbullying-related datasets. Our results show that BERT outperforms state-of-the-art cyberbullying detection models and deep learning models. The results also show that deep learning models initialized with slang-based word embeddings outperform deep learning models initialized with traditional word embeddings.

Item Type:Article
Full text:(AM) Accepted Manuscript
Available under License - Creative Commons Attribution 4.0.
Download PDF
Publisher Web site:
Publisher statement:CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via to obtain full-text articles and stipulations in the API documentation.
Date accepted:17 July 2021
Date deposited:29 July 2021
Date of first online publication:21 July 2021
Date first made open access:29 July 2021

Save or Share this output

Look up in GoogleScholar