Revue de l'Information Scientifique et Technique
Volume 25, Numéro 1, Pages 1-12
2020-12-22
Authors : Geet D’sa Ashwin . Illina Irina . Fohr Dominique .
In the Internet age where the information flow has grown rapidly, there is an increase in digital communication. The spread of hatred that was previously limited to verbal communications has quickly moved over the Internet. Social media and community forums that allow people to discuss and express their opinions are becoming platforms for the dissemination of hate messages. Many countries have developed laws to prevent online hate speech. They hold the companies that run the social media responsible for their failure to remove hate speech. However, manual analysis of hate speech on online platforms is infeasible due to the huge amount of data as it is expensive and time consuming. Thus, it is important to automatically process the online user contents to detect and remove hate speech from online media. Through this work, we propose some solutions for the problem of automatic detection of hate messages. We perform hate speech classification using embedding representations of words and Deep Neural Networks (DNN). We compare fastText and BERT (Bidirectional Encoder Representations from Transformers) embedding representations of words. Furthermore, we perform classification using two approaches: (a) using word embeddings as input to Support Vector Machines (SVM) and DNN-based classifiers; (b) fine-tuning of a BERT model for classification using a task-specific corpus. Among the DNN-based classifiers, we compare Convolutional Neural Networks (CNN), Bi-Directional Long Short Term Memory (Bi-LSTM) and Convolutional Recurrent Neural Network (CRNN). The classification was performed on a Twitter dataset using three classes: hate, offensive and neither classes. Compared to the feature-based approaches, the BERT fine-tuning approach obtained a relative improvement of 16% in terms of macro-average F1-measure and 5.3% in terms of weighted F1-measure.
Natural language processing; classification; deep neural network; embedding; hate speech.
Hellal Aouatef
.
Djeddou Messaoud
.
Loukam Imed
.
A. Hameed Ibrahim
.
Al Dallal Jehad
.
Shawaqfah Moayyad
.
pages 69-83.
Benyahia Samia
.
Meftah Boudjelal
.
pages 23-27.
Kabache Mahraz
.
Guerti Mhania
.
pages 47-51.