Towards Explainable Sexism Detection in Social Media: An Ensemble Approach with Human Rationalizations

Hadi Mohammadi

Department of Methodology and Statistics, Utrecht University, The Netherlands.

Ayoub Bagheri

Department of Methodology and Statistics, Utrecht University, The Netherlands.

Anastasia Giachanou

Department of Methodology and Statistics, Utrecht University, The Netherlands.

Sexist speech is a pervasive issue on online social media platforms. Despite the ongoing development of increasingly sophisticated models for detecting sexist speech, the explainability and interpretability of these models remain limited. The majority of the models address online sexism as a binary task, that overlooks the multifaceted nature of sexist content and fails to provide clear reasoning behind the classification of content as sexist. To address this issue, we present an ensemble model that is based on diverse pre-trained models. Our composite model leverages the power of extracting relational information from the text by using different pre-trained models (BERT, XLMRoBERTa, DistilBERT) and passing it through a Convolutional Neural Network (CNN) architecture. The design facilitates the integration of multiple techniques to improve the model's robustness and generalizability. To foster transparency, we apply explainable artificial intelligence techniques to discern the influence of individual tokens and various model components on the decision-making process. Moreover, we incorporate a feedback loop that involves human validation, where humans review and validate the model's predictions and explanations. This iterative approach enhances the model's rationality and alignment with human cognition. To continually evaluate the model's performance, we utilize explainability metrics. This leads to a more reliable and interpretable model that aligns with human evaluation and contributes a comprehensive dataset in the field of sexism detection. This dataset, enriched with annotation and interpretability features, offers valuable insights into decision-making processes and serves as a robust foundation.

CLIN33
The 33rd Meeting of Computational Linguistics in The Netherlands (CLIN 33)
UAntwerpen City Campus: Building R
Rodestraat 14, Antwerp, Belgium
22 September 2023
logo of Clips