Personality Style Recognition via Machine Learning: Identifying Anaclitic and Introjective Personality Styles from Patients’ Speech

Semere Kiros Bitew

IDLab, Department of Information Technology, Ghent University – IMEC, Ghent, Belgium

Vincent Schelstraete

TechWolf, Ghent, Belgium

Klim Zaporojets

IDLab, Department of Information Technology, Ghent University – IMEC, Ghent, Belgium

Kimberly Van Nieuwenhove

Department of Psycho-analysis and Clinical Consulting, Ghent University, Ghent, Belgium

Reitske Meganck

IDLab, Department of Information Technology, Ghent University – IMEC, Ghent, Belgium

Chris Develder

IDLab, Department of Information Technology, Ghent University – IMEC, Ghent, Belgium

In disentangling the heterogeneity observed in psychopathology, personality of the patients is considered crucial. While it has been demonstrated that personality traits are reflected in the language used by a patient, we hypothesize that this enables automatic inference of the personality type directly from speech utterances, potentially more accurately than through a traditional questionnaire-based approach explicitly designed for personality classification. To validate this hypothesis, we adopt natural language processing (NLP) and standard machine learning tools for classification. We test this on a dutch dataset of recorded clinical diagnostic interviews (CDI) on a sample of 79 patients diagnosed with major depressive disorder (MDD) — a condition for which differentiated treatment based on personality styles has been advocated — and classified into anaclitic and introjective personality styles. We start by analyzing the interviews to see which linguistic features are associated with each style, in order to gain a better understanding of the styles. Then, we develop automatic classifiers based on (a) standardized questionnaire responses; (b) basic text features, i.e., TF-IDF scores of words and word sequences; (c) more advanced text features, using LIWC (linguistic inquiry and word count) and context-aware features using BERT (bidirectional encoder representations from transformers); (d) audio features. We find that automated classification with language-derived features (i.e., based on LIWC) significantly outperforms questionnaire-based classification models. Furthermore, the best performance is achieved by combining LIWC with the questionnaire features. This suggests that more work should be put into developing linguistically based automated techniques for characterizing personality, however questionnaires should still be used to complement such methods.

CLIN33

The 33rd Meeting of Computational Linguistics in The Netherlands (CLIN 33)

UAntwerpen City Campus: Building R

Rodestraat 14, Antwerp, Belgium

22 September 2023