Textgain / University of Antwerp (CLiPS)
Textgain
University of Antwerp
This research delves into topic identification and generation in news texts, drawing on a comparative study of human participants from Belgium, the USA, and Ukraine, and Large Language Models (LLMs). In the first experiment, 110 participants of diverse backgrounds assigned topics to three news texts each. The findings underscored significant variations in topic assignment and naming, indicating a need for new evaluative metrics that move beyond simple binary matches. The second experiment enlisted seven native speakers and two LLMs to generate topics for seven news texts. These generated topics were then anonymously assessed by a jury of three experts, evaluated by the criteria of relevance, completeness, and clarity. Detailed results shed light on the potential use of LLMs for topic detection and underscore the subjective nature of news topic identification by human annotators. The study highlights the need to acknowledge and accommodate the inherent diversity and subjectivity in topic identification, particularly when applying LLMs for topic detection and naming.