Gender Marketing of Toys on Amazon.com: A text mining analysis of product descriptions

Mariia Zamyrova

Radboud University, Nijmegen, the Netherlands

Throughout the decades toys have been a highly gender-segregated product. The division is further reinforced by gender marketing, which is reflected in toy packaging and the language used to advertise toys. This research extends the 2015 study by Owen and Padron about gender stereotypes in toys. The authors analyzed the texts on the packaging of 74 action figures, categorized as ‘for boys’ and ‘for girls’. They compared the texts based on the number of words that belonged to categories such as power, science/technology and physical appearance.

This project looks at how the properties of text used to describe toys for girls, boys or both boys and girls on Amazon.com differ. It broadens the scope of Owen and Padron’s study by considering a wider range of toys. It also compares the text properties of toys not only between their gender classes but also their release dates. This provides an overview of how the gendered language in toy descriptions evolves over the years. The study’s main novelty and advantage, however, is that it automates the analysis process, done manually in previous research, with the use of machine learning and natural language processing.

We analyzed 6804 toys released in the years 2000-2019. The dataset contained metadata of toys from Amazon.com, such as product description text and release date. As the data did not include who the toy is for, we first trained a MLP classifier to label the toys as ‘for boys’, ‘for girls’ or ‘for both boys and girls’ based on the features of their description text. Secondly, we analyzed how all of the text feature values differ between the three classes and over the years. The text features consist of common ones like average sentence/word length and frequencies of punctuation marks, digits and POS tags, and a more specific set of features - frequencies of words from topical categories. Those categories comprise a variety of topics such as emotion-related ones like ‘aggression’ and miscellaneous ones like ‘family’ and ‘sports’. We computed the average of each feature’s values per class and then looked at the trends of those averages over the years. Two-sided pairwise t-test was used to confirm the significance of differences between classes.

The results show that toys ‘for girls’ have more words connected to appearance, femininity and domestic chores like cleaning, while toys ‘for boys’ have more ‘anger’ words. The study also found that toys ‘for boys’ contain significantly more digits than toys ‘for girls’, which could be connected to the existing imbalance in STEM toy marketing towards the two genders. In the future, the description texts of specific toy types could be compared to see which toys contribute to the gender differences most.

References:

Owen, P. R. and Padron, M. 2015, “The language of toys: Gendered language in toy advertisements”. Journal of Research on Women and Gender, 6(1) (June 2015), 67-80. https://digital.library.txstate.edu/handle/10877/12878

CLIN33
The 33rd Meeting of Computational Linguistics in The Netherlands (CLIN 33)
UAntwerpen City Campus: Building R
Rodestraat 14, Antwerp, Belgium
22 September 2023
logo of Clips