Institute for Computing and Information Sciences, Radboud University, Nijmegen, Netherlands
Writing fanfiction based on original fiction is a known method for fans to interact with the source material and express their (dis)satisfaction with it. In this work, we investigate the relationship between canon and fanfiction by answering the question: What can we learn about the fan views on the source materials based on their fanfiction through topic modeling? Furthermore, we are interested in analyzing whether the fans agree with the canon content development. The majority of the research conducted in this domain is in the form of interviews. We attempt a thorough analysis of the featured topics and characters in fanfiction in an automated way using topic modeling, with which we lay a strong foundation for future studies of similar problems.
We successfully conduct an analysis of the frequently found themes within Supernatural fanfiction through topic modeling on the corresponding 42,000 summaries. Using Latent Dirichlet Allocation (LDA), we perform 10 passes over the corpus and extract the 10 most commonly used words in 20 groups which we assign topics. We perform a temporal analysis by learning four topic models for four time periods in which the show has run to track the evolution of the fans' involvement and of the predominant themes that interest the fans as told through their fanfiction. The topics were labeled by hand leveraging the domain knowledge of the author.
We report a divergence from the source material expressed through overrepresentation of secondary characters and change in the character dynamics, while the plot and setting remain close to the canon. Moreover, we identify queerification of the source material. Our results confirm what previous non-automatic research has produced, showcasing that topic modeling is a very suitable technique for tasks from the domain of fanfiction. Future work can, for instance, apply topic modeling on the whole fanfiction contents, weight the topics by the number of reads, and explore fanfiction per genre, not individual media property.