TNO
TNO and University of Leiden
TNO
TNO
Given the salience of the concept of causality in natural language and the many ways in which causality can be expressed in language, causality extraction is an important and challenging topic in the field of NLP. We present a fully developed causality extraction pipeline that comprises two steps: 1) identifying whether a given sentence has a causal meaning; and 2) extracting the relevant causal fragments (cause and effect) from the causal sentences. We use state-of-the-art methods, such as the Unicausal BERT model (Tan et al., 2022-1, Tan et al., 2022-2), and (chat)GPT to perform the tasks. From our experiments, we conclude 1) it is possible to build a pipeline to effectively extract causal spans from raw text; 2) while rule-based methods perform best for extracting explicit causality, the more advanced method of using a fine-tuned BERT model performs better when causality is expressed implicitly.
Additionally, we show how these causal fragments can be used to semi-automatically create a (causal) knowledge graph. This knowledge graph can give a quick overview of the (causal) relations within a text, saving the user time to read through it all. A next step in our research is to perform a user experiment to validate this claim.
References:
* Tan, F. A., Hürriyetoğlu, A., Caselli, T., Oostdijk, N., Nomoto, T., Hettiarachchi, H., ... & Hu, T. (2022-1). The causal news corpus: Annotating causal relations in event sentences from news. arXiv preprint arXiv:2204.11714.
* Tan, F. A., Zuo, X., & Ng, S. K. (2022-2). Unicausal: Unified benchmark and model for causal text mining. arXiv preprint arXiv:2208.09163.