Radboud University
Radboud University
After last year’s developments, we can all pack up and go home. ChatGPT has shown that NLP is a solved problem, right? Well, this may be what it might seem like to many, but we’re not quite there yet. Still, the results are so impressive that we should investigate how such models can help us in our work.
Most attention has gone to how well ChatGPT is able to produce text. Which is appropriate, as this is what the model was trained for. After all, it is a GENERATIVE Pretrained Transformer. However, to be able to generate, there must be quite a lot of linguistic knowledge in there, which makes us wonder how well it would do at analysis.
One way to study language is to look at language use in large text corpora. The main problem there is to find the items we are interested in. The standard solution is annotation: we add markup that indicates various linguistic properties of the word, sentences, etc. But that comes with its own problems. Human annotation is costly and error-prone. For many tasks machine annotation is not good enough as the software in the end does not understand what it is doing. In this paper, we will have a look to see if ChatGPT is also a jump forward here.
We will test how well we are able to have ChatGPT analyze modern English text for the purpose of providing various annotations for linguistic research, such as word senses, dependency relations, coreference chains, discourse relations, text structure and argumentation. Evaluation will be by manual checks and, where appropriate, we intend to compare ChatGPT’s results with those of what before ChatGPT’s arrival were considered state-of-the-art analysis systems.