TNO
TNO
TNO
TNO
TNO
[NOTE: REPEAT OF PRESENTATION TO BE GIVEN AT ICAIL (JUNE 2023) AND PAPER TO BE PUBLISHED IN CONFERENCE PROCEEDINGS]
Legal documents, and specifically law texts, are not easy to understand
by humans. The specific terminology and sentence constructions
are particular, which also makes it a difficult machine
understanding task. In this paper, we present a publicly available
benchmark dataset containing Dutch law texts which can be used
to train AI models that assist humans equipped with the task of
interpreting legal texts. However, the dataset can be used in a
broader context, such as semantic role labeling of Dutch (legal)
texts. Our dataset contains 4463 annotated sentences from 55 different
Dutch laws, in which four roles are annotated by human
annotators: action, actor, object and recipient. The inter-annotator
agreement is substantial (𝜅=0.75). In experiments with a rule-based
and a transformer-based method, results show that the transformer-based
method performs quite well on the dataset (accuracy > 0.8).
These results indicate that we can reliably predict actions, actors,
objects and recipients in legal texts. This can help people equipped
with the task of formal interpretation of legal texts.