Hogeschool van Amsterdam
University of Groningen
Leiden University
University of Amsterdam
Radboud University
University of Groningen
University of Amsterdam
University of Groningen
Leiden University
The rapid advancement and widespread adoption of task-based conversational agents are revolutionizing various industries. These agents help clients of online retailers or service providers achieve their goals in a fast and efficient way. Adopting Large-scale pre-trained Language Models (LLMs) as back-ends for task-based conversational agents can significantly enhance their response generation capabilities and enable more fluent and context-aware interactions with users. However, the emerging capabilities of LLMs are driven by scaling up model sizes and training with large datasets, posing challenges for small and medium-sized enterprises (SMEs).
SMEs often lack the necessary data and computational power to train and deploy large models effectively. This problem is further exacerbated by domain-specific challenges: when operating in specific topic domains (e.g. insurance, telecommunication), lack of data and domain knowledge is limiting the development of task-based agents. While utilizing agents offered by larger corporations is an option, it presents various downsides, such as economic dependency, lack of safety/privacy assurance, domain specificity, and transparency. As a result, economic stakeholders in the Dutch ecosystem prefer to collectively develop open-source agents, leading to the creation of the LESSEN consortium which brings together academia, economic stakeholders from various retail and service domains, and governmental stakeholders to democratize conversational AI technology for the Dutch language.
The LESSEN project has three main objectives: The first is to create new network architectures that are compute- and data-efficient. In this research line, we will look for ways to make more efficient use of the existing LLMs. A promising method is Parameter-Efficient Fine-Tuning (PEFT), which only fine-tunes a small number of task-specific parameters while freezing most parameters of the LLMs. The second objective is to devise methods for domain adaptation and data augmentation. The existence of a myriad of (lesser-resourced) domains, tasks, and scenarios yields a high demand for data. One approach is to apply transfer learning methods to adapt conversational applications for low-resource contexts. Another option is to generate synthetic data for training and evaluation. The third and final objective is to ensure the safety, privacy, and transparency of conversational agents. With their increasing capacity, conversational agents are more prone to hallucinating, generating harmful content, and malicious manipulation. Their ability to preserve the private information of the individuals is also questionable. Furthermore, the black-box nature of current transformer architectures lacks transparency and explainability. Therefore, we will address the challenges of making conversational agents transparent, safe, and privacy-preserving.
Due to the involvement of industrial partners, the techniques we develop will be grounded on real-world use cases. In addition, the impact of the project will not be limited to the parties involved. We aim to democratize conversational AI technology for lesser-resourced languages and domains, and we will make the technology accessible to diverse economic stakeholders in the Dutch ecosystem by open-sourcing our algorithms.