Volltext-Downloads (blau) und Frontdoor-Views (grau)

Robustness Evaluation of the German Extractive Question Answering Task

  • To ensure reliable performance of Question Answering (QA) systems, evaluation of robustness is crucial. Common evaluation benchmarks commonly only include performance metrics, such as Exact Match (EM) and the F1 score. However, these benchmarks overlook critical factors for the deployment of QA systems. This oversight can result in systems vulnerable to minor perturbations in the input such as typographical errors. While several methods have been proposed to test the robustness of QA models, there has been minimal exploration of these approaches for languages other than English. This study focuses on the robustness evaluation of German language QA models, extending methodologies previously applied primarily to English. The objective is to nurture the development of robust models by defining an evaluation method specifically tailored to the German language. We assess the applicability of perturbations used in English QA models for German and perform a comprehensive experimental evaluation with eight models. The results show that all models are vulnerable to character-level perturbations. Additionally, the comparison of monolingual and multilingual models suggest that the former are less affected by character and word-level perturbations.

Download full text files

Export metadata

Additional Services

Search Google Scholar Check availability

Statistics

Show usage statistics
Metadaten
Document Type:Conference Object
Language:English
Author:Shalaka Satheesh, Katharina Beckh, Katrin Klug, Héctor Allende-Cid, Sebastian Houben, Teena Hassan
Parent Title (English):Rambow, Wanner et al. (Eds.): Proceedings of the 31st International Conference on Computational Linguistics, January 19-24, 2025, Abu Dhabi, UAE
Number of pages:17
First Page:1785
Last Page:1801
URN:urn:nbn:de:hbz:1044-opus-88935
URL:https://aclanthology.org/2025.coling-main.121/
Publisher:Association for Computational Linguistics
Publishing Institution:Hochschule Bonn-Rhein-Sieg
Date of first publication:2025/01/31
Copyright:© 1963–2025 ACL; Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
Departments, institutes and facilities:Fachbereich Informatik
Institut für Technik, Ressourcenschonung und Energieeffizienz (TREE)
Institut für KI und Autonome Systeme (A2S)
Dewey Decimal Classification (DDC):0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 006 Spezielle Computerverfahren
Entry in this database:2025/02/20
Licence (German):License LogoCreative Commons - CC BY - Namensnennung 4.0 International