006 Spezielle Computerverfahren
Refine
H-BRS Bibliography
- yes (4)
Departments, institutes and facilities
- Fachbereich Informatik (4) (remove)
Document Type
- Preprint (4) (remove)
Language
- English (4)
Keywords
- Ball Tracking (1)
- Bayesian Deep Learning (1)
- Bioinformatics (1)
- Drosophila (1)
- Facial Emotion Recognition (1)
- Knowledge Graphs (1)
- Machine Learning (1)
- Natural Language Processing (1)
- Navigation (1)
- Optical Flow (1)
This paper addresses the classification of Arabic text data in the field of Natural Language Processing (NLP), with a particular focus on Natural Language Inference (NLI) and Contradiction Detection (CD). Arabic is considered a resource-poor language, meaning that there are few data sets available, which leads to limited availability of NLP methods. To overcome this limitation, we create a dedicated data set from publicly available resources. Subsequently, transformer-based machine learning models are being trained and evaluated. We find that a language-specific model (AraBERT) performs competitively with state-of-the-art multilingual approaches, when we apply linguistically informed pre-training methods such as Named Entity Recognition (NER). To our knowledge, this is the first large-scale evaluation for this task in Arabic, as well as the first application of multi-task pre-training in this context.