006 Spezielle Computerverfahren
Refine
H-BRS Bibliography
- no (3)
Document Type
- Article (2)
- Doctoral Thesis (1)
Language
- English (3)
Has Fulltext
- no (3)
Keywords
- Machine learning (3) (remove)
This dissertation presents a probabilistic state estimation framework for integrating data-driven machine learning models and a deformable facial shape model in order to estimate continuous-valued intensities of 22 different facial muscle movements, known as Action Units (AU), defined in the Facial Action Coding System (FACS). A practical approach is proposed and validated for integrating class-wise probability scores from machine learning models within a Gaussian state estimation framework. Furthermore, driven mass-spring-damper models are applied for modelling the dynamics of facial muscle movements. Both facial shape and appearance information are used for estimating AU intensities, making it a hybrid approach. Several features are designed and explored to help the probabilistic framework to deal with multiple challenges involved in automatic AU detection. The proposed AU intensity estimation method and its features are evaluated quantitatively and qualitatively using three different datasets containing either spontaneous or acted facial expressions with AU annotations. The proposed method produced temporally smoother estimates that facilitate a fine-grained analysis of facial expressions. It also performed reasonably well, even though it simultaneously estimates intensities of 22 AUs, some of which are subtle in expression or resemble each other closely. The estimated AU intensities tended to the lower range of values, and were often accompanied by a small delay in onset. This shows that the proposed method is conservative. In order to further improve performance, state-of-the-art machine learning approaches for AU detection could be integrated within the proposed probabilistic AU intensity estimation framework.
It is only a matter of time until autonomous vehicles become ubiquitous; however, human driving supervision will remain a necessity for decades. To assess the drive's ability to take control over the vehicle in critical scenarios, driver distractions can be monitored using wearable sensors or sensors that are embedded in the vehicle, such as video cameras. The types of driving distractions that can be sensed with various sensors is an open research question that this study attempts to answer. This study compared data from physiological sensors (palm electrodermal activity (pEDA), heart rate and breathing rate) and visual sensors (eye tracking, pupil diameter, nasal EDA (nEDA), emotional activation and facial action units (AUs)) for the detection of four types of distractions. The dataset was collected in a previous driving simulation study. The statistical tests showed that the most informative feature/modality for detecting driver distraction depends on the type of distraction, with emotional activation and AUs being the most promising. The experimental comparison of seven classical machine learning (ML) and seven end-to-end deep learning (DL) methods, which were evaluated on a separate test set of 10 subjects, showed that when classifying windows into distracted or not distracted, the highest F1-score of 79%; was realized by the extreme gradient boosting (XGB) classifier using 60-second windows of AUs as input. When classifying complete driving sessions, XGB's F1-score was 94%. The best-performing DL model was a spectro-temporal ResNet, which realized an F1-score of 75%; when classifying segments and an F1-score of 87%; when classifying complete driving sessions. Finally, this study identified and discussed problems, such as label jitter, scenario overfitting and unsatisfactory generalization performance, that may adversely affect related ML approaches.