Refine
Departments, institutes and facilities
Document Type
- Conference Object (14)
- Article (8)
- Part of a Book (2)
- Report (2)
- Dataset (1)
- Doctoral Thesis (1)
- Patent (1)
Year of publication
Keywords
Population ageing and growing prevalence of disability have resulted in a growing need for personal care and assistance. The insufficient supply of personal care workers and the rising costs of long-term care have turned this phenomenon into a greater social concern. This has resulted in a growing interest in assistive technology in general, and assistive robots in particular, as a means of substituting or supplementing the care provided by humans, and as a means of increasing the independence and overall quality of life of persons with special needs. Although many assistive robots have been developed in research labs world-wide, very few are commercially available. One of the reasons for this, is the cost. One way of optimising cost is to develop solutions that address specific needs of users. As a precursor to this, it is important to identify gaps between what the users need and what the technology (assistive robots) currently provides. This information is obtained through technology mapping.
The current literature lacks a mapping between user needs and assistive robots, at the level of individual systems. The user needs are not expressed in uniform terminology across studies, which makes comparison of results difficult. In this research work, we have illustrated the technology mapping of assistive robots using the International Classification of Functioning, Disability and Health (ICF). ICF provides standard terminology for expressing user needs in detail. Expressing the assistive functions of robots also in ICF terminology facilitates communication between different stakeholders (rehabilitation professionals, robotics researchers, etc.).
We also investigated existing taxonomies for assistive robots. It was observed that there is no widely accepted taxonomy for classifying assistive robots. However, there exists an international standard, ISO 9999, which classifies commercially available assistive products. The applicability of the latest revision of ISO 9999 standard for classifying mobility assistance robots has been studied. A partial classification of assistive robots based on ISO 9999 is suggested. The taxonomy and technology mapping are illustrated with the help of four robots that have the potential to provide mobility assistance. These are the SmartCane, the SmartWalker, MAid and Care-O-bot (R) 3. SmartCane, SmartWalker and MAid provide assistance by supporting physical movement. Care-O-bot (R) 3 provides assistance by reducing the need to move.
Emotional communication is a key element of habilitation care of persons with dementia. It is, therefore, highly preferable for assistive robots that are used to supplement human care provided to persons with dementia, to possess the ability to recognize and respond to emotions expressed by those who are being cared-for. Facial expressions are one of the key modalities through which emotions are conveyed. This work focuses on computer vision-based recognition of facial expressions of emotions conveyed by the elderly.
Although there has been much work on automatic facial expression recognition, the algorithms have been experimentally validated primarily on young faces. The facial expressions on older faces has been totally excluded. This is due to the fact that the facial expression databases that were available and that have been used in facial expression recognition research so far do not contain images of facial expressions of people above the age of 65 years. To overcome this problem, we adopt a recently published database, namely, the FACES database, which was developed to address exactly the same problem in the area of human behavioural research. The FACES database contains 2052 images of six different facial expressions, with almost identical and systematic representation of the young, middle-aged and older age-groups.
In this work, we evaluate and compare the performance of two of the existing imagebased approaches for facial expression recognition, over a broad spectrum of age ranging from 19 to 80 years. The evaluated systems use Gabor filters and uniform local binary patterns (LBP) for feature extraction, and AdaBoost.MH with multi-threshold stump learner for expression classification. We have experimentally validated the hypotheses that facial expression recognition systems trained only on young faces perform poorly on middle-aged and older faces, and that such systems confuse ageing-related facial features on neutral faces with other expressions of emotions. We also identified that, among the three age-groups, the middle-aged group provides the best generalization performance across the entire age spectrum. The performance of the systems was also compared to the performance of humans in recognizing facial expressions of emotions. Some similarities were observed, such as, difficulty in recognizing the expressions on older faces, and difficulty in recognizing the expression of sadness.
The findings of our work establish the need for developing approaches for facial expression recognition that are robust to the effects of ageing on the face. The scientific results of our work can be used as a basis to guide future research in this direction.
Introduction. The experience of pain is regularly accompanied by facial expressions. The gold standard for analyzing these facial expressions is the Facial Action Coding System (FACS), which provides so-called action units (AUs) as parametrical indicators of facial muscular activity. Particular combinations of AUs have appeared to be pain-indicative. The manual coding of AUs is, however, too time- and labor-intensive in clinical practice. New developments in automatic facial expression analysis have promised to enable automatic detection of AUs, which might be used for pain detection. Objective. Our aim is to compare manual with automatic AU coding of facial expressions of pain. Methods. FaceReader7 was used for automatic AU detection. We compared the performance of FaceReader7 using videos of 40 participants (20 younger with a mean age of 25.7 years and 20 older with a mean age of 52.1 years) undergoing experimentally induced heat pain to manually coded AUs as gold standard labeling. Percentages of correctly and falsely classified AUs were calculated, and we computed as indicators of congruency, "sensitivity/recall," "precision," and "overall agreement (F1)." Results. The automatic coding of AUs only showed poor to moderate outcomes regarding sensitivity/recall, precision, and F1. The congruency was better for younger compared to older faces and was better for pain-indicative AUs compared to other AUs. Conclusion. At the moment, automatic analyses of genuine facial expressions of pain may qualify at best as semiautomatic systems, which require further validation by human observers before they can be used to validly assess facial expressions of pain.
In recent years, the ability of intelligent systems to be understood by developers and users has received growing attention. This holds in particular for social robots, which are supposed to act autonomously in the vicinity of human users and are known to raise peculiar, often unrealistic attributions and expectations. However, explainable models that, on the one hand, allow a robot to generate lively and autonomous behavior and, on the other, enable it to provide human-compatible explanations for this behavior are missing. In order to develop such a self-explaining autonomous social robot, we have equipped a robot with own needs that autonomously trigger intentions and proactive behavior, and form the basis for understandable self-explanations. Previous research has shown that undesirable robot behavior is rated more positively after receiving an explanation. We thus aim to equip a social robot with the capability to automatically generate verbal explanations of its own behavior, by tracing its internal decision-making routes. The goal is to generate social robot behavior in a way that is generally interpretable, and therefore explainable on a socio-behavioral level increasing users' understanding of the robot's behavior. In this article, we present a social robot interaction architecture, designed to autonomously generate social behavior and self-explanations. We set out requirements for explainable behavior generation architectures and propose a socio-interactive framework for behavior explanations in social human-robot interactions that enables explaining and elaborating according to users' needs for explanation that emerge within an interaction. Consequently, we introduce an interactive explanation dialog flow concept that incorporates empirically validated explanation types. These concepts are realized within the interaction architecture of a social robot, and integrated with its dialog processing modules. We present the components of this interaction architecture and explain their integration to autonomously generate social behaviors as well as verbal self-explanations. Lastly, we report results from a qualitative evaluation of a working prototype in a laboratory setting, showing that (1) the robot is able to autonomously generate naturalistic social behavior, and (2) the robot is able to verbally self-explain its behavior to the user in line with users' requests.
BACKGROUND
Given the unreliable self-report in patients with dementia, pain assessment should also rely on the observation of pain behaviors, such as facial expressions. Ideal observers should be well trained and should observe the patient continuously in order to pick up any pain-indicative behavior; which are requisitions beyond realistic possibilities of pain care. Therefore, the need for video-based pain detection systems has been repeatedly voiced. Such systems would allow for constant monitoring of pain behaviors and thereby allow for a timely adjustment of pain management in these fragile patients, who are often undertreated for pain.
METHODS
In this road map paper we describe an interdisciplinary approach to develop such a video-based pain detection system. The development starts with the selection of appropriate video material of people in pain as well as the development of technical methods to capture their faces. Furthermore, single facial motions are automatically extracted according to an international coding system. Computer algorithms are trained to detect the combination and timing of those motions, which are pain-indicative.
RESULTS/CONCLUSION
We hope to encourage colleagues to join forces and to inform end-users about an imminent solution of a pressing pain-care problem. For the near future, implementation of such systems can be foreseen to monitor immobile patients in intensive and postoperative care situations.
A device includes an input to sequential data associated to a face; a predictor configured to predict facial parameters; and a corrector configured to correct the predicted facial parameters on the basis of input data, the input data containing geometric measurements and other information. A related method and a related computer program are also disclosed.
This paper describes a dynamic, model-based approach for estimating intensities of 22 out of 44 different basic facial muscle movements. These movements are defined as Action Units (AU) in the Facial Action Coding System (FACS) [1]. The maximum facial shape deformations that can be caused by the 22 AUs are represented as vectors in an anatomically based, deformable, point-based face model. The amount of deformation along these vectors represent the AU intensities, and its valid range is [0, 1]. An Extended Kalman Filter (EKF) with state constraints is used to estimate the AU intensities. The focus of this paper is on the modeling of constraints in order to impose the anatomically valid AU intensity range of [0, 1]. Two process models are considered, namely constant velocity and driven mass-spring-damper. The results show the temporal smoothing and disambiguation effect of the constrained EKF approach, when compared to the frame-by-frame model fitting approach ‘Regularized Landmark Mean-Shift (RLMS)’ [2]. This effect led to more than 35% increase in performance on a database of posed facial expressions.
Towards explaining deep learning networks to distinguish facial expressions of pain and emotions
(2018)
Deep learning networks are successfully used for object and face recognition in images and videos. In order to be able to apply such networks in practice, for example in hospitals as a pain recognition tool, the current procedures are only suitable to a limited extent. The advantage of deep learning methods is that they can learn complex non-linear relationships between raw data and target classes without limiting themselves to a set of hand-crafted features provided by humans. However, the disadvantage is that due to the complexity of these networks, it is not possible to interpret the knowledge that is stored inside the network. It is a black-box learning procedure. Explainable Artificial Intelligence (AI) approaches mitigate this problem by extracting explanations for decisions and representing them in a human-interpretable form. The aim of this paper is to investigate the explainable AI method Layer-wise Relevance Propagation (LRP) and apply it to explain how a deep learning network distinguishes facial expressions of pain from facial expressions of emotions such as happiness and disgust.
Towards self-explaining social robots. Verbal explanation strategies for a needs-based architecture
(2019)
In order to establish long-term relationships with users, social companion robots and their behaviors need to be comprehensible. Purely reactive behavior such as answering questions or following commands can be readily interpreted by users. However, the robot's proactive behaviors, included in order to increase liveliness and improve the user experience, often raise a need for explanation. In this paper, we provide a concept to produce accessible “why-explanations” for the goal-directed behavior an autonomous, lively robot might produce. To this end we present an architecture that provides reasons for behaviors in terms of comprehensible needs and strategies of the robot, and we propose a model for generating different kinds of explanations.
Towards an Interaction-Centered and Dynamically Constructed Episodic Memory for Social Robots
(2020)
It is only a matter of time until autonomous vehicles become ubiquitous; however, human driving supervision will remain a necessity for decades. To assess the drive's ability to take control over the vehicle in critical scenarios, driver distractions can be monitored using wearable sensors or sensors that are embedded in the vehicle, such as video cameras. The types of driving distractions that can be sensed with various sensors is an open research question that this study attempts to answer. This study compared data from physiological sensors (palm electrodermal activity (pEDA), heart rate and breathing rate) and visual sensors (eye tracking, pupil diameter, nasal EDA (nEDA), emotional activation and facial action units (AUs)) for the detection of four types of distractions. The dataset was collected in a previous driving simulation study. The statistical tests showed that the most informative feature/modality for detecting driver distraction depends on the type of distraction, with emotional activation and AUs being the most promising. The experimental comparison of seven classical machine learning (ML) and seven end-to-end deep learning (DL) methods, which were evaluated on a separate test set of 10 subjects, showed that when classifying windows into distracted or not distracted, the highest F1-score of 79%; was realized by the extreme gradient boosting (XGB) classifier using 60-second windows of AUs as input. When classifying complete driving sessions, XGB's F1-score was 94%. The best-performing DL model was a spectro-temporal ResNet, which realized an F1-score of 75%; when classifying segments and an F1-score of 87%; when classifying complete driving sessions. Finally, this study identified and discussed problems, such as label jitter, scenario overfitting and unsatisfactory generalization performance, that may adversely affect related ML approaches.
This dissertation presents a probabilistic state estimation framework for integrating data-driven machine learning models and a deformable facial shape model in order to estimate continuous-valued intensities of 22 different facial muscle movements, known as Action Units (AU), defined in the Facial Action Coding System (FACS). A practical approach is proposed and validated for integrating class-wise probability scores from machine learning models within a Gaussian state estimation framework. Furthermore, driven mass-spring-damper models are applied for modelling the dynamics of facial muscle movements. Both facial shape and appearance information are used for estimating AU intensities, making it a hybrid approach. Several features are designed and explored to help the probabilistic framework to deal with multiple challenges involved in automatic AU detection. The proposed AU intensity estimation method and its features are evaluated quantitatively and qualitatively using three different datasets containing either spontaneous or acted facial expressions with AU annotations. The proposed method produced temporally smoother estimates that facilitate a fine-grained analysis of facial expressions. It also performed reasonably well, even though it simultaneously estimates intensities of 22 AUs, some of which are subtle in expression or resemble each other closely. The estimated AU intensities tended to the lower range of values, and were often accompanied by a small delay in onset. This shows that the proposed method is conservative. In order to further improve performance, state-of-the-art machine learning approaches for AU detection could be integrated within the proposed probabilistic AU intensity estimation framework.
Dieses Dokument präsentiert eine Zusammenfassung der Dissertation der Autorin. In dieser Dissertation [Ha20] wurde ein neuartiger und hybrider Ansatz für die Scha ̈tzung der Intensität von Gesichtsmuskelbewegungen (Action Unit (AU)) vorgeschlagen und validiert. Dieser Ansatz basiert auf einer Gauß’schen Zustandsschätzung und kombiniert ein verformbares, AU-basiertes Gesichtsformmodell, ein viskoelastisches Modell der Gesichtsmuskelbewegung, mehrere erscheinungsbasierten AU-Klassifikatoren und eine Methode zur Erkennung von Gesichtspunkten. Es wurden mehrere Erweiterungen vorgeschlagen und in das Zustandsschätzungs-Framework integriert, um mit den personenspezifischen Eigenschaften sowie technischen und praktischen Herausforderungen umzugehen.Die mit der vorgeschlagenen Methode erzeugten AU-Intensitätsschätzungen wurden für die automatische Erkennung von Schmerzen und für die Analyse von Fahrerablenkung eingesetzt.
Time for an Explanation: A Mini-Review of Explainable Physio-Behavioural Time-Series Classification
(2024)
Time-series classification is seeing growing importance as device proliferation has lead to the collection of an abundance of sensor data. Although black-box models, whose internal workings are difficult to understand, are a common choice for this task, their use in safety-critical domains has raised calls for greater transparency. In response, researchers have begun employing explainable artificial intelligence together with physio-behavioural signals in the context of real-world problems. Hence, this paper examines the current literature in this area and contributes principles for future research to overcome the limitations of the reviewed works.
The workshop XAI for U aims to address the critical need for transparency in Artificial Intelligence (AI) systems that integrate into our daily lives through mobile systems, wearables, and smart environments. Despite advances in AI, many of these systems remain opaque, making it difficult for users, developers, and stakeholders to verify their reliability and correctness. This workshop addresses the pressing need for enabling Explainable AI (XAI) tools within Ubiquitous and Wearable Computing and highlights the unique challenges that come with it, such as XAI that deals with time-series and multimodal data, XAI that explains interconnected machine learning (ML) components, and XAI that provides user-centered explanations. The workshop aims to foster collaboration among researchers in related domains, share recent advancements, address open challenges, and propose future research directions to improve the applicability and development of XAI in Ubiquitous Pervasive and Wearable Computing - and with that seeks to enhance user trust, understanding, interaction, and adoption, ensuring that AI- driven solutions are not only more explainable but also more aligned with ethical standards and user expectations.