006 Spezielle Computerverfahren
Refine
Departments, institutes and facilities
- Fachbereich Informatik (36)
- Institute of Visual Computing (IVC) (17)
- Institut für Technik, Ressourcenschonung und Energieeffizienz (TREE) (8)
- Institut für Verbraucherinformatik (IVI) (8)
- Fachbereich Wirtschaftswissenschaften (7)
- Institut für Sicherheitsforschung (ISF) (5)
- Fachbereich Ingenieurwissenschaften und Kommunikation (3)
- Graduierteninstitut (3)
- Institut für KI und Autonome Systeme (A2S) (2)
- Fachbereich Angewandte Naturwissenschaften (1)
Document Type
- Conference Object (35)
- Article (18)
- Part of a Book (6)
- Doctoral Thesis (4)
- Preprint (4)
- Research Data (2)
- Report (2)
- Book (monograph, edited volume) (1)
- Patent (1)
Year of publication
Language
- English (73) (remove)
Has Fulltext
- no (73) (remove)
Keywords
- Augmented Reality (3)
- Machine Learning (3)
- Machine learning (3)
- deep learning (3)
- facial expression analysis (3)
- Automatic pain detection (2)
- Dementia (2)
- Explainable artificial intelligence (2)
- Robotics (2)
- Virtual Reality (2)
Vection underwater
(2022)
Using Visual and Auditory Cues to Locate Out-of-View Objects in Head-Mounted Augmented Reality
(2021)
Towards self-explaining social robots. Verbal explanation strategies for a needs-based architecture
(2019)
In order to establish long-term relationships with users, social companion robots and their behaviors need to be comprehensible. Purely reactive behavior such as answering questions or following commands can be readily interpreted by users. However, the robot's proactive behaviors, included in order to increase liveliness and improve the user experience, often raise a need for explanation. In this paper, we provide a concept to produce accessible “why-explanations” for the goal-directed behavior an autonomous, lively robot might produce. To this end we present an architecture that provides reasons for behaviors in terms of comprehensible needs and strategies of the robot, and we propose a model for generating different kinds of explanations.
This dissertation presents a probabilistic state estimation framework for integrating data-driven machine learning models and a deformable facial shape model in order to estimate continuous-valued intensities of 22 different facial muscle movements, known as Action Units (AU), defined in the Facial Action Coding System (FACS). A practical approach is proposed and validated for integrating class-wise probability scores from machine learning models within a Gaussian state estimation framework. Furthermore, driven mass-spring-damper models are applied for modelling the dynamics of facial muscle movements. Both facial shape and appearance information are used for estimating AU intensities, making it a hybrid approach. Several features are designed and explored to help the probabilistic framework to deal with multiple challenges involved in automatic AU detection. The proposed AU intensity estimation method and its features are evaluated quantitatively and qualitatively using three different datasets containing either spontaneous or acted facial expressions with AU annotations. The proposed method produced temporally smoother estimates that facilitate a fine-grained analysis of facial expressions. It also performed reasonably well, even though it simultaneously estimates intensities of 22 AUs, some of which are subtle in expression or resemble each other closely. The estimated AU intensities tended to the lower range of values, and were often accompanied by a small delay in onset. This shows that the proposed method is conservative. In order to further improve performance, state-of-the-art machine learning approaches for AU detection could be integrated within the proposed probabilistic AU intensity estimation framework.
Towards explaining deep learning networks to distinguish facial expressions of pain and emotions
(2018)
Deep learning networks are successfully used for object and face recognition in images and videos. In order to be able to apply such networks in practice, for example in hospitals as a pain recognition tool, the current procedures are only suitable to a limited extent. The advantage of deep learning methods is that they can learn complex non-linear relationships between raw data and target classes without limiting themselves to a set of hand-crafted features provided by humans. However, the disadvantage is that due to the complexity of these networks, it is not possible to interpret the knowledge that is stored inside the network. It is a black-box learning procedure. Explainable Artificial Intelligence (AI) approaches mitigate this problem by extracting explanations for decisions and representing them in a human-interpretable form. The aim of this paper is to investigate the explainable AI method Layer-wise Relevance Propagation (LRP) and apply it to explain how a deep learning network distinguishes facial expressions of pain from facial expressions of emotions such as happiness and disgust.
Towards an Interaction-Centered and Dynamically Constructed Episodic Memory for Social Robots
(2020)
The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models (KGEMs). However, representations based on a single modality are inherently limited. To generate better representations of biological knowledge, we propose STonKGs, a Sophisticated Transformer trained on biomedical text and Knowledge Graphs. This multimodal Transformer uses combined input sequences of structured information from KGs and unstructured text data from biomedical literature to learn joint representations. First, we pre-trained STonKGs on a knowledge base assembled by the Integrated Network and Dynamical Reasoning Assembler (INDRA) consisting of millions of text-triple pairs extracted from biomedical literature by multiple NLP systems. Then, we benchmarked STonKGs against two baseline models trained on either one of the modalities (i.e., text or KG) across eight different classification tasks, each corresponding to a different biological application. Our results demonstrate that STonKGs outperforms both baselines, especially on the more challenging tasks with respect to the number of classes, improving upon the F1-score of the best baseline by up to 0.083. Additionally, our pre-trained model as well as the model architecture can be adapted to various other transfer learning applications. Finally, the source code and pre-trained STonKGs models are available at https://github.com/stonkgs/stonkgs and https://huggingface.co/stonkgs/stonkgs-150k.
In recent years, the ability of intelligent systems to be understood by developers and users has received growing attention. This holds in particular for social robots, which are supposed to act autonomously in the vicinity of human users and are known to raise peculiar, often unrealistic attributions and expectations. However, explainable models that, on the one hand, allow a robot to generate lively and autonomous behavior and, on the other, enable it to provide human-compatible explanations for this behavior are missing. In order to develop such a self-explaining autonomous social robot, we have equipped a robot with own needs that autonomously trigger intentions and proactive behavior, and form the basis for understandable self-explanations. Previous research has shown that undesirable robot behavior is rated more positively after receiving an explanation. We thus aim to equip a social robot with the capability to automatically generate verbal explanations of its own behavior, by tracing its internal decision-making routes. The goal is to generate social robot behavior in a way that is generally interpretable, and therefore explainable on a socio-behavioral level increasing users' understanding of the robot's behavior. In this article, we present a social robot interaction architecture, designed to autonomously generate social behavior and self-explanations. We set out requirements for explainable behavior generation architectures and propose a socio-interactive framework for behavior explanations in social human-robot interactions that enables explaining and elaborating according to users' needs for explanation that emerge within an interaction. Consequently, we introduce an interactive explanation dialog flow concept that incorporates empirically validated explanation types. These concepts are realized within the interaction architecture of a social robot, and integrated with its dialog processing modules. We present the components of this interaction architecture and explain their integration to autonomously generate social behaviors as well as verbal self-explanations. Lastly, we report results from a qualitative evaluation of a working prototype in a laboratory setting, showing that (1) the robot is able to autonomously generate naturalistic social behavior, and (2) the robot is able to verbally self-explain its behavior to the user in line with users' requests.
Selection Performance and Reliability of Eye and Head Gaze Tracking Under Varying Light Conditions
(2024)
In this paper, we introduce an optical sensor system, which is integrated into an industrial push-button. The sensor allows to classify the type of material that is in contact with the button when pressed into different material categories on the basis of the material's so called "spectral signature". An approach for a safety sensor system at circular table saws on the same base has been introduced previously on SIAS-2007. This contactless working sensor is able to distinguish reliably between skin, textiles, leather and various other kinds of materials. A typical application for this intelligent push-button is the use at possibly dangerous machines, whose operating instructions include either the prohibition or the obligation to wear gloves during the work at the machine. An exemple of machines at which no gloves are allowed are pillar drilling machines, because of the risk of getting caught in the drill chuck and being turned in by the machine. In many cases this causes very serious hand injuries. Depending on the application needs, the sensor system integrated into the push-button can be configured flexibly by software to prevent the operator from accidentally starting a machine with or without gloves, which can decrease the risk of severe accidents significantly. Especially two-hand controls are incentive to manipulation for easier handling. By equipping both push-buttons of a two-hand control with material classification properties, the user is forced to operate the controls with his bare fingers. That limitation disallows the manipulation of a two-hand control by a simple rodding device.
BACKGROUND
Given the unreliable self-report in patients with dementia, pain assessment should also rely on the observation of pain behaviors, such as facial expressions. Ideal observers should be well trained and should observe the patient continuously in order to pick up any pain-indicative behavior; which are requisitions beyond realistic possibilities of pain care. Therefore, the need for video-based pain detection systems has been repeatedly voiced. Such systems would allow for constant monitoring of pain behaviors and thereby allow for a timely adjustment of pain management in these fragile patients, who are often undertreated for pain.
METHODS
In this road map paper we describe an interdisciplinary approach to develop such a video-based pain detection system. The development starts with the selection of appropriate video material of people in pain as well as the development of technical methods to capture their faces. Furthermore, single facial motions are automatically extracted according to an international coding system. Computer algorithms are trained to detect the combination and timing of those motions, which are pain-indicative.
RESULTS/CONCLUSION
We hope to encourage colleagues to join forces and to inform end-users about an imminent solution of a pressing pain-care problem. For the near future, implementation of such systems can be foreseen to monitor immobile patients in intensive and postoperative care situations.
Computer graphics research strives to synthesize images of a high visual realism that are indistinguishable from real visual experiences. While modern image synthesis approaches enable to create digital images of astonishing complexity and beauty, processing resources remain a limiting factor. Here, rendering efficiency is a central challenge involving a trade-off between visual fidelity and interactivity. For that reason, there is still a fundamental difference between the perception of the physical world and computer-generated imagery. At the same time, advances in display technologies drive the development of novel display devices. The dynamic range, the pixel densities, and refresh rates are constantly increasing. Display systems enable a larger visual field to be addressed by covering a wider field-of-view, due to either their size or in the form of head-mounted devices. Currently, research prototypes are ranging from stereo and multi-view systems, head-mounted devices with adaptable lenses, up to retinal projection, and lightfield/holographic displays. Computer graphics has to keep step with, as driving these devices presents us with immense challenges, most of which are currently unsolved. Fortunately, the human visual system has certain limitations, which means that providing the highest possible visual quality is not always necessary. Visual input passes through the eye’s optics, is filtered, and is processed at higher level structures in the brain. Knowledge of these processes helps to design novel rendering approaches that allow the creation of images at a higher quality and within a reduced time-frame. This thesis presents the state-of-the-art research and models that exploit the limitations of perception in order to increase visual quality but also to reduce workload alike - a concept we call perception-driven rendering. This research results in several practical rendering approaches that allow some of the fundamental challenges of computer graphics to be tackled. By using different tracking hardware, display systems, and head-mounted devices, we show the potential of each of the presented systems. The capturing of specific processes of the human visual system can be improved by combining multiple measurements using machine learning techniques. Different sampling, filtering, and reconstruction techniques aid the visual quality of the synthesized images. An in-depth evaluation of the presented systems including benchmarks, comparative examination with image metrics as well as user studies and experiments demonstrated that the methods introduced are visually superior or on the same qualitative level as ground truth, whilst having a significantly reduced computational complexity.