Refine
Departments, institutes and facilities
Document Type
- Conference Object (82)
- Preprint (15)
- Article (10)
- Report (5)
- Part of a Book (3)
- Book (monograph, edited volume) (1)
- Research Data (1)
- Doctoral Thesis (1)
Year of publication
Keywords
- Automatic Short Answer Grading (2)
- Bag of Features (2)
- Cognitive robot control (2)
- Drosophila (2)
- Explainable robotics (2)
- Human-Robot Interaction (2)
- Learning from experience (2)
- Navigation (2)
- Object detection (2)
- Object recognition (2)
Loading of shipping containers for dairy products often includes a press-fit task, which involves manually stacking milk cartons in a container without using pallets or packaging. Automating this task with a mobile manipulator can reduce worker strain, and also enhance the efficiency and safety of the container loading process. This paper proposes an approach called Adaptive Compliant Control with Integrated Failure Recovery (ACCIFR), which enables a mobile manipulator to reliably perform the press-fit task. We base the approach on a demonstration learning-based compliant control framework, such that we integrate a monitoring and failure recovery mechanism for successful task execution. Concretely, we monitor the execution through distance and force feedback, detect collisions while the robot is performing the press-fit task, and use wrench measurements to classify the direction of collision; this information informs the subsequent recovery process. We evaluate the method on a miniature container setup, considering variations in the (i) starting position of the end effector, (ii) goal configuration, and (iii) object grasping position. The results demonstrate that the proposed approach outperforms the baseline demonstration-based learning framework regarding adaptability to environmental variations and the ability to recover from collision failures, making it a promising solution for practical press-fit applications.
TSEM: Temporally-Weighted Spatiotemporal Explainable Neural Network for Multivariate Time Series
(2023)
Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems.
LiDAR-based Indoor Localization with Optimal Particle Filters using Surface Normal Constraints
(2023)
In the design of robot skills, the focus generally lies on increasing the flexibility and reliability of the robot execution process; however, typical skill representations are not designed for analysing execution failures if they occur or for explicitly learning from failures. In this paper, we describe a learning-based hybrid representation for skill parameterisation called an execution model, which considers execution failures to be a natural part of the execution process. We then (i) demonstrate how execution contexts can be included in execution models, (ii) introduce a technique for generalising models between object categories by combining generalisation attempts performed by a robot with knowledge about object similarities represented in an ontology, and (iii) describe a procedure that uses an execution model for identifying a likely hypothesis of a parameterisation failure. The feasibility of the proposed methods is evaluated in multiple experiments performed with a physical robot in the context of handle grasping, object grasping, and object pulling. The experimental results suggest that execution models contribute towards avoiding execution failures, but also represent a first step towards more introspective robots that are able to analyse some of their execution failures in an explicit manner.
MOTIVATION
The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models (KGEMs). However, representations based on a single modality are inherently limited.
RESULTS
To generate better representations of biological knowledge, we propose STonKGs, a Sophisticated Transformer trained on biomedical text and Knowledge Graphs (KGs). This multimodal Transformer uses combined input sequences of structured information from KGs and unstructured text data from biomedical literature to learn joint representations in a shared embedding space. First, we pre-trained STonKGs on a knowledge base assembled by the Integrated Network and Dynamical Reasoning Assembler (INDRA) consisting of millions of text-triple pairs extracted from biomedical literature by multiple NLP systems. Then, we benchmarked STonKGs against three baseline models trained on either one of the modalities (i.e., text or KG) across eight different classification tasks, each corresponding to a different biological application. Our results demonstrate that STonKGs outperforms both baselines, especially on the more challenging tasks with respect to the number of classes, improving upon the F1-score of the best baseline by up to 0.084 (i.e., from 0.881 to 0.965). Finally, our pre-trained model as well as the model architecture can be adapted to various other transfer learning applications.
AVAILABILITY
We make the source code and the Python package of STonKGs available at GitHub (https://github.com/stonkgs/stonkgs) and PyPI (https://pypi.org/project/stonkgs/). The pre-trained STonKGs models and the task-specific classification models are respectively available at https://huggingface.co/stonkgs/stonkgs-150k and https://zenodo.org/communities/stonkgs.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Robots applied in therapeutic scenarios, for instance in the therapy of individuals with Autism Spectrum Disorder, are sometimes used for imitation learning activities in which a person needs to repeat motions by the robot. To simplify the task of incorporating new types of motions that a robot can perform, it is desirable that the robot has the ability to learn motions by observing demonstrations from a human, such as a therapist. In this paper, we investigate an approach for acquiring motions from skeleton observations of a human, which are collected by a robot-centric RGB-D camera. Given a sequence of observations of various joints, the joint positions are mapped to match the configuration of a robot before being executed by a PID position controller. We evaluate the method, in particular the reproduction error, by performing a study with QTrobot in which the robot acquired different upper-body dance moves from multiple participants. The results indicate the method's overall feasibility, but also indicate that the reproduction quality is affected by noise in the skeleton observations.
In robot-assisted therapy for individuals with Autism Spectrum Disorder, the workload of therapists during a therapeutic session is increased if they have to control the robot manually. To allow therapists to focus on the interaction with the person instead, the robot should be more autonomous, namely it should be able to interpret the person's state and continuously adapt its actions according to their behaviour. In this paper, we develop a personalised robot behaviour model that can be used in the robot decision-making process during an activity; this behaviour model is trained with the help of a user model that has been learned from real interaction data. We use Q-learning for this task, such that the results demonstrate that the policy requires about 10,000 iterations to converge. We thus investigate policy transfer for improving the convergence speed; we show that this is a feasible solution, but an inappropriate initial policy can lead to a suboptimal final return.
TSEM: Temporally Weighted Spatiotemporal Explainable Neural Network for Multivariate Time Series
(2022)
Deep learning has become a one-size-fits-all solution for technical and business domains thanks to its flexibility and adaptability. It is implemented using opaque models, which unfortunately undermines the outcome trustworthiness. In order to have a better understanding of the behavior of a system, particularly one driven by time series, a look inside a deep learning model so-called posthoc eXplainable Artificial Intelligence (XAI) approaches, is important. There are two major types of XAI for time series data, namely model-agnostic and model-specific. Model-specific approach is considered in this work. While other approaches employ either Class Activation Mapping (CAM) or Attention Mechanism, we merge the two strategies into a single system, simply called the Temporally Weighted Spatiotemporal Explainable Neural Network for Multivariate Time Series (TSEM). TSEM combines the capabilities of RNN and CNN models in such a way that RNN hidden units are employed as attention weights for the CNN feature maps temporal axis. The result shows that TSEM outperforms XCM. It is similar to STAM in terms of accuracy, while also satisfying a number of interpretability criteria, including causality, fidelity, and spatiotemporality.
Application of underwater robots are on the rise, most of them are dependent on sonar for underwater vision, but the lack of strong perception capabilities limits them in this task. An important issue in sonar perception is matching image patches, which can enable other techniques like localization, change detection, and mapping. There is a rich literature for this problem in color images, but for acoustic images, it is lacking, due to the physics that produce these images. In this paper we improve on our previous results for this problem (Valdenegro-Toro et al, 2017), instead of modeling features manually, a Convolutional Neural Network (CNN) learns a similarity function and predicts if two input sonar images are similar or not. With the objective of improving the sonar image matching problem further, three state of the art CNN architectures are evaluated on the Marine Debris dataset, namely DenseNet, and VGG, with a siamese or two-channel architecture, and contrastive loss. To ensure a fair evaluation of each network, thorough hyper-parameter optimization is executed. We find that the best performing models are DenseNet Two-Channel network with 0.955 AUC, VGG-Siamese with contrastive loss at 0.949 AUC and DenseNet Siamese with 0.921 AUC. By ensembling the top performing DenseNet two-channel and DenseNet-Siamese models overall highest prediction accuracy obtained is 0.978 AUC, showing a large improvement over the 0.91 AUC in the state of the art.
The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models (KGEMs). However, representations based on a single modality are inherently limited. To generate better representations of biological knowledge, we propose STonKGs, a Sophisticated Transformer trained on biomedical text and Knowledge Graphs. This multimodal Transformer uses combined input sequences of structured information from KGs and unstructured text data from biomedical literature to learn joint representations. First, we pre-trained STonKGs on a knowledge base assembled by the Integrated Network and Dynamical Reasoning Assembler (INDRA) consisting of millions of text-triple pairs extracted from biomedical literature by multiple NLP systems. Then, we benchmarked STonKGs against two baseline models trained on either one of the modalities (i.e., text or KG) across eight different classification tasks, each corresponding to a different biological application. Our results demonstrate that STonKGs outperforms both baselines, especially on the more challenging tasks with respect to the number of classes, improving upon the F1-score of the best baseline by up to 0.083. Additionally, our pre-trained model as well as the model architecture can be adapted to various other transfer learning applications. Finally, the source code and pre-trained STonKGs models are available at https://github.com/stonkgs/stonkgs and https://huggingface.co/stonkgs/stonkgs-150k.
Representation and Experience-Based Learning of Explainable Models for Robot Action Execution
(2021)
For robots acting in human-centered environments, the ability to improve based on experience is essential for reliable and adaptive operation; however, particularly in the context of robot failure analysis, experience-based improvement is only useful if robots are also able to reason about and explain the decisions they make during execution. In this paper, we describe and analyse a representation of execution-specific knowledge that combines (i) a relational model in the form of qualitative attributes that describe the conditions under which actions can be executed successfully and (ii) a continuous model in the form of a Gaussian process that can be used for generating parameters for action execution, but also for evaluating the expected execution success given a particular action parameterisation. The proposed representation is based on prior, modelled knowledge about actions and is combined with a learning process that is supervised by a teacher. We analyse the benefits of this representation in the context of two actions – grasping handles and pulling an object on a table – such that the experiments demonstrate that the joint relational-continuous model allows a robot to improve its execution based on experience, while reducing the severity of failures experienced during execution.
When an autonomous robot learns how to execute actions, it is of interest to know if and when the execution policy can be generalised to variations of the learning scenarios. This can inform the robot about the necessity of additional learning, as using incomplete or unsuitable policies can lead to execution failures. Generalisation is particularly relevant when a robot has to deal with a large variety of objects and in different contexts. In this paper, we propose and analyse a strategy for generalising parameterised execution models of manipulation actions over different objects based on an object ontology. In particular, a robot transfers a known execution model to objects of related classes according to the ontology, but only if there is no other evidence that the model may be unsuitable. This allows using ontological knowledge as prior information that is then refined by the robot’s own experiences. We verify our algorithm for two actions – grasping and stowing everyday objects – such that we show that the robot can deduce cases in which an existing policy can generalise to other objects and when additional execution knowledge has to be acquired.
In the field of service robots, dealing with faults is crucial to promote user acceptance. In this context, this work focuses on some specific faults which arise from the interaction of a robot with its real world environment due to insufficient knowledge for action execution. In our previous work [1], we have shown that such missing knowledge can be obtained through learning by experimentation. The combination of symbolic and geometric models allows us to represent action execution knowledge effectively. However we did not propose a suitable representation of the symbolic model. In this work we investigate such symbolic representation and evaluate its learning capability. The experimental analysis is performed on four use cases using four different learning paradigms. As a result, the symbolic representation together with the most suitable learning paradigm are identified.
Robot Action Diagnosis and Experience Correction by Falsifying Parameterised Execution Models
(2021)
When faced with an execution failure, an intelligent robot should be able to identify the likely reasons for the failure and adapt its execution policy accordingly. This paper addresses the question of how to utilise knowledge about the execution process, expressed in terms of learned constraints, in order to direct the diagnosis and experience acquisition process. In particular, we present two methods for creating a synergy between failure diagnosis and execution model learning. We first propose a method for diagnosing execution failures of parameterised action execution models, which searches for action parameters that violate a learned precondition model. We then develop a strategy that uses the results of the diagnosis process for generating synthetic data that are more likely to lead to successful execution, thereby increasing the set of available experiences to learn from. The diagnosis and experience correction methods are evaluated for the problem of handle grasping, such that we experimentally demonstrate the effectiveness of the diagnosis algorithm and show that corrected failed experiences can contribute towards improving the execution success of a robot.
In Robot-Assisted Therapy for children with Autism Spectrum Disorder, the therapists’ workload is increased due to the necessity of controlling the robot manually. The solution for this problem is to increase the level of autonomy of the system, namely the robot should interpret and adapt to the behaviour of the child under therapy. The problem that we are adressing is to develop a behaviour model that will be used for the robot decision-making process, which will learn how to adequately react to certain child reactions. We propose the use of the reinforcement learning technique for this task, where feedback for learning is obtained from the therapist’s evaluation of a robot’s behaviour.
Execution monitoring is essential for robots to detect and respond to failures. Since it is impossible to enumerate all failures for a given task, we learn from successful executions of the task to detect visual anomalies during runtime. Our method learns to predict the motions that occur during the nominal execution of a task, including camera and robot body motion. A probabilistic U-Net architecture is used to learn to predict optical flow, and the robot's kinematics and 3D model are used to model camera and body motion. The errors between the observed and predicted motion are used to calculate an anomaly score. We evaluate our method on a dataset of a robot placing a book on a shelf, which includes anomalies such as falling books, camera occlusions, and robot disturbances. We find that modeling camera and body motion, in addition to the learning-based optical flow prediction, results in an improvement of the area under the receiver operating characteristic curve from 0.752 to 0.804, and the area under the precision-recall curve from 0.467 to 0.549.
The dataset contains the following data from successful and failed executions of the Toyota HSR robot placing a book on a shelf.
RGB images from the robot's head camera
Depth images from the robot's head camera
Rendered images of the robot's 3D model from the point of view of the robot's head camera
Force-torque readings from a wrist-mounted force-torque sensor
Joint efforts, velocities and positions
extrinsic and intrinsic camera calibration parameters
frame-level anomaly annotations
The anomalies that occur during execution include:
the manipulated book falling down
books on the shelf being disturbed significantly
camera occlusions
robot being disturbed by an external collision
The dataset is split into a train, validation and test set with the following number of trials:
Train: 48 successful trials
Validation: 6 successful trials
Test: 60 anomalous trials and 7 successful trials
Property-Based Testing in Simulation for Verifying Robot Action Execution in Tabletop Manipulation
(2021)
An important prerequisite for the reliability and robustness of a service robot is ensuring the robot’s correct behavior when it performs various tasks of interest. Extensive testing is one established approach for ensuring behavioural correctness; this becomes even more important with the integration of learning-based methods into robot software architectures, as there are often no theoretical guarantees about the performance of such methods in varying scenarios. In this paper, we aim towards evaluating the correctness of robot behaviors in tabletop manipulation through automatic generation of simulated test scenarios in which a robot assesses its performance using property-based testing. In particular, key properties of interest for various robot actions are encoded in an action ontology and are then verified and validated within a simulated environment. We evaluate our framework with a Toyota Human Support Robot (HSR) which is tested in a Gazebo simulation. We show that our framework can correctly and consistently identify various failed actions in a variety of randomised tabletop manipulation scenarios, in addition to providing deeper insights into the type and location of failures for each designed property.
Low-Cost In-Hand Slippage Detection and Avoidance for Robust Robotic Grasping with Compliant Fingers
(2021)
Object detectors have improved considerably in the last years by using advanced CNN architectures. However, many detector hyper-parameters are generally manually tuned, or they are used with values set by the detector authors. Automatic Hyper-parameter optimization has not been explored in improving CNN-based object detectors hyper-parameters. In this work, we propose the use of Black-box optimization methods to tune the prior/default box scales in Faster R-CNN and SSD, using Bayesian Optimization, SMAC, and CMA-ES. We show that by tuning the input image size and prior box anchor scale on Faster R-CNN mAP increases by 2% on PASCAL VOC 2007, and by 3% with SSD. On the COCO dataset with SSD there are mAP improvement in the medium and large objects, but mAP decreases by 1% in small objects. We also perform a regression analysis to find the significant hyper-parameters to tune.
An essential measure of autonomy in assistive service robots is adaptivity to the various contexts of human-oriented tasks, which are subject to subtle variations in task parameters that determine optimal behaviour. In this work, we propose an apprenticeship learning approach to achieving context-aware action generalization on the task of robot-to-human object hand-over. The procedure combines learning from demonstration and reinforcement learning: a robot first imitates a demonstrator’s execution of the task and then learns contextualized variants of the demonstrated action through experience. We use dynamic movement primitives as compact motion representations, and a model-based C-REPS algorithm for learning policies that can specify hand-over position, conditioned on context variables. Policies are learned using simulated task executions, before transferring them to the robot and evaluating emergent behaviours. We additionally conduct a user study involving participants assuming different postures and receiving an object from a robot, which executes hand-overs by either imitating a demonstrated motion, or adapting its motion to hand-over positions suggested by the learned policy. The results confirm the hypothesized improvements in the robot’s perceived behaviour when it is context-aware and adaptive, and provide useful insights that can inform future developments.
When a robotic agent experiences a failure while acting in the world, it should be possible to discover why that failure has occurred, namely to diagnose the failure. In this paper, we argue that the diagnosability of robot actions, at least in a classical sense, is a feature that cannot be taken for granted since it strongly depends on the underlying action representation. We specifically define criteria that determine the diagnosability of robot actions. The diagnosability question is then analysed in the context of a handle manipulation action, such that we discuss two different representations of the action – a composite policy with a learned success model for the action parameters, and a neural network-based monolithic policy – both of which exist on different sides of the diagnosability spectrum. Through this comparison, we conclude that composite actions are more suited to explicit diagnosis, but representations with less prior knowledge are more flexible. This suggests that model learning may provide balance between flexibility and diagnosability; however, data-driven diagnosis methods also need to be enhanced in order to deal with the complexity of modern robots.
Grasp verification is advantageous for autonomous manipulation robots as they provide the feedback required for higher level planning components about successful task completion. However, a major obstacle in doing grasp verification is sensor selection. In this paper, we propose a vision based grasp verification system using machine vision cameras, with the verification problem formulated as an image classification task. Machine vision cameras consist of a camera and a processing unit capable of on-board deep learning inference. The inference in these low-power hardware are done near the data source, reducing the robot's dependence on a centralized server, leading to reduced latency, and improved reliability. Machine vision cameras provide the deep learning inference capabilities using different neural accelerators. Although, it is not clear from the documentation of these cameras what is the effect of these neural accelerators on performance metrics such as latency and throughput. To systematically benchmark these machine vision cameras, we propose a parameterized model generator that generates end to end models of Convolutional Neural Networks(CNN). Using these generated models we benchmark latency and throughput of two machine vision cameras, JeVois A33 and Sipeed Maix Bit. Our experiments demonstrate that the selected machine vision camera and the deep learning models can robustly verify grasp with 97% per frame accuracy.
Efficient and comprehensive assessment of students knowledge is an imperative task in any learning process. Short answer grading is one of the most successful methods in assessing the knowledge of students. Many supervised learning and deep learning approaches have been used to automate the task of short answer grading in the past. We investigate why assistive grading with active learning would be the next logical step in this task as there is no absolute ground truth answer for any question and the task is very subjective in nature. We present a fast and easy method to harness the power of active learning and natural language processing in assisting the task of grading short answer questions. A webbased GUI is designed and implemented to incorporate an interactive short answer grading system. The experiments show that active learning saves the time and effort of graders in assessment and reaches the performance of supervised learning with less amount of graded answers for training.
Comparative Evaluation of Pretrained Transfer Learning Models on Automatic Short Answer Grading
(2020)
Automatic Short Answer Grading (ASAG) is the process of grading the student answers by computational approaches given a question and the desired answer. Previous works implemented the methods of concept mapping, facet mapping, and some used the conventional word embeddings for extracting semantic features. They extracted multiple features manually to train on the corresponding datasets. We use pretrained embeddings of the transfer learning models, ELMo, BERT, GPT, and GPT-2 to assess their efficiency on this task. We train with a single feature, cosine similarity, extracted from the embeddings of these models. We compare the RMSE scores and correlation measurements of the four models with previous works on Mohler dataset. Our work demonstrates that ELMo outperformed the other three models. We also, briefly describe the four transfer learning models and conclude with the possible causes of poor results of transfer learning models.
Deep learning models are extensively used in various safety critical applications. Hence these models along with being accurate need to be highly reliable. One way of achieving this is by quantifying uncertainty. Bayesian methods for UQ have been extensively studied for Deep Learning models applied on images but have been less explored for 3D modalities such as point clouds often used for Robots and Autonomous Systems. In this work, we evaluate three uncertainty quantification methods namely Deep Ensembles, MC-Dropout and MC-DropConnect on the DarkNet21Seg 3D semantic segmentation model and comprehensively analyze the impact of various parameters such as number of models in ensembles or forward passes, and drop probability values, on task performance and uncertainty estimate quality. We find that Deep Ensembles outperforms other methods in both performance and uncertainty metrics. Deep ensembles outperform other methods by a margin of 2.4% in terms of mIOU, 1.3% in terms of accuracy, while providing reliable uncertainty for decision making.
Abschlussbericht zum BMBF-Fördervorhaben Enabling Infrastructure for HPC-Applications (EI-HPC)
(2020)
Emotion and gender recognition from facial features are important properties of human empathy. Robots should also have these capabilities. For this purpose we have designed special convolutional modules that allow a model to recognize emotions and gender with a considerable lower number of parameters, enabling real-time evaluation on a constrained platform. We report accuracies of 96% in the IMDB gender dataset and 66% in the FER-2013 emotion dataset, while requiring a computation time of less than 0.008 seconds on a Core i7 CPU. All our code, demos and pre-trained architectures have been released under an open-source license in our repository at https://github.com/oarriaga/face classification.
Background: Virtual reality combined with spherical treadmills is used across species for studying neural circuits underlying navigation.
New Method: We developed an optical flow-based method for tracking treadmil ball motion in real-time using a single high-resolution camera.
Results: Tracking accuracy and timing were determined using calibration data. Ball tracking was performed at 500 Hz and integrated with an open source game engine for virtual reality projection. The projection was updated at 120 Hz with a latency with respect to ball motion of 30 ± 8 ms.
Comparison: with Existing Method(s) Optical flow based tracking of treadmill motion is typically achieved using optical mice. The camera-based optical flow tracking system developed here is based on off-the-shelf components and offers control over the image acquisition and processing parameters. This results in flexibility with respect to tracking conditions – such as ball surface texture, lighting conditions, or ball size – as well as camera alignment and calibration.
Conclusions: A fast system for rotational ball motion tracking suitable for virtual reality animal behavior across different scales was developed and characterized.
Tell Your Robot What To Do: Evaluation of Natural Language Models for Robot Command Processing
(2019)
The use of natural language to indicate robot tasks is a convenient way to command robots. As a result, several models and approaches capable of understanding robot commands have been developed, which however complicates the choice of a suitable model for a given scenario. In this work, we present a comparative analysis and benchmarking of four natural language understanding models - Mbot, Rasa, LU4R, and ECG. We particularly evaluate the performance of the models to understand domestic service robot commands by recognizing the actions and any complementary information in them in three use cases: the RoboCup@Home General Purpose Service Robot (GPSR) category 1 contest, GPSR category 2, and hospital logistics in the context of the ROPOD project.
When developing robot functionalities, finite state machines are commonly used due to their straightforward semantics and simple implementation. State machines are also a natural implementation choice when designing robot experiments, as they generally lead to reproducible program execution. In practice, the implementation of state machines can lead to significant code repetition and may necessitate unnecessary code interaction when reparameterisation is required. In this paper, we present a small Python library that allows state machines to be specified, configured, and dynamically created using a minimal domain-specific language. We illustrate the use of the library in three different use cases - scenario definition in the context of the RoboCup@Home competition, experiment design in the context of the ROPOD project, as well as specification transfer between robots.
Data-Driven Robot Fault Detection and Diagnosis Using Generative Models: A Modified SFDD Algorithm
(2019)
This paper presents a modification of the data-driven sensor-based fault detection and diagnosis (SFDD) algorithm for online robot monitoring. Our version of the algorithm uses a collection of generative models, in particular restricted Boltzmann machines, each of which represents the distribution of sliding window correlations between a pair of correlated measurements. We use such models in a residual generation scheme, where high residuals generate conflict sets that are then used in a subsequent diagnosis step. As a proof of concept, the framework is evaluated on a mobile logistics robot for the problem of recognising disconnected wheels, such that the evaluation demonstrates the feasibility of the framework (on the faulty data set, the models obtained 88.6% precision and 75.6% recall rates), but also shows that the monitoring results are influenced by the choice of distribution model and the model parameters as a whole.
For robots acting - and failing - in everyday environments, a predictable behaviour representation is important so that it can be utilised for failure analysis, recovery, and subsequent improvement. Learning from demonstration combined with dynamic motion primitives is one commonly used technique for creating models that are easy to analyse and interpret; however, mobile manipulators complicate such models since they need the ability to synchronise arm and base motions for performing purposeful tasks. In this paper, we analyse dynamic motion primitives in the context of a mobile manipulator - a Toyota Human Support Robot (HSR)- and introduce a small extension of dynamic motion primitives that makes it possible to perform whole body motion with a mobile manipulator. We then present an extensive set of experiments in which our robot was grasping various everyday objects in a domestic environment, where a sequence of object detection, pose estimation, and manipulation was required for successfully completing the task. Our experiments demonstrate the feasibility of the proposed whole body motion framework for everyday object manipulation, but also illustrate the necessity for highly adaptive manipulation strategies that make better use of a robot's perceptual capabilities.
In Sensor-based Fault Detection and Diagnosis (SFDD) methods, spatial and temporal dependencies among the sensor signals can be modeled to detect faults in the sensors, if the defined dependencies change over time. In this work, we model Granger causal relationships between pairs of sensor data streams to detect changes in their dependencies. We compare the method on simulated signals with the Pearson correlation, and show that the method elegantly handles noise and lags in the signals and provides appreciable dependency detection. We further evaluate the method using sensor data from a mobile robot by injecting both internal and external faults during operation of the robot. The results show that the method is able to detect changes in the system when faults are injected, but is also prone to detecting false positives. This suggests that this method can be used as a weak detection of faults, but other methods, such as the use of a structural model, are required to reliably detect and diagnose faults.
In the field of service robots, dealing with faults is crucial to promote user acceptance. In this context, this work focuses on some specific faults which arise from the interaction of a robot with its real world environment due to insufficient knowledge for action execution.
In our previous work [1], we have shown that such missing knowledge can be obtained through learning by experimentation. The combination of symbolic and geometric models allows us to represent action execution knowledge effectively. However we did not propose a suitable representation of the symbolic model.
In this work we investigate such symbolic representation and evaluate its learning capability. The experimental analysis is performed on four use cases using four different learning paradigms. As a result, the symbolic representation together with the most suitable learning paradigm are identified.
The increasing complexity of tasks that are required to be executed by robots demands higher reliability of robotic platforms. For this, it is crucial for robot developers to consider fault diagnosis. In this study, a general non-intrusive fault diagnosis system for robotic platforms is proposed. A mini-PC is non-intrusively attached to a robot that is used to detect and diagnose faults. The health data and diagnosis produced by the mini-PC is then standardized and transmitted to a remote-PC. A storage device is also attached to the mini-PC for data logging of health data in case of loss of communication with the remote-PC. In this study, a hybrid fault diagnosis method is compared to consistency-based diagnosis (CBD), and CBD is selected to be deployed on the system. The proposed system is modular and can be deployed on different robotic platforms with minimum setup.
Robot deployment in realistic environments is challenging despite the fact that robots can be quite skilled at a large number of isolated tasks. One reason for this is that robots are rarely equipped with powerful introspection capabilities, which means that they cannot always deal with failures in an acceptable manner; in addition, manual diagnosis is often a tedious task that requires technicians to have a considerable set of robotics skills. In this paper, we discuss our ongoing efforts to address some of these problems. In particular, we (i) present our early efforts at developing a robotic black box and consider some factors that complicate its design, (ii) explain our component and system monitoring concept, and (iii) describe the necessity for remote monitoring and experimentation as well as our initial attempts at performing those. Our preliminary work opens a range of promising directions for making robots more usable and reliable in practice.
Robot deployment in realistic dynamic environments is a challenging problem despite the fact that robots can be quite skilled at a large number of isolated tasks. One reason for this is that robots are rarely equipped with powerful introspection capabilities, which means that they cannot always deal with failures in a reasonable manner; in addition, manual diagnosis is often a tedious task that requires technicians to have a considerable set of robotics skills.
In this paper we propose an implement a general convolutional neural network (CNN) building framework for designing real-time CNNs. We validate our models by creating a real-time vision system which accomplishes the tasks of face detection, gender classification and emotion classification simultaneously in one blended step using our proposed CNN architecture. After presenting the details of the training procedure setup we proceed to evaluate on standard benchmark sets. We report accuracies of 96% in the IMDB gender dataset and 66% in the FER-2013 emotion dataset. Along with this we also introduced the very recent real-time enabled guided back-propagation visualization technique. Guided back-propagation uncovers the dynamics of the weight changes and evaluates the learned features. We argue that the careful implementation of modern CNN architectures, the use of the current regularization methods and the visualization of previously hidden features are necessary in order to reduce the gap between slow performances and real-time architectures. Our system has been validated by its deployment on a Care-O-bot 3 robot used during RoboCup@Home competitions. All our code, demos and pre-trained architectures have been released under an open-source license in our public repository.
Current robot platforms are being employed to collaborate with humans in a wide range of domestic and industrial tasks. These environments require autonomous systems that are able to classify and communicate anomalous situations such as fires, injured persons, car accidents; or generally, any potentially dangerous situation for humans. In this paper we introduce an anomaly detection dataset for the purpose of robot applications as well as the design and implementation of a deep learning architecture that classifies and describes dangerous situations using only a single image as input. We report a classification accuracy of 97 % and METEOR score of 16.2. We will make the dataset publicly available after this paper is accepted.
While executing actions, service robots may experience external faults because of insufficient knowledge about the actions' preconditions. The possibility of encountering such faults can be minimised if symbolic and geometric precondition models are combined into a representation that specifies how and where actions should be executed. This work investigates the problem of learning such action execution models and the manner in which those models can be generalised. In particular, we develop a template-based representation of execution models, which we call delta models, and describe how symbolic template representations and geometric success probability distributions can be combined for generalising the templates beyond the problem instances on which they are created. Our experimental analysis, which is performed with two physical robot platforms, shows that delta models can describe execution-specific knowledge reliably, thus serving as a viable model for avoiding the occurrence of external faults.
Cognitive robotics aims at understanding biological processes, though it has also the potential to improve future robotics systems. Here we show how a biologically inspired model of motor control with neural fields can be augmented with additional components such that it is able to solve a basic robotics task, that of obstacle avoidance. While obstacle avoidance is a well researched area, the focus here is on the extensibility of a biologically inspired framework. This work demonstrates how easily the biological inspired system can be used to adapt to new tasks. This flexibility is thought to be a major hallmark of biological agents.
Autonomous mobile robots comprise of several hardware and software components. These components interact with each other continuously in order to achieve autonomity. Due to the complexity of such a task, a monumental responsibility is bestowed upon the developer to make sure that the robot is always operable. Hence, some means of detecting faults should be readily available. In this work, the aforementioned fault-detection system is a robotic black box (RBB) attached to the robot which acquires all the relevant measurements of the system that are needed to achieve a fault-free robot. Due to limited computational and memory resources on-board the RBB, a distributed diagnosis is proposed. That is, the fault diagnosis task (detection and isolation) is shared among an on-board component (the black box) and an off-board component (an external computer). The distribution of the diagnosis task allows for a non-intrusive method of detecting and diagnosing faults, in addition to the ability of remotely diagnosing a robot and potentially issuing a repair command. In addition to decomposing the diagnosis task and allowing remote diagnosability of the robot, another key feature of this work is the addition of expert human knowledge to aid in the fault detection process.
This paper presents the b-it-bots@Home team and its mobile service robot called Jenny – a service robot based on the Care-O-bot 3 platform manufactured by the Fraunhofer Institute for Manufacturing Engineering and Automation. In this paper, an overview of the robot control architecture and its capabilities is presented. The capabilities refers to the added functionalities from research and projects carried out within the Bonn-Rhein-Sieg University of Applied Science.
Robust Indoor Localization Using Optimal Fusion Filter For Sensors And Map Layout Information
(2014)
The ability to track moving people is a key aspect of autonomous robot systems in real-world environments. Whilst for many tasks knowing the approximate positions of people may be sufficient, the ability to identify unique people is needed to accurately count people in the real world. To accomplish the people counting task, a robust system for people detection, tracking and identification is needed.
In the field of domestic service robots, recovery from faults is crucial to promote user acceptance. In this context we focus in particular on some specific faults, which arise from the interaction of a robot with its real world environment. Even a well-modelled robot may fail to perform its tasks successfully due to unexpected situations, which occur while interacting. These situations occur as deviations of properties of the objects (manipulated by the robot) from their expected values. Hence, they are experienced by the robot as external faults.
Unexpected Situations in Service Robot Environment: Classification and Reasoning Using Naive Physics
(2014)
Robots, which are able to carry out their tasks robustly in real world environments, are not only desirable but necessary if we want them to be more welcome for a wider audience. But very often they may fail to execute their actions successfully because of insufficient information about behaviour of objects used in the actions.
Improving Robustness of Task Execution Against External Faults Using Simulation Based Approach
(2013)
Robots interacting in complex and cluttered environments may face unexpected situations referred to as external faults which prohibit the successful completion of their tasks. In order to function in a more robust manner, robots need to recognise these faults and learn how to deal with them in the future. We present a simulation-based technique to avoid external faults occurring during execusion releasing actions of a robot. Our technique utilizes simulation to generate a set of labeled examples which are used by a histogram algorithm to compute a safe region. A safe region consists of a set of releasing states of an object that correspond to successful performances of the action. This technique also suggests a general solution to avoid the occurrence of external faults for not only the current, observable object but also for any other object of the same shape but different size.
We developed a scene text recognition system with active vision capabilities, namely: auto-focus, adaptive aperture control and auto-zoom. Our localization system is able to delimit text regions in images with complex backgrounds, and is based on an attentional cascade, asymmetric adaboost, decision trees and Gaussian mixture models. We think that text could become a valuable source of semantic information for robots, and we aim to raise interest in it within the robotics community. Moreover, thanks to the robot’s pan-tilt-zoom camera and to the active vision behaviors, the robot can use its affordances to overcome hindrances to the performance of the perceptual task. Detrimental conditions, such as poor illumination, blur, low resolution, etc. are very hard to deal with once an image has been captured and can often be prevented. We evaluated the localization algorithm on a public dataset and one of our own with encouraging results. Furthermore, we offer an interesting experiment in active vision, which makes us consider that active sensing in general should be considered early on when addressing complex perceptual problems in embodied agents.
This project investigated the viability of using the Microsoft Kinect in order to obtain reliable Red-Green-Blue-Depth (RGBD) information. This explored the usability of the Kinect in a variety of environments as well as its ability to detect different classes of materials and objects. This was facilitated through the implementation of Random Sample and Consensus (RANSAC) based algorithms and highly parallelized workflows in order to provide time sensitive results. We found that the Kinect provides detailed and reliable information in a time sensitive manner. Furthermore, the project results recommend usability and operational parameters for the use of the Kinect as a scientific research tool.
In the realm of service robots recovery from faults is indispensable to foster user acceptance. Here fault is to be understood not in the sense of robot internal, rather as interaction faults while situated in and interacting with an environment (aka ex-ternal faults). We reason along the most frequent failures in typical scenarios which we observed during real-world demonstrations and competitions using our Care-O-bot III 1 robot. They take place in an apartment-like environments which is known as closed world. We suggest four different -for now adhoc -fault categories caused by disturbances, imperfect per-ception, inadequate planning or chaining of action sequences. The fault are categorized and then mapped to a handful of partly known, partly extended fault handling techniques. Among them we applied qualitative reasoning, use of simu-lation as oracle, learning for planning (aka en-hancement of plan operators) or -in future -case-based reasoning. Having laid out this frame we mainly ask open questions related to the applicability of the pre-sented approach. Amongst them: how to find new categories, how to extend them, how to as-sure disjointness, how to identify old and label new faults on the fly.
The work presented in this paper focuses on the comparison of well-known and new techniques for designing robust fault diagnosis schemes in the robot domain. The main challenge for fault diagnosis is to allow the robot to effectively cope not only with internal hardware and software faults but with external disturbances and errors from dynamic and complex environments as well.
The development of robot control programs is a complex task. Many robots are different in their electrical and mechanical structure which is also reflected in the software. Specific robot software environments support the program development, but are mainly text-based and usually applied by experts in the field with profound knowledge of the target robot. This paper presents a graphical programming environment which aims to ease the development of robot control programs. In contrast to existing graphical robot programming environments, our approach focuses on the composition of parallel action sequences. The developed environment allows to schedule independent robot actions on parallel execution lines and provides mechanism to avoid side-effects of parallel actions. The developed environment is platform-independent and based on the model-driven paradigm. The feasibility of our approach is shown by the application of the sequencer to a simulated service robot and a robot for educational purpose.
This work presents a person independent pointing gesture recognition application. It uses simple but effective features for the robust tracking of the head and the hand of the user in an undefined environment. The application is able to detect if the tracking is lost and can be reinitialized automatically. The pointing gesture recognition accuracy is improved by the proposed fingertip detection algorithm and by the detection of the width of the face. The experimental evaluation with eight different subjects shows that the overall average pointing gesture recognition rate of the system for distances up to 250 cm (head to pointing target) is 86.63% (with a distance between objects of 23 cm). Considering just frontal pointing gestures for distances up to 250 cm the gesture recognition rate is 90.97% and for distances up to 194 cm even 95.31%. The average error angle is 7.28◦.
With regard to performance well established SW-only design methodologies proceed by making the initial specification run first, then by enhancing its functionality and finally by optimizing it. When designing Embedded Systems (EbS) this approach is not viable since decisive design decisions like e.g. the estimation of required processing power or the identification of those parts of the specification which need to be delegated to dedicated HW depend on the fastness and fairness of the initial specification. We here propose a sequence of optimization steps embedded into the design flow, which enables a structured way to accelerate a given working EbS specification at different layers. This sequence of accelerations comprises algorithm selection, algorithm transformation, data transformation, implementation optimization and finally HW acceleration. It is analyzed how all acceleration steps are influenced by the specific attributes of the underlying EbS. The overall acceleration procedure is explained and quantified at hand of a real-life industrial example.
This paper presents an approach to estimate theego-motion of a robot while moving. The employed sensor is aTime-of-Flight (ToF) camera, the SR3000 from Mesa Imaging.ToF cameras provide depth and reflectance data of the scene athigh frame rates.The proposed method utilizes the coherence of depth andreflectance data of ToF cameras by detecting image features onreflectance data and estimating the motion on depth data. Themotion estimate of the camera is fused with inertial measure-ments to gain higher accuracy and robustness.The result of the algorithm is benchmarked against referenceposes determined by matching accurate 2D range scans. Theevaluation shows that fusing the pose estimate with the datafromthe IMU improves the accuracy and robustness of the motionestimate against distorted measurements from the sensor.
The goal of this work is to develop an integration framework for a robotic software system which enables robotic learning by experimentation within a distributed and heterogeneous setting. To meet this challenge, the authors specified, defined, developed, implemented and tested a component-based architecture called XPERSIF. The architecture comprises loosely-coupled, autonomous components that offer services through their well-defined interfaces and form a service-oriented architecture. The Ice middleware is used in the communication layer. Additionally, the successful integration of the XPERSim simulator into the system has enabled simultaneous quasi-realtime observation of the simulation by numerous, distributed users.
Swedish wheeled mobile robots have remarkable mobility properties allowing them to rotate and translate at the same time. Being holonomic systems, their kinematics model results in the possibility of designing separate and independent position and heading trajectory tracking control laws. Nevertheless, if these control laws should be implemented in the presence of unaccounted actuator saturation, the resulting saturated linear and angular velocity commands could interfere with each other thus dramatically affecting the overall expected performance. Based on Lyapunov’s direct method, a position and heading trajectory tracking control law for Swedish wheeled robots is developed. It explicitly accounts for actuator saturation by using ideas from a prioritized task based control framework.