Technical Report / Hochschule Bonn-Rhein-Sieg University of Applied Sciences. Department of Computer Science
Publisher: Dean Prof. Dr. Sascha Alda
Hochschule Bonn-Rhein-Sieg University of Applied Sciences, Department of Computer Science
Sankt Augustin, Germany
ISSN 1869-5272
Hochschule Bonn-Rhein-Sieg University of Applied Sciences, Department of Computer Science
Sankt Augustin, Germany
ISSN 1869-5272
Refine
H-BRS Bibliography
- yes (49)
Departments, institutes and facilities
- Fachbereich Informatik (49) (remove)
Document Type
- Report (42)
- Master's Thesis (7)
Year of publication
Has Fulltext
- yes (49)
Keywords
- Robotik (6)
- Cutting sticks-Problem (3)
- Teilsummenaufteilung (3)
- Virtuelle Realität (3)
- 3D-Scanner (2)
- Computer Vision (2)
- Deep Learning (2)
- Forschungsbericht (2)
- Gravitation (2)
- Machine Learning (2)
02-2018
Bei der Übertragung und Speicherung von Daten ist es eine wesentliche Frage, inwieweit die Daten komprimiert werden können, ohne dass deren Informationsgehalt verloren geht.
Ein Maß für den Informationsgehalt von Daten ist also von grundlegender Bedeutung. Vor etwa siebzig Jahren hat C. E. Shannon ein solches Maß eingeführt und damit das Lehr- und Forschungsgebiet der Informationstheorie begründet, welches seit dem bis heute hin wesentlich zur Konzeption und Realisierung von Informationsund Kommunikationstechnologien beigetragen hat. Etwa zwanzig Jahre später hat A. N. Kolmogorov ein anderes Maß für den Informationsgehalt von Daten eingeführt. Während die Shannonsche Informationstheorie zum Curriculum von mathematischen, informatischen und elektrotechnischen Studiengängen gehört, ist die Algorithmische Informationstheorie von Kolmogorov weit weniger bekannt und eher Gegenstand von speziellen Lehrveranstaltungen.
Seit einigen Jahren nimmt allerdings die Beschäftigung mit dieser Theorie zu, zumal in der einschlägigen Literatur von erfolgreichen praktischen Anwendungen der Theorie berichtet wird. Die vorliegende Arbeit gibt eine Einführung in grundlegende Ideen dieser Theorie und beschreibt deren Anwendungsmöglichkeiten bei einigen ausgewählten Problemstellungen der Theoretischen Informatik.
Die Ausarbeitung kann als Skript für einführende Lehrveranstaltungen in die Algorithmische Informationstheorie sowie als Lektüre zur Einarbeitung in die Thematik als Ausgangspunkt für Forschungs- und Entwicklungsarbeiten verwendet werden.
03-2018
Friction effects impose a requirement for the supplementary amount of torque to be produced in actuators for a robot to move, which in turn increases energy consumption. We cannot eliminate friction, but we can optimize motions to make them more energy efficient, by considering friction effects in motion computations. Optimizing motions means computing efficient joint torques/accelerations based on different friction torques imposed in each joint. Existing friction forces can be used for supporting certain types of arm motions, e.g standing still.
Reducing energy consumption of robot's arms will provide many benefits, such as longer battery life of mobile robots, reducing heat in motor systems, etc.
The aim of this project is extending an already available constrained hybrid dynamic solver, by including static friction effects in the computations of energy optimal motions. When the algorithm is extended to account for static friction factors, a convex optimization (maximization) problem must be solved.
The author of this hybrid dynamic solver has briefly outlined the approach for including static friction forces in computations of motions, but without providing a detailed derivation of the approach and elaboration that will show its correctness. Additionally, the author has outlined the idea for improving the computational efficiency of the approach, but without providing its derivation.
In this project, the proposed approach for extending the originally formulated algorithm has been completely derived and evaluated in order to show its feasibility. The evaluation is conducted in simulation environment with one DOF robot arm, and it shows correct results from the computation of motions. Furthermore, this project presents the derivation of the outlined method for improving the computational efficiency of the extended solver.
03-2020
Human and robot tasks in household environments include actions such as carrying an object, cleaning a surface, etc. These tasks are performed by means of dexterous manipulation, and for humans, they are straightforward to accomplish. Moreover, humans perform these actions with reasonable accuracy and precision but with much less energy and stress on the actuators (muscles) than the robots do. The high agility in controlling their forces and motions is actually due to "laziness", i.e. humans exploit the existing natural forces and constraints to execute the tasks.
The above-mentioned properties of the human lazy strategy motivate us to relax the problem of controlling robot motions and forces, and solve it with the help of the environment. Therefore, in this work, we developed a lazy control strategy, i.e. task specification models and control architectures that relax several aspects of robot control by exploiting prior knowledge about the task and environment. The developed control strategy is realized in four different robotics use cases. In this work, the Popov-Vereshchagin hybrid dynamics solver is used as one of the building blocks in the proposed control architectures. An extension of the solver’s interface with the artificial Cartesian force and feed-forward joint torque task-drivers is proposed in this thesis.
To validate the proposed lazy control approach, an experimental evaluation was performed in a simulation environment and on a real robot platform.
01-2019
Interactive Object Detection
(2019)
The success of state-of-the-art object detection methods depend heavily on the availability of a large amount of annotated image data. The raw image data available from various sources are abundant but non-annotated. Annotating image data is often costly, time-consuming or needs expert help. In this work, a new paradigm of learning called Active Learning is explored which uses user interaction to obtain annotations for a subset of the dataset. The goal of active learning is to achieve superior object detection performance with images that are annotated on demand. To realize active learning method, the trade-off between the effort to annotate (annotation cost) unlabeled data and the performance of object detection model is minimised.
Random Forests based method called Hough Forest is chosen as the object detection model and the annotation cost is calculated as the predicted false positive and false negative rate. The framework is successfully evaluated on two Computer Vision benchmark and two Carl Zeiss custom datasets. Also, an evaluation of RGB, HoG and Deep features for the task is presented.
Experimental results show that using Deep features with Hough Forest achieves the maximum performance. By employing Active Learning, it is demonstrated that performance comparable to the fully supervised setting can be achieved by annotating just 2.5% of the images. To this end, an annotation tool is developed for user interaction during Active Learning.
04-2015
Advanced driver assistance systems (ADAS) are technology systems and devices designed as an aid to the driver of a vehicle. One of the critical components of any ADAS is the traffic sign recognition module. For this module to achieve real-time performance, some preprocessing of input images must be done, which consists of a traffic sign detection (TSD) algorithm to reduce the possible hypothesis space. Performance of TSD algorithm is critical.
One of the best algorithms used for TSD is the Radial Symmetry Detector (RSD), which can detect both Circular [7] and Polygonal traffic signs [5]. This algorithm runs in real-time on high end personal computers, but computational performance of must be improved in order to be able to run in real-time in embedded computer platforms.
To improve the computational performance of the RSD, we propose a multiscale approach and the removal of a gaussian smoothing filter used in this algorithm. We evaluate the performance on both computation times, detection and false positive rates on a synthetic image dataset and on the german traffic sign detection benchmark [29].
We observed significant speedups compared to the original algorithm. Our Improved Radial Symmetry Detector is up to 5.8 times faster than the original on detecting Circles, up to 3.8 times faster on Triangle detection, 2.9 times faster on Square detection and 2.4 times faster on Octagon detection. All of this measurements were observed with better detection and false positive rates than the original RSD.
When evaluated on the GTSDB, we observed smaller speedups, in the range of 1.6 to 2.3 times faster for Circle and Regular Polygon detection, but for Circle detection we observed a decreased detection rate than the original algorithm, while for Regular Polygon detection we always observed better detection rates. False positive rates were high, in the range of 80% to 90%.
We conclude that our Improved Radial Symmetry Detector is a significant improvement of the Radial Symmetry Detector, both for Circle and Regular polygon detection. We expect that our improved algorithm will lead the way to obtain real-time traffic sign detection and recognition in embedded computer platforms.
05-2015
Extraction of text information from visual sources is an important component of many modern applications, for example, extracting the text from traffic signs on a road scene in an autonomous vehicle. For natural images or road scenes this is a unsolved problem. In this thesis the use of histogram of stroke widths (HSW) for character and noncharacter region classification is presented. Stroke widths are extracted using two methods. One is based on the Stroke Width Transform and another based on run lengths. The HSW is combined with two simple region features– aspect and occupancy ratios– and then a linear SVM is used as classifier. One advantage of our method over the state of the art is that it is script-independent and can also be used to verify detected text regions with the purpose of reducing false positives. Our experiments on generated datasets of Latin, CJK, Hiragana and Katakana characters show that the HSW is able to correctly classify at least 90% of the character regions, a similar figure is obtained for non-character regions. This performance is also obtained when training the HSW with one script and testing with a different one, and even when characters are rotated. On the English and Kannada portions of the Chars74K dataset we obtained over 95% correctly classified character regions. The use of raycasting for text line grouping is also proposed. By combining it with our HSW-based character classifier, a text detector based on Maximally Stable Extremal Regions (MSER) was implemented. The text detector was evaluated on our own dataset of road scenes from the German Autobahn, where 65% precision, 72% recall with a f-score of 69% was obtained. Using the HSW as a text verifier increases precision while slightly reducing recall. Our HSW feature allows the building of a script-independent and low parameter count classifier for character and non-character regions.
01-2018
Motion capture, often abbreviated mocap, generally aims at recording any kind of motion -- be it from a person or an object -- and to transform it to a computer-readable format. Especially the data recorded from (professional and non-professional) human actors are typically used for analysis in e.g. medicine, sport sciences, or biomechanics for evaluation of human motion across various factors. Motion capture is also widely used in the entertainment industry: In video games and films realistic motion sequences and animations are generated through data-driven motion synthesis based on recorded motion (capture) data.
Although the amount of publicly available full-body-motion capture data is growing, the research community still lacks a comparable corpus of specialty motion data such as, e.g. prehensile movements for everyday actions. On the one hand, such data can be used to enrich (hand-over animation) full-body motion capture data - usually captured without hand motion data due to the drastic dimensional difference in articulation detail. On the other hand, it provides means to classify and analyse prehensile movements with or without respect to the concrete object manipulated and to transfer the acquired knowledge to other fields of research (e.g. from 'pure' motion analysis to robotics or biomechanics).
Therefore, the objective of this motion capture database is to provide well-documented, free motion capture data for research purposes.
The presented database GraspDB14 in sum contains over 2000 prehensile movements of ten different non-professional actors interacting with 15 different objects. Each grasp was realised five times by each actor. The motions are systematically named containing an (anonymous) identifier for each actor as well as one for the object grasped or interacted with.
The data were recorded as joint angles (and raw 8-bit sensor data) which can be transformed into positional 3D data (3D trajectories of each joint).
In this document, we provide a detailed description on the GraspDB14-database as well as on its creation (for reproducibility).
Chapter 2 gives a brief overview of motion capture techniques, freely available motion capture databases for both, full body motions and hand motions, and a short section on how such data is made useful and re-used. Chapter 3 describes the database recording process and details the recording setup and the recorded scenarios. It includes a list of objects and performed types of interaction. Chapter 4 covers used file formats, contents, and naming patterns. We provide various tools for parsing, conversion, and visualisation of the recorded motion sequences and document their usage in chapter 5.
01-2014
Design of a declarative language for task-oriented grasping and tool-use with dextrous robotic hands
(2014)
Apparently simple manipulation tasks for a human such as transportation or tool use are challenging to replicate in an autonomous service robot. Nevertheless, dextrous manipulation is an important aspect for a robot in many daily tasks. While it is possible to manufacture special-purpose hands for one specific task in industrial settings, a generalpurpose service robot in households must have flexible hands which can adapt to many tasks. Intelligently using tools enables the robot to perform tasks more efficiently and even beyond the designed capabilities. In this work a declarative domain-specific language, called Grasp Domain Definition Language (GDDL), is presented that allows the specification of grasp planning problems independently of a specific grasp planner. This design goal resembles the idea of the Planning Domain Definition Language (PDDL). The specification of GDDL requires a detailed analysis of the research in grasping in order to identify best practices in different domains that contribute to a grasp. These domains describe for instance physical as well as semantic properties of objects and hands. Grasping always has a purpose which is captured in the task domain definition. It enables the robot to grasp an object in a taskdependent manner. Suitable representations in these domains have to be identified and formalized for which a domain-driven software engineering approach is applied. This kind of modeling allows the specification of constraints which guide the composition of domain entity specifications. The domain-driven approach fosters reuse of domain concepts while the constraints enable the validation of models already during design time. A proof of concept implementation of GDDL into the GraspIt! grasp planner is developed. Preliminary results of this thesis have been published and presented on the IEEE International Conference on Robotics and Automation (ICRA).
01-2015
Rural areas often lack affordable broadband Internet connectivity, mainly due to the CAPEX and especially OPEX of traditional operator equipment [HEKN11]. This digital divide limits the access to knowledge, health care and other services for billions of people. Different approaches to close this gap were discussed in the last decade [SPNB08]. In most rural areas satellite bandwidth is expensive and cellular networks (3G,4G) as well as WiMAX suffer from the usually low population density making it hard to amortize the costs of a base station [SPNB08].
05-2020
The ability to finely segment different instances of various objects in an environment forms a critical tool in the perception tool-box of any autonomous agent. Traditionally instance segmentation is treated as a multi-label pixel-wise classification problem. This formulation has resulted in networks that are capable of producing high-quality instance masks but are extremely slow for real-world usage, especially on platforms with limited computational capabilities. This thesis investigates an alternate regression-based formulation of instance segmentation to achieve a good trade-off between mask precision and run-time. Particularly the instance masks are parameterized and a CNN is trained to regress to these parameters, analogous to bounding box regression performed by an object detection network.
In this investigation, the instance segmentation masks in the Cityscape dataset are approximated using irregular octagons and an existing object detector network (i.e., SqueezeDet) is modified to regresses to the parameters of these octagonal approximations. The resulting network is referred to as SqueezeDetOcta. At the image boundaries, object instances are only partially visible. Due to the convolutional nature of most object detection networks, special handling of the boundary adhering object instances is warranted. However, the current object detection techniques seem to be unaffected by this and handle all the object instances alike. To this end, this work proposes selectively learning only partial, untainted parameters of the bounding box approximation of the boundary adhering object instances. Anchor-based object detection networks like SqueezeDet and YOLOv2 have a discrepancy between the ground-truth encoding/decoding scheme and the coordinate space used for clustering, to generate the prior anchor shapes. To resolve this disagreement, this work proposes clustering in a space defined by two coordinate axes representing the natural log transformations of the width and height of the ground-truth bounding boxes.
When both SqueezeDet and SqueezeDetOcta were trained from scratch, SqueezeDetOcta lagged behind the SqueezeDet network by a massive ≈ 6.19 mAP. Further analysis revealed that the sparsity of the annotated data was the reason for this lackluster performance of the SqueezeDetOcta network. To mitigate this issue transfer-learning was used to fine-tune the SqueezeDetOcta network starting from the trained weights of the SqueezeDet network. When all the layers of the SqueezeDetOcta were fine-tuned, it outperformed the SqueezeDet network paired with logarithmically extracted anchors by ≈ 0.77 mAP. In addition to this, the forward pass latencies of both SqueezeDet and SqueezeDetOcta are close to ≈ 19ms. Boundary adhesion considerations, during training, resulted in an improvement of ≈ 2.62 mAP of the baseline SqueezeDet network. A SqueezeDet network paired with logarithmically extracted anchors improved the performance of the baseline SqueezeDet network by ≈ 1.85 mAP.
In summary, this work demonstrates that if given sufficient fine instance annotated data, an existing object detection network can be modified to predict much finer approximations (i.e., irregular octagons) of the instance annotations, whilst having the same forward pass latency as that of the bounding box predicting network. The results justify the merits of logarithmically extracted anchors to boost the performance of any anchor-based object detection network. The results also showed that the special handling of image boundary adhering object instances produces more performant object detectors.
03-2022
As cameras are ubiquitous in autonomous systems, object detection is a crucial task. Object detectors are widely used in applications such as autonomous driving, healthcare, and robotics. Given an image, an object detector outputs both the bounding box coordinates as well as classification probabilities for each object detected. The state-of-the-art detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications in particular. It is therefore crucial to explain the reason behind each detector decision in order to gain user trust, enhance detector performance, and analyze their failure.
Previous work fails to explain as well as evaluate both bounding box and classification decisions individually for various detectors. Moreover, no tools explain each detector decision, evaluate the explanations, and also identify the reasons for detector failures. This restricts the flexibility to analyze detectors. The main contribution presented here is an open-source Detector Explanation Toolkit (DExT). It is used to explain the detector decisions, evaluate the explanations, and analyze detector errors. The detector decisions are explained visually by highlighting the image pixels that most influence a particular decision. The toolkit implements the proposed approach to generate a holistic explanation for all detector decisions using certain gradient-based explanation methods. To the author’s knowledge, this is the first work to conduct extensive qualitative and novel quantitative evaluations of different explanation methods across various detectors. The qualitative evaluation incorporates a visual analysis of the explanations carried out by the author as well as a human-centric evaluation. The human-centric evaluation includes a user study to understand user trust in the explanations generated across various explanation methods for different detectors. Four multi-object visualization methods are provided to merge the explanations of multiple objects detected in an image as well as the corresponding detector outputs in a single image. Finally, DExT implements the procedure to analyze detector failures using the formulated approach.
The visual analysis illustrates that the ability to explain a model is more dependent on the model itself than the actual ability of the explanation method. In addition, the explanations are affected by the object explained, the decision explained, detector architecture, training data labels, and model parameters. The results of the quantitative evaluation show that the Single Shot MultiBox Detector (SSD) is more faithfully explained compared to other detectors regardless of the explanation methods. In addition, a single explanation method cannot generate more faithful explanations than other methods for both the bounding box and the classification decision across different detectors. Both the quantitative and human-centric evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides more trustworthy explanations among selected methods across all detectors. Finally, a convex polygon-based multi-object visualization method provides more human-understandable visualization than other methods.
The author expects that DExT will motivate practitioners to evaluate object detectors from the interpretability perspective by explaining both bounding box and classification decisions.
01-2012
In service robotics, tasks without the involvement of objects are barely applicable, like in searching, fetching or delivering tasks. Service robots are supposed to capture efficiently object related information in real world scenes while for instance considering clutter and noise, and also being flexible and scalable to memorize a large set of objects. Besides object perception tasks like object recognition where the object’s identity is analyzed, object categorization is an important visual object perception cue that associates unknown object instances based on their e.g. appearance or shape to a corresponding category. We present a pipeline from the detection of object candidates in a domestic scene over the description to the final shape categorization of detected candidates. In order to detect object related information in cluttered domestic environments an object detection method is proposed that copes with multiple plane and object occurrences like in cluttered scenes with shelves. Further a surface reconstruction method based on Growing Neural Gas (GNG) in combination with a shape distribution-based descriptor is proposed to reflect shape characteristics of object candidates. Beneficial properties provided by the GNG such as smoothing and denoising effects support a stable description of the object candidates which also leads towards a more stable learning of categories. Based on the presented descriptor a dictionary approach combined with a supervised shape learner is presented to learn prediction models of shape categories.
Experimental results, of different shapes related to domestically appearing object shape categories such as cup, can, box, bottle, bowl, plate and ball, are shown. A classification accuracy of about 90% and a sequential execution time of lesser than two seconds for the categorization of an unknown object is achieved which proves the reasonableness of the proposed system design. Additional results are shown towards object tracking and false positive handling to enhance the robustness of the categorization. Also an initial approach towards incremental shape category learning is proposed that learns a new category based on the set of previously learned shape categories.
02-2020
Object detectors have improved considerably in the last years by using advanced Convolutional Neural Networks (CNNs) architectures. However, many detector hyper-parameters are not generally tuned, and they are used with values set by the detector authors. Blackbox optimization methods have gained more attention in recent years because of its ability to optimize the hyper-parameters of various machine learning algorithms and deep learning models. However, these methods are not explored in improving CNN-based object detector's hyper-parameters. In this research work, we propose the use of blackbox optimization methods such as Gaussian Process based Bayesian Optimization (BOGP), Sequential Model-based Algorithm Configuration (SMAC), and Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to tune the hyper-parameters in Faster R-CNN and Single Shot MultiBox Detector (SSD). In Faster R-CNN, tuning the input image size, prior box anchor scales and ratios using BOGP, SMAC, and CMA-ES has increased the performance around 1.5% in terms of Mean Average Precision (mAP) on PASCAL VOC. Tuning the anchor scales of SSD has increased the mAP by 3% on PASCAL VOC and marine debris datasets. On the COCO dataset with SSD, mAP improvement is observed in the medium and large objects, but mAP decreases by 1% in small objects. The experimental results show that the blackbox optimization methods have proved to increase the mAP performance by optimizing the object detectors. Moreover, it has achieved better results than the hand-tuned configurations in most of the cases.
03-2008
This thesis introduces and demonstrates a novel method for learning qualitative models of the world by an autonomous robot. The method makes possible generation of qualitative models that can be used for prediction as well as directing the experiments to improve the model. The qualitative models form the knowledge representation of the robot and consists of qualitative trees and non-deterministic finite automaton. An efficient exploration algorithm that lets the robot collect the most relevant learning samples is also introduced. To demonstrate the use of the methodology, representation and algorithm, two experiments are described. The first experiment is conducted using a mobile robot and a ball, where the robot observes the ball and learns the effect of its actions on the observed attributes of the world. The second experiment is conducted using a mobile robot and five boxes, two non-movable boxes and three movable boxes. The robot experiments actively with the objects and observes the changes in the attributes of the world. The main difference with the two experiments is that the first one tries to learn by observation while the second tries to learn by experimentation. In both experiments the robot learns qualitative models from its actions and observations. Although the primary objective of the robot is to improve itself by being able to predict the outcome of its actions, the models Learned were also used at each step of the learning process to direct the experiments so that the model converges to the final model as quickly as possible.
01-2022
Effective Neighborhood Feature Exploitation in Graph CNNs for Point Cloud Object-Part Segmentation
(2022)
Part segmentation is the task of semantic segmentation applied on objects and carries a wide range of applications from robotic manipulation to medical imaging. This work deals with the problem of part segmentation on raw, unordered point clouds of 3D objects. While pioneering works on deep learning for point clouds typically ignore taking advantage of local geometric structure around individual points, the subsequent methods proposed to extract features by exploiting local geometry have not yielded significant improvements either. In order to investigate further, a graph convolutional network (GCN) is used in this work in an attempt to increase the effectiveness of such neighborhood feature exploitation approaches. Most of the previous works also focus only on segmenting complete point cloud data. Considering the impracticality of such approaches, taking into consideration the real world scenarios where complete point clouds are scarcely available, this work proposes approaches to deal with partial point cloud segmentation.
In the attempt to better capture neighborhood features, this work proposes a novel method to learn regional part descriptors which guide and refine the segmentation predictions. The proposed approach helps the network achieve state-of-the-art performance of 86.4% mIoU on the ShapeNetPart dataset for methods which do not use any preprocessing techniques or voting strategies. In order to better deal with partial point clouds, this work also proposes new strategies to train and test on partial data. While achieving significant improvements compared to the baseline performance, the problem of partial point cloud segmentation is also viewed through an alternate lens of semantic shape completion.
Semantic shape completion networks not only help deal with partial point cloud segmentation but also enrich the information captured by the system by predicting complete point clouds with corresponding semantic labels for each point. To this end, a new network architecture for semantic shape completion is also proposed based on point completion network (PCN) which takes advantage of a graph convolution based hierarchical decoder for completion as well as segmentation. In addition to predicting complete point clouds, results indicate that the network is capable of reaching within a margin of 5% to the mIoU performance of dedicated segmentation networks for partial point cloud segmentation.
03-2019
Currently, a variety of methods exist for creating different types of spatio-temporal world models. Despite the numerous methods for this type of modeling, there exists no methodology for comparing the different approaches or their suitability for a given application e.g. logistics robots. In order to establish a means for comparing and selecting the best-fitting spatio-temporal world modeling technique, a methodology and standard set of criteria must be established. To that end, state-of-the-art methods for this type of modeling will be collected, listed, and described. Existing methods used for evaluation will also be collected where possible.
Using the collected methods, new criteria and techniques will be devised to enable the comparison of various methods in a qualitative manner. Experiments will be proposed to further narrow and ultimately select a spatio-temporal model for a given purpose. An example network of autonomous logistic robots, ROPOD, will serve as a case study used to demonstrate the use of the new criteria. This will also serve to guide the design of future experiments that aim to select a spatio-temporal world modeling technique for a given task. ROPOD was specifically selected as it operates in a real-world, human shared environment. This type of environment is desirable for experiments as it provides a unique combination of common and novel problems that arise when selecting an appropriate spatio-temporal world model. Using the developed criteria, a qualitative analysis will be applied to the selected methods to remove unfit options.
Then, experiments will be run on the remaining methods to provide comparative benchmarks. Finally, the results will be analyzed and recommendations to ROPOD will be made.
04-2020
A Comparative Study of Uncertainty Estimation Methods in Deep Learning Based Classification Models
(2020)
Deep learning models produce overconfident predictions even for misclassified data. This work aims to improve the safety guarantees of software-intensive systems that use deep learning based classification models for decision making by performing comparative evaluation of different uncertainty estimation methods to identify possible misclassifications.
In this work, uncertainty estimation methods applicable to deep learning models are reviewed and those which can be seamlessly integrated to existing deployed deep learning architectures are selected for evaluation. The different uncertainty estimation methods, deep ensembles, test-time data augmentation and Monte Carlo dropout with its variants, are empirically evaluated on two standard datasets (CIFAR-10 and CIFAR-100) and two custom classification datasets (optical inspection and RoboCup@Work dataset). A relative ranking between the methods is provided by evaluating the deep learning classifiers on various aspects such as uncertainty quality, classifier performance and calibration. Standard metrics like entropy, cross-entropy, mutual information, and variance, combined with a rank histogram based method to identify uncertain predictions by thresholding on these metrics, are used to evaluate uncertainty quality.
The results indicate that Monte Carlo dropout combined with test-time data augmentation outperforms all other methods by identifying more than 95% of the misclassifications and representing uncertainty in the highest number of samples in the test set. It also yields a better classifier performance and calibration in terms of higher accuracy and lower Expected Calibration Error (ECE), respectively. A python based uncertainty estimation library for training and real-time uncertainty estimation of deep learning based classification models is also developed.
04-2012
The work presented in this paper focuses on the comparison of well-known and new fault-diagnosis algorithms in the robot domain. The main challenge for fault diagnosis is to allow the robot to effectively cope not only with internal hardware and software faults but with external disturbances and errors from dynamic and complex environments as well. Based on a study of literature covering fault-diagnosis algorithms, I selected four of these methods based on both linear and non-linear models, analysed and implemented them in a mathematical robot-model, representing a four-wheels-OMNI robot. In experiments I tested the ability of the algorithms to detect and identify abnormal behaviour and to optimize the model parameters for the given training data. The final goal was to point out the strengths of each algorithm and to figure out which method would best suit the demands of fault diagnosis for a particular robot.
02-2013
Realism and plausibility of computer controlled entities in entertainment software have been enhanced by adding both static personalities and dynamic emotions. Here a generic model is introduced which allows the transfer of findings from real-life personality studies to a computational model. This information is used for decision making. The introduction of dynamic event-based emotions enables adaptive behavior patterns. The advantages of this new model have been validated with a four-way crossroad in a traffic simulation. Driving agents using the introduced model enhanced by dynamics were compared to agents based on static personality profiles and simple rule-based behavior. It has been shown that adding an adaptive dynamic factor to agents improves perceivable plausibility and realism. It also supports coping with extreme situations in a fair and understandable way.
05-2014
Digitalisierung eines Pen-&-Paper-Rollenspiels mit Übertragung von Interaktionen in die reale Welt
(2015)
Das hier vorliegende Werk ist eine Zusammenführung des Masterprojekts und der darauf aufbauenden Masterarbeit von Antony Konstantinidis und Nicolas Kopp. Diese Arbeiten sind in den Jahren 2013 und 2014 entstanden und ergeben zusammen ein umfassendes Bild der Software- und Spielenentwicklung, der Konzeption von Echtzeitanwendungen und vermitteln Hintergründe aus den verschiedensten Bereichen der Mixed Reality, des Storytelling, der Netzwerkkonzeption und der künstlichen Intelligenz.
01-2008
The research of autonomous artificial agents that adapt to and survive in changing, possibly hostile environments, has gained momentum in recent years. Many of such agents incorporate mechanisms to learn and acquire new knowledge from its environment, a feature that becomes fundamental to enable the desired adaptation, and account for the challenges that the environment poses. The issue of how to trigger such learning, however, has not been as thoroughly studied as its significance suggest. The solution explored is based on the use of surprise (the reaction to unexpected events), as the mechanism that triggers learning. This thesis introduces a computational model of surprise that enables the robotic learner to experience surprise and start the acquisition of knowledge to explain it. A measure of surprise that combines elements from information and probability theory, is presented. Such measure offers a response to surprising situations faced by the robot, that is proportional to the degree of unexpectedness of such event. The concepts of short- and long-term memory are investigated as factors that influence the resulting surprise. Short-term memory enables the robot to habituate to new, repeated surprises, and to “forget” about old ones, allowing them to become surprising again. Long-term memory contains knowledge that is known a priori or that has been previously learned by the robot. Such knowledge influences the surprise mechanism, by applying a subsumption principle: if the available knowledge is able to explain the surprising event, suppress any trigger of surprise. The computational model of robotic surprise has been successfully applied to the domain of a robotic learner, specifically one that learns by experimentation. A brief introduction to the context of such application is provided, as well as a discussion on related issues like the relationship of the surprise mechanism with other components of the robot conceptual architecture, the challenges presented by the specific learning paradigm used, and other components of the motivational structure of the agent.
01-2017
This paper describes the security mechanisms of several wireless building automation technologies, namely ZigBee, EnOcean, ZWave, KNX, FS20, and Home-Matic. It is shown that none of the technologies provides the necessary measure ofsecurity that should be expected in building automation systems. One of the conclusions drawn is that software embedded in systems that are build for a lifetime of twenty years or more needs to be updatable.
04-2022
In the field of automatic music generation, one of the greatest challenges is the consistent generation of pieces continuously perceived positively by the majority of the audience since there is no objective method to determine the quality of a musical composition. However, composing principles, which have been refined for millennia, have shaped the core characteristics of today's music. A hybrid music generation system, mlmusic, that incorporates various static, music-theory-based methods, as well as data-driven, subsystems, is implemented to automatically generate pieces considered acceptable by the average listener. Initially, a MIDI dataset, consisting of over 100 hand-picked pieces of various styles and complexities, is analysed using basic music theory principles, and the abstracted information is fed into explicitly constrained LSTM networks. For chord progressions, each individual network is specifically trained on a given sequence length, while phrases are created by consecutively predicting the notes' offset, pitch and duration. Using these outputs as a composition's foundation, additional musical elements, along with constrained recurrent rhythmic and tonal patterns, are statically generated. Although no survey regarding the pieces' reception could be carried out, the successful generation of numerous compositions of varying complexities suggests that the integration of these fundamentally distinctive approaches might lead to success in other branches.
01-2009
Autonomous mobile robots need internal environment representations or models of their environment in order to act in a goal-directed manner, plan actions and navigate effectively. Especially in those situations where a robot can not be provided with a manually constructed model or in environments that change over time, the robot needs to possess the ability of autonomously constructing models and maintaining these models on its own. To construct a model of an environment multiple sensor readings have to be acquired and integrated into a single representation. Where the robot has to take these sensor readings is determined by an exploration strategy. The strategy allows the robot to sense all environmental structures and to construct a complete model of its workspace. Given a complete environment model, the task of inspection is to guide the robot to all modeled environmental structures in order to detect changes and to update the model if necessary. Informally stated, exploration and inspection provide the means for acquiring as much information as possible by the robot itself. Both exploration and inspection are highly integrated problems. In addition to the according strategies, they require for several abilities of a robotic system and comprise various problems from the field of mobile robotics including Simultaneous localization and Mapping (SLAM), motion planning and control as well as reliable collision avoidance. The goal of this thesis is to develop and implement a complete system and a set of algorithms for robotic exploration and inspection. That is, instead of focussing on specific strategies, robotic exploration and inspection are addressed as the integrated problems that they are. Given the set of algorithms a real mobile service robot has to be able to autonomously explore its workspace, construct a model of its workspace and use this model in subsequent tasks e.g. for navigating in the workspace or inspecting the workspace itself. The algorithms need to be reliable, robust against environment dynamics and internal failures and applicable online in real-time on a real mobile robot. The resulting system should allow a mobile service robot to navigate effectively and reliably in a domestic environment and avoid all kinds of collisions. In the context of mobile robotics, domestic environments combine the characteristics of being cluttered, dynamic and populated by humans and domestic animals. SLAM is going to be addressed in terms of incremental range image registration which provides efficient means to construct internal environment representations online while moving through the environment. Two registration algorithms are presented that can be applied on two-dimensional and three-dimensional data together with several extensions and an incremental registration procedure. The algorithms are used to construct two different types of environment representations, memory-efficient sparse points and probabilistic reflection maps. For effective navigation in the robot’s workspace, different path planning algorithms are going to be presented for the two types of environment representations. Furthermore, two motion controllers will be described that allow a mobile robot to follow planned paths and to approach a target position and orientation. Finally this thesis will present different exploration and inspection strategies that use the aforementioned algorithms to move the robot to previously unexplored or uninspected terrain and update the internal environment representations accordingly. These strategies are augmented with algorithms for detecting changes in the environment and for segmenting internal models into individual rooms. The resulting system performed very successfully in the 2008 and 2009 RoboCup@Home competitions.
03-2014
Ziel des hier beschriebenen Forschungsprojekts war die Entwicklung eines prototypischen Fahrradfahrsimulators für den Einsatz in der Verkehrserziehung und im Verkehrssicherheitstraining. Der entwickelte Prototyp soll möglichst universell für verschiedene Altersklassen und Applikationen einsetzbar sowie mobil sein.