Technical Report / Hochschule Bonn-Rhein-Sieg University of Applied Sciences. Department of Computer Science
Publisher: Dean Prof. Dr. Sascha Alda
Hochschule Bonn-Rhein-Sieg University of Applied Sciences, Department of Computer Science
Sankt Augustin, Germany
ISSN 1869-5272
Hochschule Bonn-Rhein-Sieg University of Applied Sciences, Department of Computer Science
Sankt Augustin, Germany
ISSN 1869-5272
Refine
H-BRS Bibliography
- yes (50)
Departments, institutes and facilities
Document Type
- Report (42)
- Master's Thesis (8)
Year of publication
Has Fulltext
- yes (50)
Keywords
- Robotik (6)
- Cutting sticks-Problem (3)
- Teilsummenaufteilung (3)
- Virtuelle Realität (3)
- 3D-Scanner (2)
- Active Learning (2)
- Computer Vision (2)
- Deep Learning (2)
- Forschungsbericht (2)
- Gravitation (2)
01-2012
In service robotics, tasks without the involvement of objects are barely applicable, like in searching, fetching or delivering tasks. Service robots are supposed to capture efficiently object related information in real world scenes while for instance considering clutter and noise, and also being flexible and scalable to memorize a large set of objects. Besides object perception tasks like object recognition where the object’s identity is analyzed, object categorization is an important visual object perception cue that associates unknown object instances based on their e.g. appearance or shape to a corresponding category. We present a pipeline from the detection of object candidates in a domestic scene over the description to the final shape categorization of detected candidates. In order to detect object related information in cluttered domestic environments an object detection method is proposed that copes with multiple plane and object occurrences like in cluttered scenes with shelves. Further a surface reconstruction method based on Growing Neural Gas (GNG) in combination with a shape distribution-based descriptor is proposed to reflect shape characteristics of object candidates. Beneficial properties provided by the GNG such as smoothing and denoising effects support a stable description of the object candidates which also leads towards a more stable learning of categories. Based on the presented descriptor a dictionary approach combined with a supervised shape learner is presented to learn prediction models of shape categories.
Experimental results, of different shapes related to domestically appearing object shape categories such as cup, can, box, bottle, bowl, plate and ball, are shown. A classification accuracy of about 90% and a sequential execution time of lesser than two seconds for the categorization of an unknown object is achieved which proves the reasonableness of the proposed system design. Additional results are shown towards object tracking and false positive handling to enhance the robustness of the categorization. Also an initial approach towards incremental shape category learning is proposed that learns a new category based on the set of previously learned shape categories.
02-2012
The ability of detecting people has become a crucial subtask, especially in robotic systems which aim an application in public or domestic environments. Robots already provide their services e.g. in real home improvement markets and guide people to a desired product. In such a scenario many robot internal tasks would benefit from the knowledge of knowing the number and positions of people in the vicinity. The navigation for example could treat them as dynamical moving objects and also predict their next motion directions in order to compute a much safer path. Or the robot could specifically approach customers and offer its services. This requires to detect a person or even a group of people in a reasonable range in front of the robot. Challenges of such a real-world task are e.g. changing lightning conditions, a dynamic environment and different people shapes. In this thesis a 3D people detection approach based on point cloud data provided by the Microsoft Kinect is implemented and integrated on mobile service robot. A Top-Down/Bottom-Up segmentation is applied to increase the systems flexibility and provided the capability to the detect people even if they are partially occluded. A feature set is proposed to detect people in various pose configurations and motions using a machine learning technique. The system can detect people up to a distance of 5 meters. The experimental evaluation compared different machine learning techniques and showed that standing people can be detected with a rate of 87.29% and sitting people with 74.94% using a Random Forest classifier. Certain objects caused several false detections. To elimante those a verification is proposed which further evaluates the persons shape in the 2D space. The detection component has been implemented as s sequential (frame rate of 10 Hz) and a parallel application (frame rate of 16 Hz). Finally, the component has been embedded into complete people search task which explorates the environment, find all people and approach each detected person.
03-2013
The objective of this research project is to develop a user-friendly and cost-effective interactive input device that allows intuitive and efficient manipulation of 3D objects (6 DoF) in virtual reality (VR) visualization environments with flat projections walls. During this project, it was planned to develop an extended version of a laser pointer with multiple laser beams arranged in specific patterns. Using stationary cameras observing projections of these patterns from behind the screens, it is planned to develop an algorithm for reconstruction of the emitter’s absolute position and orientation in space. Laser pointer concept is an intuitive way of interaction that would provide user with a familiar, mobile and efficient navigation though a 3D environment. In order to navigate in a 3D world, it is required to know the absolute position (x, y and z position) and orientation (roll, pitch and yaw angles) of the device, a total of 6 degrees of freedom (DoF). Ordinary laser-based pointers when captured on a flat surface with a video camera system and then processed, will only provide x and y coordinates effectively reducing available input to 2 DoF only. In order to overcome this problem, an additional set of multiple (invisible) laser pointers should be used in the pointing device. These laser pointers should be arranged in a way that the projection of their rays will form one fixed dot pattern when intersected with the flat surface of projection screens. Images of such a pattern will be captured via a real-time camera-based system and then processed using mathematical re-projection algorithms. This would allow the reconstruction of the full absolute 3D pose (6 DoF) of the input device. Additionally, multi-user or collaborative work should be supported by the system, would allow several users to interact with a virtual environment at the same time. Possibilities to port processing algorithms into embedded processors or FPGAs will be investigated during this project as well.
04-2020
A Comparative Study of Uncertainty Estimation Methods in Deep Learning Based Classification Models
(2020)
Deep learning models produce overconfident predictions even for misclassified data. This work aims to improve the safety guarantees of software-intensive systems that use deep learning based classification models for decision making by performing comparative evaluation of different uncertainty estimation methods to identify possible misclassifications.
In this work, uncertainty estimation methods applicable to deep learning models are reviewed and those which can be seamlessly integrated to existing deployed deep learning architectures are selected for evaluation. The different uncertainty estimation methods, deep ensembles, test-time data augmentation and Monte Carlo dropout with its variants, are empirically evaluated on two standard datasets (CIFAR-10 and CIFAR-100) and two custom classification datasets (optical inspection and RoboCup@Work dataset). A relative ranking between the methods is provided by evaluating the deep learning classifiers on various aspects such as uncertainty quality, classifier performance and calibration. Standard metrics like entropy, cross-entropy, mutual information, and variance, combined with a rank histogram based method to identify uncertain predictions by thresholding on these metrics, are used to evaluate uncertainty quality.
The results indicate that Monte Carlo dropout combined with test-time data augmentation outperforms all other methods by identifying more than 95% of the misclassifications and representing uncertainty in the highest number of samples in the test set. It also yields a better classifier performance and calibration in terms of higher accuracy and lower Expected Calibration Error (ECE), respectively. A python based uncertainty estimation library for training and real-time uncertainty estimation of deep learning based classification models is also developed.
01-2008
The research of autonomous artificial agents that adapt to and survive in changing, possibly hostile environments, has gained momentum in recent years. Many of such agents incorporate mechanisms to learn and acquire new knowledge from its environment, a feature that becomes fundamental to enable the desired adaptation, and account for the challenges that the environment poses. The issue of how to trigger such learning, however, has not been as thoroughly studied as its significance suggest. The solution explored is based on the use of surprise (the reaction to unexpected events), as the mechanism that triggers learning. This thesis introduces a computational model of surprise that enables the robotic learner to experience surprise and start the acquisition of knowledge to explain it. A measure of surprise that combines elements from information and probability theory, is presented. Such measure offers a response to surprising situations faced by the robot, that is proportional to the degree of unexpectedness of such event. The concepts of short- and long-term memory are investigated as factors that influence the resulting surprise. Short-term memory enables the robot to habituate to new, repeated surprises, and to “forget” about old ones, allowing them to become surprising again. Long-term memory contains knowledge that is known a priori or that has been previously learned by the robot. Such knowledge influences the surprise mechanism, by applying a subsumption principle: if the available knowledge is able to explain the surprising event, suppress any trigger of surprise. The computational model of robotic surprise has been successfully applied to the domain of a robotic learner, specifically one that learns by experimentation. A brief introduction to the context of such application is provided, as well as a discussion on related issues like the relationship of the surprise mechanism with other components of the robot conceptual architecture, the challenges presented by the specific learning paradigm used, and other components of the motivational structure of the agent.
02-2023
Neuromorphic computing aims to mimic the computational principles of the brain in silico and has motivated research into event-based vision and spiking neural networks (SNNs). Event cameras (ECs) capture local, independent changes in brightness, and offer superior power consumption, response latencies, and dynamic ranges compared to frame-based cameras. SNNs replicate neuronal dynamics observed in biological neurons and propagate information in sparse sequences of ”spikes”. Apart from biological fidelity, SNNs have demonstrated potential as an alternative to conventional artificial neural networks (ANNs), such as in reducing energy expenditure and inference time in visual classification. Although potentially beneficial for robotics, the novel event-driven and spike-based paradigms remain scarcely explored outside the domain of aerial robots.
To investigate the utility of brain-inspired sensing and data processing in a robotics application, we developed a neuromorphic approach to real-time, online obstacle avoidance on a manipulator with an onboard camera. Our approach adapts high-level trajectory plans with reactive maneuvers by processing emulated event data in a convolutional SNN, decoding neural activations into avoidance motions, and adjusting plans in a dynamic motion primitive formulation. We conducted simulated and real experiments with a Kinova Gen3 arm performing simple reaching tasks involving static and dynamic obstacles. Our implementation was systematically tuned, validated, and tested in sets of distinct task scenarios, and compared to a non-adaptive baseline through formalized quantitative metrics and qualitative criteria.
The neuromorphic implementation facilitated reliable avoidance of imminent collisions in most scenarios, with 84% and 92% median success rates in simulated and real experiments, where the baseline consistently failed. Adapted trajectories were qualitatively similar to baseline trajectories, indicating low impacts on safety, predictability and smoothness criteria. Among notable properties of the SNN were the correlation of processing time with the magnitude of perceived motions (captured in events) and robustness to different event emulation methods. Preliminary tests with a DAVIS346 EC showed similar performance, validating our experimental event emulation method. These results motivate future efforts to incorporate SNN learning, utilize neuromorphic processors, and target other robot tasks to further explore this approach.
01-2011
This thesis work presents the implementation and validation of image processing problems in hardware to estimate the performance and precision gain. It compares the implementation for the addressed problem on a Field Programmable Gate Array (FPGA) with a software implementation for a General Purpose Processor (GPP) architecture. For both solutions the implementation costs for their development is an important aspect in the validation. The analysis of the flexibility and extendability that can be achieved by a modular implementation for the FPGA design was another major aspect. This work is based upon approaches from previous work, which included the detection of Binary Large OBjects (BLOBs) in static images and continuous video streams [13, 15]. One addressed problem of this work is the tracking of the detected BLOBs in continuous image material. This has been implemented for the FPGA platform and the GPP architecture. Both approaches have been compared with respect to performance and precision. This research project is motivated by the MI6 project of the Computer Vision research group, which is located at the Bonn-Rhein-Sieg University of Applied Sciences. The intent of the MI6 project is the tracking of a user in an immersive environment. The proposed solution is to attach a light emitting device to the user for tracking the created light dots on the projection surface of the immersive environment. Having the center points of those light dots would allow the estimation of the user’s position and orientation. One major issue that makes Computer Vision problems computationally expensive is the high amount of data that has to be processed in real-time. Therefore, one major target for the implementation was to get a processing speed of more than 30 frames per second. This would allow the system to realize feedback to the user in a response time which is faster than the human visual perception. One problem that comes with the idea of using a light emitting device to represent the user, is the precision error. Dependent on the resolution of the tracked projection surface of the immersive environment, a pixel might have a size in cm2. Having a precision error of only a few pixels, might lead to an offset in the estimated user’s position of several cm. In this research work the development and validation of a detection and tracking system for BLOBs on a Cyclone II FPGA from Altera has been realized. The system supports different input devices for the image acquisition and can perform detection and tracking for five to eight BLOBs. A further extension of the design has been evaluated and is possible with some constraints. Additional modules for compressing the image data based on run-length encoding and sub-pixel precision for the computed BLOB center-points have been designed. For the comparison of the FPGA approach for BLOB tracking a similar implementation in software using a multi-threaded approach has been realized. The system can transmit the detection or tracking results on two available communication interfaces, USB and RS232. The analysis of the hardware solution showed a similar precision for the BLOB detection and tracking as the software approach. One problem is the strong increase of the allocated resources when extending the system to process more BLOBs. With one of the applied target platforms, the DE2-70 board from Altera, the BLOB detection could be extended to process up to thirty BLOBs. The implementation of the tracking approach in hardware required much more effort than the software solution. The design of high level problems in hardware for this case are more expensive than the software implementation. The search and match steps in the tracking approach could be realized more efficiently and reliably in software. The additional pre-processing modules for sub-pixel precision and run-length-encoding helped to increase the system’s performance and precision.
03-2015
Der Einsatz von Agentensystemen ist vielfältig, dennoch sind aktuelle Realisierungen lediglich in der Lage primär regelkonformes oder aber „geskriptetes“ Verhalten auch unter Einsatz von randomisierten Verfahren abzubilden. Für eine realistische Repräsentation sind jedoch auch Abweichungen von den Regeln notwendig, die nicht zufällig sondern kontextbedingt auftreten. Im Rahmen dieses Forschungsprojektes wurde ein realitätsnaher Straßenverkehrssimulator realisiert, der mittels eines detailliert definierten Systems für kognitive Agenten auch diese irregulären Verhaltensweisen generiert und somit ein realistisches Verkehrsverhalten für die Verwendung in VR-Anwendungen simuliert. Durch das Erweitern der Agenten mit psychologischen Persönlichkeitsprofilen, basierend auf dem „Fünf-Faktoren-Modell“, zeigen die Agenten individualisierte und gleichzeitig konsistente Verhaltensmuster. Ein dynamisches Emotionsmodell sorgt zusätzlich für eine situationsbedingte Adaption des Verhaltens, z.B. bei langen Wartezeiten. Da die detaillierte Simulation kognitiver Prozesse, der Persönlichkeitseinflüsse und der emotionalen Zustände erhebliche Rechenleistungen verlangt, wurde ein mehrschichtiger Simulationsansatz entwickelt, der es erlaubt den Detailgrad der Berechnung und Darstellung jedes Agenten während der Simulation stufenweise zu verändern, so dass alle im System befindlichen Agenten konsistent simuliert werden können. Im Rahmen diverser Evaluierungsiterationen in einer bestehenden VR-Anwendung – dem FIVIS-Fahrradfahrsimulator des Antragstellers - konnte eindrucksvoll nachgewiesen werden, dass die realisierten Konzepte die ursprünglich formulierten Forschungsfragestellung überzeugend und effizient lösen.
05-2020
The ability to finely segment different instances of various objects in an environment forms a critical tool in the perception tool-box of any autonomous agent. Traditionally instance segmentation is treated as a multi-label pixel-wise classification problem. This formulation has resulted in networks that are capable of producing high-quality instance masks but are extremely slow for real-world usage, especially on platforms with limited computational capabilities. This thesis investigates an alternate regression-based formulation of instance segmentation to achieve a good trade-off between mask precision and run-time. Particularly the instance masks are parameterized and a CNN is trained to regress to these parameters, analogous to bounding box regression performed by an object detection network.
In this investigation, the instance segmentation masks in the Cityscape dataset are approximated using irregular octagons and an existing object detector network (i.e., SqueezeDet) is modified to regresses to the parameters of these octagonal approximations. The resulting network is referred to as SqueezeDetOcta. At the image boundaries, object instances are only partially visible. Due to the convolutional nature of most object detection networks, special handling of the boundary adhering object instances is warranted. However, the current object detection techniques seem to be unaffected by this and handle all the object instances alike. To this end, this work proposes selectively learning only partial, untainted parameters of the bounding box approximation of the boundary adhering object instances. Anchor-based object detection networks like SqueezeDet and YOLOv2 have a discrepancy between the ground-truth encoding/decoding scheme and the coordinate space used for clustering, to generate the prior anchor shapes. To resolve this disagreement, this work proposes clustering in a space defined by two coordinate axes representing the natural log transformations of the width and height of the ground-truth bounding boxes.
When both SqueezeDet and SqueezeDetOcta were trained from scratch, SqueezeDetOcta lagged behind the SqueezeDet network by a massive ≈ 6.19 mAP. Further analysis revealed that the sparsity of the annotated data was the reason for this lackluster performance of the SqueezeDetOcta network. To mitigate this issue transfer-learning was used to fine-tune the SqueezeDetOcta network starting from the trained weights of the SqueezeDet network. When all the layers of the SqueezeDetOcta were fine-tuned, it outperformed the SqueezeDet network paired with logarithmically extracted anchors by ≈ 0.77 mAP. In addition to this, the forward pass latencies of both SqueezeDet and SqueezeDetOcta are close to ≈ 19ms. Boundary adhesion considerations, during training, resulted in an improvement of ≈ 2.62 mAP of the baseline SqueezeDet network. A SqueezeDet network paired with logarithmically extracted anchors improved the performance of the baseline SqueezeDet network by ≈ 1.85 mAP.
In summary, this work demonstrates that if given sufficient fine instance annotated data, an existing object detection network can be modified to predict much finer approximations (i.e., irregular octagons) of the instance annotations, whilst having the same forward pass latency as that of the bounding box predicting network. The results justify the merits of logarithmically extracted anchors to boost the performance of any anchor-based object detection network. The results also showed that the special handling of image boundary adhering object instances produces more performant object detectors.
05-2017
Das Cutting sticks-Problem ist in seiner allgemeinen Formulierung ein NP-vollständiges Problem mit Anwendungspotenzialen im Bereich der Logistik. Unter der Annahme, dass P ungleich NP (P != NP) ist, existieren keine effizienten, d.h. polynomiellen Algorithmen zur Lösung des allgemeinen Problems.
In diesem Papier werden Ansätze aufgezeigt, mit denen bestimmte Instanzen des Problems effizient berechnet werden können. Für die Berechnung wichtige Parameter werden charakterisiert und deren Beziehung untereinander analysiert.
02-2016
Das Optimalziel für ein Logistiklager ist eine hohe Auslastung des Transportsystems. Es stellt sich somit die Frage nach der Auswahl der Aufträge, die gleichzeitig innerhalb des Lagers abgearbeitet werden, ohne Staus, Blockaden oder Überlastungen entstehen zu lassen. Dieser Auswahlprozess wird auch als Path-Packing bezeichnet. Diese Masterthesis untersucht das Path-Packing auf graphentheoretischer Ebene und stellt verschiedene Greedy-Heuristiken, eine Optimallösung auf Basis der Linearen Programmierung sowie einen kombinierten Ansatz gegenüber. Die Ansätze werden anhand von Messzeiten und Auslastungen unterschiedlich randomisiert erstellter Testdaten ausgewertet.
01-2009
Autonomous mobile robots need internal environment representations or models of their environment in order to act in a goal-directed manner, plan actions and navigate effectively. Especially in those situations where a robot can not be provided with a manually constructed model or in environments that change over time, the robot needs to possess the ability of autonomously constructing models and maintaining these models on its own. To construct a model of an environment multiple sensor readings have to be acquired and integrated into a single representation. Where the robot has to take these sensor readings is determined by an exploration strategy. The strategy allows the robot to sense all environmental structures and to construct a complete model of its workspace. Given a complete environment model, the task of inspection is to guide the robot to all modeled environmental structures in order to detect changes and to update the model if necessary. Informally stated, exploration and inspection provide the means for acquiring as much information as possible by the robot itself. Both exploration and inspection are highly integrated problems. In addition to the according strategies, they require for several abilities of a robotic system and comprise various problems from the field of mobile robotics including Simultaneous localization and Mapping (SLAM), motion planning and control as well as reliable collision avoidance. The goal of this thesis is to develop and implement a complete system and a set of algorithms for robotic exploration and inspection. That is, instead of focussing on specific strategies, robotic exploration and inspection are addressed as the integrated problems that they are. Given the set of algorithms a real mobile service robot has to be able to autonomously explore its workspace, construct a model of its workspace and use this model in subsequent tasks e.g. for navigating in the workspace or inspecting the workspace itself. The algorithms need to be reliable, robust against environment dynamics and internal failures and applicable online in real-time on a real mobile robot. The resulting system should allow a mobile service robot to navigate effectively and reliably in a domestic environment and avoid all kinds of collisions. In the context of mobile robotics, domestic environments combine the characteristics of being cluttered, dynamic and populated by humans and domestic animals. SLAM is going to be addressed in terms of incremental range image registration which provides efficient means to construct internal environment representations online while moving through the environment. Two registration algorithms are presented that can be applied on two-dimensional and three-dimensional data together with several extensions and an incremental registration procedure. The algorithms are used to construct two different types of environment representations, memory-efficient sparse points and probabilistic reflection maps. For effective navigation in the robot’s workspace, different path planning algorithms are going to be presented for the two types of environment representations. Furthermore, two motion controllers will be described that allow a mobile robot to follow planned paths and to approach a target position and orientation. Finally this thesis will present different exploration and inspection strategies that use the aforementioned algorithms to move the robot to previously unexplored or uninspected terrain and update the internal environment representations accordingly. These strategies are augmented with algorithms for detecting changes in the environment and for segmenting internal models into individual rooms. The resulting system performed very successfully in the 2008 and 2009 RoboCup@Home competitions.
01-2016
Das Cutting sticks-Problem ist ein NP-vollständiges Problem mit Anwendungspotenzialen im Bereich der Logistik. Es werden grundlegende Definitionen für die Behandlung sowie bisherige Ansätze zur Lösung des Problems aufgearbeitet und durch einige neue Aussagen ergänzt. Insbesondere stehen Ideen für eine algorithmische Lösung des Problems bzw. von Varianten des Problems im Fokus.
02-2020
Object detectors have improved considerably in the last years by using advanced Convolutional Neural Networks (CNNs) architectures. However, many detector hyper-parameters are not generally tuned, and they are used with values set by the detector authors. Blackbox optimization methods have gained more attention in recent years because of its ability to optimize the hyper-parameters of various machine learning algorithms and deep learning models. However, these methods are not explored in improving CNN-based object detector's hyper-parameters. In this research work, we propose the use of blackbox optimization methods such as Gaussian Process based Bayesian Optimization (BOGP), Sequential Model-based Algorithm Configuration (SMAC), and Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to tune the hyper-parameters in Faster R-CNN and Single Shot MultiBox Detector (SSD). In Faster R-CNN, tuning the input image size, prior box anchor scales and ratios using BOGP, SMAC, and CMA-ES has increased the performance around 1.5% in terms of Mean Average Precision (mAP) on PASCAL VOC. Tuning the anchor scales of SSD has increased the mAP by 3% on PASCAL VOC and marine debris datasets. On the COCO dataset with SSD, mAP improvement is observed in the medium and large objects, but mAP decreases by 1% in small objects. The experimental results show that the blackbox optimization methods have proved to increase the mAP performance by optimizing the object detectors. Moreover, it has achieved better results than the hand-tuned configurations in most of the cases.
03-2019
Currently, a variety of methods exist for creating different types of spatio-temporal world models. Despite the numerous methods for this type of modeling, there exists no methodology for comparing the different approaches or their suitability for a given application e.g. logistics robots. In order to establish a means for comparing and selecting the best-fitting spatio-temporal world modeling technique, a methodology and standard set of criteria must be established. To that end, state-of-the-art methods for this type of modeling will be collected, listed, and described. Existing methods used for evaluation will also be collected where possible.
Using the collected methods, new criteria and techniques will be devised to enable the comparison of various methods in a qualitative manner. Experiments will be proposed to further narrow and ultimately select a spatio-temporal model for a given purpose. An example network of autonomous logistic robots, ROPOD, will serve as a case study used to demonstrate the use of the new criteria. This will also serve to guide the design of future experiments that aim to select a spatio-temporal world modeling technique for a given task. ROPOD was specifically selected as it operates in a real-world, human shared environment. This type of environment is desirable for experiments as it provides a unique combination of common and novel problems that arise when selecting an appropriate spatio-temporal world model. Using the developed criteria, a qualitative analysis will be applied to the selected methods to remove unfit options.
Then, experiments will be run on the remaining methods to provide comparative benchmarks. Finally, the results will be analyzed and recommendations to ROPOD will be made.
02-2019
Multi-robot systems (MRS) are capable of performing a set of tasks by dividing them among the robots in the fleet. One of the challenges of working with multirobot systems is deciding which robot should execute each task. Multi-robot task allocation (MRTA) algorithms address this problem by explicitly assigning tasks to robots with the goal of maximizing the overall performance of the system. The indoor transportation of goods is a practical application of multi-robot systems in the area of logistics. The ROPOD project works on developing multi-robot system solutions for logistics in hospital facilities. The correct selection of an MRTA algorithm is crucial for enhancing transportation tasks. Several multi-robot task allocation algorithms exist in the literature, but just few experimental comparative analysis have been performed. This project analyzes and assesses the performance of MRTA algorithms for allocating supply cart transportation tasks to a fleet of robots. We conducted a qualitative analysis of MRTA algorithms, selected the most suitable ones based on the ROPOD requirements, implemented four of them (MURDOCH, SSI, TeSSI, and TeSSIduo), and evaluated the quality of their allocations using a common experimental setup and 10 experiments. Our experiments include off-line and semi on-line allocation of tasks as well as scalability tests and use virtual robots implemented as Docker containers. This design should facilitate deployment of the system on the physical robots. Our experiments conclude that TeSSI and TeSSIduo suit best the ROPOD requirements. Both use temporal constraints to build task schedules and run in polynomial time, which allow them to scale well with the number of tasks and robots. TeSSI distributes the tasks among more robots in the fleet, while TeSSIduo tends to use a lower percentage of the available robots.
Subsequently, we have integrated TeSSI and TeSSIduo to perform multi-robot task allocation for the ROPOD project.
01-2023
This thesis investigates the benefit of rubrics for grading short answers using an active learning mechanism. Automating short answer grading using Natural Language Processing (NLP) is one of the active research areas in the education domain. This could save time for the evaluator and invest more time in preparing for the lecture. Most of the research on short answer grading was treated as a similarity task between reference and student answers. However, grading based on reference answers does not account for partial grades and does not provide feedback. Also, the grading is automatic that tries to replace the evaluator. Hence, using rubrics for short answer grading with active learning eliminates the drawbacks mentioned earlier.
Initially, the proposed approach is evaluated on the Mohler dataset, popularly used to benchmark the methodology. This phase is used to determine the parameters for the proposed approach. Therefore, the approach with the selected parameter exceeds the performance of current State-Of-The-Art (SOTA) methods resulting in the Pearson correlation value of 0.63 and Root Mean Square Error (RMSE) of 0.85. The proposed approach has surpassed the SOTA methods by almost 4%.
Finally, the benchmarked approach is used to grade the short answer based on rubrics instead of reference answers. The proposed approach evaluates short answers from Autonomous Mobile Robot (AMR) dataset to provide scores and feedback (formative assessment) based on the rubrics. The average performance of the dataset results in the Pearson correlation value of 0.61 and RMSE of 0.83. Thus, this research has proven that rubrics-based grading achieves formative assessment without compromising performance. In addition, the rubrics have the advantage of generalizability to all answers.
01-2014
Design of a declarative language for task-oriented grasping and tool-use with dextrous robotic hands
(2014)
Apparently simple manipulation tasks for a human such as transportation or tool use are challenging to replicate in an autonomous service robot. Nevertheless, dextrous manipulation is an important aspect for a robot in many daily tasks. While it is possible to manufacture special-purpose hands for one specific task in industrial settings, a generalpurpose service robot in households must have flexible hands which can adapt to many tasks. Intelligently using tools enables the robot to perform tasks more efficiently and even beyond the designed capabilities. In this work a declarative domain-specific language, called Grasp Domain Definition Language (GDDL), is presented that allows the specification of grasp planning problems independently of a specific grasp planner. This design goal resembles the idea of the Planning Domain Definition Language (PDDL). The specification of GDDL requires a detailed analysis of the research in grasping in order to identify best practices in different domains that contribute to a grasp. These domains describe for instance physical as well as semantic properties of objects and hands. Grasping always has a purpose which is captured in the task domain definition. It enables the robot to grasp an object in a taskdependent manner. Suitable representations in these domains have to be identified and formalized for which a domain-driven software engineering approach is applied. This kind of modeling allows the specification of constraints which guide the composition of domain entity specifications. The domain-driven approach fosters reuse of domain concepts while the constraints enable the validation of models already during design time. A proof of concept implementation of GDDL into the GraspIt! grasp planner is developed. Preliminary results of this thesis have been published and presented on the IEEE International Conference on Robotics and Automation (ICRA).
03-2022
As cameras are ubiquitous in autonomous systems, object detection is a crucial task. Object detectors are widely used in applications such as autonomous driving, healthcare, and robotics. Given an image, an object detector outputs both the bounding box coordinates as well as classification probabilities for each object detected. The state-of-the-art detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications in particular. It is therefore crucial to explain the reason behind each detector decision in order to gain user trust, enhance detector performance, and analyze their failure.
Previous work fails to explain as well as evaluate both bounding box and classification decisions individually for various detectors. Moreover, no tools explain each detector decision, evaluate the explanations, and also identify the reasons for detector failures. This restricts the flexibility to analyze detectors. The main contribution presented here is an open-source Detector Explanation Toolkit (DExT). It is used to explain the detector decisions, evaluate the explanations, and analyze detector errors. The detector decisions are explained visually by highlighting the image pixels that most influence a particular decision. The toolkit implements the proposed approach to generate a holistic explanation for all detector decisions using certain gradient-based explanation methods. To the author’s knowledge, this is the first work to conduct extensive qualitative and novel quantitative evaluations of different explanation methods across various detectors. The qualitative evaluation incorporates a visual analysis of the explanations carried out by the author as well as a human-centric evaluation. The human-centric evaluation includes a user study to understand user trust in the explanations generated across various explanation methods for different detectors. Four multi-object visualization methods are provided to merge the explanations of multiple objects detected in an image as well as the corresponding detector outputs in a single image. Finally, DExT implements the procedure to analyze detector failures using the formulated approach.
The visual analysis illustrates that the ability to explain a model is more dependent on the model itself than the actual ability of the explanation method. In addition, the explanations are affected by the object explained, the decision explained, detector architecture, training data labels, and model parameters. The results of the quantitative evaluation show that the Single Shot MultiBox Detector (SSD) is more faithfully explained compared to other detectors regardless of the explanation methods. In addition, a single explanation method cannot generate more faithful explanations than other methods for both the bounding box and the classification decision across different detectors. Both the quantitative and human-centric evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides more trustworthy explanations among selected methods across all detectors. Finally, a convex polygon-based multi-object visualization method provides more human-understandable visualization than other methods.
The author expects that DExT will motivate practitioners to evaluate object detectors from the interpretability perspective by explaining both bounding box and classification decisions.
05-2014
Digitalisierung eines Pen-&-Paper-Rollenspiels mit Übertragung von Interaktionen in die reale Welt
(2015)
Das hier vorliegende Werk ist eine Zusammenführung des Masterprojekts und der darauf aufbauenden Masterarbeit von Antony Konstantinidis und Nicolas Kopp. Diese Arbeiten sind in den Jahren 2013 und 2014 entstanden und ergeben zusammen ein umfassendes Bild der Software- und Spielenentwicklung, der Konzeption von Echtzeitanwendungen und vermitteln Hintergründe aus den verschiedensten Bereichen der Mixed Reality, des Storytelling, der Netzwerkkonzeption und der künstlichen Intelligenz.
02-2013
Realism and plausibility of computer controlled entities in entertainment software have been enhanced by adding both static personalities and dynamic emotions. Here a generic model is introduced which allows the transfer of findings from real-life personality studies to a computational model. This information is used for decision making. The introduction of dynamic event-based emotions enables adaptive behavior patterns. The advantages of this new model have been validated with a four-way crossroad in a traffic simulation. Driving agents using the introduced model enhanced by dynamics were compared to agents based on static personality profiles and simple rule-based behavior. It has been shown that adding an adaptive dynamic factor to agents improves perceivable plausibility and realism. It also supports coping with extreme situations in a fair and understandable way.
01-2022
Effective Neighborhood Feature Exploitation in Graph CNNs for Point Cloud Object-Part Segmentation
(2022)
Part segmentation is the task of semantic segmentation applied on objects and carries a wide range of applications from robotic manipulation to medical imaging. This work deals with the problem of part segmentation on raw, unordered point clouds of 3D objects. While pioneering works on deep learning for point clouds typically ignore taking advantage of local geometric structure around individual points, the subsequent methods proposed to extract features by exploiting local geometry have not yielded significant improvements either. In order to investigate further, a graph convolutional network (GCN) is used in this work in an attempt to increase the effectiveness of such neighborhood feature exploitation approaches. Most of the previous works also focus only on segmenting complete point cloud data. Considering the impracticality of such approaches, taking into consideration the real world scenarios where complete point clouds are scarcely available, this work proposes approaches to deal with partial point cloud segmentation.
In the attempt to better capture neighborhood features, this work proposes a novel method to learn regional part descriptors which guide and refine the segmentation predictions. The proposed approach helps the network achieve state-of-the-art performance of 86.4% mIoU on the ShapeNetPart dataset for methods which do not use any preprocessing techniques or voting strategies. In order to better deal with partial point clouds, this work also proposes new strategies to train and test on partial data. While achieving significant improvements compared to the baseline performance, the problem of partial point cloud segmentation is also viewed through an alternate lens of semantic shape completion.
Semantic shape completion networks not only help deal with partial point cloud segmentation but also enrich the information captured by the system by predicting complete point clouds with corresponding semantic labels for each point. To this end, a new network architecture for semantic shape completion is also proposed based on point completion network (PCN) which takes advantage of a graph convolution based hierarchical decoder for completion as well as segmentation. In addition to predicting complete point clouds, results indicate that the network is capable of reaching within a margin of 5% to the mIoU performance of dedicated segmentation networks for partial point cloud segmentation.
06-2017
Das Cutting sticks-Problem ist in seiner allgemeinen Formulierung ein NP-vollständiges Problem mit Anwendungspotenzialen im Bereich der Logistik. Unter der Annahme, dass P ungleich NP (P != NP) ist, existieren keine effizienten, d.h. polynomiellen Algorithmen zur Lösung des allgemeinen Problems.
In diesem Papier werden für eine Reihe von Instanzen effiziente Lösungen angegeben.
02-2018
Bei der Übertragung und Speicherung von Daten ist es eine wesentliche Frage, inwieweit die Daten komprimiert werden können, ohne dass deren Informationsgehalt verloren geht.
Ein Maß für den Informationsgehalt von Daten ist also von grundlegender Bedeutung. Vor etwa siebzig Jahren hat C. E. Shannon ein solches Maß eingeführt und damit das Lehr- und Forschungsgebiet der Informationstheorie begründet, welches seit dem bis heute hin wesentlich zur Konzeption und Realisierung von Informationsund Kommunikationstechnologien beigetragen hat. Etwa zwanzig Jahre später hat A. N. Kolmogorov ein anderes Maß für den Informationsgehalt von Daten eingeführt. Während die Shannonsche Informationstheorie zum Curriculum von mathematischen, informatischen und elektrotechnischen Studiengängen gehört, ist die Algorithmische Informationstheorie von Kolmogorov weit weniger bekannt und eher Gegenstand von speziellen Lehrveranstaltungen.
Seit einigen Jahren nimmt allerdings die Beschäftigung mit dieser Theorie zu, zumal in der einschlägigen Literatur von erfolgreichen praktischen Anwendungen der Theorie berichtet wird. Die vorliegende Arbeit gibt eine Einführung in grundlegende Ideen dieser Theorie und beschreibt deren Anwendungsmöglichkeiten bei einigen ausgewählten Problemstellungen der Theoretischen Informatik.
Die Ausarbeitung kann als Skript für einführende Lehrveranstaltungen in die Algorithmische Informationstheorie sowie als Lektüre zur Einarbeitung in die Thematik als Ausgangspunkt für Forschungs- und Entwicklungsarbeiten verwendet werden.
03-2014
Ziel des hier beschriebenen Forschungsprojekts war die Entwicklung eines prototypischen Fahrradfahrsimulators für den Einsatz in der Verkehrserziehung und im Verkehrssicherheitstraining. Der entwickelte Prototyp soll möglichst universell für verschiedene Altersklassen und Applikationen einsetzbar sowie mobil sein.