pub H-BRS | Fachbereich Informatik

Computer Assisted Short Answer Grading with Rubrics using Active Learning (2023)

This thesis investigates the benefit of rubrics for grading short answers using an active learning mechanism. Automating short answer grading using Natural Language Processing (NLP) is one of the active research areas in the education domain. This could save time for the evaluator and invest more time in preparing for the lecture. Most of the research on short answer grading was treated as a similarity task between reference and student answers. However, grading based on reference answers does not account for partial grades and does not provide feedback. Also, the grading is automatic that tries to replace the evaluator. Hence, using rubrics for short answer grading with active learning eliminates the drawbacks mentioned earlier. Initially, the proposed approach is evaluated on the Mohler dataset, popularly used to benchmark the methodology. This phase is used to determine the parameters for the proposed approach. Therefore, the approach with the selected parameter exceeds the performance of current State-Of-The-Art (SOTA) methods resulting in the Pearson correlation value of 0.63 and Root Mean Square Error (RMSE) of 0.85. The proposed approach has surpassed the SOTA methods by almost 4%. Finally, the benchmarked approach is used to grade the short answer based on rubrics instead of reference answers. The proposed approach evaluates short answers from Autonomous Mobile Robot (AMR) dataset to provide scores and feedback (formative assessment) based on the rubrics. The average performance of the dataset results in the Pearson correlation value of 0.61 and RMSE of 0.83. Thus, this research has proven that rubrics-based grading achieves formative assessment without compromising performance. In addition, the rubrics have the advantage of generalizability to all answers.

Evaluation of Drift Detection Techniques for Automated Machine Learning Pipelines (2023)

Abdelwahab, Hammam

Machine learning-based solutions are frequently adapted in several applications that require big data in operations. The performance of a model that is deployed into operations is subject to degradation due to unanticipated changes in the flow of input data. Hence, monitoring data drift becomes essential to maintain the model’s desired performance. Based on the conducted review of the literature on drift detection, statistical hypothesis testing enables to investigate whether incoming data is drifting from training data. Because Maximum Mean Discrepancy (MMD) and Kolmogorov-Smirnov (KS) have shown to be reliable distance measures between multivariate distributions in the literature review, both were selected from several existing techniques for experimentation. For the scope of this work, the image classification use case was experimented with using the Stream-51 dataset. Based on the results from different drift experiments, both MMD and KS showed high Area Under Curve values. However, KS exhibited faster performance than MMD with fewer false positives. Furthermore, the results showed that using the pre-trained ResNet-18 for feature extraction maintained the high performance of the experimented drift detectors. Furthermore, the results showed that the performance of the drift detectors highly depends on the sample sizes of the reference (training) data and the test data that flow into the pipeline’s monitor. Finally, the results also showed that if the test data is a mixture of drifting and non-drifting data, the performance of the drift detectors does not depend on how the drifting data are scattered with the non-drifting ones, but rather their amount in the test set

Multi-modal Emotion Categorization in Oral History Interviews (2023)

Viswanath, Anargh

This thesis proposes a multi-label classification approach using the Multimodal Transformer (MulT) [80] to perform multi-modal emotion categorization on a dataset of oral histories archived at the Haus der Geschichte (HdG). Prior uni-modal emotion classification experiments conducted on the novel HdG dataset provided less than satisfactory results. They uncovered issues such as class imbalance, ambiguities in emotion perception between annotators, and lack of representative training data to perform transfer learning [28]. Hence, the objectives of this thesis were to achieve better results by performing a multi-modal fusion and resolving the problems arising from class imbalance and annotator-induced bias in emotion perception. A further objective was to assess the quality of the novel HdG dataset and benchmark the results using SOTA techniques. Through a literature survey on the challenges, models, and datasets related to multi-modal emotion recognition, we created a methodology utilizing the MulT along with a multi-label classification approach. This approach produced a considerable improvement in the overall emotion recognition by obtaining an average AUC of 0.74 and Balanced-accuracy of 0.70 on the HdG dataset, which is comparable to state-of-the-art (SOTA) results on other datasets. In this manner, we were also able to benchmark the novel HdG dataset as well as introduce a novel multi-annotator learning approach to understand each annotator’s relative strengths and weaknesses for emotion perception. Our evaluation results highlight the potential benefits of the novel multi-annotator learning approach in improving overall performance by resolving the problems arising from annotator-induced bias and variation in the perception of emotions. Complementing these results, we performed a further qualitative analysis of the HdG annotations with a psychologist to study the ambiguities found in the annotations. We conclude that the ambiguities in annotations may have resulted from a combination of several socio-psychological factors and systemic issues associated with the process of creating these annotations. As these problems are also present in most multi-modal emotion recognition datasets, we conclude that the domain could benefit from a set of annotation guidelines to create standardized datasets.

Continual Learning in Object Detection (2023)

Tran Tien, Huy

Object detection concerns the classification and localization of objects in an image. To cope with changes in the environment, such as when new classes are added or a new domain is encountered, the detector needs to update itself with the new information while retaining knowledge learned in the past. Previous works have shown that training the detector solely on new data would produce a severe "forgetting" effect, in which the performance on past tasks deteriorates through each new learning phase. However, in many cases, storing and accessing past data is not possible due to privacy concerns or storage constraints. This project aims to investigate promising continual learning strategies for object detection without storing and accessing past training images and labels. We show that by utilizing the pseudo-background trick to deal with missing labels, and knowledge distillation to deal with missing data, the forgetting effect can be significantly reduced in both class-incremental and domain-incremental scenarios. Furthermore, an integration of a small latent replay buffer can result in a positive backward transfer, indicating the enhancement of past knowledge when new knowledge is learned.

DExT: Detector Explanation Toolkit for Explaining Multiple Detections Using Saliency Methods (2022)

Padmanabhan, Deepan Chakravarthi

As cameras are ubiquitous in autonomous systems, object detection is a crucial task. Object detectors are widely used in applications such as autonomous driving, healthcare, and robotics. Given an image, an object detector outputs both the bounding box coordinates as well as classification probabilities for each object detected. The state-of-the-art detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications in particular. It is therefore crucial to explain the reason behind each detector decision in order to gain user trust, enhance detector performance, and analyze their failure. Previous work fails to explain as well as evaluate both bounding box and classification decisions individually for various detectors. Moreover, no tools explain each detector decision, evaluate the explanations, and also identify the reasons for detector failures. This restricts the flexibility to analyze detectors. The main contribution presented here is an open-source Detector Explanation Toolkit (DExT). It is used to explain the detector decisions, evaluate the explanations, and analyze detector errors. The detector decisions are explained visually by highlighting the image pixels that most influence a particular decision. The toolkit implements the proposed approach to generate a holistic explanation for all detector decisions using certain gradient-based explanation methods. To the author’s knowledge, this is the first work to conduct extensive qualitative and novel quantitative evaluations of different explanation methods across various detectors. The qualitative evaluation incorporates a visual analysis of the explanations carried out by the author as well as a human-centric evaluation. The human-centric evaluation includes a user study to understand user trust in the explanations generated across various explanation methods for different detectors. Four multi-object visualization methods are provided to merge the explanations of multiple objects detected in an image as well as the corresponding detector outputs in a single image. Finally, DExT implements the procedure to analyze detector failures using the formulated approach. The visual analysis illustrates that the ability to explain a model is more dependent on the model itself than the actual ability of the explanation method. In addition, the explanations are affected by the object explained, the decision explained, detector architecture, training data labels, and model parameters. The results of the quantitative evaluation show that the Single Shot MultiBox Detector (SSD) is more faithfully explained compared to other detectors regardless of the explanation methods. In addition, a single explanation method cannot generate more faithful explanations than other methods for both the bounding box and the classification decision across different detectors. Both the quantitative and human-centric evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides more trustworthy explanations among selected methods across all detectors. Finally, a convex polygon-based multi-object visualization method provides more human-understandable visualization than other methods. The author expects that DExT will motivate practitioners to evaluate object detectors from the interpretability perspective by explaining both bounding box and classification decisions.

Object Detection in Dense Volume Data (2022)

Sharma, Ramit

This project focuses on object detection in dense volume data. There are several types of dense volume data, namely Computed Tomography (CT) scan, Positron Emission Tomography (PET), Magnetic Resonance Imaging (MRI). This work focuses on CT scans. CT scans are not limited to the medical domain; they are also used in industries. CT scans are used in airport baggage screening, assembly lines, and the object detection systems in these places should be able to detect objects fast. One of the ways to address the issue of computational complexity and make the object detection systems fast is to use low-resolution images. Low-resolution CT scanning is fast. The entire process of scanning and detection can be made faster by using low-resolution images. Even in the medical domain, to reduce the rad iation dose, the exposure time of the patient should be reduced. The exposure time of patients could be reduced by allowing low-resolution CT scans. Hence it is essential to find out which object detection model has better accuracy as well as speed at low-resolution CT scans. However, the existing approaches did not provide details about how the model would perform when the resolution of CT scans is varied. Hence in this project, the goal is to analyze the impact of varying resolution of CT scans on both the speed and accuracy of the model. Three object detection models, namely RetinaNet, YOLOv3, and YOLOv5, were trained at various resolutions. Among the three models, it was found that YOLOv5 has the best mAP and f1 score at multiple resolutions on the DeepLesion dataset. RetinaNet model h as the least inference time on the DeepLesion dataset. From the experiments, it could be asserted that sacrificing mean average precision (mAP) to improve inference time by reducing resolution is feasible.

PillarFPN: A Feature Pyramid Extension to Pointpillars for 3D Object Detection (2022)

Munshi, Harsh

In the field of autonomous robotics, sensors have played a major role in defining the scope of technology and to a great extent, limitations of it as well. This cycle of constant updates and hence technological advancement has made given birth to some serious industries which were once inconceivable. Industries like autonomous driving which has a serious impact on safety and security of people, also has an equally harsh implication on the dynamics and economics of the market. With sensors like LiDAR and RADAR delivering 3D measurements as point clouds, there is a necessity to process the raw measurements directly and many research groups are working on the same. A sizable research has gone in solving the task of object detection on 2D images. In this thesis we aim to develop a LiDAR based 3D object detection scheme. We combine the ideas of PointPillars and feature pyramid networks from 2D vision to propose Pillar-FPN. The proposed method directly takes 3D point clouds as input and outputs a 3D bounding box. Our pipeline consists of multiple variations of proposed Pillar-FPN at the feature fusion level that are described in the results section. We have trained our model on the KITTI train dataset and evaluated it on KITTI validation dataset.

Generative Models for the Analysis of Dynamical Systems with Applications (2022)

Preciado Grijalva, Alan

High-dimensional and multi-variate data from dynamical systems such as turbulent flows and wind turbines can be analyzed with deep learning due to its capacity to learn representations in lower-dimensional manifolds. Two challenges of interest arise from data generated from these systems, namely, how to anticipate wind turbine failures and how to better understand air flow through car ventilation systems. There are deep neural network architectures that can project data into a lower-dimensional space with the goal of identifying and understanding patterns that are not distinguishable in the original dimensional space. Learning data representations in lower dimensions via non-linear mappings allows one to perform data compression, data clustering (for anomaly detection), data reconstruction and synthetic data generation. In this work, we explore the potential that variational autoencoders (VAE) have to learn low-dimensional data representations in order to tackle the problems posed by the two dynamical systems mentioned above. A VAE is a neural network architecture that combines the mechanisms of the standard autoencoder and variational bayes. The goal here is to train a neural network to minimize a loss function defined by a reconstruction term together with a variational term defined as a Kulback-Leibler (KL) divergence. The report discusses the results obtained for the two different data domains: wind turbine time series and turbulence data from computational fluid dynamics (CFD) simulations. We report on the reconstruction, clustering and unsupervised anomaly detection of wind turbine multi-variate time series data using a variant of a VAE called Variational Recurrent Autoencoder (VRAE). We trained a VRAE to cluster normal and abnormal wind turbine series (two class problem) as well as normal and multiple abnormal series (multi-class problem). We found that the model is capable of distinguishing between normal and abnormal cases by reducing the dimensionality of the input data and projecting it to two dimensions using techniques such as Principal Component Analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). A set of anomaly scoring methods is applied on top of these latent vectors in order to compute unsupervised clustering. We have achieved an accuracy of up to 96% with the KM eans + + algorithm. We also report the data reconstruction and generation results of two dimensional turbulence slices corresponding to CFD simulation of a HVAC air duct. For this, we have trained a Convolutional Variational Autoencoder (CVAE). We have found that the model is capable of reconstructing laminar flows up to a certain degree of resolution as well generating synthetic turbulence data from the learned latent distribution.

Integration eines Chatbots in ein Stadtportal - lohnt sich das? (2020)

Bertling, Alexander

Intelligente Dialogsysteme – Chatbots – werden immer häufiger als virtuelle Ansprechpartner von Unternehmen und Institutionen eingesetzt. Auf Basis einer Wissensdatenbank können Chatbots einen größeren Anteil von Kundenanfragen automatisiert beantworten. Analog ist der Einsatz von Chatbots als digitaler Ansprechpartner öffentlicher Verwaltungen denkbar. Sie könnten Bürgern helfen, sich innerhalb der behördlichen Strukturen zu orientieren und Verwaltungsleistungen effizient und effektiv in Anspruch zu nehmen. Diese Arbeit überprüft den Einsatz eines Chatbots in der öffentlichen Verwaltung hinsichtlich der entstehenden Kosten und des erwartbaren Nutzens. Auf Basis einer umfangreichen Literaturauswertung und der prototypischen Realisierung eines Chatbots für ein Stadtportal werden dabei Herausforderungen dieser Anwendungsdomäne herausgearbeitet, konkrete Funktionsweise und Implementierungsstrategien von Chatbots erörtert und einige Erfolgsfaktoren formuliert, die den Kern einer Handlungsempfehlung für Entscheidungsträger öffentlicher Verwaltungen bilden.

Evaluation von Ansätzen zur skalierten Ausführung von workflowbasierten Anwendungen am Beispiel einer Audio-Mining Anwendung (2019)

Luhmer, David

Die letzten zwei Jahrzehnte wurden durch das exponentielle Wachstum der zur Verfügung stehenden Daten geprägt. Täglich produzieren Menschen und Maschinen mehr und mehr Daten, die oftmals in verteilten Datenspeichern abgelegt werden. Anwendungsgebiete lassen sich beispielsweise in der Physik und Astronomie finden, wo immense Datenmengen von Teilchenbeschleunigern oder Satelliten erzeugt werden, die gespeichert und verarbeitet werden müssen. Aus diesen Datenmengen können weder vom Menschen direkt noch durch traditionelle Analysemethoden neue Erkenntnisse gewonnen werden. Zur Verarbeitung dieser Datenmassen sind parallele sowie verteilte Datenanalyseverfahren notwendig. [MTT18,NEKH+18]

Automatic Sentence Generation Using Generative Adversarial Neural Networks (2019)

Ovchinnikova, Evgeniya

This work aims to create a natural language generation (NLG) base for further development of systems for automatic examination questions generation and automatic summarization in Hochschule Bonn-Rhein-Sieg and Fraunhofer IAIS, respectively. Nowadays both tasks are very relevant. The first can significantly simplify the university teachers' work and the second to be of assistance for a faster retrieval of knowledge from an excessively large amount of information that people often work with. We focus on the search for an efficient and robust approach to the controlled NLG problem. Therefore, though the initial idea of the project was the usage of the generative adversarial neural networks (GANs), we switched our attention to more robust and easily-controllable autoencoders. Thus, in this work we implement an autoencoder for unsupervised discovery of latent space representations of text, and show the ability of the system to generate new sentences based on this latent space. Apart from that, we apply Gaussian mixture techniques in order to obtain meaningful text clusters and thereby try to create a tool that would allow us to generate sentences relevant to the semantics of the Gaussian clusters, e.g. positive or negative reviews or examination questions on certain topic. The developed system is tested on several datasets and compared to GANs' performance.

Generalization of SELU to CNN (2019)

Ha, Bach

Neural network based object detectors are able to automatize many difficult, tedious tasks. However, they are usually slow and/or require powerful hardware. One main reason is called Batch Normalization (BN) [1], which is an important method for building these detectors. Recent studies present a potential replacement called Self-normalizing Neural Network (SNN) [2], which at its core is a special activation function named Scaled Exponential Linear Unit (SELU). This replacement seems to have most of BNs benefits while requiring less computational power. Nonetheless, it is uncertain that SELU and neural network based detectors are compatible with one another. An evaluation of SELU incorporated networks would help clarify that uncertainty. Such evaluation is performed through series of tests on different neural networks. After the evaluation, it is concluded that, while indeed faster, SELU is still not as good as BN for building complex object detector networks.

Einführung eines (Halb-)automatisierten Datenerfassungssystems von Blutproben in die Forschungsumgebung des Deutschen Zentrums für Luft- und Raumfahrt e. V. (DLR) (2019)

Zok, Thomas Tobias

Das Deutsche Zentrum für Luft- und Raumfahrt (DLR) führt viele Forschungen und Studien im Bereich der Luft- und Raumfahrt durch. Dabei spielen die Studien für die Gesundheit und Medizin auch eine sehr wichtige Rolle bei der DLR. Zu diesem Zweck führt die DLR die Artificial Gravity bed rest study (AGBRESA) im Auftrag der European Space Agency (esa) und in Kooperation der NASA durch. In dieser Studie werden die negativen Auswirkungen der Schwerelosigkeit auf dem Menschen im Weltall simuliert. Dabei werden Experimente durchgeführt, um die negative Auswirkungen entgegenzuwirken. Die Ergebnisse der Experimente werden in der DLR digital, aber auch auf Papier dokumentiert. In diesem Master-Projekt habe ich nun die Aufgabe, die Papierprotokolle für den Bereich der Blutabnahme und der Labordokumentation in eine digitale Form zu ersetzen.

Estimation of Prediction Uncertainty for Semantic Scene Labeling Using Bayesian Approximation (2018)

Ajmera, Anand

With the advancement in technology, autonomous and assisted driving are close to being reality. A key component of such systems is the understanding of the surrounding environment. This understanding about the environment can be attained by performing semantic labeling of the driving scenes. Existing deep learning based models have been developed over the years that outperform classical image processing algorithms for the task of semantic labeling. However, the existing models only produce semantic predictions and do not provide a measure of uncertainty about the predictions. Hence, this work focuses on developing a deep learning based semantic labeling model that can produce semantic predictions and their corresponding uncertainties. Autonomous driving needs a real-time operating model, however the Full Resolution Residual Network (FRRN) [4] architecture, which is found as the best performing architecture during literature search, is not able to satisfy this condition. Hence, a small network, similar to FRRN, has been developed and used in this work. Based on the work of [13], the developed network is then extended by adding dropout layers and the dropouts are used during testing to perform approximate Bayesian inference. The existing works on uncertainties, do not have quantitative metrics to evaluate the quality of uncertainties estimated by a model. Hence, the area under curve (AUC) of the receiver operating characteristic (ROC) curves is proposed and used as an evaluation metric in this work. Further, a comparative analysis about the influence of dropout layer position, drop probability and the number of samples, on the quality of uncertainty estimation is performed. Finally, based on the insights gained from the analysis, a model with optimal configuration of dropout is developed. It is then evaluated on the Cityscape dataset and shown to be outperforming the baseline model with an AUC-ROC of about 90%, while the latter having AUC-ROC of about 80%.

Segmentierung von multimodalen Laserscanner-Daten im Außenbereich und Terrain-Klassifikation (2017)

Doll, Thomas

In der vorliegenden Arbeit wird ein Verfahren zur Segmentierung von Außenszenen und Terrain-Klassifkation entwickelt. Dazu werden 360 Grad-Laserscanner-Aufnahmen von Straßen, Gebäudefassaden und Waldwegen aufgenommen. Von diesen Aufnahmen werden verschiedene visuelle Repräsentationen in 2D erstellt. Dazu werden die Distanzinformationen und Winkelübergänge der Polarkoordinaten, die Remissionswerte und der Normalenvektor eingesetzt. Die Berechnung des Normalenvektors wird über ein modernes Verfahren mit einerniedrigen Laufzeit durchgeführt. Anschließend werden Oberflächeneigenschaften innerhalb einer Punktwolke analysiert und vier Klassen unterschieden: Untergrund, Vegetation, Hindernis und Himmel. Die Segmentierung und Klassifkation geschieht in einem Schritt. Dazuwird die Varianz auf den N ormalen über eine Filtermaske berechnet und ein Deskriptor erstellt. Der Deskriptor beinhaltet die Normalenvektoren und die Normalenvarianz fürdie x-, y- und z-Achse. Die Ergebnisse werden als Überblendung auf dem Remissionsbilddargestellt. Die Auswertung wird über eigens erstellte Ground-Truth-Daten vorgenommen. Dazu wird das Remissionsbild genutzt und der Ground-Truth mit verschiedenen Farben eingezeichnet. Die Klassifkationsergebnisse sind in Precision-Recall-Diagrammen dargestellt.

Entwicklung und Integration einer Lucene-basierten Suchfunktion zur interaktiven Selektion von Echtzeitdaten im maritimen Lagebildsystem iLEXX (2017)

Mechlaoui, Ammer

Das Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme (IAIS) betreibt seit mehreren Jahren auf dem Campus Schloss Birlinghoven in Sankt Augustin angewandte Forschung in den Bereichen Multisensordatenanalyse und Datenvisualisierung. Im Rahmen einer mehrjährigen Kooperation zwischen dem Fraunhofer-IAIS und der Wehrtechnischen Dienststelle 71 (WTD71) wurde das Seeraumüberwachungssystem iLEXX entwickelt. Es soll den Benutzer auf auffällige Situationen hinweisen und ihm kontextabhängig alle notwendigen Handlungsoptionen zur weiteren Aufklärung der Situation oder der Abwehr einer Bedrohung aufzeigen. Das iLEXX-System verarbeitet eine Vielzahl von Sensordaten und Ereignissen. Abhängig vom Szenario kommen hier mehrere tausend Updates pro Sekunde zusammen, die in Echtzeit vorverarbeitet und visualisiert werden müssen.

Workflow for Automatic i-Vector based Speaker Identification on German Parliament Speakers (2017)

Åkermark, Gunnar

In order to help journalists investigate inside large audiovisual archives, as maintained by news broadcast agencies, the multimedia data must be indexed by text-based search engies. By automatically creating a transcript through automatic speech recognition (ASR), the spoken word becomes accessible to text search, and queries for keywords are made possible. But stil, important contextual information like the identity of the speaker is not captured. Especially when gathering original footage in the political domain, the identity of the speaker can be the most important query constraint, although this name may not be prominent in the words spoken. It is thus desireable to have this information provided explicitely to the search engine. To provide this information, the archive must be an alyzed by automatic Speaker Identification (SID). While this research topic has seen substantial gains in accuracy and robustness over last years, it has not yet established itself as a helpful, large-scale tool outside the research community. This thesis sets out to establish a workflow to provide automatic speaker identification. Its application is to help journalists searching on speeches given in the German parliament (Bundestag). This is a contribution to the News-Stream 3.0 project, a BMBF funded research project that addresses accessibility of various data sources for journalists.

Anwendung und Untersuchung von Path-Packing in der Lagerlogistik (2016)

Büchel, Alexander

Das Optimalziel für ein Logistiklager ist eine hohe Auslastung des Transportsystems. Es stellt sich somit die Frage nach der Auswahl der Aufträge, die gleichzeitig innerhalb des Lagers abgearbeitet werden, ohne Staus, Blockaden oder Überlastungen entstehen zu lassen. Dieser Auswahlprozess wird auch als Path-Packing bezeichnet. Diese Masterthesis untersucht das Path-Packing auf graphentheoretischer Ebene und stellt verschiedene Greedy-Heuristiken, eine Optimallösung auf Basis der Linearen Programmierung sowie einen kombinierten Ansatz gegenüber. Die Ansätze werden anhand von Messzeiten und Auslastungen unterschiedlich randomisiert erstellter Testdaten ausgewertet.

Sandboxing remote code execution in the distributed system RCE (2016)

Stammerjohann, Marc Julian

Scientists and engineers are using a distributed system Remote Component Environment (RCE) to design and simulate complex systems like airplanes, ships and satellites. During the simulation, RCE executes local and remote code. Remote code is classified as untrusted code. The execution of remote code comprises potential security risks for the host system of RCE. Additionally, RCE provides full access to system resources. The objective of this thesis is to implement a sandbox prototype to reduce the vulnerability of RCE during the execution of remote code.

Semantic Image Segmentation Combining Visible and Near-Infrared Channels with Depth Information (2015)

Velte, Maurice

Image understanding is a vital task in computer vision that has many applications in areas such as robotics, surveillance and the automobile industry. An important precondition for image understanding is semantic image segmentation, i.e. the correct labeling of every image pixel with its corresponding object name or class. This thesis proposes a machine learning approach for semantic image segmentation that uses images from a multi-modal camera rig. It demonstrates that semantic segmentation can be improved by combining different image types as inputs to a convolutional neural network (CNN), when compared to a single-image approach. In this work a multi-channel near-infrared (NIR) image, an RGB image and a depth map are used. The detection of people is further improved by using a skin image that indicates the presence of human skin in the scene and is computed based on NIR information. It is also shown that segmentation accuracy can be enhanced by using a class voting method based on a superpixel pre-segmentation. Models are trained for 10-class, 3-class and binary classification tasks using an original dataset. Compared to the NIR-only approach, average class accuracy is increased by 7% for 10-class, and by 22% for 3-class classification, reaching a total of 48% and 70% accuracy, respectively. The binary classification task, which focuses on the detection of people, achieves a classification accuracy of 95% and true positive rate of 66%. The report at hand describes the proposed approach and the encountered challenges and shows that a CNN can successfully learn and combine features from multi-modal image sets and use them to predict scene labeling.

Affordance-Based Action Abstraction in Robot Planning (2013)

Höller, Daniel

This work extends the affordance-inspired robot control architecture introduced in the MACS project [35] and especially its approach to integrate symbolic planning systems given in [24] by providing methods to automated abstraction of affordances to high-level operators. It discusses how symbolic planning instances can be generated automatically based on these operators and introduces an instantiation method to execute the resulting plans. Preconditions and effects of agent behaviour are learned and represented in Gärdenfors conceptual spaces framework. Its notion of similarity is used to group behaviours to abstract operators based on the affordance-inspired, function-centred view on the environment. Ways on how the capabilities of conceptual spaces to map subsymbolic to symbolic representations to generate PDDL planning domains including affordance-based operators are discussed. During plan execution, affordance-based operators are instantiated by agent behaviour based on the situation directly before its execution. The current situation is compared to past ones and the behaviour that has been most successful in the past is applied. Execution failures can be repaired by action substitution. The concept of using contexts to dynamically change dimension salience as introduced by Gärdenfors is realized by using techniques from the field of feature selection. The approach is evaluated using a 3D simulation environment and implementations of several object manipulation behaviours.

Automated testing of distributed systems using on-demand virtual infrastructure (2013)

Kroll, Phillip

Distributed systems comprise distributed computing systems, distributed information systems, and distributed pervasive systems. They are often very complex and their implementation is challenging. Intensive and continuous testing is indispensable to ensure reliability and high quality of a distributed system. The testing process should have a high degree of automation, not only on lower levels (i.e. unit and module testing), but also on higher testing levels (e.g. system, integration, and acceptance tests). To achieve automation on higher testing levels virtual infrastructure components (e.g. virtual machines, virtual networks) that are offered as a Service (IaaS) can be employed. The elasticity of on-demand computation resources fits well together with the varying resource demands of automated test execution. A methodology for automated acceptance testing of distributed systems that uses virtual infrastructure is presented. It is founded on a task-oriented model that is used to abstract concurrency and asynchronous, remote communication in distributed systems. The model is used as groundwork for a domain-specific language that allows expressing tests for distributed systems in the form of scenarios. On the one hand, test scenarios are executable and, therefore, fully automated. On the other hand, test scenarios represent requirements to the system under test making an automated, example-based verification possible. A prototypical implementation is used to apply the developed methodology in the context of two different case studies. The first case study uses RCE as an example of a distributed, workflow-driven integration environment for scientific computing. The second one uses MongoDB as an example of a document-oriented database system that offers distributed data storage through master-slave replication. The results of the experimental evaluation indicate that the developed acceptance testing methodology is a useful approach to design, build, and execute tests for distributed systems with high quality and a high degree of automation.

Semantische Suche in Wissensportalen: Konzeption und Evaluation der Erweiterung eines Suchframeworks um semantische Technologien (2013)

Schäfer, Thorsten

Für die Durchführung größerer Projekte innerhalb des DLR ist es häufig notwendig, dass sich Wissenschaftler fachübergreifend in Themengebiete einarbeiten müssen. Im Rahmen dieser Einarbeitung führen Wissenschaftler Recherchen in fremden Fachbereichen durch. Das DLR hat zu diesem Zweck das Wissensportal KnowledgeFinder entwickelt. Dieses Framework setzt klassische Suchverfahren zum Auffinden von Informationen in beliebigen Datenbeständen ein. Wenn Wissenschaftler in fremden Fachbereichen recherchieren, dann fällt es ihnen aufgrund des oberflächlichen Einblicks oftmals schwer, zielgerichtet nach Informationen zu suchen. Die im KnowledgeFinder eingesetzten klassischen Suchverfahren, die auf textueller und struktureller Ähnlichkeit basieren, können bei diesen unspezifischen Suchanfragen nur bedingt beim Auffinden von relevanten Informationen helfen. Aufgrund von Mehrdeutigkeiten und unterschiedlichen Kontexten stoße solche Verfahren oftmals an ihre Grenzen. Semantische Technologien haben zum Ziel diesen Mangel zu beheben. Hier wird neben der textuellen und strukturellen Ähnlichkeit zusätzlich die Dimension der Bedeutung betrachtet. In dieser Masterthesis wurde untersucht, ob die Suchergebnisqualität des KnowledgeFinder durch den Einsatz semantischer Technologien verbessert werden kann. Innerhalb einer Machbarkeitsstudie wurde dazu das KnowledgeFinder Framework um semantische Suchverfahren erweitert. Diese Verfahren sollen die fachübergreifende Recherche von DLR-Wissenschaftlern erleichtern, indem sie ihnen helfen, passende Suchergebnisse in den entsprechenden Fachbereichen zu finden.

Improving reliability of mobile manipulators against unknown external faults (2012)

Akhtar, Naveed

A robot (e.g. mobile manipulator) that interacts with its environment to perform its tasks, often faces situations in which it is unable to achieve its goals despite perfect functioning of its sensors and actuators. These situations occur when the behavior of the object(s) manipulated by the robot deviates from its expected course because of unforeseeable ircumstances. These deviations are experienced by the robot as unknown external faults. In this work we present an approach that increases reliability of mobile manipulators against the unknown external faults. This approach focuses on the actions of manipulators which involve releasing of an object. The proposed approach, which is triggered after detection of a fault, is formulated as a three-step scheme that takes a definition of a planning operator and an example simulation as its inputs. The planning operator corresponds to the action that fails because of the fault occurrence, whereas the example simulation shows the desired/expected behavior of the objects for the same action. In its first step, the scheme finds a description of the expected behavior of the objects in terms of logical atoms (i.e. description vocabulary). The description of the simulation is used by the second step to find limits of the parameters of the manipulated object. These parameters are the variables that define the releasing state of the object. Using randomly chosen values of the parameters within these limits, this step creates different examples of the releasing state of the object. Each one of these examples is labelled as desired or undesired according to the behavior exhibited by the object (in the simulation), when the object is released in the state corresponded by the example. The description vocabulary is also used in labeling the examples autonomously. In the third step, an algorithm (i.e. N-Bins) uses the labelled examples to suggest the state for the object in which releasing it avoids the occurrence of unknown external faults. The proposed N-Bins algorithm can also be used for binary classification problems. Therefore, in our experiments with the proposed approach we also test its prediction ability along with the analysis of the results of our approach. The results show that under the circumstances peculiar to our approach, N-Bins algorithm shows reasonable prediction accuracy where other state of the art classification algorithms fail to do so. Thus, N-Bins also extends the ability of a robot to predict the behavior of the object to avoid unknown external faults. In this work we use simulation environment OPENRave that uses physics engine ODE to simulate the dynamics of rigid bodies.

Continuous Integration für die Entwicklung einer numerischen Bibliothek (2012)

Büchel, Alexander

Um eine Software fertigzustellen und dem Endkunden zu übergeben muss zunächst der Entwicklungsprozess durchschritten werden. Das zügige Durchlaufen dieses Entwicklungsprozesses ist besonders für den Endkunden von entscheidender Bedeutung, da die Wartezeit auf das Softwareprodukt für ihn reduziert wird. Problematisch könnte beispielsweise dabei ein modulares Vorgehen werden, wenn zunächst alle einzelnen Teilkomponenten eines Softwareproduktes entwickelt und diese daraufhin in einer anschließenden Phase, auch Integrationsphase genannt, zusammengefügt würden. Die Länge dieser Integrationsphase kann nur schwer vorausgesagt werden, so dass weder das Entwicklerteam noch der Endkunde wissen, wie lang die Fertigstellung des Produktes dauern wird. Dabei entsteht ein weiterer Nachteil. Da die Komponenten separat voneinander entwickelt werden, könnte es passieren, dass diese beim finalen Zusammenfügen nicht kompatibel sein und müssten, falls notwendig, angepasst werden. Die Folge wäre eine Verschwendung von personellen und somit auch finanziellen Ressourcen seitens des entwickelnden Unternehmens.

Open Access

Fachbereich Informatik

Refine

H-BRS Bibliography

Departments, institutes and facilities

Document Type

Year of publication

Language

Has Fulltext

Keywords

62 search hits