Fachbereich Informatik
Refine
H-BRS Bibliography
- yes (32)
Departments, institutes and facilities
Document Type
- Master's Thesis (22)
- Bachelor Thesis (5)
- Report (4)
- Study Thesis (1)
Year of publication
Language
- English (32) (remove)
Keywords
- Emergency support system (2)
- Mobile sensors (2)
- 3D-Scanner (1)
- ASAG (1)
- Active Learning (1)
- Alize (1)
- Automation (1)
- Batch Normalization (1)
- Bildverarbeitung (1)
- Bounding box explanations (1)
Machine learning-based solutions are frequently adapted in several applications that require big data in operations. The performance of a model that is deployed into operations is subject to degradation due to unanticipated changes in the flow of input data. Hence, monitoring data drift becomes essential to maintain the model’s desired performance. Based on the conducted review of the literature on drift detection, statistical hypothesis testing enables to investigate whether incoming data is drifting from training data. Because Maximum Mean Discrepancy (MMD) and Kolmogorov-Smirnov (KS) have shown to be reliable distance measures between multivariate distributions in the literature review, both were selected from several existing techniques for experimentation. For the scope of this work, the image classification use case was experimented with using the Stream-51 dataset. Based on the results from different drift experiments, both MMD and KS showed high Area Under Curve values. However, KS exhibited faster performance than MMD with fewer false positives. Furthermore, the results showed that using the pre-trained ResNet-18 for feature extraction maintained the high performance of the experimented drift detectors. Furthermore, the results showed that the performance of the drift detectors highly depends on the sample sizes of the reference (training) data and the test data that flow into the pipeline’s monitor. Finally, the results also showed that if the test data is a mixture of drifting and non-drifting data, the performance of the drift detectors does not depend on how the drifting data are scattered with the non-drifting ones, but rather their amount in the test set
Estimation of Prediction Uncertainty for Semantic Scene Labeling Using Bayesian Approximation
(2018)
With the advancement in technology, autonomous and assisted driving are close to being reality. A key component of such systems is the understanding of the surrounding environment. This understanding about the environment can be attained by performing semantic labeling of the driving scenes. Existing deep learning based models have been developed over the years that outperform classical image processing algorithms for the task of semantic labeling. However, the existing models only produce semantic predictions and do not provide a measure of uncertainty about the predictions. Hence, this work focuses on developing a deep learning based semantic labeling model that can produce semantic predictions and their corresponding uncertainties. Autonomous driving needs a real-time operating model, however the Full Resolution Residual Network (FRRN) [4] architecture, which is found as the best performing architecture during literature search, is not able to satisfy this condition. Hence, a small network, similar to FRRN, has been developed and used in this work. Based on the work of [13], the developed network is then extended by adding dropout layers and the dropouts are used during testing to perform approximate Bayesian inference. The existing works on uncertainties, do not have quantitative metrics to evaluate the quality of uncertainties estimated by a model. Hence, the area under curve (AUC) of the receiver operating characteristic (ROC) curves is proposed and used as an evaluation metric in this work. Further, a comparative analysis about the influence of dropout layer position, drop probability and the number of samples, on the quality of uncertainty estimation is performed. Finally, based on the insights gained from the analysis, a model with optimal configuration of dropout is developed. It is then evaluated on the Cityscape dataset and shown to be outperforming the baseline model with an AUC-ROC of about 90%, while the latter having AUC-ROC of about 80%.
A robot (e.g. mobile manipulator) that interacts with its environment to perform its tasks, often faces situations in which it is unable to achieve its goals despite perfect functioning of its sensors and actuators. These situations occur when the behavior of the object(s) manipulated by the robot deviates from its expected course because of unforeseeable ircumstances. These deviations are experienced by the robot as unknown external faults. In this work we present an approach that increases reliability of mobile manipulators against the unknown external faults. This approach focuses on the actions of manipulators which involve releasing of an object. The proposed approach, which is triggered after detection of a fault, is formulated as a three-step scheme that takes a definition of a planning operator and an example simulation as its inputs. The planning operator corresponds to the action that fails because of the fault occurrence, whereas the example simulation shows the desired/expected behavior of the objects for the same action. In its first step, the scheme finds a description of the expected behavior of the objects in terms of logical atoms (i.e. description vocabulary). The description of the simulation is used by the second step to find limits of the parameters of the manipulated object. These parameters are the variables that define the releasing state of the object.
Using randomly chosen values of the parameters within these limits, this step creates different examples of the releasing state of the object. Each one of these examples is labelled as desired or undesired according to the behavior exhibited by the object (in the simulation), when the object is released in the state corresponded by the example. The description vocabulary is also used in labeling the examples autonomously. In the third step, an algorithm (i.e. N-Bins) uses the labelled examples to suggest the state for the object in which releasing it avoids the occurrence of unknown external faults.
The proposed N-Bins algorithm can also be used for binary classification problems. Therefore, in our experiments with the proposed approach we also test its prediction ability along with the analysis of the results of our approach. The results show that under the circumstances peculiar to our approach, N-Bins algorithm shows reasonable prediction accuracy where other state of the art classification algorithms fail to do so. Thus, N-Bins also extends the ability of a robot to predict the behavior of the object to avoid unknown external faults. In this work we use simulation environment OPENRave that uses physics engine ODE to simulate the dynamics of rigid bodies.
This thesis work presents the implementation and validation of image processing problems in hardware to estimate the performance and precision gain. It compares the implementation for the addressed problem on a Field Programmable Gate Array (FPGA) with a software implementation for a General Purpose Processor (GPP) architecture. For both solutions the implementation costs for their development is an important aspect in the validation. The analysis of the flexibility and extendability that can be achieved by a modular implementation for the FPGA design was another major aspect. This work is based upon approaches from previous work, which included the detection of Binary Large OBjects (BLOBs) in static images and continuous video streams [13, 15]. One addressed problem of this work is the tracking of the detected BLOBs in continuous image material. This has been implemented for the FPGA platform and the GPP architecture. Both approaches have been compared with respect to performance and precision. This research project is motivated by the MI6 project of the Computer Vision research group, which is located at the Bonn-Rhein-Sieg University of Applied Sciences. The intent of the MI6 project is the tracking of a user in an immersive environment. The proposed solution is to attach a light emitting device to the user for tracking the created light dots on the projection surface of the immersive environment. Having the center points of those light dots would allow the estimation of the user’s position and orientation. One major issue that makes Computer Vision problems computationally expensive is the high amount of data that has to be processed in real-time. Therefore, one major target for the implementation was to get a processing speed of more than 30 frames per second. This would allow the system to realize feedback to the user in a response time which is faster than the human visual perception. One problem that comes with the idea of using a light emitting device to represent the user, is the precision error. Dependent on the resolution of the tracked projection surface of the immersive environment, a pixel might have a size in cm2. Having a precision error of only a few pixels, might lead to an offset in the estimated user’s position of several cm. In this research work the development and validation of a detection and tracking system for BLOBs on a Cyclone II FPGA from Altera has been realized. The system supports different input devices for the image acquisition and can perform detection and tracking for five to eight BLOBs. A further extension of the design has been evaluated and is possible with some constraints. Additional modules for compressing the image data based on run-length encoding and sub-pixel precision for the computed BLOB center-points have been designed. For the comparison of the FPGA approach for BLOB tracking a similar implementation in software using a multi-threaded approach has been realized. The system can transmit the detection or tracking results on two available communication interfaces, USB and RS232. The analysis of the hardware solution showed a similar precision for the BLOB detection and tracking as the software approach. One problem is the strong increase of the allocated resources when extending the system to process more BLOBs. With one of the applied target platforms, the DE2-70 board from Altera, the BLOB detection could be extended to process up to thirty BLOBs. The implementation of the tracking approach in hardware required much more effort than the software solution. The design of high level problems in hardware for this case are more expensive than the software implementation. The search and match steps in the tracking approach could be realized more efficiently and reliably in software. The additional pre-processing modules for sub-pixel precision and run-length-encoding helped to increase the system’s performance and precision.
Robots integrated into a social environment with humans need the ability to locate persons in their surrounding area. This is also the case for the WelcomeBot which is developed at the Fraunhofer Institute IAIS. In the future, the robot should follow persons in the buildings and guide them to certain areas. Therefore, it needs the capability to detect and track a person in the environment. In this master thesis, an approach for fast and reliable tracking of a person via a mobile robotic platform is presented. Based on the investigation of different methods and sensors, a laser scanner and a camera are selected as the primary two sensors.
Nowadays perception is still an up-to-date scienti fic issue on a mobile robot system. This thesis introduces an approach on how to recognize objects, namely numbers, using a digital camera on a Volksbot robot. The robot used in this thesis has been specifi cally designed for the SICK robot day. The development of the vision algorithm was done in two stages: the region of interest detection and the actual number recognition. Diff erent algorithms had been tested and evaluated and the Canny edge detector with contour finding has been proven to be the best choice for the region of interest detection and the Tesseract OCR engine was the best decision for number recognition. To integrate the vision component on an existing robot system, ROS was used. This thesis also discusses the integration of the EPOS motor controller into ROS.
In the eld of accessing and visualization mobile sensors and their recorded data, di erent approaches were realized. The OGC1 Sensor observation Service supplies a standard to access these information, stored on servers. To be able to access these servers, an interface must be developed and implemented. The result should be a con gurable development framework for web-based GIS clients supporting the OGC sensor observation services. In particular the framework should allow continuous position updates of mobile sensors. Visualization features like charts, bounding boxes of sensors and data series should be included.
Grid services will form the base for future computational Grids. Web Services, have been extended to build Grid services. Grid Services are dened in the Open Grid Service Architecture (OGSA). The Globus Alliance has released a Web Service Resource Framework, which is still under development and which is still missing vital parts. One of them is a Concept that allows Grid-Service Requests to securely traverse Firewalls, and its realization. This Thesis aims at the development and realization of a detailed Concept for an Application Level Gateway for Grid services, based on an existing rough concept. This approach should enable a strict division between a local network and the Internet. The internet is considered as a untrusted site and the local network is considered as a trusted site. Grid resources are placed in the internet as well as in the local network. This means that the possibility to communicate through a Firewall is essential. Some further protocols like Grid Resource Allocation and Management (GRAM) and the Grid File Transfer Protocol (GridFTP) must be able to traverse the network borders securely as well, while no further actions must be taken from the user side. The German Federal Oce for Information Security (BSI) proposes a Firewall - Application Level Gateway (ALG) - Firewall solution to the German Aerospace Center (DLR) where this Thesis is written, as a principle approach. In this approach, the local network is divided from the Internet with two rewalls. Between those rewalls is a demilitarized zone (DMZ), where computers may be placed, which can be accessed from the Internet and from the local network. An ALG which is placed in this DMZ should represent the local Grid nodes to the Internet and it should act as a client to the local nodes. All Grid service requests must be directed to the ALG instead of the protected Grid nodes. The ALG then checks and validates the requests on the application level (OSI layer 7). Requests that pose no security threat and fulll certain criteria will then be forwarded to the local Grid nodes. The responses from the local Grid nodes are checked and validated by the ALG as well.
Data management is a challenge in both scientific and technical environments. Therefore researchers have developed a special interest in this field. Modern approaches (i.e. Subversion, CVS) already offer authoring and versioning in distributed systems. However this might be insufficient in a vast number of scenarios, where not only the data resulting from a process, but also data which describes the process that generated those results is crucial.
Neural network based object detectors are able to automatize many difficult, tedious tasks. However, they are usually slow and/or require powerful hardware. One main reason is called Batch Normalization (BN) [1], which is an important method for building these detectors. Recent studies present a potential replacement called Self-normalizing Neural Network (SNN) [2], which at its core is a special activation function named Scaled Exponential Linear Unit (SELU). This replacement seems to have most of BNs benefits while requiring less computational power. Nonetheless, it is uncertain that SELU and neural network based detectors are compatible with one another. An evaluation of SELU incorporated networks would help clarify that uncertainty. Such evaluation is performed through series of tests on different neural networks. After the evaluation, it is concluded that, while indeed faster, SELU is still not as good as BN for building complex object detector networks.
Today publications are digitally available which enables researchers to search the text and often also the content of tables. On the contrary, images cannot be searched which is not a problem for most fields, but in chemistry most of the information are contained in images, especially structure diagrams. Next to the "normal" chemical structures, which represent exactly one molecule, there also exist generic structures, so called Markush structures. These contain variable parts and additional textual information which enable them to represent several molecules at once. This can vary between just a few and up to thousands or even millions. This ability lead to a spread of Markush structures in patents, because it enables patents to protect entire families of molecules at once. Next to the prevention of an enumeration of all structures it also has the advantage that, if a Markush structure is used in a patent, it is much harder to determine whether a specific structure is protected by it or not. To solve the question about the protection of a structure, it is necessary to search the patents. Appropriate databases for this task already do exist, but are filled manually. An automatic processing does not yet exist. In this project a Markush structure reconstruction prototype is developed which is able to reconstruct bitmaps including Markush structures (meaning a depiction of the structure and a text part describing the generic parts) into a digital format and save them in the newly developed context-free grammar based file format extSMILES. This format is searchable due to its context-free grammar based design. To be able to develop a Markush structure reconstruction prototype, an in depth analysis of the concept of Markush structures and their requirements for a reconstruction process was performed. Thereby it is stated, that the common connection table concept of the existing file formats is not able to store Markush structures. Especially challenging are conditions for most of the formats. Thus, a context-free grammar based file format is developed, which extends the SMILES format. This extSMILES called format assures the searchability of the results by its context-free grammar based concept, and is able to store all information contained in Markush structures. In addition it is generic, extendable and easily understandable. The developed prototype for the Markush structure reconstruction uses extSMILES as output format and is based on the chemical structure recognition tool chemoCR and the Unstructured Information Management Architecture UIMA. For chemoCR modules are developed which enable it to recognize and assemble Markush structures as well as to return the reconstruction result in extSMILES. For UIMA on the other hand, a pipeline is developed, which is able to analyse and translate the input text files to extSMILES. The results of both tools then are combined and presented in chemoCR. An evaluation of the prototype is performed on a representative set of twelve structures of interest and low image quality which contain all typical Markush elements. Trivial structures containing only one R-group are not evaluated. Due to the challenging nature of the images, no Markush structure could be correctly reconstructed. But by regarding the assumption, that R-group definitions which are described by natural language are excluded from the task, and under the condition that the core structure reconstruction is improved, the rate of success can be increased to 58.4%.
This work extends the affordance-inspired robot control architecture introduced in the MACS project [35] and especially its approach to integrate symbolic planning systems given in [24] by providing methods to automated abstraction of affordances to high-level operators. It discusses how symbolic planning instances can be generated automatically based on these operators and introduces an instantiation method to execute the resulting plans. Preconditions and effects of agent behaviour are learned and represented in Gärdenfors conceptual spaces framework. Its notion of similarity is used to group behaviours to abstract operators based on the affordance-inspired, function-centred view on the environment. Ways on how the capabilities of conceptual spaces to map subsymbolic to symbolic representations to generate PDDL planning domains including affordance-based operators are discussed. During plan execution, affordance-based operators are instantiated by agent behaviour based on the situation directly before its execution. The current situation is compared to past ones and the behaviour that has been most successful in the past is applied. Execution failures can be repaired by action substitution. The concept of using contexts to dynamically change dimension salience as introduced by Gärdenfors is realized by using techniques from the field of feature selection. The approach is evaluated using a 3D simulation environment and implementations of several object manipulation behaviours.
This master thesis describes a supervised approach to the detection and the identification of humans in TV-style video sequences. In still images and video sequences, humans appear in different poses and views, fully visible and partly occluded, with varying distances to the camera, at different places, under different illumination conditions, etc. This diversity in appearance makes the task of human detection and identification to a particularly challenging problem. A possible solution of this problem is interesting for a wide range of applications such as video surveillance and content-based image and video processing. In order to detect humans in views ranging from full to close-up view and in the presence of clutter and occlusion, they are modeled by an assembly of several upper body parts. For each body part, a detector is trained based on a Support Vector Machine and on densely sampled, SIFT-like feature points in a detection window. For a more robust human detection, localized body parts are assembled using a learned model for geometric relations based on Gaussians. For a flexible human identification, the outward appearance of humans is captured and learned using the Bag-of-Features approach and non-linear Support Vector Machines. Probabilistic votes for each body part are combined to improve classification results. The combined votes yield an identification accuracy of about 80% in our experiments on episodes of the TV series "Buffy the Vampire Slayer". The Bag-of-Features approach has been used in previous work mainly for object classification tasks. Our results show that this approach can also be applied to the identification of humans in video sequences. Despite the difficulty of the given problem, the overall results are good and encourage future work in this direction.
This thesis investigates the benefit of rubrics for grading short answers using an active learning mechanism. Automating short answer grading using Natural Language Processing (NLP) is one of the active research areas in the education domain. This could save time for the evaluator and invest more time in preparing for the lecture. Most of the research on short answer grading was treated as a similarity task between reference and student answers. However, grading based on reference answers does not account for partial grades and does not provide feedback. Also, the grading is automatic that tries to replace the evaluator. Hence, using rubrics for short answer grading with active learning eliminates the drawbacks mentioned earlier.
Initially, the proposed approach is evaluated on the Mohler dataset, popularly used to benchmark the methodology. This phase is used to determine the parameters for the proposed approach. Therefore, the approach with the selected parameter exceeds the performance of current State-Of-The-Art (SOTA) methods resulting in the Pearson correlation value of 0.63 and Root Mean Square Error (RMSE) of 0.85. The proposed approach has surpassed the SOTA methods by almost 4%.
Finally, the benchmarked approach is used to grade the short answer based on rubrics instead of reference answers. The proposed approach evaluates short answers from Autonomous Mobile Robot (AMR) dataset to provide scores and feedback (formative assessment) based on the rubrics. The average performance of the dataset results in the Pearson correlation value of 0.61 and RMSE of 0.83. Thus, this research has proven that rubrics-based grading achieves formative assessment without compromising performance. In addition, the rubrics have the advantage of generalizability to all answers.
Distributed computing environments allow collaborative problem solving across teams and organisations. A fundamental precondition for collaboration is the ability to find available participants and be able to exchange information. One way to approach this conceptual formulation are central directories or registry services. A major disadvantage of centralized components is, that they limit the flexibility to form ad hoc networks that are targeted to solve a specific problem. To facilitate flexible and dynamic collaborations, ideas from decentralized and self-organising networks can be combined with concepts of service oriented computing. This project aims to investigate potential solutions for dynamic discovery of network participants and outlines how to manage challenges associated with the development of a discovery protocol for distributed systems. During the course of this project a prototypical implementation was created that integrates into the open source distributed, collaborative problem solving environment RCE [9]. It is currently developed at the German Aerospace Center (DLR) but is planned to make the framework available to broader community.
Distributed systems comprise distributed computing systems, distributed information systems, and distributed pervasive systems. They are often very complex and their implementation is challenging. Intensive and continuous testing is indispensable to ensure reliability and high quality of a distributed system. The testing process should have a high degree of automation, not only on lower levels (i.e. unit and module testing), but also on higher testing levels (e.g. system, integration, and acceptance tests). To achieve automation on higher testing levels virtual infrastructure components (e.g. virtual machines, virtual networks) that are offered as a Service (IaaS) can be employed. The elasticity of on-demand computation resources fits well together with the varying resource demands of automated test execution.
A methodology for automated acceptance testing of distributed systems that uses virtual infrastructure is presented. It is founded on a task-oriented model that is used to abstract concurrency and asynchronous, remote communication in distributed systems. The model is used as groundwork for a domain-specific language that allows expressing tests for distributed systems in the form of scenarios. On the one hand, test scenarios are executable and, therefore, fully automated. On the other hand, test scenarios represent requirements to the system under test making an automated, example-based verification possible.
A prototypical implementation is used to apply the developed methodology in the context of two different case studies. The first case study uses RCE as an example of a distributed, workflow-driven integration environment for scientific computing. The second one uses MongoDB as an example of a document-oriented database system that offers distributed data storage through master-slave replication. The results of the experimental evaluation indicate that the developed acceptance testing methodology is a useful approach to design, build, and execute tests for distributed systems with high quality and a high degree of automation.
The objective of this thesis is to implement a computer game based motivation system for maximal strength testing on the Biodex System 3 Isokinetic Dynamometer. The prototype game has been designed to improve the peak torque produced in an isometric knee extensor strength test. An extensive analysis is performed on a torque data set from a previous study. The torque responses for five second long maximal voluntary contractions of the knee extensor are analyzed to understand torque response characteristics of different subjects. The parameters identifed in the data analysis are used in the implementation of the 'Shark and School of Fish' game. The behavior of the game for different torque responses is analyzed on a different torque data set from the previous study. The evaluation shows that the game rewards and motivates continuously over a repetition to reach the peak torque value. The evaluation also shows that the game rewards the user more if he overcomes a baseline torque value within the first second and then gradually increase the torque to reach peak torque.
In the field of autonomous robotics, sensors have played a major role in defining the scope of technology and to a great extent, limitations of it as well. This cycle of constant updates and hence technological advancement has made given birth to some serious industries which were once inconceivable. Industries like autonomous driving which has a serious impact on safety and security of people, also has an equally harsh implication on the dynamics and economics of the market. With sensors like LiDAR and RADAR delivering 3D measurements as point clouds, there is a necessity to process the raw measurements directly and many research groups are working on the same. A sizable research has gone in solving the task of object detection on 2D images. In this thesis we aim to develop a LiDAR based 3D object detection scheme. We combine the ideas of PointPillars and feature pyramid networks from 2D vision to propose Pillar-FPN. The proposed method directly takes 3D point clouds as input and outputs a 3D bounding box. Our pipeline consists of multiple variations of proposed Pillar-FPN at the feature fusion level that are described in the results section. We have trained our model on the KITTI train dataset and evaluated it on KITTI validation dataset.
The recent explosion of available audio-visual media is the new challenge for information retrieval research. Audio speech recognition systems translate spoken content to the text domain. There is a need for searching and indexing this data which possesses no logical structure. One possible way to structure it on a high level of abstraction is by finding topic boundaries. Two unsupervised topic segmentation methods were evaluated with real-world data in the course of this work. The first one, TSF, models topic shifts as fluctuations in the similarity function of the transcript. The second one, LCSeg, approaches topic changes as places with the least overlapping lexical chains. Only LCSeg performed close to a similar real-world corpus. Other reported results could not be outperformed. Topic analysis based on the repeated word usage models renders topic changes more ambiguous than expected. This issue has more impact on the segmentation quality than the state-of-the-art ASR word error rate. It could be concluded that it is advisable to develop topic segmentation algorithms with real-world data to avoid potential biases to artificial data. Unlike evaluated approaches based on word usage analysis, methods operating with local contexts can be expected to perform better through emulation of semantic dependencies.
In service robotics, tasks without the involvement of objects are barely applicable, like in searching, fetching or delivering tasks. Service robots are supposed to capture efficiently object related information in real world scenes while for instance considering clutter and noise, and also being flexible and scalable to memorize a large set of objects. Besides object perception tasks like object recognition where the object’s identity is analyzed, object categorization is an important visual object perception cue that associates unknown object instances based on their e.g. appearance or shape to a corresponding category. We present a pipeline from the detection of object candidates in a domestic scene over the description to the final shape categorization of detected candidates. In order to detect object related information in cluttered domestic environments an object detection method is proposed that copes with multiple plane and object occurrences like in cluttered scenes with shelves. Further a surface reconstruction method based on Growing Neural Gas (GNG) in combination with a shape distribution-based descriptor is proposed to reflect shape characteristics of object candidates. Beneficial properties provided by the GNG such as smoothing and denoising effects support a stable description of the object candidates which also leads towards a more stable learning of categories. Based on the presented descriptor a dictionary approach combined with a supervised shape learner is presented to learn prediction models of shape categories.
Experimental results, of different shapes related to domestically appearing object shape categories such as cup, can, box, bottle, bowl, plate and ball, are shown. A classification accuracy of about 90% and a sequential execution time of lesser than two seconds for the categorization of an unknown object is achieved which proves the reasonableness of the proposed system design. Additional results are shown towards object tracking and false positive handling to enhance the robustness of the categorization. Also an initial approach towards incremental shape category learning is proposed that learns a new category based on the set of previously learned shape categories.