Refine
H-BRS Bibliography
- yes (1148)
Departments, institutes and facilities
- Fachbereich Informatik (1148) (remove)
Document Type
- Conference Object (606)
- Article (265)
- Report (77)
- Part of a Book (50)
- Preprint (50)
- Book (monograph, edited volume) (32)
- Doctoral Thesis (22)
- Conference Proceedings (18)
- Research Data (11)
- Master's Thesis (7)
Year of publication
Keywords
- Virtual Reality (13)
- Robotics (12)
- Machine Learning (10)
- Usable Security (10)
- virtual reality (10)
- 3D user interface (7)
- Quality diversity (7)
- Augmented Reality (6)
- Lehrbuch (6)
- Navigation (6)
Safety-critical applications like autonomous driving use Deep Neural Networks (DNNs) for object detection and segmentation. The DNNs fail to predict when they observe an Out-of-Distribution (OOD) input leading to catastrophic consequences. Existing OOD detection methods were extensively studied for image inputs but have not been explored much for LiDAR inputs. So in this study, we proposed two datasets for benchmarking OOD detection in 3D semantic segmentation. We used Maximum Softmax Probability and Entropy scores generated using Deep Ensembles and Flipout versions of RandLA-Net as OOD scores. We observed that Deep Ensembles out perform Flipout model in OOD detection with greater AUROC scores for both datasets.
Login Data Set for Risk-Based Authentication
Synthesized login feature data of >33M login attempts and >3.3M users on a large-scale online service in Norway. Original data collected between February 2020 and February 2021.
This data sets aims to foster research and development for <a href="https://riskbasedauthentication.org">Risk-Based Authentication (RBA) systems. The data was synthesized from the real-world login behavior of more than 3.3M users at a large-scale single sign-on (SSO) online service in Norway.
Robust Identification and Segmentation of the Outer Skin Layers in Volumetric Fingerprint Data
(2022)
Despite the long history of fingerprint biometrics and its use to authenticate individuals, there are still some unsolved challenges with fingerprint acquisition and presentation attack detection (PAD). Currently available commercial fingerprint capture devices struggle with non-ideal skin conditions, including soft skin in infants. They are also susceptible to presentation attacks, which limits their applicability in unsupervised scenarios such as border control. Optical coherence tomography (OCT) could be a promising solution to these problems. In this work, we propose a digital signal processing chain for segmenting two complementary fingerprints from the same OCT fingertip scan: One fingerprint is captured as usual from the epidermis (“outer fingerprint”), whereas the other is taken from inside the skin, at the junction between the epidermis and the underlying dermis (“inner fingerprint”). The resulting 3D fingerprints are then converted to a conventional 2D grayscale representation from which minutiae points can be extracted using existing methods. Our approach is device-independent and has been proven to work with two different time domain OCT scanners. Using efficient GPGPU computing, it took less than a second to process an entire gigabyte of OCT data. To validate the results, we captured OCT fingerprints of 130 individual fingers and compared them with conventional 2D fingerprints of the same fingers. We found that both the outer and inner OCT fingerprints were backward compatible with conventional 2D fingerprints, with the inner fingerprint generally being less damaged and, therefore, more reliable.
Ziel der achten Auflage des wissenschaftlichen Workshops “Usable Security and Privacy” auf der Mensch und Computer 2022 ist es, aktuelle Forschungs- und Praxisbeiträge zu präsentieren und anschließend mit den Teilnehmenden zu diskutieren. Der Workshop soll ein etabliertes Forum fortführen und weiterentwickeln, in dem sich Experten aus verschiedenen Bereichen, z. B. Usability und Security Engineering, transdisziplinär austauschen können.
Auch die mittlerweile siebte Ausgabe des wissenschaftlichen Workshops “Usable Security und Privacy” auf der Mensch und Computer 2021 wird aktuelle Forschungs- und Praxisbeiträge präsentiert und anschließend mit allen Teilnehmer:innen diskutiert. Zwei Beiträge befassen sich dieses Jahr mit dem Thema Privatsphäre, zwei mit dem Thema Sicherheit. Mit dem Workshop wird ein etabliertes Forum fortgeführt und weiterentwickelt, in dem sich Expert:innen aus unterschiedlichen Domänen, z. B. dem Usability- und Security- Engineering, transdisziplinär austauschen können.
The visual and auditory quality of computer-mediated stimuli for virtual and extended reality (VR/XR) is rapidly improving. Still, it remains challenging to provide a fully embodied sensation and awareness of objects surrounding, approaching, or touching us in a 3D environment, though it can greatly aid task performance in a 3D user interface. For example, feedback can provide warning signals for potential collisions (e.g., bumping into an obstacle while navigating) or pinpointing areas where one’s attention should be directed to (e.g., points of interest or danger). These events inform our motor behaviour and are often associated with perception mechanisms associated with our so-called peripersonal and extrapersonal space models that relate our body to object distance, direction, and contact point/impact. We will discuss these references spaces to explain the role of different cues in our motor action responses that underlie 3D interaction tasks. However, providing proximity and collision cues can be challenging. Various full-body vibration systems have been developed that stimulate body parts other than the hands, but can have limitations in their applicability and feasibility due to their cost and effort to operate, as well as hygienic considerations associated with e.g., Covid-19. Informed by results of a prior study using low-frequencies for collision feedback, in this paper we look at an unobtrusive way to provide spatial, proximal and collision cues. Specifically, we assess the potential of foot sole stimulation to provide cues about object direction and relative distance, as well as collision direction and force of impact. Results indicate that in particular vibration-based stimuli could be useful within the frame of peripersonal and extrapersonal space perception that support 3DUI tasks. Current results favor the feedback combination of continuous vibrotactor cues for proximity, and bass-shaker cues for body collision. Results show that users could rather easily judge the different cues at a reasonably high granularity. This granularity may be sufficient to support common navigation tasks in a 3DUI.
The processing of employee personal data is dramatically increasing. To protect employees' fundamental right to privacy, the law provides for the implementation of privacy controls, including transparency and intervention. At present, however, the stakeholders responsible for putting these obligations into action, such as employers and software engineers, simply lack the fundamental knowledge needed to design and implement the necessary controls. Indeed, privacy research has so far focused mainly on consumer relations in the private context. In contrast, privacy in the employment context is less well studied. However, since privacy is highly context-dependent, existing knowledge and privacy controls from other contexts cannot simply be adopted to the employment context. In particular, privacy in employment is subject to different legal and social norms, which require a different conceptualization of the right to privacy than is usual in other contexts. To adequately address these aspects, there is broad consensus that privacy must be regarded as a socio-technical concept in which human factors must be considered alongside technical-legal factors. Today, however, there is a particular lack of knowledge about human factors in employee privacy. Disregarding the needs and concerns of individuals or lack of usability, though, are common reasons for the failure of privacy and security measures in practice. This dissertation addresses key knowledge gaps on human factors in employee privacy by presenting the results of a total of three in-depth studies with employees in Germany. The results provide insights into employees' perceptions of the right to privacy, as well as their perceptions and expectations regarding the processing of employee personal data. The insights gained provide a foundation for the human-centered design and implementation of employee-centric privacy controls, i.e., privacy controls that incorporate the views, expectations, and capabilities of employees. Specifically, this dissertation presents the first mental models of employees on the right to informational self-determination, the German equivalent of the right to privacy. The results provide insights into employees' (1) perceptions of categories of data, (2) familiarity and expectations of the right to privacy, and (3) perceptions of data processing, data flow, safeguards, and threat models. In addition, three major types of mental models are presented, each with a different conceptualization of the right to privacy and a different desire for control. Moreover, this dissertation provides multiple insights into employees' perceptions of data sensitivity and willingness to disclose personal data in employment. Specifically, it highlights the uniqueness of the employment context compared to other contexts and breaks down the multi-dimensionality of employees' perceptions of personal data. As a result, the dimensions in which employees perceive data are presented, and differences among employees are highlighted. This is complemented by identifying personal characteristics and attitudes toward employers, as well as toward the right to privacy, that influence these perceptions. Furthermore, this dissertation provides insights into practical aspects for the implementation of personal data management solutions to safeguard employee privacy. Specifically, it presents the results of a user-centered design study with employees who process personal data of other employees as part of their job. Based on the results obtained, a privacy pattern is presented that harmonizes privacy obligations with personal data processing activities. The pattern is useful for designing privacy controls that help these employees handle employee personal data in a privacy-compliant manner, taking into account their skills and knowledge, thus helping to protect employee privacy. The outcome of this dissertation benefits a wide range of stakeholders who are involved in the protection of employee privacy. For example, it highlights the challenges to be considered by employers and software engineers when conceptualizing and designing employee-centric privacy controls. Policymakers and researchers gain a better understanding of employees' perceptions of privacy and obtain fundamental knowledge for future research into theoretical and abstract concepts or practical issues of employee privacy. Employers, IT engineers, and researchers gain insights into ways to empower data processing employees to handle employee personal data in a privacy-compliant manner, enabling employers to improve and promote compliance. Since the basic principles underlying informational self-determination have been incorporated into European privacy legislation, we are confident that our results are also of relevance to stakeholders outside Germany.
The following work presents algorithms for semi-automatic validation, feature extraction and ranking of time series measurements acquired from MOX gas sensors. Semi-automatic measurement validation is accomplished by extending established curve similarity algorithms with a slope-based signature calculation. Furthermore, a feature-based ranking metric is introduced. It allows for individual prioritization of each feature and can be used to find the best performing sensors regarding multiple research questions. Finally, the functionality of the algorithms, as well as the developed software suite, are demonstrated with an exemplary scenario, illustrating how to find the most power-efficient MOX gas sensor in a data set collected during an extensive screening consisting of 16,320 measurements, all taken with different sensors at various temperatures and analytes.
Künstliche Intelligenz (KI) ist aus der heutigen Gesellschaft kaum noch wegzudenken. Auch im Sport haben Methoden der KI in den letzten Jahren mehr und mehr Einzug gehalten. Ob und inwieweit dabei allerdings die derzeitigen Potenziale der KI tatsächlich ausgeschöpft werden, ist bislang nicht untersucht worden. Der Nutzen von Methoden der KI im Sport ist unbestritten, jedoch treten bei der Umsetzung in die Praxis gravierende Probleme auf, was den Zugang zu Ressourcen, die Verfügbarkeit von Experten und den Umgang mit den Methoden und Daten betrifft. Die Ursache für die, verglichen mit anderen Anwendungsgebieten, langsame An- bzw. Übernahme von Methoden der KI in den Spitzensport ist nach Hypothese des Autorenteams auf mehrere Mismatches zwischen dem Anwendungsfeld und den KI-Methoden zurückzuführen. Diese Mismatches sind methodischer, struktureller und auch kommunikativer Art. In der vorliegenden Expertise werden Vorschläge abgeleitet, die zur Auflösung der Mismatches führen können und zugleich neue Transfer- und Synergiemöglichkeiten aufzeigen. Außerdem wurden drei Use Cases zu Trainingssteuerung, Leistungsdiagnostik und Wettkampfdiagnostik exemplarisch umgesetzt. Dies erfolgte in Form entsprechender Projektbeschreibungen. Dabei zeigt die Ausarbeitung, auf welche Art und Weise Probleme, die heute noch bei der Verbindung zwischen KI und Sport bestehen, möglichst ausgeräumt werden können. Eine empirische Umsetzung des Use Case Trainingssteuerung erfolgte im Radsport, weshalb dieser ausführlicher dargestellt wird.
Computers can help us to trigger our intuition about how to solve a problem. But how does a computer take into account what a user wants and update these triggers? User preferences are hard to model as they are by nature vague, depend on the user’s background and are not always deterministic, changing depending on the context and process under which they were established. We pose that the process of preference discovery should be the object of interest in computer aided design or ideation. The process should be transparent, informative, interactive and intuitive. We formulate Hyper-Pref, a cyclic co-creative process between human and computer, which triggers the user’s intuition about what is possible and is updated according to what the user wants based on their decisions. We combine quality diversity algorithms, a divergent optimization method that can produce many, diverse solutions, with variational autoencoders to both model that diversity as well as the user’s preferences, discovering the preference hypervolume within large search spaces.
We describe a systematic approach for rendering time-varying simulation data produced by exa-scale simulations, using GPU workstations. The data sets we focus on use adaptive mesh refinement (AMR) to overcome memory bandwidth limitations by representing interesting regions in space with high detail. Particularly, our focus is on data sets where the AMR hierarchy is fixed and does not change over time. Our study is motivated by the NASA Exajet, a large computational fluid dynamics simulation of a civilian cargo aircraft that consists of 423 simulation time steps, each storing 2.5 GB of data per scalar field, amounting to a total of 4 TB. We present strategies for rendering this time series data set with smooth animation and at interactive rates using current generation GPUs. We start with an unoptimized baseline and step by step extend that to support fast streaming updates. Our approach demonstrates how to push current visualization workstations and modern visualization APIs to their limits to achieve interactive visualization of exa-scale time series data sets.
In robot-assisted therapy for individuals with Autism Spectrum Disorder, the workload of therapists during a therapeutic session is increased if they have to control the robot manually. To allow therapists to focus on the interaction with the person instead, the robot should be more autonomous, namely it should be able to interpret the person's state and continuously adapt its actions according to their behaviour. In this paper, we develop a personalised robot behaviour model that can be used in the robot decision-making process during an activity; this behaviour model is trained with the help of a user model that has been learned from real interaction data. We use Q-learning for this task, such that the results demonstrate that the policy requires about 10,000 iterations to converge. We thus investigate policy transfer for improving the convergence speed; we show that this is a feasible solution, but an inappropriate initial policy can lead to a suboptimal final return.
This paper explores the role of artificial intelligence (AI) in elite sports. We approach the topic from two perspectives. Firstly, we provide a literature based overview of AI success stories in areas other than sports. We identified multiple approaches in the area of Machine Perception, Machine Learning and Modeling, Planning and Optimization as well as Interaction and Intervention, holding a potential for improving training and competition. Secondly, we discover the present status of AI use in elite sports. Therefore, in addition to another literature review, we interviewed leading sports scientist, which are closely connected to the main national service institute for elite sports in their countries. The analysis of this literature review and the interviews show that the most activity is carried out in the methodical categories of signal and image processing. However, projects in the field of modeling & planning have become increasingly popular within the last years. Based on these two perspectives, we extract deficits, issues and opportunities and summarize them in six key challenges faced by the sports analytics community. These challenges include data collection, controllability of an AI by the practitioners and explainability of AI results.
Short summary
Accompanying dataset for our paper
A. Mitrevski, P. G. Plöger, and G. Lakemeyer, "Robot Action Diagnosis and Experience Correction by Falsifying Parameterised Execution Models," in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2021.
Contents
The dataset includes a single zip archive, containing data from the experiment described in the paper (conducted with a Toyota HSR). The zip archive contains three subdirectories:
handle_grasping_failure_database: A dump of a MongoDB database containing data from the handle grasping experiment, including ground-truth grasping failure annotations
pre_arm_motion_images: Images collected from the robot's hand camera before moving the robot's hand towards the handle
pregrasp_images: Images collected from the robot's hand camera just before closing the gripper for grasping
The image names include the time stamp at which the images were taken; this allows matching each image with the execution data in the database.
Database usage
After unzipping the archive, the database can be restored with the command
mongorestore handle_grasping_failure_database
This will create a MongoDB database with the name drawer_handle_grasping_failures.
Code for processing the data and failure analysis can be found in our <a href="https://github.com/alex-mitrevski/explainable-robot-execution-models">GitHub repository.
The dataset contains the following data from successful and failed executions of the Toyota HSR robot placing a book on a shelf.
RGB images from the robot's head camera
Depth images from the robot's head camera
Rendered images of the robot's 3D model from the point of view of the robot's head camera
Force-torque readings from a wrist-mounted force-torque sensor
Joint efforts, velocities and positions
extrinsic and intrinsic camera calibration parameters
frame-level anomaly annotations
The anomalies that occur during execution include:
the manipulated book falling down
books on the shelf being disturbed significantly
camera occlusions
robot being disturbed by an external collision
The dataset is split into a train, validation and test set with the following number of trials:
Train: 48 successful trials
Validation: 6 successful trials
Test: 60 anomalous trials and 7 successful trials
Contents
There are two zip archives included (grasping.zip and throwing.zip), corresponding to two experiments (grasping objects and throwing them in a drawer), both performed with a Toyota HSR. Each archive contains two directories - learning and generalisation - with object-specific learning and generalisation data. For each object, we provide a dump of a MongoDB database, which contains data sufficient for learning the models used in our experiments.
Usage
After unzipping the archives, each database can be restored with the command
mongorestore [data_directory_name]
This will create a MongoDB database with the name of the directory. Code for processing the data and model learning can be found in our <a href="https://github.com/alex-mitrevski/explainable-robot-execution-models">GitHub repository.
Short summary
This dataset accompanies our paper
A. Mitrevski, P. G. Plöger, and G. Lakemeyer, "Representation and Experience-Based Learning of Explainable Models for Robot Action Execution," in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020.
Contents
There are three zip archives included, each of them a dump of a MongoDB database corresponding to one of the three experiments in the paper:
Grasping a drawer handle (handle_drawer_logs.zip)
Grasping a fridge handle (handle_fridge_logs.zip)
Pulling an object (pull_logs.zip)
All three experiments were performed with a Toyota HSR. Only the data necessary for learning the models used in our experiments are included here.
Usage
After unzipping the archives, each database can be restored with the command
mongorestore [directory_name]
This will create a MongoDB database with the name of the directory (handle_drawer_logs, handle_fridge_logs, and pull_logs).
Code for processing the data and model learning can be found in our <a href="https://github.com/alex-mitrevski/explainable-robot-execution-models">GitHub repository.
Modern Monte-Carlo-based rendering systems still suffer from the computational complexity involved in the generation of noise-free images, making it challenging to synthesize interactive previews. We present a framework suited for rendering such previews of static scenes using a caching technique that builds upon a linkless octree. Our approach allows for memory-efficient storage and constant-time lookup to cache diffuse illumination at multiple hitpoints along the traced paths. Non-diffuse surfaces are dealt with in a hybrid way in order to reconstruct view-dependent illumination while maintaining interactive frame rates. By evaluating the visual fidelity against ground truth sequences and by benchmarking, we show that our approach compares well to low-noise path traced results, but with a greatly reduced computational complexity allowing for interactive frame rates. This way, our caching technique provides a useful tool for global illumination previews and multi-view rendering.
Graph databases employ graph structures such as nodes, attributes and edges to model and store relationships among data. To access this data, graph query languages (GQL) such as Cypher are typically used, which might be difficult to master for end-users. In the context of relational databases, sequence to SQL models, which translate natural language questions to SQL queries, have been proposed. While these Neural Machine Translation (NMT) models increase the accessibility of relational databases, NMT models for graph databases are not yet available mainly due to the lack of suitable parallel training data. In this short paper we sketch an architecture which enables the generation of synthetic training data for the graph query language Cypher.
Die Digitalisierung und der Einsatz von Informations- und Kommunikationstechnologien (ICT) hat im Arbeits- und Privatleben neben einer höheren Produktivität auch zu neuen Formen von psychischem Stress geführt. Das Stresserleben, das mit dem Einsatz von ICT verbunden ist, wird in der Literatur auch als Technostress bezeichnet. Die Forschung zu diesem Thema zeigt, dass die Entstehung von Technostress von individuellen Faktoren abhängt. Die Persönlichkeit von ICT-Anwenderinnen und Anwendern bestimmt nicht nur das Auftreten von Technostress, sondern hat auch Einfluss auf dessen gesundheitliche und leistungsbezogene Folgen. In diesem Literaturreview wird der Forschungsstand zu der Rolle von Persönlichkeitsunterschieden bei der Entstehung von Technostress und dessen Folgen systematisch zusammengefasst. Die Auswertung der relevanten Forschungsartikel erfolgt hinsichtlich verwendeter Variablen, Stichproben und Studiendesigns, statistischer Methoden, Theorien und Frameworks. Abschließend werden der aktuelle Forschungsstand eingeordnet und Forschungslücken aufgezeigt.
The accurate forecasting of solar radiation plays an important role for predictive control applications for energy systems with a high share of photovoltaic (PV) energy. Especially off-grid microgrid applications using predictive control applications can benefit from forecasts with a high temporal resolution to address sudden fluctuations of PV-power. However, cloud formation processes and movements are subject to ongoing research. For now-casting applications, all-sky-imagers (ASI) are used to offer an appropriate forecasting for aforementioned application. Recent research aims to achieve these forecasts via deep learning approaches, either as an image segmentation task to generate a DNI forecast through a cloud vectoring approach to translate the DNI to a GHI with ground-based measurement (Fabel et al., 2022; Nouri et al., 2021), or as an end-to-end regression task to generate a GHI forecast directly from the images (Paletta et al., 2021; Yang et al., 2021). While end-to-end regression might be the more attractive approach for off-grid scenarios, literature reports increased performance compared to smart-persistence but do not show satisfactory forecasting patterns (Paletta et al., 2021). This work takes a step back and investigates the possibility to translate ASI-images to current GHI to deploy the neural network as a feature extractor. An ImageNet pre-trained deep learning model is used to achieve such translation on an openly available dataset by the University of California San Diego (Pedro et al., 2019). The images and measurements were collected in Folsom, California. Results show that the neural network can successfully translate ASI-images to GHI for a variety of cloud situations without the need of any external variables. Extending the neural network to a forecasting task also shows promising forecasting patterns, which shows that the neural network extracts both temporal and momentarily features within the images to generate GHI forecasts.
Kollaborative Industrieroboter werden für produzierende Unternehmen immer kosteneffizienter. Während diese Systeme für den menschlichen Mitarbeiter eine große Hilfe sein können, stellen sie gleichzeitig ein ernstes Gesundheitsrisiko dar, wenn die zwingend notwendigen Sicherheitsmaßnahmen nur unzureichend umgesetzt werden. Herkömmliche Sicherheitseinrichtungen wie Zäune oder Lichtvorhänge bieten einen guten Schutz, aber solch statische Schutzvorrichtungen sind in neuen, hochdynamischen Arbeitsszenarien problematisch.
Im Forschungsprojekt BeyondSPAI wurde ein Funktionsmuster eines Multisensorsystems zur Absicherung solcher dynamischer Arbeitsszenarien entworfen, implementiert und im Feld getestet. Kern des Systems ist eine robuste optische Materialklassifikation, die mit Hilfe eines intelligenten InGaAs-Kamerasystems Haut von anderen typischen Werkstückoberflächen (z.B. Holz, Metalle od. Kunststoffe) unterscheiden kann. Diese einzigartige Eigenschaft wird genutzt, um menschliche Mitarbeiter zuverlässig zu erkennen, so dass ein konventioneller Roboter in Folge als personenbewusster Cobot arbeiten kann.
Das System ist modular und kann leicht mit weiteren Sensoren verschiedenster Art erweitert werden. Es kann an verschiedene Marken von Industrierobotern angepasst werden und lässt sich schnell an bestehenden Robotersystemen integrieren. Die vier vom System bereitgestellten Sicherheitsausgänge können dazu verwendet werden - abhängig von der durchdrungenen Überwachungszone - entweder eine Warnung auszugeben, die Bewegung des Roboters auf eine sichere Geschwindigkeit zu verlangsamen, oder den Roboter sicher anzuhalten. Sobald alle Zonen wieder als „eindeutig frei von Personen“ identifiziert sind, kann der Roboter wieder beschleunigen, seine ursprüngliche Bewegung wiederaufnehmen und die Arbeit fortsetzen.
Risk-based authentication (RBA) aims to protect users against attacks involving stolen passwords. RBA monitors features during login, and requests re-authentication when feature values widely differ from those previously observed. It is recommended by various national security organizations, and users perceive it more usable than and equally secure to equivalent two-factor authentication. Despite that, RBA is still used by very few online services. Reasons for this include a lack of validated open resources on RBA properties, implementation, and configuration. This effectively hinders the RBA research, development, and adoption progress.
To close this gap, we provide the first long-term RBA analysis on a real-world large-scale online service. We collected feature data of 3.3 million users and 31.3 million login attempts over more than 1 year. Based on the data, we provide (i) studies on RBA’s real-world characteristics plus its configurations and enhancements to balance usability, security, and privacy; (ii) a machine learning–based RBA parameter optimization method to support administrators finding an optimal configuration for their own use case scenario; (iii) an evaluation of the round-trip time feature’s potential to replace the IP address for enhanced user privacy; and (iv) a synthesized RBA dataset to reproduce this research and to foster future RBA research. Our results provide insights on selecting an optimized RBA configuration so that users profit from RBA after just a few logins. The open dataset enables researchers to study, test, and improve RBA for widespread deployment in the wild.
We introduce canonical weight normalization for convolutional neural networks. Inspired by the canonical tensor decomposition, we express the weight tensors in so-called canonical networks as scaled sums of outer vector products. In particular, we train network weights in the decomposed form, where scale weights are optimized separately for each mode. Additionally, similarly to weight normalization, we include a global scaling parameter. We study the initialization of the canonical form by running the power method and by drawing randomly from Gaussian or uniform distributions. Our results indicate that we can replace the power method with cheaper initializations drawn from standard distributions. The canonical re-parametrization leads to competitive normalization performance on the MNIST, CIFAR10, and SVHN data sets. Moreover, the formulation simplifies network compression. Once training has converged, the canonical form allows convenient model-compression by truncating the parameter sums.
TSEM: Temporally Weighted Spatiotemporal Explainable Neural Network for Multivariate Time Series
(2022)
Deep learning has become a one-size-fits-all solution for technical and business domains thanks to its flexibility and adaptability. It is implemented using opaque models, which unfortunately undermines the outcome trustworthiness. In order to have a better understanding of the behavior of a system, particularly one driven by time series, a look inside a deep learning model so-called posthoc eXplainable Artificial Intelligence (XAI) approaches, is important. There are two major types of XAI for time series data, namely model-agnostic and model-specific. Model-specific approach is considered in this work. While other approaches employ either Class Activation Mapping (CAM) or Attention Mechanism, we merge the two strategies into a single system, simply called the Temporally Weighted Spatiotemporal Explainable Neural Network for Multivariate Time Series (TSEM). TSEM combines the capabilities of RNN and CNN models in such a way that RNN hidden units are employed as attention weights for the CNN feature maps temporal axis. The result shows that TSEM outperforms XCM. It is similar to STAM in terms of accuracy, while also satisfying a number of interpretability criteria, including causality, fidelity, and spatiotemporality.
Comparative study of 3D object detection frameworks based on LiDAR data and sensor fusion techniques
(2022)
Estimating and understanding the surroundings of the vehicle precisely forms the basic and crucial step for the autonomous vehicle. The perception system plays a significant role in providing an accurate interpretation of a vehicle's environment in real-time. Generally, the perception system involves various subsystems such as localization, obstacle (static and dynamic) detection, and avoidance, mapping systems, and others. For perceiving the environment, these vehicles will be equipped with various exteroceptive (both passive and active) sensors in particular cameras, Radars, LiDARs, and others. These systems are equipped with deep learning techniques that transform the huge amount of data from the sensors into semantic information on which the object detection and localization tasks are performed. For numerous driving tasks, to provide accurate results, the location and depth information of a particular object is necessary. 3D object detection methods, by utilizing the additional pose data from the sensors such as LiDARs, stereo cameras, provides information on the size and location of the object. Based on recent research, 3D object detection frameworks performing object detection and localization on LiDAR data and sensor fusion techniques show significant improvement in their performance. In this work, a comparative study of the effect of using LiDAR data for object detection frameworks and the performance improvement seen by using sensor fusion techniques are performed. Along with discussing various state-of-the-art methods in both the cases, performing experimental analysis, and providing future research directions.
As cameras are ubiquitous in autonomous systems, object detection is a crucial task. Object detectors are widely used in applications such as autonomous driving, healthcare, and robotics. Given an image, an object detector outputs both the bounding box coordinates as well as classification probabilities for each object detected. The state-of-the-art detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications in particular. It is therefore crucial to explain the reason behind each detector decision in order to gain user trust, enhance detector performance, and analyze their failure.
Previous work fails to explain as well as evaluate both bounding box and classification decisions individually for various detectors. Moreover, no tools explain each detector decision, evaluate the explanations, and also identify the reasons for detector failures. This restricts the flexibility to analyze detectors. The main contribution presented here is an open-source Detector Explanation Toolkit (DExT). It is used to explain the detector decisions, evaluate the explanations, and analyze detector errors. The detector decisions are explained visually by highlighting the image pixels that most influence a particular decision. The toolkit implements the proposed approach to generate a holistic explanation for all detector decisions using certain gradient-based explanation methods. To the author’s knowledge, this is the first work to conduct extensive qualitative and novel quantitative evaluations of different explanation methods across various detectors. The qualitative evaluation incorporates a visual analysis of the explanations carried out by the author as well as a human-centric evaluation. The human-centric evaluation includes a user study to understand user trust in the explanations generated across various explanation methods for different detectors. Four multi-object visualization methods are provided to merge the explanations of multiple objects detected in an image as well as the corresponding detector outputs in a single image. Finally, DExT implements the procedure to analyze detector failures using the formulated approach.
The visual analysis illustrates that the ability to explain a model is more dependent on the model itself than the actual ability of the explanation method. In addition, the explanations are affected by the object explained, the decision explained, detector architecture, training data labels, and model parameters. The results of the quantitative evaluation show that the Single Shot MultiBox Detector (SSD) is more faithfully explained compared to other detectors regardless of the explanation methods. In addition, a single explanation method cannot generate more faithful explanations than other methods for both the bounding box and the classification decision across different detectors. Both the quantitative and human-centric evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides more trustworthy explanations among selected methods across all detectors. Finally, a convex polygon-based multi-object visualization method provides more human-understandable visualization than other methods.
The author expects that DExT will motivate practitioners to evaluate object detectors from the interpretability perspective by explaining both bounding box and classification decisions.
Self-supervised learning has proved to be a powerful approach to learn image representations without the need of large labeled datasets. For underwater robotics, it is of great interest to design computer vision algorithms to improve perception capabilities such as sonar image classification. Due to the confidential nature of sonar imaging and the difficulty to interpret sonar images, it is challenging to create public large labeled sonar datasets to train supervised learning algorithms. In this work, we investigate the potential of three self-supervised learning methods (RotNet, Denoising Autoencoders, and Jigsaw) to learn high-quality sonar image representation without the need of human labels. We present pre-training and transfer learning results on real-life sonar image datasets. Our results indicate that self-supervised pre-training yields classification performance comparable to supervised pre-training in a few-shot transfer learning setup across all three methods. Code and self-supervised pre-trained models are be available at https://github.com/agrija9/ssl-sonar-images
ProtSTonKGs: A Sophisticated Transformer Trained on Protein Sequences, Text, and Knowledge Graphs
(2022)
While most approaches individually exploit unstructured data from the biomedical literature or structured data from biomedical knowledge graphs, their union can better exploit the advantages of such approaches, ultimately improving representations of biology. Using multimodal transformers for such purposes can improve performance on context dependent classication tasks, as demonstrated by our previous model, the Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs (STonKGs). In this work, we introduce ProtSTonKGs, a transformer aimed at learning all-encompassing representations of protein-protein interactions. ProtSTonKGs presents an extension to our previous work by adding textual protein descriptions and amino acid sequences (i.e., structural information) to the text- and knowledge graph-based input sequence used in STonKGs. We benchmark ProtSTonKGs against STonKGs, resulting in improved F1 scores by up to 0.066 (i.e., from 0.204 to 0.270) in several tasks such as predicting protein interactions in several contexts. Our work demonstrates how multimodal transformers can be used to integrate heterogeneous sources of information, paving the foundation for future approaches that use multiple modalities for biomedical applications.
Vection underwater
(2022)
This edited volume on “Recent Advances in Renewable Energy” presents a selection of refereed papers presented at the 1st International Conference on Electrical Systems and Automation. The book provides rigorous discussions, the state of the art, and recent developments in the field of renewable energy sources supported by examples and case studies, making it an educational tool for relevant undergraduate and graduate courses. The book will be a valuable reference for beginners, researchers, and professionals interested in renewable energy.
This book which is the second part of two volumes on ''Control of Electrical and Electronic Systems” presents a compilation of selected contributions to the 1st International Conference on Electrical Systems & Automation. The book provides rigorous discussions, the state of the art, and recent developments in the modelling, simulation and control of power electronics, industrial systems, and embedded systems. The book will be a valuable reference for beginners, researchers, and professionals interested in control of electrical and electronic systems.
Contextual information is widely considered for NLP and knowledge discovery in life sciences since it highly influences the exact meaning of natural language. The scientific challenge is not only to extract such context data, but also to store this data for further query and discovery approaches. Classical approaches use RDF triple stores, which have serious limitations. Here, we propose a multiple step knowledge graph approach using labeled property graphs based on polyglot persistence systems to utilize context data for context mining, graph queries, knowledge discovery and extraction. We introduce the graph-theoretic foundation for a general context concept within semantic networks and show a proof of concept based on biomedical literature and text mining. Our test system contains a knowledge graph derived from the entirety of PubMed and SCAIView data and is enriched with text mining data and domain-specific language data using Biological Expression Language. Here, context is a more general concept than annotations. This dense graph has more than 71M nodes and 850M relationships. We discuss the impact of this novel approach with 27 real-world use cases represented by graph queries. Storing and querying a giant knowledge graph as a labeled property graph is still a technological challenge. Here, we demonstrate how our data model is able to support the understanding and interpretation of biomedical data. We present several real-world use cases that utilize our massive, generated knowledge graph derived from PubMed data and enriched with additional contextual data. Finally, we show a working example in context of biologically relevant information using SCAIView.
Effective Neighborhood Feature Exploitation in Graph CNNs for Point Cloud Object-Part Segmentation
(2022)
Part segmentation is the task of semantic segmentation applied on objects and carries a wide range of applications from robotic manipulation to medical imaging. This work deals with the problem of part segmentation on raw, unordered point clouds of 3D objects. While pioneering works on deep learning for point clouds typically ignore taking advantage of local geometric structure around individual points, the subsequent methods proposed to extract features by exploiting local geometry have not yielded significant improvements either. In order to investigate further, a graph convolutional network (GCN) is used in this work in an attempt to increase the effectiveness of such neighborhood feature exploitation approaches. Most of the previous works also focus only on segmenting complete point cloud data. Considering the impracticality of such approaches, taking into consideration the real world scenarios where complete point clouds are scarcely available, this work proposes approaches to deal with partial point cloud segmentation.
In the attempt to better capture neighborhood features, this work proposes a novel method to learn regional part descriptors which guide and refine the segmentation predictions. The proposed approach helps the network achieve state-of-the-art performance of 86.4% mIoU on the ShapeNetPart dataset for methods which do not use any preprocessing techniques or voting strategies. In order to better deal with partial point clouds, this work also proposes new strategies to train and test on partial data. While achieving significant improvements compared to the baseline performance, the problem of partial point cloud segmentation is also viewed through an alternate lens of semantic shape completion.
Semantic shape completion networks not only help deal with partial point cloud segmentation but also enrich the information captured by the system by predicting complete point clouds with corresponding semantic labels for each point. To this end, a new network architecture for semantic shape completion is also proposed based on point completion network (PCN) which takes advantage of a graph convolution based hierarchical decoder for completion as well as segmentation. In addition to predicting complete point clouds, results indicate that the network is capable of reaching within a margin of 5% to the mIoU performance of dedicated segmentation networks for partial point cloud segmentation.
The processing of employees’ personal data is dramatically increasing, yet there is a lack of tools that allow employees to manage their privacy. In order to develop these tools, one needs to understand what sensitive personal data are and what factors influence employees’ willingness to disclose. Current privacy research, however, lacks such insights, as it has focused on other contexts in recent decades. To fill this research gap, we conducted a cross-sectional survey with 553 employees from Germany. Our survey provides multiple insights into the relationships between perceived data sensitivity and willingness to disclose in the employment context. Among other things, we show that the perceived sensitivity of certain types of data differs substantially from existing studies in other contexts. Moreover, currently used legal and contextual distinctions between different types of data do not accurately reflect the subtleties of employees’ perceptions. Instead, using 62 different data elements, we identified four groups of personal data that better reflect the multi-dimensionality of perceptions. However, previously found common disclosure antecedents in the context of online privacy do not seem to affect them. We further identified three groups of employees that differ in their perceived data sensitivity and willingness to disclose, but neither in their privacy beliefs nor in their demographics. Our findings thus provide employers, policy makers, and researchers with a better understanding of employees’ privacy perceptions and serve as a basis for future targeted research
on specific types of personal data and employees.