pub H-BRS | Search

Knowledge Base Question Answering by Transformer-Based Graph Pattern Scoring (2023)

Lamott, Marcel ; Hees, Jörn ; Ulges, Adrian

Question Answering (QA) has gained significant attention in recent years, with transformer-based models improving natural language processing. However, issues of explainability remain, as it is difficult to determine whether an answer is based on a true fact or a hallucination. Knowledge-based question answering (KBQA) methods can address this problem by retrieving answers from a knowledge graph. This paper proposes a hybrid approach to KBQA called FRED, which combines pattern-based entity retrieval with a transformer-based question encoder. The method uses an evolutionary approach to learn SPARQL patterns, which retrieve candidate entities from a knowledge base. The transformer-based regressor is then trained to estimate each pattern’s expected F1 score for answering the question, resulting in a ranking ofcandidate entities. Unlike other approaches, FRED can attribute results to learned SPARQL patterns, making them more interpretable. The method is evaluated on two datasets and yields MAP scores of up to 73 percent, with the transformer-based interpretation falling only 4 pp short of an oracle run. Additionally, the learned patterns successfully complement manually generated ones and generalize well to novel questions.

Developing OERs for Teaching Database Systems (2023)

Rakow, Thomas C. ; Kless, André ; Hasler, Charlotte ; Knolle, Harm ; Faeskorn-Woyke, Heide ; Saatz, Inga Marina ; Lambert, Jens ; Focken, Mareike

In the project EILD.nrw, Open Educational Resources (OER) have been developed for teaching databases. Lecturers can use the tools and courses in a variety of learning scenarios. Students of computer science and application subjects can learn the complete life cycle of databases. For this purpose, quizzes, interactive tools, instructional videos, and courses for learning management systems are developed and published under a Creative Commons license. We give an overview of the developed OERs according to subject, description, teaching form, and format. Following, we describe how licencing, sustainability, accessibility, contextualization, content description, and technical adaptability are implemented. The feedback of students in ongoing classes are evaluated.

Risk-Based Authentication for OpenStack: A Fully Functional Implementation and Guiding Example (2023)

Unsel, Vincent ; Wiefling, Stephan ; Gruschka, Nils ; Lo Iacono, Luigi

Online services have difficulties to replace passwords with more secure user authentication mechanisms, such as Two-Factor Authentication (2FA). This is partly due to the fact that users tend to reject such mechanisms in use cases outside of online banking. Relying on password authentication alone, however, is not an option in light of recent attack patterns such as credential stuffing. Risk-Based Authentication (RBA) can serve as an interim solution to increase password-based account security until better methods are in place. Unfortunately, RBA is currently used by only a few major online services, even though it is recommended by various standards and has been shown to be effective in scientific studies. This paper contributes to the hypothesis that the low adoption of RBA in practice can be due to the complexity of implementing it. We provide an RBA implementation for the open source cloud management software OpenStack, which is the first fully functional open source RBA implementation based on the Freeman et al. algorithm, along with initial reference tests that can serve as a guiding example and blueprint for developers.

Towards Detection of Malicious Software Packages Through Code Reuse by Malevolent Actors (2022)

Ohm, Marc ; Kempf, Lukas ; Boes, Felix ; Meier, Michael

Trojanized software packages used in software supply chain attacks constitute an emerging threat. Unfortunately, there is still a lack of scalable approaches that allow automated and timely detection of malicious software packages and thus most detections are based on manual labor and expertise. However, it has been observed that most attack campaigns comprise multiple packages that share the same or similar malicious code. We leverage that fact to automatically reproduce manually identified clusters of known malicious packages that have been used in real world attacks, thus, reducing the need for expert knowledge and manual inspection. Our approach, AST Clustering using MCL to mimic Expertise (ACME), yields promising results with a 𝐹1 score of 0.99. Signatures are automatically generated based on characteristic code fragments from clusters and are subsequently used to scan the whole npm registry for unreported malicious packages. We are able to identify and report six malicious packages that have been removed from npm consequentially. Therefore, our approach can support the detection by reducing manual labor and hence may be employed by maintainers of package repositories to detect possible software supply chain attacks through trojanized software packages.

Data Protection Officers' Perspectives on Privacy Challenges in Digital Ecosystems (2023)

Wiefling, Stephan ; Tolsdorf, Jan ; Lo Iacono, Luigi

Digital ecosystems are driving the digital transformation of business models. Meanwhile, the associated processing of personal data within these complex systems poses challenges to the protection of individual privacy. In this paper, we explore these challenges from the perspective of digital ecosystems' platform providers. To this end, we present the results of an interview study with seven data protection officers representing a total of 12 digital ecosystems in Germany. We identified current and future challenges for the implementation of data protection requirements, covering issues on legal obligations and data subject rights. Our results support stakeholders involved in the implementation of privacy protection measures in digital ecosystems, and form the foundation for future privacy-related studies tailored to the specifics of digital ecosystems.

A Deep Dive Into Exploring the Preference Hypervolume (2020)

Hagg, Alexander ; Asteroth, Alexander ; Bäck, Thomas

Computers can help us to trigger our intuition about how to solve a problem. But how does a computer take into account what a user wants and update these triggers? User preferences are hard to model as they are by nature vague, depend on the user’s background and are not always deterministic, changing depending on the context and process under which they were established. We pose that the process of preference discovery should be the object of interest in computer aided design or ideation. The process should be transparent, informative, interactive and intuitive. We formulate Hyper-Pref, a cyclic co-creative process between human and computer, which triggers the user’s intuition about what is possible and is updated according to what the user wants based on their decisions. We combine quality diversity algorithms, a divergent optimization method that can produce many, diverse solutions, with variational autoencoders to both model that diversity as well as the user’s preferences, discovering the preference hypervolume within large search spaces.

Design and Evaluation of a GPU Streaming Framework for Visualizing Time-Varying AMR Data (2022)

Zellmann, Stefan ; Wald, Ingo ; Sahistan, Alper ; Hellmann, Matthias ; Usher, Will

We describe a systematic approach for rendering time-varying simulation data produced by exa-scale simulations, using GPU workstations. The data sets we focus on use adaptive mesh refinement (AMR) to overcome memory bandwidth limitations by representing interesting regions in space with high detail. Particularly, our focus is on data sets where the AMR hierarchy is fixed and does not change over time. Our study is motivated by the NASA Exajet, a large computational fluid dynamics simulation of a civilian cargo aircraft that consists of 423 simulation time steps, each storing 2.5 GB of data per scalar field, amounting to a total of 4 TB. We present strategies for rendering this time series data set with smooth animation and at interactive rates using current generation GPUs. We start with an unoptimized baseline and step by step extend that to support fast streaming updates. Our approach demonstrates how to push current visualization workstations and modern visualization APIs to their limits to achieve interactive visualization of exa-scale time series data sets.

End to End Global Horizontal Irradiance Estimation Through Pre-trained Deep Learning Models Using All-Sky-Images (2022)

Chaaraoui, Samer ; Houben, Sebastian ; Meilinger, Stefanie

The accurate forecasting of solar radiation plays an important role for predictive control applications for energy systems with a high share of photovoltaic (PV) energy. Especially off-grid microgrid applications using predictive control applications can benefit from forecasts with a high temporal resolution to address sudden fluctuations of PV-power. However, cloud formation processes and movements are subject to ongoing research. For now-casting applications, all-sky-imagers (ASI) are used to offer an appropriate forecasting for aforementioned application. Recent research aims to achieve these forecasts via deep learning approaches, either as an image segmentation task to generate a DNI forecast through a cloud vectoring approach to translate the DNI to a GHI with ground-based measurement (Fabel et al., 2022; Nouri et al., 2021), or as an end-to-end regression task to generate a GHI forecast directly from the images (Paletta et al., 2021; Yang et al., 2021). While end-to-end regression might be the more attractive approach for off-grid scenarios, literature reports increased performance compared to smart-persistence but do not show satisfactory forecasting patterns (Paletta et al., 2021). This work takes a step back and investigates the possibility to translate ASI-images to current GHI to deploy the neural network as a feature extractor. An ImageNet pre-trained deep learning model is used to achieve such translation on an openly available dataset by the University of California San Diego (Pedro et al., 2019). The images and measurements were collected in Folsom, California. Results show that the neural network can successfully translate ASI-images to GHI for a variety of cloud situations without the need of any external variables. Extending the neural network to a forecasting task also shows promising forecasting patterns, which shows that the neural network extracts both temporal and momentarily features within the images to generate GHI forecasts.

ProtSTonKGs: A Sophisticated Transformer Trained on Protein Sequences, Text, and Knowledge Graphs (2022)

Balabin, Helena ; Hoyt, Charles Tapley ; Gyori, Benjamin M. ; Bachman, John ; Tom Kodamullil, Alpha ; Hofmann-Apitius, Martin ; Domingo-Fernández, Daniel

While most approaches individually exploit unstructured data from the biomedical literature or structured data from biomedical knowledge graphs, their union can better exploit the advantages of such approaches, ultimately improving representations of biology. Using multimodal transformers for such purposes can improve performance on context dependent classication tasks, as demonstrated by our previous model, the Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs (STonKGs). In this work, we introduce ProtSTonKGs, a transformer aimed at learning all-encompassing representations of protein-protein interactions. ProtSTonKGs presents an extension to our previous work by adding textual protein descriptions and amino acid sequences (i.e., structural information) to the text- and knowledge graph-based input sequence used in STonKGs. We benchmark ProtSTonKGs against STonKGs, resulting in improved F1 scores by up to 0.066 (i.e., from 0.204 to 0.270) in several tasks such as predicting protein interactions in several contexts. Our work demonstrates how multimodal transformers can be used to integrate heterogeneous sources of information, paving the foundation for future approaches that use multiple modalities for biomedical applications.

Maximum Likelihood Uncertainty Estimation: Robustness to Outliers (2022)

Nair, Deebul S. ; Hochgeschwender, Nico ; Olivares-Mendez, Miguel A.

We benchmark the robustness of maximum likelihood based uncertainty estimation methods to outliers in training data for regression tasks. Outliers or noisy labels in training data results in degraded performances as well as incorrect estimation of uncertainty. We propose the use of a heavy-tailed distribution (Laplace distribution) to improve the robustness to outliers. This property is evaluated using standard regression benchmarks and on a high-dimensional regression task of monocular depth estimation, both containing outliers. In particular, heavy-tailed distribution based maximum likelihood provides better uncertainty estimates, better separation in uncertainty for out-of-distribution data, as well as better detection of adversarial attacks in the presence of outliers.

Digitale Verbraucherteilhabe bei Blockchain-Anwendungen (2021)

Hünseler, Marco ; Lemke-Rust, Kerstin ; Pöll, Eva ; Stoppenbrink, Katja

Die Blockchain-Technologie ist einer der großen Innovationstreiber der letzten Jahre. Mit einer zugrundeliegenden Blockchain-Technologie ist auch der Betrieb von verteilten Anwendungen, sogenannter Decentralized Applications (DApps), bereits technisch umsetzbar. Dieser Beitrag verfolgt das Ziel, Gestaltungsmöglichkeiten der digitalen Verbraucherteilhabe an Blockchain-Anwendungen zu untersuchen. Hierzu enthält der Beitrag eine Einführung in die digitale Verbraucherteilhabe und die technischen Grundlagen und Eigenschaften der Blockchain-Technologie, einschließlich darauf basierender DApps. Abschließend werden technische, ethisch-organisatorische, rechtliche und sonstige Anforderungsbereiche für die Umsetzung von digitaler Verbraucherteilhabe in Blockchain-Anwendungen adressiert.

What's in Score for Website Users: A Data-Driven Long-Term Study on Risk-Based Authentication Characteristics (2021)

Wiefling, Stephan ; Dürmuth, Markus ; Lo Iacono, Luigi

Risk-based authentication (RBA) aims to strengthen password-based authentication rather than replacing it. RBA does this by monitoring and recording additional features during the login process. If feature values at login time differ significantly from those observed before, RBA requests an additional proof of identification. Although RBA is recommended in the NIST digital identity guidelines, it has so far been used almost exclusively by major online services. This is partly due to a lack of open knowledge and implementations that would allow any service provider to roll out RBA protection to its users. To close this gap, we provide a first in-depth analysis of RBA characteristics in a practical deployment. We observed N=780 users with 247 unique features on a real-world online service for over 1.8 years. Based on our collected data set, we provide (i) a behavior analysis of two RBA implementations that were apparently used by major online services in the wild, (ii) a benchmark of the features to extract a subset that is most suitable for RBA use, (iii) a new feature that has not been used in RBA before, and (iv) factors which have a significant effect on RBA performance. Our results show that RBA needs to be carefully tailored to each online service, as even small configuration adjustments can greatly impact RBA's security and usability properties. We provide insights on the selection of features, their weightings, and the risk classification in order to benefit from RBA after a minimum number of login attempts.

Privacy Considerations for Risk-Based Authentication Systems (2021)

Wiefling, Stephan ; Tolsdorf, Jan ; Lo Iacono, Luigi

Risk-based authentication (RBA) extends authentication mechanisms to make them more robust against account takeover attacks, such as those using stolen passwords. RBA is recommended by NIST and NCSC to strengthen password-based authentication, and is already used by major online services. Also, users consider RBA to be more usable than two-factor authentication and just as secure. However, users currently obtain RBA's high security and usability benefits at the cost of exposing potentially sensitive personal data (e.g., IP address or browser information). This conflicts with user privacy and requires to consider user rights regarding the processing of personal data. We outline potential privacy challenges regarding different attacker models and propose improvements to balance privacy in RBA systems. To estimate the properties of the privacy-preserving RBA enhancements in practical environments, we evaluated a subset of them with long-term data from 780 users of a real-world online service. Our results show the potential to increase privacy in RBA solutions. However, it is limited to certain parameters that should guide RBA design to protect privacy. We outline research directions that need to be considered to achieve a widespread adoption of privacy preserving RBA with high user acceptance.

Grammar-Constrained Neural Semantic Parsing with LR Parsers (2021)

Baranowski, Artur ; Hochgeschwender, Nico

Target meaning representations for semantic parsing tasks are often based on programming or query languages, such as SQL, and can be formalized by a context-free grammar. Assuming a priori knowledge of the target domain, such grammars can be exploited to enforce syntactical constraints when predicting logical forms. To that end, we assess how syntactical parsers can be integrated into modern encoder-decoder frameworks. Specifically, we implement an attentional SEQ2SEQ model that uses an LR parser to maintain syntactically valid sequences throughout the decoding procedure. Compared to other approaches to grammar-guided decoding that modify the underlying neural network architecture or attempt to derive full parse trees, our approach is conceptually simpler, adds less computational overhead during inference and integrates seamlessly with current SEQ2SEQ frameworks. We present preliminary evaluation results against a recurrent SEQ2SEQ baseline on GEOQUERY and ATIS and demonstrate improved performance while enforcing grammatical constraints.

Less is Often More: Header Whitelisting as Semantic Gap Mitigation in HTTP-Based Software Systems (2021)

Büttner, Andre ; Nguyen, Hoai Viet ; Gruschka, Nils ; Lo Iacono, Luigi

The web is the most wide-spread digital system in the world and is used for many crucial applications. This makes web application security extremely important and, although there are already many security measures, new vulnerabilities are constantly being discovered. One reason for some of the recent discoveries lies in the presence of intermediate systems—e.g. caches, message routers, and load balancers—on the way between a client and a web application server. The implementations of such intermediaries may interpret HTTP messages differently, which leads to a semantically different understanding of the same message. This so-called semantic gap can cause weaknesses in the entire HTTP message processing chain. In this paper we introduce the header whitelisting (HWL) approach to address the semantic gap in HTTP message processing pipelines. The basic idea is to normalize and reduce an HTTP request header to the minimum required fields using a whitelist before processing it in an intermediary or on the server, and then restore the original request for the next hop. Our results show that HWL can avoid misinterpretations of HTTP messages in the different components and thus prevent many attacks rooted in a semantic gap including request smuggling, cache poisoning, and authentication bypass.

XML Signature Wrapping Still Considered Harmful: A Case Study on the Personal Health Record in Germany (2021)

Höller, Paul ; Krumeich, Alexander ; Lo Iacono, Luigi

XML Signature Wrapping (XSW) has been a relevant threat to web services for 15 years until today. Using the Personal Health Record (PHR), which is currently under development in Germany, we investigate a current SOAP-based web services system as a case study. In doing so, we highlight several deficiencies in defending against XSW. Using this real-world contemporary example as motivation, we introduce a guideline for more secure XML signature processing that provides practitioners with easier access to the effective countermeasures identified in the current state of research.

Evaluation of Account Recovery Strategies with FIDO2-based Passwordless Authentication (2021)

Kunke, Johannes ; Wiefling, Stephan ; Ullmann, Markus ; Lo Iacono, Luigi

Threats to passwords are still very relevant due to attacks like phishing or credential stuffing. One way to solve this problem is to remove passwords completely. User studies on passwordless FIDO2 authentication using security tokens demonstrated the potential to replace passwords. However, widespread acceptance of FIDO2 depends, among other things, on how user accounts can be recovered when the security token becomes permanently unavailable. For this reason, we provide a heuristic evaluation of 12 account recovery mechanisms regarding their properties for FIDO2 passwordless authentication. Our results show that the currently used methods have many drawbacks. Some even rely on passwords, taking passwordless authentication ad absurdum. Still, our evaluation identifies promising account recovery solutions and provides recommendations for further studies.

More Than Just Good Passwords? A Study on Usability and Security Perceptions of Risk-based Authentication (2020)

Wiefling, Stephan ; Dürmuth, Markus ; Lo Iacono, Luigi

Risk-based Authentication (RBA) is an adaptive security measure to strengthen password-based authentication. RBA monitors additional features during login, and when observed feature values differ significantly from previously seen ones, users have to provide additional authentication factors such as a verification code. RBA has the potential to offer more usable authentication, but the usability and the security perceptions of RBA are not studied well. We present the results of a between-group lab study (n=65) to evaluate usability and security perceptions of two RBA variants, one 2FA variant, and password-only authentication. Our study shows with significant results that RBA is considered to be more usable than the studied 2FA variants, while it is perceived as more secure than password-only authentication in general and comparably secure to 2FA in a variety of application types. We also observed RBA usability problems and provide recommendations for mitigation. Our contribution provides a first deeper understanding of the users' perception of RBA and helps to improve RBA implementations for a broader user acceptance.

Evaluation of Risk-based Re-Authentication Methods (2020)

Wiefling, Stephan ; Patil, Tanvi ; Dürmuth, Markus ; Lo Iacono, Luigi

Risk-based Authentication (RBA) is an adaptive security measure that improves the security of password-based authentication by protecting against credential stuffing, password guessing, or phishing attacks. RBA monitors extra features during login and requests for an additional authentication step if the observed feature values deviate from the usual ones in the login history. In state-of-the-art RBA re-authentication deployments, users receive an email with a numerical code in its body, which must be entered on the online service. Although this procedure has a major impact on RBA's time exposure and usability, these aspects were not studied so far. We introduce two RBA re-authentication variants supplementing the de facto standard with a link-based and another code-based approach. Then, we present the results of a between-group study (N=592) to evaluate these three approaches. Our observations show with significant results that there is potential to speed up the RBA re-authentication process without reducing neither its security properties nor its security perception. The link-based re-authentication via "magic links", however, makes users significantly more anxious than the code-based approaches when perceived for the first time. Our evaluations underline the fact that RBA re-authentication is not a uniform procedure. We summarize our findings and provide recommendations.

Analyzing power consumption of TLS ciphers on an ESP32 (2019)

Fischer, Tilo ; Linka, Hendrik ; Rademacher, Michael ; Jonas, Karl ; Loebenberger, Daniel

More and more devices will be connected to the internet [3]. Many devicesare part of the so-called Internet of Things (IoT) which contains many low-powerdevices often powered by a battery. These devices mainly communicate with the manufacturers back-end and deliver personal data and secrets like passwords.

Quantifying Interference in WiLD Networks using Topography Data and Realistic Antenna Patterns (2019)

Rademacher, Michael ; Kretschmer, Mathias ; Jonas, Karl

Avoiding possible interference is a key aspect to maximize the performance in Wi-Fi based Long Distance networks. In this paper we quantify self-induced interference based on data derived from our testbed and match the findings against simulations. By enhancing current simulation models with two key elements we significantly reduce the deviation between testbed and simulation: the usage of detailed antenna patterns compared to the cone model and propagation modeling enhanced by license-free topography data. Based on the gathered data we discuss several possible optimization approaches such as physical separation of local radios, tuning the sensitivity of the transmitter and using centralized compared to distributed channel assignment algorithms. While our testbed is based on 5 GHz Wi-Fi, we briefly discuss the possible impact of our results to other frequency bands.

Sicherheitsanalyse von Bluetooth-Low-Energy-Geräten in der Heimautomatisierung (2019)

Fröhlich, Kevin ; Rademacher, Michael ; Jonas, Karl

Verschiedene intelligente Heimautomatisierungsgeräte wie Lampen, Schlösser und Thermostate verbreiten sich rasant im privaten Umfeld. Ein typisches Kommunikationsprotokoll für diese Geräteklasse ist Bluetooth Low Energy (BLE). In dieser Arbeit wird eine strukturierte Sicherheitsanalyse für BLE vorgestellt. Die beschriebene Vorgehensweise kategorisiert bekannte Angriffsvektoren und beschreibt einen möglichen Aufbau für eine Analyse. Im Zuge dieser Arbeit wurden einige sicherheitsrelevante Probleme aufgedeckt, die es Angreifern ermöglichen die Geräte vollständig zu übernehmen. Es zeigte sich, dass im Standard vorgesehene Sicherheitsfunktionen wie Verschlüsselung und Integritätsprüfungen häufig gar nicht oder fehlerhaft implementiert sind.

Challenges in Using OSM for Robotic Applications (2018)

Huebel, Nico ; Blumenthal, Sebastian ; Naik, Lakshadeep ; Bruyninckx, Herman

This work discusses how to use OSM for robotic applications and aims at starting a discussion between the OSM and the robotics community. OSM contains much topological and semantic information that can be directly used in robotics and offers various advantages: 1) Standardized format with existing tooling. 2) The graph structure allows to compose the OSM models with domain-specific semantics by adding custom nodes, relations, and key-value pairs. 3) Information about many places is already available and can be used by robots since it is driven by a community effort.

A Random Number Generator Based on Electronic Noise and the Xorshift Algorithm (2018)

Ewert, Mandana

This paper introduces a random number generator (RNG) based on the avalanche noise of two diodes. A true random number generator (TRNG) generates true random numbers with the use of the electronic noise produced by two avalanche diodes. The amplified outputs of the diodes are sampled and digitized. The difference between the two concurrently sampled and digitized outputs is calculated and used to select a seed and to drive a pseudo-random number generator (PRNG). The PRNG is an xorshift generator that generates 1024 bits in each cycle. Every sequence of 1024 bits is moderately modified and output. The TRNG delivers the next seed and the next cycle begins. The statistical behavior of the generator is analyzed and presented.

Quantifying the spectrum occupancy in an outdoor 5 GHz WiFi network with directional antennas (2018)

Rademacher, Michael ; Jonas, Karl ; Kretschmer, Mathias

WiFi-based Long Distance networks are seen as a promising alternative for bringing broadband connectivity to rural areas. A key factor for the profitability of these networks is using license free bands. This work quantifies the current spectrum occupancy in our testbed, which covers rural and urban areas alike. The data mining is conducted on the same WiFi card and in parallel with an operational network. The presented evaluations reveal tendencies for various aspects: occupancy compared to population density, occupancy fluctuations, (joint)-vacant channels, the mean channel vacant duration, different approaches to model/forecast occupancy, and correlations among related interfaces.

Open Access

Refine

H-BRS Bibliography

Departments, institutes and facilities

Document Type

Year of publication

Language

Has Fulltext

Keywords

32 search hits