pub H-BRS | 005 Computerprogrammierung, Programme, Daten

„Im Wohnzimmer kriegt die schon alles mit“ – Sprachassistentendaten im Alltag (2021)

Sprachassistenten wie Alexa oder Google Assistant sind aus dem Alltag vieler VerbraucherInnen nicht mehr wegzudenken. Sie überzeugen insbesondere durch die sprachbasierte und somit freihändige Steuerung und mitunter auch den unterhaltsamen Charakter. Als häuslicher Lebensmittelpunkt sind die häufigsten Aufstellungsorte das Wohnzimmer und die Küche, da sich Haushaltsmitglieder dort die meiste Zeit aufhalten und das alltägliche Leben abspielt. Dies bedeutet allerdings ebenso, dass an diesen Orten potenziell viele Daten erfasst und gesammelt werden können, die nicht für den Sprachassistenten bestimmt sind. Demzufolge ist nicht auszuschließen, dass der Sprachassistent – wenn auch versehentlich – durch Gespräche oder Geräusche aktiviert wird und Aufnahmen speichert, selbst wenn eine Aktivierung unbewusst von Anwesenden bzw. von anderen Geräten (z. B. Fernseher) erfolgt oder aus anderen Räumen kommt. Im Rahmen eines Forschungsprojekts haben wir dazu NutzerInnen über Ihre Nutzungs- und Aufstellungspraktiken der Sprachassistenten befragt und zudem einen Prototyp getestet, der die gespeicherten Interaktionen mit dem Sprachassistenten sichtbar macht. Dieser Beitrag präsentiert basierend auf den Erkenntnissen aus den Interviews und abgeleiteten Leitfäden aus den darauffolgenden Nutzungstests des Prototyps eine Anwendung zur Beantragung und Visualisierung der Interaktionsdaten mit dem Sprachassistenten. Diese ermöglicht es, Interaktionen und die damit zusammenhängende Situation darzustellen, indem sie zu jeder Interaktion die Zeit, das verwendete Gerät sowie den Befehl wiedergibt und unerwartete Verhaltensweisen wie die versehentliche oder falsche Aktivierung sichtbar macht. Dadurch möchten wir VerbraucherInnen für die Fehleranfälligkeit dieser Geräte sensibilisieren und einen selbstbestimmteren und sichereren Umgang ermöglichen.

XML Signature Wrapping Still Considered Harmful: A Case Study on the Personal Health Record in Germany (2021)

Höller, Paul ; Krumeich, Alexander ; Lo Iacono, Luigi

XML Signature Wrapping (XSW) has been a relevant threat to web services for 15 years until today. Using the Personal Health Record (PHR), which is currently under development in Germany, we investigate a current SOAP-based web services system as a case study. In doing so, we highlight several deficiencies in defending against XSW. Using this real-world contemporary example as motivation, we introduce a guideline for more secure XML signature processing that provides practitioners with easier access to the effective countermeasures identified in the current state of research.

What's in Score for Website Users: A Data-Driven Long-Term Study on Risk-Based Authentication Characteristics (2021)

Wiefling, Stephan ; Dürmuth, Markus ; Lo Iacono, Luigi

Risk-based authentication (RBA) aims to strengthen password-based authentication rather than replacing it. RBA does this by monitoring and recording additional features during the login process. If feature values at login time differ significantly from those observed before, RBA requests an additional proof of identification. Although RBA is recommended in the NIST digital identity guidelines, it has so far been used almost exclusively by major online services. This is partly due to a lack of open knowledge and implementations that would allow any service provider to roll out RBA protection to its users. To close this gap, we provide a first in-depth analysis of RBA characteristics in a practical deployment. We observed N=780 users with 247 unique features on a real-world online service for over 1.8 years. Based on our collected data set, we provide (i) a behavior analysis of two RBA implementations that were apparently used by major online services in the wild, (ii) a benchmark of the features to extract a subset that is most suitable for RBA use, (iii) a new feature that has not been used in RBA before, and (iv) factors which have a significant effect on RBA performance. Our results show that RBA needs to be carefully tailored to each online service, as even small configuration adjustments can greatly impact RBA's security and usability properties. We provide insights on the selection of features, their weightings, and the risk classification in order to benefit from RBA after a minimum number of login attempts.

What's in Score for Website Users: A Data-driven Long-term Study on Risk-based Authentication Characteristics (2021)

Wiefling, Stephan ; Dürmuth, Markus ; Lo Iacono, Luigi

Risk-based authentication (RBA) aims to strengthen password-based authentication rather than replacing it. RBA does this by monitoring and recording additional features during the login process. If feature values at login time differ significantly from those observed before, RBA requests an additional proof of identification. Although RBA is recommended in the NIST digital identity guidelines, it has so far been used almost exclusively by major online services. This is partly due to a lack of open knowledge and implementations that would allow any service provider to roll out RBA protection to its users. To close this gap, we provide a first in-depth analysis of RBA characteristics in a practical deployment. We observed N=780 users with 247 unique features on a real-world online service for over 1.8 years. Based on our collected data set, we provide (i) a behavior analysis of two RBA implementations that were apparently used by major online services in the wild, (ii) a benchmark of the features to extract a subset that is most suitable for RBA use, (iii) a new feature that has not been used in RBA before, and (iv) factors which have a significant effect on RBA performance. Our results show that RBA needs to be carefully tailored to each online service, as even small configuration adjustments can greatly impact RBA's security and usability properties. We provide insights on the selection of features, their weightings, and the risk classification in order to benefit from RBA after a minimum number of login attempts.

Warum wir parteiische Datentreuhänder brauchen (2022)

Stevens, Gunnar ; Boden, Alexander

Der technische Fortschritt im Bereich der Erhebung, Speicherung und Verarbeitung von Daten macht es erforderlich, neue Fragen zu sozialverträglichen Datenmärkten aufzuwerfen. So gibt es sowohl eine Tendenz zur vereinfachten Datenteilung als auch die Forderung, die informationelle Selbstbestimmung besser zu schützen. Innerhalb dieses Spannungsfeldes bewegt sich die Idee von Datentreuhändern. Ziel des Beitrags ist darzulegen, dass zwischen verschiedenen Formen der Datentreuhänderschaft unterschieden werden sollte, um der Komplexität des Themas gerecht zu werden. Insbesondere bedarf es neben der mehrseitigen Treuhänderschaft, mit dem Treuhänder als neutraler Instanz, auch der einseitigen Treuhänderschaft, bei dem der Treuhänder als Anwalt der Verbraucherinteressen fungiert. Aus dieser Perspektive wird das Modell der Datentreuhänderschaft als stellvertretende Deutung der Interessen individueller und kollektiver Identitäten systematisch entwickelt.

Von Personal Information Management zu Vendor Management Software (2022)

Dethier, Erik ; Pakusch, Christina ; Boden, Alexander

Personal-Information-Management-Systeme (PIMS) gelten als Chance, um die Datensouveränität der Verbraucher zu stärken. Datenschutzbezogene Fragen sind für Verbraucher immer dort relevant, wo sie Verträge und Nutzungsbedingungen mit Diensteanbietern eingehen. Vor diesem Hintergrund diskutiert dieser Beitrag die Potenziale von VRM-Systemen, die nicht nur das Datenmanagement, sondern das gesamte Vertragsmanagement von Verbrauchern unterstützen. Dabei gehen wir der Frage nach, ob diese besser geeignet sind, um Verbraucher zu souveränem Handeln zu befähigen.

Verify It's You: How Users Perceive Risk-based Authentication (2021)

Wiefling, Stephan ; Dürmuth, Markus ; Lo Iacono, Luigi

Risk-based authentication (RBA) is an adaptive security measure to strengthen password-based authentication against account takeover attacks. Our study on 65 participants shows that users find RBA more usable than two-factor authentication equivalents and more secure than password-only authentication. We identify pitfalls and provide guidelines for putting RBA into practice.

Vergleich von Open Source MLOps Tools zur Unterstützung von Machine Learning basierten Zeitreihenanalysen (2024)

Autenrieth, Erik

Projekte des maschinellen Lernens (ML), insbesondere im Bereich der Zeitreihenanalyse, gewinnen heute zunehmend an Bedeutung. Die Bereitstellung solcher Projekte in einer Produktionsumgebung mit dem gleichen Automatisierungsgrad wie bei klassischen Softwareprojekten ist ein komplexes Unterfangen. Die Umsetzung in Produktionsumgebungen erfordert neben klassischen DevOps auch Machine Learning Operation (MLOps) Technologien und Werkzeuge. Ziel dieser Studie ist es, einen umfassenden Überblick über verfügbare MLOps Tools zu bieten und einen spezifischen Techstack für Zeitreihen ML Projekte zu entwickeln. Es werden aktuelle Trends und Werkzeuge im Bereich MLOps durch eine multivokale Literaturrecherche (MLR) untersucht und analysiert. Die Studie identifiziert passende MLOps Werkzeuge und Methoden für die Zeitreihenanalyse und präsentiert eine spezifische Implementierung einer MLOps Pipeline für die Aktienkursprognose des S&P 500. MLOps und DevOps Tools nehmen eine essenzielle Rolle bei der effektiven Konstruktion und Verwaltung von ML Pipelines ein. Bei der Auswahl geeigneter Werkzeuge ist stets eine spezifische Anpassung an die jeweiligen Projektanforderungen erforderlich. Die Bereitstellung einer detaillierten Darstellung der aktuellen MLOps Tool Landschaft erweist sich hierbei als wertvolle Ressource, die es Entwicklern ermöglicht, die Effizienz und Effektivität ihrer ML Projekte zu optimieren.

Verbraucherdatenschutz – Technik und Regulation zur Unterstützung des Individuums (2021)

An der Hochschule Bonn-Rhein-Sieg fand am Donnerstag, den 23.9.21 das erste Verbraucherforum für Verbraucherinformatik statt. Im Rahmen der Online-Tagesveranstaltung diskutierten mehr als 30 Teilnehmer:innen über Themen und Ideen rund um den Bereich Verbraucherdatenschutz. Dabei kamen sowohl Beiträge aus der Informatik, den Verbraucher- und Sozialwissenschaften sowie auch der regulatorischen Perspektive zur Sprache. Der folgende Beitrag stellt den Hintergrund der Veranstaltung dar und berichtet über Inhalte der Vorträge sowie Anknüpfungspunkte für die weitere Konstituierung der Verbraucherinformatik. Veranstalter waren das Institut für Verbraucherinformatik an der H-BRS in Zusammenarbeit mit dem Lehrstuhl IT-Sicherheit der Universität Siegen sowie dem Kompetenzzentrum Verbraucherforschung NRW der Verbraucherzentrale NRW e. V. mit Förderung des Bundesministeriums der Justiz und für Verbraucherschutz.

Verbraucherdatenschutz – Hintergrund und Einführung (2021)

Boden, Alexander ; Jakobi, Timo ; Stevens, Gunnar ; Bala, Christian

An der Hochschule Bonn-Rhein-Sieg fand am Donnerstag, den 23.9.21 das erste Verbraucherforum für Verbraucherinformatik statt. Im Rahmen der Online-Tagesveranstaltung diskutierten mehr als 30 Teilnehmer:innen über Themen und Ideen rund um den Bereich Verbraucherdatenschutz. Dabei kamen sowohl Beiträge aus der Informatik, den Verbraucher- und Sozialwissenschaften sowie auch der regulatorischen Perspektive zur Sprache. Der folgende Beitrag stellt den Hintergrund der Veranstaltung dar und berichtet über Inhalte der Vorträge sowie Anknüpfungspunkte für die weitere Konstituierung der Verbraucherinformatik. Veranstalter waren das Institut für Verbraucherinformatik an der H-BRS in Zusammenarbeit mit dem Lehrstuhl IT-Sicherheit der Universität Siegen sowie dem Kompetenzzentrum Verbraucherforschung NRW der Verbraucherzentrale NRW e. V. mit Förderung des Bundesministeriums der Justiz und für Verbraucherschutz.

Trust your guts: fostering embodied knowledge and sustainable practices through voice interaction (2023)

Esau, Margarita ; Lawo, Dennis ; Neifer, Thomas ; Stevens, Gunnar ; Boden, Alexander

Despite various attempts to prevent food waste and motivate conscious food handling, household members find it difficult to correctly assess the edibility of food. With the rise of ambient voice assistants, we did a design case study to support households’ in situ decision-making process in collaboration with our voice agent prototype, Fischer Fritz. Therefore, we conducted 15 contextual inquiries to understand food practices at home. Furthermore, we interviewed six fish experts to inform the design of our voice agent on how to guide consumers and teach food literacy. Finally, we created a prototype and discussed with 15 consumers its impact and capability to convey embodied knowledge to the human that is engaged as sensor. Our design research goes beyond current Human-Food Interaction automation approaches by emphasizing the human-food relationship in technology design and demonstrating future complementary human-agent collaboration with the aim to increase humans’ competence to sense, think, and act.

TreeSatAI Benchmark Archive : a multi-sensor, multi-label dataset for tree species classification in remote sensing (2023)

Ahlswede, Steve ; Schulz, Christian ; Gava, Christiano ; Helber, Patrick ; Bischke, Benjamin ; Förster, Michael ; Arias, Florencia ; Hees, Jörn ; Demir, Begüm ; Kleinschmit, Birgit

Airborne and spaceborne platforms are the primary data sources for large-scale forest mapping, but visual interpretation for individual species determination is labor-intensive. Hence, various studies focusing on forests have investigated the benefits of multiple sensors for automated tree species classification. However, transferable deep learning approaches for large-scale applications are still lacking. This gap motivated us to create a novel dataset for tree species classification in central Europe based on multi-sensor data from aerial, Sentinel-1 and Sentinel-2 imagery. In this paper, we introduce the TreeSatAI Benchmark Archive, which contains labels of 20 European tree species (i.e., 15 tree genera) derived from forest administration data of the federal state of Lower Saxony, Germany. We propose models and guidelines for the application of the latest machine learning techniques for the task of tree species classification with multi-label data. Finally, we provide various benchmark experiments showcasing the information which can be derived from the different sensors including artificial neural networks and tree-based machine learning methods. We found that residual neural networks (ResNet) perform sufficiently well with weighted precision scores up to 79 % only by using the RGB bands of aerial imagery. This result indicates that the spatial content present within the 0.2 m resolution data is very informative for tree species classification. With the incorporation of Sentinel-1 and Sentinel-2 imagery, performance improved marginally. However, the sole use of Sentinel-2 still allows for weighted precision scores of up to 74 % using either multi-layer perceptron (MLP) or Light Gradient Boosting Machine (LightGBM) models. Since the dataset is derived from real-world reference data, it contains high class imbalances. We found that this dataset attribute negatively affects the models' performances for many of the underrepresented classes (i.e., scarce tree species). However, the class-wise precision of the best-performing late fusion model still reached values ranging from 54 % (Acer) to 88 % (Pinus). Based on our results, we conclude that deep learning techniques using aerial imagery could considerably support forestry administration in the provision of large-scale tree species maps at a very high resolution to plan for challenges driven by global environmental change. The original dataset used in this paper is shared via Zenodo (https://doi.org/10.5281/zenodo.6598390, Schulz et al., 2022). For citation of the dataset, we refer to this article.

Sicherheit der Verbraucher in vernetzten Fahrzeugen (2016)

Lemke-Rust, Kerstin

Dieser Beitrag betrachtet den Stand der Entwicklung bei der Vernetzung von Fahrzeugen aus Sicht der IT-Sicherheit. Etablierte Kommunikationssysteme und Verkehrstelematikanwendungen im Automobil werden ebenso vorgestellt und diskutiert wie auch zukünftige Kommunikationstechnologien Car-2-Car und Car-2-X. IT-Sicherheit im Automobil ist ein schwieriges Feld, da es hier um eine Integration von neuen innovativen Anwendungen in eine hochkomplexe bestehende Fahrzeugarchitektur geht, die zu keinen neuen Gefährdungen für die Fahrzeuginsassen führen darf. Zudem bleibt die Funktionsweise dieser Anwendungen mit ihren Auswirkungen auf das informationelle Selbstbestimmungsrecht oft intransparent. Die abschließende Diskussion gibt Handlungsempfehlungen aus Sicht der Verbraucher.

Science Track FrOSCon 2018 (2021)

Science Track FrOSCon 2016 (2018)

Im Jahre 2015 feierte die Free and Open Source Software Conference ihr 10 Jähriges Bestehen. Entstanden aus einer Idee von Studierenden, wissenschaftlichen Mitarbeitern und Professoren des Fachbereichs Informatik entwickelte sich eine der wichtigsten Konferenzen im Bereich der freien und quelloffenen Software in Deutschland.

Risk-Based Authentication for OpenStack: A Fully Functional Implementation and Guiding Example (2023)

Unsel, Vincent ; Wiefling, Stephan ; Gruschka, Nils ; Lo Iacono, Luigi

Online services have difficulties to replace passwords with more secure user authentication mechanisms, such as Two-Factor Authentication (2FA). This is partly due to the fact that users tend to reject such mechanisms in use cases outside of online banking. Relying on password authentication alone, however, is not an option in light of recent attack patterns such as credential stuffing. Risk-Based Authentication (RBA) can serve as an interim solution to increase password-based account security until better methods are in place. Unfortunately, RBA is currently used by only a few major online services, even though it is recommended by various standards and has been shown to be effective in scientific studies. This paper contributes to the hypothesis that the low adoption of RBA in practice can be due to the complexity of implementing it. We provide an RBA implementation for the open source cloud management software OpenStack, which is the first fully functional open source RBA implementation based on the Freeman et al. algorithm, along with initial reference tests that can serve as a guiding example and blueprint for developers.

Pump Up Password Security! Evaluating and Enhancing Risk-Based Authentication on a Real-World Large-Scale Online Service (2023)

Wiefling, Stephan ; Jørgensen, Paul René ; Thunem, Sigurd ; Lo Iacono, Luigi

Risk-based authentication (RBA) aims to protect users against attacks involving stolen passwords. RBA monitors features during login, and requests re-authentication when feature values widely differ from those previously observed. It is recommended by various national security organizations, and users perceive it more usable than and equally secure to equivalent two-factor authentication. Despite that, RBA is still used by very few online services. Reasons for this include a lack of validated open resources on RBA properties, implementation, and configuration. This effectively hinders the RBA research, development, and adoption progress. To close this gap, we provide the first long-term RBA analysis on a real-world large-scale online service. We collected feature data of 3.3 million users and 31.3 million login attempts over more than 1 year. Based on the data, we provide (i) studies on RBA’s real-world characteristics plus its configurations and enhancements to balance usability, security, and privacy; (ii) a machine learning–based RBA parameter optimization method to support administrators finding an optimal configuration for their own use case scenario; (iii) an evaluation of the round-trip time feature’s potential to replace the IP address for enhanced user privacy; and (iv) a synthesized RBA dataset to reproduce this research and to foster future RBA research. Our results provide insights on selecting an optimized RBA configuration so that users profit from RBA after just a few logins. The open dataset enables researchers to study, test, and improve RBA for widespread deployment in the wild.

Probing Integrated Household Information Systems for Integrated Food Practices (2021)

Lawo, Dennis ; Esau, Margarita ; Neifer, Thomas ; Stevens, Gunnar

Recent publications propose concepts of systems that integrate the various services and data sources of everyday food practices. However, this research does not go beyond the conceptualization of such systems. Therefore, there is a deficit in understanding how to combine different services and data sources and which design challenges arise from building integrated Household Information Systems. In this paper, we probed the design of an Integrated Household Information System with 13 participants. The results point towards more personalization, automatization of storage administration and enabling flexible artifact ecologies. Our paper contributes to understanding the design and usage of Integrated Household Information Systems, as a new class of information systems for HCI research.

Privacy Considerations for Risk-Based Authentication Systems (2021)

Wiefling, Stephan ; Tolsdorf, Jan ; Lo Iacono, Luigi

Risk-based authentication (RBA) extends authentication mechanisms to make them more robust against account takeover attacks, such as those using stolen passwords. RBA is recommended by NIST and NCSC to strengthen password-based authentication, and is already used by major online services. Also, users consider RBA to be more usable than two-factor authentication and just as secure. However, users currently obtain RBA's high security and usability benefits at the cost of exposing potentially sensitive personal data (e.g., IP address or browser information). This conflicts with user privacy and requires to consider user rights regarding the processing of personal data. We outline potential privacy challenges regarding different attacker models and propose improvements to balance privacy in RBA systems. To estimate the properties of the privacy-preserving RBA enhancements in practical environments, we evaluated a subset of them with long-term data from 780 users of a real-world online service. Our results show the potential to increase privacy in RBA solutions. However, it is limited to certain parameters that should guide RBA design to protect privacy. We outline research directions that need to be considered to achieve a widespread adoption of privacy preserving RBA with high user acceptance.

Morning Routines between Calm and Engaging: Designing a Smart Mirror (2021)

Esau, Margarita ; Lawo, Dennis ; Castelli, Nico ; Jakobi, Timo ; Stevens, Gunnar

Frequently the main purpose of domestic artifacts equipped with smart sensors is to hide technology, like previous examples of a Smart Mirror show. However, current Smart Homes often fail to provide meaningful IoT applications for all residents’ needs. To design beyond efficiency and productivity, we propose to realize the potential of the traditional artifact for calm and engaging experiences. Therefore, we followed a design case study approach with 22 participants in total. After an initial focus group, we conducted a diary study to examine home routines and developed a conceptual design. The evaluation of our mid-fidelity prototype shows, that we need to study carefully the practices of the residents to leverage the physical material of the artifact to fit the routines. Our Smart Mirror, enhanced by digital qualities, supports meaningful activities and makes the bathroom more appealing. Thereby, we discuss domestic technology design beyond automation.

More Than Just Good Passwords? A Study on Usability and Security Perceptions of Risk-based Authentication (2020)

Wiefling, Stephan ; Dürmuth, Markus ; Lo Iacono, Luigi

Risk-based Authentication (RBA) is an adaptive security measure to strengthen password-based authentication. RBA monitors additional features during login, and when observed feature values differ significantly from previously seen ones, users have to provide additional authentication factors such as a verification code. RBA has the potential to offer more usable authentication, but the usability and the security perceptions of RBA are not studied well. We present the results of a between-group lab study (n=65) to evaluate usability and security perceptions of two RBA variants, one 2FA variant, and password-only authentication. Our study shows with significant results that RBA is considered to be more usable than the studied 2FA variants, while it is perceived as more secure than password-only authentication in general and comparably secure to 2FA in a variety of application types. We also observed RBA usability problems and provide recommendations for mitigation. Our contribution provides a first deeper understanding of the users' perception of RBA and helps to improve RBA implementations for a broader user acceptance.

Making Order in Household Accounting - Digital Invoices as Domestic Work Artifacts (2024)

Dethier, Erik ; Kern, Dean-Robin ; Stevens, Gunnar ; Boden, Alexander

The digitization of financial activities in consumers' lives is increasing, and the digitalization of invoicing processes is expected to play a significant role, although this area is not well understood regarding the private sector. Human-Computer Interaction (HCI) and Computer Supported Cooperative Work (CSCW) research have a long history of analyzing the socio-material and temporal aspects of work practices that are relevant for the domestic domain. The socio-material structuring of invoicing work and the working styles of consumers must be considered when designing effective consumer support systems. In this ethnomethodologically-informed, design-oriented interview study, we followed 17 consumers in their daily practices of dealing with invoices to make the invisible administrative work involved in this process visible. We identified and described the meaningful artifacts that were used in a spatial-temporal process within various storage locations such as input, reminding, intermediate (for postponing cases) buffers, and archive systems. Furthermore, we identified three different working styles that consumers exhibited: direct completion, at the next opportunity, and postpone as far as possible. This study contributes to our understanding of household economics and domestic workplace studies in the tradition of CSCW and has implications for the design of electronic invoicing systems.

Less is Often More: Header Whitelisting as Semantic Gap Mitigation in HTTP-Based Software Systems (2021)

Büttner, Andre ; Nguyen, Hoai Viet ; Gruschka, Nils ; Lo Iacono, Luigi

The web is the most wide-spread digital system in the world and is used for many crucial applications. This makes web application security extremely important and, although there are already many security measures, new vulnerabilities are constantly being discovered. One reason for some of the recent discoveries lies in the presence of intermediate systems—e.g. caches, message routers, and load balancers—on the way between a client and a web application server. The implementations of such intermediaries may interpret HTTP messages differently, which leads to a semantically different understanding of the same message. This so-called semantic gap can cause weaknesses in the entire HTTP message processing chain. In this paper we introduce the header whitelisting (HWL) approach to address the semantic gap in HTTP message processing pipelines. The basic idea is to normalize and reduce an HTTP request header to the minimum required fields using a whitelist before processing it in an intermediary or on the server, and then restore the original request for the next hop. Our results show that HWL can avoid misinterpretations of HTTP messages in the different components and thus prevent many attacks rooted in a semantic gap including request smuggling, cache poisoning, and authentication bypass.

Knowledge Base Question Answering by Transformer-Based Graph Pattern Scoring (2023)

Lamott, Marcel ; Hees, Jörn ; Ulges, Adrian

Question Answering (QA) has gained significant attention in recent years, with transformer-based models improving natural language processing. However, issues of explainability remain, as it is difficult to determine whether an answer is based on a true fact or a hallucination. Knowledge-based question answering (KBQA) methods can address this problem by retrieving answers from a knowledge graph. This paper proposes a hybrid approach to KBQA called FRED, which combines pattern-based entity retrieval with a transformer-based question encoder. The method uses an evolutionary approach to learn SPARQL patterns, which retrieve candidate entities from a knowledge base. The transformer-based regressor is then trained to estimate each pattern’s expected F1 score for answering the question, resulting in a ranking ofcandidate entities. Unlike other approaches, FRED can attribute results to learned SPARQL patterns, making them more interpretable. The method is evaluated on two datasets and yields MAP scores of up to 73 percent, with the transformer-based interpretation falling only 4 pp short of an oracle run. Additionally, the learned patterns successfully complement manually generated ones and generalize well to novel questions.

Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety (2022)

Deployment of modern data-driven machine learning methods, most often realized by deep neural networks (DNNs), in safety-critical applications such as health care, industrial plant control, or autonomous driving is highly challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability and implausible predictions to directed attacks by means of malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from so-called safety concerns, properties that preclude their deployment as no argument or experimental setup can help to assess the remaining risk. In recent years, an abundance of state-of-the-art techniques aiming to address these safety concerns has emerged. This chapter provides a structured and broad overview of them. We first identify categories of insufficiencies to then describe research activities aiming at their detection, quantification, or mitigation. Our work addresses machine learning experts and safety engineers alike: The former ones might profit from the broad range of machine learning topics covered and discussions on limitations of recent methods. The latter ones might gain insights into the specifics of modern machine learning methods. We hope that this contribution fuels discussions on desiderata for machine learning systems and strategies on how to help to advance existing approaches accordingly.

Open Access

005 Computerprogrammierung, Programme, Daten

Refine

H-BRS Bibliography

Departments, institutes and facilities

Document Type

Year of publication

Language

Has Fulltext

Keywords

52 search hits