005 Computerprogrammierung, Programme, Daten
Refine
Departments, institutes and facilities
- Fachbereich Informatik (27)
- Institut für Cyber Security & Privacy (ICSP) (17)
- Fachbereich Wirtschaftswissenschaften (15)
- Institut für Verbraucherinformatik (IVI) (14)
- Institut für Technik, Ressourcenschonung und Energieeffizienz (TREE) (5)
- Fachbereich Ingenieurwissenschaften und Kommunikation (1)
- Institut für funktionale Gen-Analytik (IFGA) (1)
- Institute of Visual Computing (IVC) (1)
Document Type
- Conference Object (22)
- Article (18)
- Part of a Book (3)
- Conference Proceedings (3)
- Working Paper (3)
- Study Thesis (2)
- Preprint (1)
Year of publication
Has Fulltext
- yes (52) (remove)
Keywords
- Big Data Analysis (4)
- Risk-based Authentication (4)
- Usable Security (4)
- GDPR (3)
- Authentication features (2)
- Consumer Informatics (2)
- Machine Learning (2)
- Password (2)
- Risk-based Authentication (RBA) (2)
- API Documentation (1)
The digitization of financial activities in consumers' lives is increasing, and the digitalization of invoicing processes is expected to play a significant role, although this area is not well understood regarding the private sector. Human-Computer Interaction (HCI) and Computer Supported Cooperative Work (CSCW) research have a long history of analyzing the socio-material and temporal aspects of work practices that are relevant for the domestic domain. The socio-material structuring of invoicing work and the working styles of consumers must be considered when designing effective consumer support systems. In this ethnomethodologically-informed, design-oriented interview study, we followed 17 consumers in their daily practices of dealing with invoices to make the invisible administrative work involved in this process visible. We identified and described the meaningful artifacts that were used in a spatial-temporal process within various storage locations such as input, reminding, intermediate (for postponing cases) buffers, and archive systems. Furthermore, we identified three different working styles that consumers exhibited: direct completion, at the next opportunity, and postpone as far as possible. This study contributes to our understanding of household economics and domestic workplace studies in the tradition of CSCW and has implications for the design of electronic invoicing systems.
The European General Data Protection Regulation requires the implementation of Technical and Organizational Measures (TOMs) to reduce the risk of illegitimate processing of personal data. For these measures to be effective, they must be applied correctly by employees who process personal data under the authority of their organization. However, even data processing employees often have limited knowledge of data protection policies and regulations, which increases the likelihood of misconduct and privacy breaches. To lower the likelihood of unintentional privacy breaches, TOMs must be developed with employees’ needs, capabilities, and usability requirements in mind. To reduce implementation costs and help organizations and IT engineers with the implementation, privacy patterns have proven to be effective for this purpose. In this chapter, we introduce the privacy pattern Data Cart, which specifically helps to develop TOMs for data processing employees. Based on a user-centered design approach with employees from two public organizations in Germany, we present a concept that illustrates how Privacy by Design can be effectively implemented. Organizations, IT engineers, and researchers will gain insight on how to improve the usability of privacy-compliant tools for managing personal data.
Projekte des maschinellen Lernens (ML), insbesondere im Bereich der Zeitreihenanalyse, gewinnen heute zunehmend an Bedeutung. Die Bereitstellung solcher Projekte in einer Produktionsumgebung mit dem gleichen Automatisierungsgrad wie bei klassischen Softwareprojekten ist ein komplexes Unterfangen. Die Umsetzung in Produktionsumgebungen erfordert neben klassischen DevOps auch Machine Learning Operation (MLOps) Technologien und Werkzeuge. Ziel dieser Studie ist es, einen umfassenden Überblick über verfügbare MLOps Tools zu bieten und einen spezifischen Techstack für Zeitreihen ML Projekte zu entwickeln. Es werden aktuelle Trends und Werkzeuge im Bereich MLOps durch eine multivokale Literaturrecherche (MLR) untersucht und analysiert. Die Studie identifiziert passende MLOps Werkzeuge und Methoden für die Zeitreihenanalyse und präsentiert eine spezifische Implementierung einer MLOps Pipeline für die Aktienkursprognose des S&P 500. MLOps und DevOps Tools nehmen eine essenzielle Rolle bei der effektiven Konstruktion und Verwaltung von ML Pipelines ein. Bei der Auswahl geeigneter Werkzeuge ist stets eine spezifische Anpassung an die jeweiligen Projektanforderungen erforderlich. Die Bereitstellung einer detaillierten Darstellung der aktuellen MLOps Tool Landschaft erweist sich hierbei als wertvolle Ressource, die es Entwicklern ermöglicht, die Effizienz und Effektivität ihrer ML Projekte zu optimieren.
Angesichts der raschen Entwicklungen und der Besonderheiten von Softwaresystemen, welche Künstliche Intelligenz (KI) nutzen, ist ein angepasstes Requirements Engineering (RE) erforderlich. Die spezifischen Anforderungen von KI-Projekten müssen dabei erkannt und angegangen werden. Hierfür wird eine systematische Überprufung bestehender Herausforderungen des RE in KI-Projekten durchgeführt. Darauf aufbauend werden neue RE-Ansätze und Empfehlungen präsentiert, die auf die Datensicht von KI-Projekten abzielen. Mithilfe der Analyse bestehender Lösungsansatze, Methoden, Frameworks und Tools soll aufgezeigt werden, inwiefern die Herausforderungen im RE bewältigt werden können. Noch bestehende Lücken im Forschungsstand werden identifiziert und aufgezeigt.
Integrating physical simulation data into data ecosystems challenges the compatibility and interoperability of data management tools. Semantic web technologies and relational databases mostly use other data types, such as measurement or manufacturing design data. Standardizing simulation data storage and harmonizing the data structures with other domains is still a challenge, as current standards such as the ISO standard STEP (ISO 10303 ”Standard for the Exchange of Product model data”) fail to bridge the gap between design and simulation data. This challenge requires new methods, such as ontologies, to rethink simulation results integration. This research describes a new software architecture and application methodology based on the industrial standard ”Virtual Material Modelling in Manufacturing” (VMAP). The architecture integrates large quantities of structured simulation data and their analyses into a semantic data structure. It is capable of providing data permeability from the global digital twin level to the detailed numerical values of data entries and even new key indicators in a three-step approach: It represents a file as an instance in a knowledge graph, queries the file’s metadata, and finds a semantically represented process that enables new metadata to be created and instantiated.
Question Answering (QA) has gained significant attention in recent years, with transformer-based models improving natural language processing. However, issues of explainability remain, as it is difficult to determine whether an answer is based on a true fact or a hallucination. Knowledge-based question answering (KBQA) methods can address this problem by retrieving answers from a knowledge graph. This paper proposes a hybrid approach to KBQA called FRED, which combines pattern-based entity retrieval with a transformer-based question encoder. The method uses an evolutionary approach to learn SPARQL patterns, which retrieve candidate entities from a knowledge base. The transformer-based regressor is then trained to estimate each pattern’s expected F1 score for answering the question, resulting in a ranking ofcandidate entities. Unlike other approaches, FRED can attribute results to learned SPARQL patterns, making them more interpretable. The method is evaluated on two datasets and yields MAP scores of up to 73 percent, with the transformer-based interpretation falling only 4 pp short of an oracle run. Additionally, the learned patterns successfully complement manually generated ones and generalize well to novel questions.
In the project EILD.nrw, Open Educational Resources (OER) have been developed for teaching databases. Lecturers can use the tools and courses in a variety of learning scenarios. Students of computer science and application subjects can learn the complete life cycle of databases. For this purpose, quizzes, interactive tools, instructional videos, and courses for learning management systems are developed and published under a Creative Commons license. We give an overview of the developed OERs according to subject, description, teaching form, and format. Following, we describe how licencing, sustainability, accessibility, contextualization, content description, and technical adaptability are implemented. The feedback of students in ongoing classes are evaluated.
Background
Consumers rely heavily on online user reviews when shopping online and cybercriminals produce fake reviews to manipulate consumer opinion. Much prior research focuses on the automated detection of these fake reviews, which are far from perfect. Therefore, consumers must be able to detect fake reviews on their own. In this study we survey the research examining how consumers detect fake reviews online.
Methods
We conducted a systematic literature review over the research on fake review detection from the consumer-perspective. We included academic literature giving new empirical data. We provide a narrative synthesis comparing the theories, methods and outcomes used across studies to identify how consumers detect fake reviews online.
Results
We found only 15 articles that met our inclusion criteria. We classify the most often used cues identified into five categories which were (1) review characteristics (2) textual characteristics (3) reviewer characteristics (4) seller characteristics and (5) characteristics of the platform where the review is displayed.
Discussion
We find that theory is applied inconsistently across studies and that cues to deception are often identified in isolation without any unifying theoretical framework. Consequently, we discuss how such a theoretical framework could be developed.
Risk-Based Authentication for OpenStack: A Fully Functional Implementation and Guiding Example
(2023)
Online services have difficulties to replace passwords with more secure user authentication mechanisms, such as Two-Factor Authentication (2FA). This is partly due to the fact that users tend to reject such mechanisms in use cases outside of online banking. Relying on password authentication alone, however, is not an option in light of recent attack patterns such as credential stuffing.
Risk-Based Authentication (RBA) can serve as an interim solution to increase password-based account security until better methods are in place. Unfortunately, RBA is currently used by only a few major online services, even though it is recommended by various standards and has been shown to be effective in scientific studies. This paper contributes to the hypothesis that the low adoption of RBA in practice can be due to the complexity of implementing it. We provide an RBA implementation for the open source cloud management software OpenStack, which is the first fully functional open source RBA implementation based on the Freeman et al. algorithm, along with initial reference tests that can serve as a guiding example and blueprint for developers.
Airborne and spaceborne platforms are the primary data sources for large-scale forest mapping, but visual interpretation for individual species determination is labor-intensive. Hence, various studies focusing on forests have investigated the benefits of multiple sensors for automated tree species classification. However, transferable deep learning approaches for large-scale applications are still lacking. This gap motivated us to create a novel dataset for tree species classification in central Europe based on multi-sensor data from aerial, Sentinel-1 and Sentinel-2 imagery. In this paper, we introduce the TreeSatAI Benchmark Archive, which contains labels of 20 European tree species (i.e., 15 tree genera) derived from forest administration data of the federal state of Lower Saxony, Germany. We propose models and guidelines for the application of the latest machine learning techniques for the task of tree species classification with multi-label data. Finally, we provide various benchmark experiments showcasing the information which can be derived from the different sensors including artificial neural networks and tree-based machine learning methods. We found that residual neural networks (ResNet) perform sufficiently well with weighted precision scores up to 79 % only by using the RGB bands of aerial imagery. This result indicates that the spatial content present within the 0.2 m resolution data is very informative for tree species classification. With the incorporation of Sentinel-1 and Sentinel-2 imagery, performance improved marginally. However, the sole use of Sentinel-2 still allows for weighted precision scores of up to 74 % using either multi-layer perceptron (MLP) or Light Gradient Boosting Machine (LightGBM) models. Since the dataset is derived from real-world reference data, it contains high class imbalances. We found that this dataset attribute negatively affects the models' performances for many of the underrepresented classes (i.e., scarce tree species). However, the class-wise precision of the best-performing late fusion model still reached values ranging from 54 % (Acer) to 88 % (Pinus). Based on our results, we conclude that deep learning techniques using aerial imagery could considerably support forestry administration in the provision of large-scale tree species maps at a very high resolution to plan for challenges driven by global environmental change. The original dataset used in this paper is shared via Zenodo (https://doi.org/10.5281/zenodo.6598390, Schulz et al., 2022). For citation of the dataset, we refer to this article.
Digital ecosystems are driving the digital transformation of business models. Meanwhile, the associated processing of personal data within these complex systems poses challenges to the protection of individual privacy. In this paper, we explore these challenges from the perspective of digital ecosystems' platform providers. To this end, we present the results of an interview study with seven data protection officers representing a total of 12 digital ecosystems in Germany. We identified current and future challenges for the implementation of data protection requirements, covering issues on legal obligations and data subject rights. Our results support stakeholders involved in the implementation of privacy protection measures in digital ecosystems, and form the foundation for future privacy-related studies tailored to the specifics of digital ecosystems.
A company's financial documents use tables along with text to organize the data containing key performance indicators (KPIs) (such as profit and loss) and a financial quantity linked to them. The KPI’s linked quantity in a table might not be equal to the similarly described KPI's quantity in a text. Auditors take substantial time to manually audit these financial mistakes and this process is called consistency checking. As compared to existing work, this paper attempts to automate this task with the help of transformer-based models. Furthermore, for consistency checking it is essential for the table's KPIs embeddings to encode the semantic knowledge of the KPIs and the structural knowledge of the table. Therefore, this paper proposes a pipeline that uses a tabular model to get the table's KPIs embeddings. The pipeline takes input table and text KPIs, generates their embeddings, and then checks whether these KPIs are identical. The pipeline is evaluated on the financial documents in the German language and a comparative analysis of the cell embeddings' quality from the three tabular models is also presented. From the evaluation results, the experiment that used the English-translated text and table KPIs and Tabbie model to generate table KPIs’ embeddings achieved an accuracy of 72.81% on the consistency checking task, outperforming the benchmark, and other tabular models.
Deployment of modern data-driven machine learning methods, most often realized by deep neural networks (DNNs), in safety-critical applications such as health care, industrial plant control, or autonomous driving is highly challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability and implausible predictions to directed attacks by means of malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from so-called safety concerns, properties that preclude their deployment as no argument or experimental setup can help to assess the remaining risk. In recent years, an abundance of state-of-the-art techniques aiming to address these safety concerns has emerged. This chapter provides a structured and broad overview of them. We first identify categories of insufficiencies to then describe research activities aiming at their detection, quantification, or mitigation. Our work addresses machine learning experts and safety engineers alike: The former ones might profit from the broad range of machine learning topics covered and discussions on limitations of recent methods. The latter ones might gain insights into the specifics of modern machine learning methods. We hope that this contribution fuels discussions on desiderata for machine learning systems and strategies on how to help to advance existing approaches accordingly.
Trust your guts: fostering embodied knowledge and sustainable practices through voice interaction
(2023)
Despite various attempts to prevent food waste and motivate conscious food handling, household members find it difficult to correctly assess the edibility of food. With the rise of ambient voice assistants, we did a design case study to support households’ in situ decision-making process in collaboration with our voice agent prototype, Fischer Fritz. Therefore, we conducted 15 contextual inquiries to understand food practices at home. Furthermore, we interviewed six fish experts to inform the design of our voice agent on how to guide consumers and teach food literacy. Finally, we created a prototype and discussed with 15 consumers its impact and capability to convey embodied knowledge to the human that is engaged as sensor. Our design research goes beyond current Human-Food Interaction automation approaches by emphasizing the human-food relationship in technology design and demonstrating future complementary human-agent collaboration with the aim to increase humans’ competence to sense, think, and act.
While the recent discussion on Art. 25 GDPR often considers the approach of data protection by design as an innovative idea, the notion of making data protection law more effective through requiring the data controller to implement the legal norms into the processing design is almost as old as the data protection debate. However, there is another, more recent shift in establishing the data protection by design approach through law, which is not yet understood to its fullest extent in the debate. Art. 25 GDPR requires the controller to not only implement the legal norms into the processing design but to do so in an effective manner. By explicitly declaring the effectiveness of the protection measures to be the legally required result, the legislator inevitably raises the question of which methods can be used to test and assure such efficacy. In our opinion, extending the legal compatibility assessment to the real effects of the required measures opens this approach to interdisciplinary methodologies. In this paper, we first summarise the current state of research on the methodology established in Art. 25 sect. 1 GDPR, and pinpoint some of the challenges of incorporating interdisciplinary research methodologies. On this premise, we present an empirical research methodology and first findings which offer one approach to answering the question on how to specify processing purposes effectively. Lastly, we discuss the implications of these findings for the legal interpretation of Art. 25 GDPR and related provisions, especially with respect to a more effective implementation of transparency and consent, and provide an outlook on possible next research steps.
Risk-based authentication (RBA) aims to protect users against attacks involving stolen passwords. RBA monitors features during login, and requests re-authentication when feature values widely differ from those previously observed. It is recommended by various national security organizations, and users perceive it more usable than and equally secure to equivalent two-factor authentication. Despite that, RBA is still used by very few online services. Reasons for this include a lack of validated open resources on RBA properties, implementation, and configuration. This effectively hinders the RBA research, development, and adoption progress.
To close this gap, we provide the first long-term RBA analysis on a real-world large-scale online service. We collected feature data of 3.3 million users and 31.3 million login attempts over more than 1 year. Based on the data, we provide (i) studies on RBA’s real-world characteristics plus its configurations and enhancements to balance usability, security, and privacy; (ii) a machine learning–based RBA parameter optimization method to support administrators finding an optimal configuration for their own use case scenario; (iii) an evaluation of the round-trip time feature’s potential to replace the IP address for enhanced user privacy; and (iv) a synthesized RBA dataset to reproduce this research and to foster future RBA research. Our results provide insights on selecting an optimized RBA configuration so that users profit from RBA after just a few logins. The open dataset enables researchers to study, test, and improve RBA for widespread deployment in the wild.
The processing of employees’ personal data is dramatically increasing, yet there is a lack of tools that allow employees to manage their privacy. In order to develop these tools, one needs to understand what sensitive personal data are and what factors influence employees’ willingness to disclose. Current privacy research, however, lacks such insights, as it has focused on other contexts in recent decades. To fill this research gap, we conducted a cross-sectional survey with 553 employees from Germany. Our survey provides multiple insights into the relationships between perceived data sensitivity and willingness to disclose in the employment context. Among other things, we show that the perceived sensitivity of certain types of data differs substantially from existing studies in other contexts. Moreover, currently used legal and contextual distinctions between different types of data do not accurately reflect the subtleties of employees’ perceptions. Instead, using 62 different data elements, we identified four groups of personal data that better reflect the multi-dimensionality of perceptions. However, previously found common disclosure antecedents in the context of online privacy do not seem to affect them. We further identified three groups of employees that differ in their perceived data sensitivity and willingness to disclose, but neither in their privacy beliefs nor in their demographics. Our findings thus provide employers, policy makers, and researchers with a better understanding of employees’ privacy perceptions and serve as a basis for future targeted research
on specific types of personal data and employees.
Personal-Information-Management-Systeme (PIMS) gelten als Chance, um die Datensouveränität der Verbraucher zu stärken. Datenschutzbezogene Fragen sind für Verbraucher immer dort relevant, wo sie Verträge und Nutzungsbedingungen mit Diensteanbietern eingehen. Vor diesem Hintergrund diskutiert dieser Beitrag die Potenziale von VRM-Systemen, die nicht nur das Datenmanagement, sondern das gesamte Vertragsmanagement von Verbrauchern unterstützen. Dabei gehen wir der Frage nach, ob diese besser geeignet sind, um Verbraucher zu souveränem Handeln zu befähigen.
Der technische Fortschritt im Bereich der Erhebung, Speicherung und Verarbeitung von Daten macht es erforderlich, neue Fragen zu sozialverträglichen Datenmärkten aufzuwerfen. So gibt es sowohl eine Tendenz zur vereinfachten Datenteilung als auch die Forderung, die informationelle Selbstbestimmung besser zu schützen. Innerhalb dieses Spannungsfeldes bewegt sich die Idee von Datentreuhändern. Ziel des Beitrags ist darzulegen, dass zwischen verschiedenen Formen der Datentreuhänderschaft unterschieden werden sollte, um der Komplexität des Themas gerecht zu werden. Insbesondere bedarf es neben der mehrseitigen Treuhänderschaft, mit dem Treuhänder als neutraler Instanz, auch der einseitigen Treuhänderschaft, bei dem der Treuhänder als Anwalt der Verbraucherinteressen fungiert. Aus dieser Perspektive wird das Modell der Datentreuhänderschaft als stellvertretende Deutung der Interessen individueller und kollektiver Identitäten systematisch entwickelt.
An der Hochschule Bonn-Rhein-Sieg fand am Donnerstag, den 23.9.21 das erste Verbraucherforum für Verbraucherinformatik statt. Im Rahmen der Online-Tagesveranstaltung diskutierten mehr als 30 Teilnehmer:innen über Themen und Ideen rund um den Bereich Verbraucherdatenschutz. Dabei kamen sowohl Beiträge aus der Informatik, den Verbraucher- und Sozialwissenschaften sowie auch der regulatorischen Perspektive zur Sprache. Der folgende Beitrag stellt den Hintergrund der Veranstaltung dar und berichtet über Inhalte der Vorträge sowie Anknüpfungspunkte für die weitere Konstituierung der Verbraucherinformatik. Veranstalter waren das Institut für Verbraucherinformatik an der H-BRS in Zusammenarbeit mit dem Lehrstuhl IT-Sicherheit der Universität Siegen sowie dem Kompetenzzentrum Verbraucherforschung NRW der Verbraucherzentrale NRW e. V. mit Förderung des Bundesministeriums der Justiz und für Verbraucherschutz.
An der Hochschule Bonn-Rhein-Sieg fand am Donnerstag, den 23.9.21 das erste Verbraucherforum für Verbraucherinformatik statt. Im Rahmen der Online-Tagesveranstaltung diskutierten mehr als 30 Teilnehmer:innen über Themen und Ideen rund um den Bereich Verbraucherdatenschutz. Dabei kamen sowohl Beiträge aus der Informatik, den Verbraucher- und Sozialwissenschaften sowie auch der regulatorischen Perspektive zur Sprache. Der folgende Beitrag stellt den Hintergrund der Veranstaltung dar und berichtet über Inhalte der Vorträge sowie Anknüpfungspunkte für die weitere Konstituierung der Verbraucherinformatik. Veranstalter waren das Institut für Verbraucherinformatik an der H-BRS in Zusammenarbeit mit dem Lehrstuhl IT-Sicherheit der Universität Siegen sowie dem Kompetenzzentrum Verbraucherforschung NRW der Verbraucherzentrale NRW e. V. mit Förderung des Bundesministeriums der Justiz und für Verbraucherschutz.
Die Blockchain-Technologie ist einer der großen Innovationstreiber der letzten Jahre. Mit einer zugrundeliegenden Blockchain-Technologie ist auch der Betrieb von verteilten Anwendungen, sogenannter Decentralized Applications (DApps), bereits technisch umsetzbar. Dieser Beitrag verfolgt das Ziel, Gestaltungsmöglichkeiten der digitalen Verbraucherteilhabe an Blockchain-Anwendungen zu untersuchen. Hierzu enthält der Beitrag eine Einführung in die digitale Verbraucherteilhabe und die technischen Grundlagen und Eigenschaften der Blockchain-Technologie, einschließlich darauf basierender DApps. Abschließend werden technische, ethisch-organisatorische, rechtliche und sonstige Anforderungsbereiche für die Umsetzung von digitaler Verbraucherteilhabe in Blockchain-Anwendungen adressiert.
Hinreichende Datensouveränität gestaltet sich für Verbraucher:innen in der Praxis als äußerst schwierig. Die Europäische Datenschutzgrundverordnung garantiert umfassende Betroffenenrechte, die von verwantwortlichen Stellen durch technisch-organisatorische Maßnahmen umzusetzen sind. Traditionelle Vorgehensweisen wie die Bereitstellung länglicher Datenschutzerklärungen oder der ohne weitere Hilfestellungen angebotene Download von personenbezogenen Rohdaten werden dem Anspruch der informationellen Selbstbestimmung nicht gerecht. Die im Folgenden aufgezeigten neuen technischen Ansätze insbesondere KI-basierter Transparenz- und Auskunftsmodalitäten zeigen die Praktikabilität wirksamer und vielseitiger Mechanismen. Hierzu werden die relevanten Transparenzangaben teilautomatisiert extrahiert, maschinenlesbar repräsentiert und anschließend über diverse Kanäle wie virtuelle Assistenten oder die Anreicherung von Suchergebnissen ausgespielt. Ergänzt werden außerdem automatisierte und leicht zugängliche Methoden für Auskunftsersuchen und deren Aufbereitung nach Art. 15 DSGVO. Abschließend werden konkrete Regulierungsimplikationen diskutiert.
Künstliche Intelligenz im autonomen Fahrzeug verarbeitet enorme Mengen an Daten. Beim Betrieb eines solchen Fahrzeugs basiert jede Bewegung auf einer datenbasierten, automatisierten und adaptiven Entscheidungsfindung. Aber auch, um Regeln zur Erkennung und Entscheidung in komplexen Situationen wie den hochindividuellen Verkehrsszenarien entwickeln zu können (KI-Training), sind bereits beachtliche Datenmengen von Fahrzeugen im Realverkehr erforderlich – zum Beispiel Videosequenzen aus Kamerafahrten. Für das Training Künstlicher Intelligenz ist es aus Sicht der Fahrzeugentwicklung attraktiv, auf den Datenschatz zuzugreifen, den die Gesamtheit der Fahrzeuge im realen Anwendungskontext erzeugen kann. Als Nutzer:innen und Insassen sind Verbraucher:innen so Teil einer groß angelegten Testdatenerhebung durch Fahrzeughersteller und Anbieter. Das wirft Datenschutzfragen auf. Ziel des vorliegenden Beitrags ist es herauszuarbeiten, inwiefern sich hierdurch Implikationen für die Rechte und Freiheiten von Verbraucher:innen ergeben und welche Mechanismen das geltende Recht sowie aktuelle legislative Entwicklungen bereithalten, den „Datenhunger“ der KI mit den Interessen an Datensouveränität und informationeller Selbstbestimmung in Einklang und Ausgleich zu bringen. Im Fokus steht dabei insbesondere, wie Anforderungen schon im Produktdesign „mitgedacht“ werden und damit für Verbraucher:innen rechts- und vertrauensfördernd wirken können.
Datenschutz und informationelle Selbstbestimmung sind Bestandteile aktueller Leitbilder einer Digitalen Bildung in der Schule. Im Kontext der Schulschließungen und der vorrangigen Nutzung digitaler Medien zeigte sich jedoch, dass Datenschutz weder als Thema noch als Gestaltungsprinzip digitaler Lernumgebungen in der bildungsadministrativen und pädagogisch-praktischen Schulwirklichkeit systematisch verankert ist. Die Diskrepanz zwischen aktuellen Leitbildern einer digitalen Bildung und der sichtbar problematischen Praxis des digitalen Notfalldistanzunterrichts markiert den Ausgangspunkt des Beitrages, der sich der übergeordneten Frage widmet, welche Herausforderungen sich bei der Realisierung von Datenschutz in der Schul- und Unterrichtswirklichkeit in einer digital geprägten Welt stellen. Im Sinne einer Problemfeldanalyse werden prototypische Handlungsprobleme der Schule herausgearbeitet. Fokussiert betrachtet werden exemplarische Herausforderungen und Anforderungen an Technologien und Akteur:innen der inneren und äußeren Schulentwicklung auf den Ebenen der Unterrichtsentwicklung, der Personalentwicklung, der Technologieentwicklung und der Organisationsentwicklung.
Sprachassistenten wie Alexa oder Google Assistant sind aus dem Alltag vieler VerbraucherInnen nicht mehr wegzudenken. Sie überzeugen insbesondere durch die sprachbasierte und somit freihändige Steuerung und mitunter auch den unterhaltsamen Charakter. Als häuslicher Lebensmittelpunkt sind die häufigsten Aufstellungsorte das Wohnzimmer und die Küche, da sich Haushaltsmitglieder dort die meiste Zeit aufhalten und das alltägliche Leben abspielt. Dies bedeutet allerdings ebenso, dass an diesen Orten potenziell viele Daten erfasst und gesammelt werden können, die nicht für den Sprachassistenten bestimmt sind. Demzufolge ist nicht auszuschließen, dass der Sprachassistent – wenn auch versehentlich – durch Gespräche oder Geräusche aktiviert wird und Aufnahmen speichert, selbst wenn eine Aktivierung unbewusst von Anwesenden bzw. von anderen Geräten (z. B. Fernseher) erfolgt oder aus anderen Räumen kommt. Im Rahmen eines Forschungsprojekts haben wir dazu NutzerInnen über Ihre Nutzungs- und Aufstellungspraktiken der Sprachassistenten befragt und zudem einen Prototyp getestet, der die gespeicherten Interaktionen mit dem Sprachassistenten sichtbar macht. Dieser Beitrag präsentiert basierend auf den Erkenntnissen aus den Interviews und abgeleiteten Leitfäden aus den darauffolgenden Nutzungstests des Prototyps eine Anwendung zur Beantragung und Visualisierung der Interaktionsdaten mit dem Sprachassistenten. Diese ermöglicht es, Interaktionen und die damit zusammenhängende Situation darzustellen, indem sie zu jeder Interaktion die Zeit, das verwendete Gerät sowie den Befehl wiedergibt und unerwartete Verhaltensweisen wie die versehentliche oder falsche Aktivierung sichtbar macht. Dadurch möchten wir VerbraucherInnen für die Fehleranfälligkeit dieser Geräte sensibilisieren und einen selbstbestimmteren und sichereren Umgang ermöglichen.
Most people use disaster apps infrequently, primarily only in situations of turmoil, when they are physically or emotionally vulnerable. Personal data may be necessary to help them, data protections may be waived. In some circumstances, free movement and liberties may be curtailed for public protection, as was seen in the current COVID pandemic. Consuming and producing disaster data can deepen problems arising at the confluence of surveillance and disaster capitalism, where data has become a tool for solutionist instrumentarian power (Zuboff 2019, Klein 2008) and part of a destructive mode of one world worlding (Law 2015, Escobar 2020). The special use of disaster apps prompts us to ask what role consumer protection could play in safeguarding democratic liberties. Within this work, a set of current approaches are briefly reviewed and two case studies are presented of what we call appropriation or design against datafication. These combine document analysis and literature research with several months of online and field ethnographic observation. The first case study examines disaster app use in response to the 2010 Haiti earthquake, the second explores COVID Contact Tracing in Taiwan in 2020/21. Against this backdrop we ask, ‘how could and how should consumer protection respond to problems of surveillance disaster capitalism?’ Drawing on our work with the is IT ethical? Exchange, a co-designed community platform and knowledge exchange for disaster information sharing, and a Societal Readiness Assessment Framework that we are developing alongside it, we explore how co-design methodologies could help define answers.
Unsere interdisziplinäre Forschungsarbeit „Die Gestaltung wirksamer Bildsymbole für Verarbeitungszwecke und ihre Folgen für Betroffene“ („Designing Effective Privacy Icons through an Interdisciplinary Research Methodology“) baut auf dem „Data Protection by Design“-Ansatz (Art. 25(1) DSGVO) auf und zielt auf folgende Forschungsfragen ab: Wie müssen das Transparenzprinzip (Art. 5(1)(a) DSGVO) und die Informationspflichten (Art. 12-14 DSGVO) insbesondere im Hinblick auf die Festlegung der Verarbeitungszwecke (Art. 5(1)(b) DSGVO) umgesetzt werden, damit sie die Nutzer:innen effektiv vor Risiken der Datenverarbeitung schützen? Mit welchen Methoden lässt sich die Wirksamkeit der Umsetzung ermitteln und diese auch durchsetzen?1 Im vorliegenden Projekt erweitern wir juristische Methoden um solche aus der HCI-Forschung (Human Computer Interaction) und der Visuellen Gestaltung. In einer ersten Phase haben wir mit empirischen Methoden der HCI-Forschung untersucht, welche Datennutzungstypen Nutzer:innen technologieübergreifend als relevant empfinden. Diese Erkenntnisse können als Ausgangspunkt für eine neue Zweckbestimmung dienen, die bestimmte Datennutzungstypen deutlicher ein- oder ausschließt. Erste Umformulierungen von Zweckbestimmungen haben wir in zwei Praxisworkshops mit Verantwortlichen der Datenverarbeitung getestet. In einer darauffolgenden qualitativen Studie untersuchten wir dann die Einstellungen und Erwartungen von Internetnutzerinnen und -nutzern am Beispiel der Personalisierung von Internetinhalten, um die entsprechenden Zwecke anhand eines konkreten Beispiels, in unserem Fall der personalisierten Werbung, neu zu formulieren. Auf dieser Basis haben wir nun die zweite Forschungsphase begonnen, in der wir Designs für Datenschutzhinweise und Kontrollmöglichkeiten unter besonderer Berücksichtigung des Verarbeitungszwecks entwickeln. Da der Einsatz von Cookies eine wichtige Rolle bei der Personalisierung von Werbung spielt, ist eine zentrale Aufgaben die Neugestaltung des sogenannten „Cookie-Banners“.
Frequently the main purpose of domestic artifacts equipped with smart sensors is to hide technology, like previous examples of a Smart Mirror show. However, current Smart Homes often fail to provide meaningful IoT applications for all residents’ needs. To design beyond efficiency and productivity, we propose to realize the potential of the traditional artifact for calm and engaging experiences. Therefore, we followed a design case study approach with 22 participants in total. After an initial focus group, we conducted a diary study to examine home routines and developed a conceptual design. The evaluation of our mid-fidelity prototype shows, that we need to study carefully the practices of the residents to leverage the physical material of the artifact to fit the routines. Our Smart Mirror, enhanced by digital qualities, supports meaningful activities and makes the bathroom more appealing. Thereby, we discuss domestic technology design beyond automation.
With the debates on climate change and sustainability, a reduction of the share of cars in the modal split has become increasingly prevalent in both public and academic discourse. Besides some motivational approaches, there is a lack of ICT artifacts that successfully raise the ability of consumers to adopt sustainable mobility patterns. To further understand the requirements and the design of these artifacts within everyday mobility adopted a practice-lens. This lens is helpful to get a broader perspective on the use of ICT artifacts along consumers’ transformational journey towards sustainable mobility practices. Based on 12 retrospective interviews with car-free mobility consumers, we argue that artifacts should not be viewed as ’magic-bullet’ solutions but should accompany the complex transformation of practices in multifaceted ways. Moreover, we highlight in particular the difficulties of appropriating shared infrastructures and aligning own practices with them. This opens up a design space to provide more support for these kinds of material-interactions, to provide access to consumption infrastructures and make them usable, rather than leaving consumers alone with increased motivation.
Recent publications propose concepts of systems that integrate the various services and data sources of everyday food practices. However, this research does not go beyond the conceptualization of such systems. Therefore, there is a deficit in understanding how to combine different services and data sources and which design challenges arise from building integrated Household Information Systems. In this paper, we probed the design of an Integrated Household Information System with 13 participants. The results point towards more personalization, automatization of storage administration and enabling flexible artifact ecologies. Our paper contributes to understanding the design and usage of Integrated Household Information Systems, as a new class of information systems for HCI research.
Risk-based authentication (RBA) aims to strengthen password-based authentication rather than replacing it. RBA does this by monitoring and recording additional features during the login process. If feature values at login time differ significantly from those observed before, RBA requests an additional proof of identification. Although RBA is recommended in the NIST digital identity guidelines, it has so far been used almost exclusively by major online services. This is partly due to a lack of open knowledge and implementations that would allow any service provider to roll out RBA protection to its users. To close this gap, we provide a first in-depth analysis of RBA characteristics in a practical deployment. We observed N=780 users with 247 unique features on a real-world online service for over 1.8 years. Based on our collected data set, we provide (i) a behavior analysis of two RBA implementations that were apparently used by major online services in the wild, (ii) a benchmark of the features to extract a subset that is most suitable for RBA use, (iii) a new feature that has not been used in RBA before, and (iv) factors which have a significant effect on RBA performance. Our results show that RBA needs to be carefully tailored to each online service, as even small configuration adjustments can greatly impact RBA's security and usability properties. We provide insights on the selection of features, their weightings, and the risk classification in order to benefit from RBA after a minimum number of login attempts.
Risk-based authentication (RBA) extends authentication mechanisms to make them more robust against account takeover attacks, such as those using stolen passwords. RBA is recommended by NIST and NCSC to strengthen password-based authentication, and is already used by major online services. Also, users consider RBA to be more usable than two-factor authentication and just as secure. However, users currently obtain RBA's high security and usability benefits at the cost of exposing potentially sensitive personal data (e.g., IP address or browser information). This conflicts with user privacy and requires to consider user rights regarding the processing of personal data. We outline potential privacy challenges regarding different attacker models and propose improvements to balance privacy in RBA systems. To estimate the properties of the privacy-preserving RBA enhancements in practical environments, we evaluated a subset of them with long-term data from 780 users of a real-world online service. Our results show the potential to increase privacy in RBA solutions. However, it is limited to certain parameters that should guide RBA design to protect privacy. We outline research directions that need to be considered to achieve a widespread adoption of privacy preserving RBA with high user acceptance.
Software developers build complex systems using plenty of third-party libraries. Documentation is key to understand and use the functionality provided via the libraries’ APIs. Therefore, functionality is the main focus of contemporary API documentation, while cross-cutting concerns such as security are almost never considered at all, especially when the API itself does not provide security features. Documentations of JavaScript libraries for use in web applications, e.g., do not specify how to add or adapt a Content Security Policy (CSP) to mitigate content injection attacks like Cross-Site Scripting (XSS). This is unfortunate, as security-relevant API documentation might have an influence on secure coding practices and prevailing major vulnerabilities such as XSS. For the first time, we study the effects of integrating security-relevant information in non-security API documentation. For this purpose, we took CSP as an exemplary study object and extended the official Google Maps JavaScript API documentation with security-relevant CSP information in three distinct manners. Then, we evaluated the usage of these variations in a between-group eye-tracking lab study involving N=49 participants. Our observations suggest: (1) Developers are focused on elements with code examples. They mostly skim the documentation while searching for a quick solution to their programming task. This finding gives further evidence to results of related studies. (2) The location where CSP-related code examples are placed in non-security API documentation significantly impacts the time it takes to find this security-relevant information. In particular, the study results showed that the proximity to functional-related code examples in documentation is a decisive factor. (3) Examples significantly help to produce secure CSP solutions. (4) Developers have additional information needs that our approach cannot meet.
Overall, our study contributes to a first understanding of the impact of security-relevant information in non-security API documentation on CSP implementation. Although further research is required, our findings emphasize that API producers should take responsibility for adequately documenting security aspects and thus supporting the sensibility and training of developers to implement secure systems. This responsibility also holds in seemingly non-security relevant contexts.
Science Track FrOSCon 2018
(2021)
Less is Often More: Header Whitelisting as Semantic Gap Mitigation in HTTP-Based Software Systems
(2021)
The web is the most wide-spread digital system in the world and is used for many crucial applications. This makes web application security extremely important and, although there are already many security measures, new vulnerabilities are constantly being discovered. One reason for some of the recent discoveries lies in the presence of intermediate systems—e.g. caches, message routers, and load balancers—on the way between a client and a web application server. The implementations of such intermediaries may interpret HTTP messages differently, which leads to a semantically different understanding of the same message. This so-called semantic gap can cause weaknesses in the entire HTTP message processing chain.
In this paper we introduce the header whitelisting (HWL) approach to address the semantic gap in HTTP message processing pipelines. The basic idea is to normalize and reduce an HTTP request header to the minimum required fields using a whitelist before processing it in an intermediary or on the server, and then restore the original request for the next hop. Our results show that HWL can avoid misinterpretations of HTTP messages in the different components and thus prevent many attacks rooted in a semantic gap including request smuggling, cache poisoning, and authentication bypass.
XML Signature Wrapping (XSW) has been a relevant threat to web services for 15 years until today. Using the Personal Health Record (PHR), which is currently under development in Germany, we investigate a current SOAP-based web services system as a case study. In doing so, we highlight several deficiencies in defending against XSW. Using this real-world contemporary example as motivation, we introduce a guideline for more secure XML signature processing that provides practitioners with easier access to the effective countermeasures identified in the current state of research.
Threats to passwords are still very relevant due to attacks like phishing or credential stuffing. One way to solve this problem is to remove passwords completely. User studies on passwordless FIDO2 authentication using security tokens demonstrated the potential to replace passwords. However, widespread acceptance of FIDO2 depends, among other things, on how user accounts can be recovered when the security token becomes permanently unavailable. For this reason, we provide a heuristic evaluation of 12 account recovery mechanisms regarding their properties for FIDO2 passwordless authentication. Our results show that the currently used methods have many drawbacks. Some even rely on passwords, taking passwordless authentication ad absurdum. Still, our evaluation identifies promising account recovery solutions and provides recommendations for further studies.
Risk-based authentication (RBA) is an adaptive security measure to strengthen password-based authentication against account takeover attacks. Our study on 65 participants shows that users find RBA more usable than two-factor authentication equivalents and more secure than password-only authentication. We identify pitfalls and provide guidelines for putting RBA into practice.
Applied privacy research has so far focused mainly on consumer relations in private life. Privacy in the context of employment relationships is less well studied, although it is subject to the same legal privacy framework in Europe. The European General Data Protection Regulation (GDPR) has strengthened employees’ right to privacy by obliging that employers provide transparency and intervention mechanisms. For such mechanisms to be effective, employees must have a sound understanding of their functions and value. We explored possible boundaries by conducting a semistructured interview study with 27 office workers in Germany and elicited mental models of the right to informational self-determination, which is the European proxy for the right to privacy. We provide insights into (1) perceptions of different categories of data, (2) familiarity with the legal framework regarding expectations for privacy controls, and (3) awareness of data processing, data flow, safeguards, and threat models. We found that legal terms often used in privacy policies used to describe categories of data are misleading. We further identified three groups of mental models that differ in their privacy control requirements and willingness to accept restrictions on their privacy rights. We also found ignorance about actual data flow, processing, and safeguard implementation. Participants’ mindsets were shaped by their faith in organizational and technical measures to protect privacy. Employers and developers may benefit from our contributions by understanding the types of privacy controls desired by office workers and the challenges to be considered when conceptualizing and designing usable privacy protections in the workplace.
Kompetenzen auf dem Gebiet der Datenbanken gehören zum Pflichtbereich der Informatik. Das Angebot an Lehrbüchern, Vorlesungsformaten und Tools lässt sich jedoch für Lehrende oft nur eingeschränkt in die eigene Lehre integrieren. In diesem Aufsatz schildern wir unsere Erfahrungen in der Nutzung (frei) verfügbarer und der Entwicklung eigener digitaler Inhalte für grundlegende Datenbankveranstaltungen. Die Präferenzen der Studierenden werden mittels Nutzungsanalysen und Befragungen ermittelt. Wir stellen die Anforderungen auf, wie die nicht selten aufwendig herzustellenden digitalen Materialien von Lehrenden in ihre Lehr- und Lernumgebungen integriert werden können. Als konstruktive Antwort auf diese Herausforderung wird das Konzept EILD zur Entwicklung von Inhalten für die Lehre im Fach Datenbanken vorgestellt. Die Inhalte sollen in vielfältigen Lernszenarien eingesetzt werden können und mit einer Creative Commons (CC) Lizenzierung als OER (open educational resources) frei zur Verfügung stehen.
Risk-based authentication (RBA) aims to strengthen password-based authentication rather than replacing it. RBA does this by monitoring and recording additional features during the login process. If feature values at login time differ significantly from those observed before, RBA requests an additional proof of identification. Although RBA is recommended in the NIST digital identity guidelines, it has so far been used almost exclusively by major online services. This is partly due to a lack of open knowledge and implementations that would allow any service provider to roll out RBA protection to its users.
To close this gap, we provide a first in-depth analysis of RBA characteristics in a practical deployment. We observed N=780 users with 247 unique features on a real-world online service for over 1.8 years. Based on our collected data set, we provide (i) a behavior analysis of two RBA implementations that were apparently used by major online services in the wild, (ii) a benchmark of the features to extract a subset that is most suitable for RBA use, (iii) a new feature that has not been used in RBA before, and (iv) factors which have a significant effect on RBA performance. Our results show that RBA needs to be carefully tailored to each online service, as even small configuration adjustments can greatly impact RBA's security and usability properties. We provide insights on the selection of features, their weightings, and the risk classification in order to benefit from RBA after a minimum number of login attempts.
Data emerged as a central success factor for companies to benefit from digitization. However, the skills in successfully creating value from data – especially at the management level – are not always profound. To address this problem, several canvas models have already been designed. Canvas models are usually created to write down an idea in a structured way to promote transparency and traceability. However, some existing data science canvas models mainly address developers and are thus unsuitable for decision-makers and communication within interdisciplinary teams. Based on a literature review, we identified influencing factors that are essential for the success of data science projects. With the information gained, the Data Science Canvas was developed in an expert workshop and finally evaluated by practitioners to find out whether such an instrument could support data-driven value creation.
Bedingt durch die fortlaufende Digitalisierung und den Big Data-Trend stehen immer mehr Daten zur Verfügung. Daraus resultieren viele Potenziale – gerade für Unternehmen. Die Fähigkeit zur Bewältigung und Auswertung dieser Daten schlägt sich in der Rolle des Data Scientist nieder, welcher aktuell einer der gefragtesten Berufe ist. Allerdings ist die Integration von Daten in Unternehmensstrategie und -kultur eine große Herausforderung. So müssen komplexe Daten und Analyseergebnisse auch nicht datenaffinen Stakeholdern kommuniziert werden. Hier kommt dem Data Storytelling eine entscheidende Rolle zu, denn um mit Daten eine Veränderung hervorrufen zu können, müssen vorerst Verständnis und Motivation für den Sachverhalt zielgruppenspezifisch geschaffen werden. Allerdings handelt es sich bei Data Storytelling noch um ein Nischenthema. Diese Arbeit leitet mithilfe einer systematischen Literaturanalyse die Erfolgsfaktoren von Data Storytelling für eine effektive und effiziente Kommunikation von Daten her, um Data Scientists in Forschung und Praxis bei der Kommunikation der Daten und Ergebnisse zu unterstützen.
Risk-based Authentication (RBA) is an adaptive security measure to strengthen password-based authentication. RBA monitors additional features during login, and when observed feature values differ significantly from previously seen ones, users have to provide additional authentication factors such as a verification code. RBA has the potential to offer more usable authentication, but the usability and the security perceptions of RBA are not studied well.
We present the results of a between-group lab study (n=65) to evaluate usability and security perceptions of two RBA variants, one 2FA variant, and password-only authentication. Our study shows with significant results that RBA is considered to be more usable than the studied 2FA variants, while it is perceived as more secure than password-only authentication in general and comparably secure to 2FA in a variety of application types. We also observed RBA usability problems and provide recommendations for mitigation. Our contribution provides a first deeper understanding of the users' perception of RBA and helps to improve RBA implementations for a broader user acceptance.
Die Motive für die Einführung von Public Cloud Services liegen oft im Bereich der Kosteneinsparung und Qualitätsverbesserung. Vielfach werden bei der erstmaligen Einführung vermeidbare Fehler gemacht, die im Nachhinein den Erfolg des Vorhabens schmälern. Der Beitrag beschreibt ein aus Sicht der Beratungspraxis bewährtes Vorgehensmodell für die Einführung und Nutzung von Public Cloud Services unter besonderer Berücksichtigung von Microsoft Cloud Services.
Quantum mechanical theories are used to search and optimized the conformations of proposed small molecule candidates for treatment of SARS-CoV-2. These candidate compounds are taken from what is reported in the news and in other pre-peer-reviewed literature (e.g. ChemRxiv, bioRxiv). The goal herein is to provided predicted structures and relative conformational stabilities for selected drug and ligand candidates, in the hopes that other research groups can make use of them for developing a treatment.
Lower back pain is one of the most prevalent diseases in Western societies. A large percentage of European and American populations suffer from back pain at some point in their lives. One successful approach to address lower back pain is postural training, which can be supported by wearable devices, providing real-time feedback about the user’s posture. In this work, we analyze the changes in posture induced by postural training. To this end, we compare snapshots before and after training, as measured by the Gokhale SpineTracker™. Considering pairs of before and after snapshots in different positions (standing, sitting, and bending), we introduce a feature space, that allows for unsupervised clustering. We show that resulting clusters represent certain groups of postural changes, which are meaningful to professional posture trainers.
Science Track FrOSCon 2016
(2018)
Im Jahre 2015 feierte die Free and Open Source Software Conference ihr 10 Jähriges Bestehen. Entstanden aus einer Idee von Studierenden, wissenschaftlichen Mitarbeitern und Professoren des Fachbereichs Informatik entwickelte sich eine der wichtigsten Konferenzen im Bereich der freien und quelloffenen Software in Deutschland.