pub H-BRS | 005 Computerprogrammierung, Programme, Daten

Vergleich von Open Source MLOps Tools zur Unterstützung von Machine Learning basierten Zeitreihenanalysen (2024)

Projekte des maschinellen Lernens (ML), insbesondere im Bereich der Zeitreihenanalyse, gewinnen heute zunehmend an Bedeutung. Die Bereitstellung solcher Projekte in einer Produktionsumgebung mit dem gleichen Automatisierungsgrad wie bei klassischen Softwareprojekten ist ein komplexes Unterfangen. Die Umsetzung in Produktionsumgebungen erfordert neben klassischen DevOps auch Machine Learning Operation (MLOps) Technologien und Werkzeuge. Ziel dieser Studie ist es, einen umfassenden Überblick über verfügbare MLOps Tools zu bieten und einen spezifischen Techstack für Zeitreihen ML Projekte zu entwickeln. Es werden aktuelle Trends und Werkzeuge im Bereich MLOps durch eine multivokale Literaturrecherche (MLR) untersucht und analysiert. Die Studie identifiziert passende MLOps Werkzeuge und Methoden für die Zeitreihenanalyse und präsentiert eine spezifische Implementierung einer MLOps Pipeline für die Aktienkursprognose des S&P 500. MLOps und DevOps Tools nehmen eine essenzielle Rolle bei der effektiven Konstruktion und Verwaltung von ML Pipelines ein. Bei der Auswahl geeigneter Werkzeuge ist stets eine spezifische Anpassung an die jeweiligen Projektanforderungen erforderlich. Die Bereitstellung einer detaillierten Darstellung der aktuellen MLOps Tool Landschaft erweist sich hierbei als wertvolle Ressource, die es Entwicklern ermöglicht, die Effizienz und Effektivität ihrer ML Projekte zu optimieren.

A Methodology for Integrating Hierarchical VMAP-Data Structures into an Ontology Using Semantically Represented Analyses (2024)

Spelten, Philipp ; Meyer, Morten-Christian ; Wagner, Anna ; Wolf, Klaus ; Reith, Dirk

Integrating physical simulation data into data ecosystems challenges the compatibility and interoperability of data management tools. Semantic web technologies and relational databases mostly use other data types, such as measurement or manufacturing design data. Standardizing simulation data storage and harmonizing the data structures with other domains is still a challenge, as current standards such as the ISO standard STEP (ISO 10303 ”Standard for the Exchange of Product model data”) fail to bridge the gap between design and simulation data. This challenge requires new methods, such as ontologies, to rethink simulation results integration. This research describes a new software architecture and application methodology based on the industrial standard ”Virtual Material Modelling in Manufacturing” (VMAP). The architecture integrates large quantities of structured simulation data and their analyses into a semantic data structure. It is capable of providing data permeability from the global digital twin level to the detailed numerical values of data entries and even new key indicators in a three-step approach: It represents a file as an instance in a knowledge graph, queries the file’s metadata, and finds a semantically represented process that enables new metadata to be created and instantiated.

Datengetriebenes Requirements Engineering für die Integration von KI in Softwaresysteme (2024)

Tewes, Alexander

Angesichts der raschen Entwicklungen und der Besonderheiten von Softwaresystemen, welche Künstliche Intelligenz (KI) nutzen, ist ein angepasstes Requirements Engineering (RE) erforderlich. Die spezifischen Anforderungen von KI-Projekten müssen dabei erkannt und angegangen werden. Hierfür wird eine systematische Überprufung bestehender Herausforderungen des RE in KI-Projekten durchgeführt. Darauf aufbauend werden neue RE-Ansätze und Empfehlungen präsentiert, die auf die Datensicht von KI-Projekten abzielen. Mithilfe der Analyse bestehender Lösungsansatze, Methoden, Frameworks und Tools soll aufgezeigt werden, inwiefern die Herausforderungen im RE bewältigt werden können. Noch bestehende Lücken im Forschungsstand werden identifiziert und aufgezeigt.

Making Order in Household Accounting - Digital Invoices as Domestic Work Artifacts (2024)

Dethier, Erik ; Kern, Dean-Robin ; Stevens, Gunnar ; Boden, Alexander

The digitization of financial activities in consumers' lives is increasing, and the digitalization of invoicing processes is expected to play a significant role, although this area is not well understood regarding the private sector. Human-Computer Interaction (HCI) and Computer Supported Cooperative Work (CSCW) research have a long history of analyzing the socio-material and temporal aspects of work practices that are relevant for the domestic domain. The socio-material structuring of invoicing work and the working styles of consumers must be considered when designing effective consumer support systems. In this ethnomethodologically-informed, design-oriented interview study, we followed 17 consumers in their daily practices of dealing with invoices to make the invisible administrative work involved in this process visible. We identified and described the meaningful artifacts that were used in a spatial-temporal process within various storage locations such as input, reminding, intermediate (for postponing cases) buffers, and archive systems. Furthermore, we identified three different working styles that consumers exhibited: direct completion, at the next opportunity, and postpone as far as possible. This study contributes to our understanding of household economics and domestic workplace studies in the tradition of CSCW and has implications for the design of electronic invoicing systems.

Developing OERs for Teaching Database Systems (2023)

Rakow, Thomas C. ; Kless, André ; Hasler, Charlotte ; Knolle, Harm ; Faeskorn-Woyke, Heide ; Saatz, Inga Marina ; Lambert, Jens ; Focken, Mareike

In the project EILD.nrw, Open Educational Resources (OER) have been developed for teaching databases. Lecturers can use the tools and courses in a variety of learning scenarios. Students of computer science and application subjects can learn the complete life cycle of databases. For this purpose, quizzes, interactive tools, instructional videos, and courses for learning management systems are developed and published under a Creative Commons license. We give an overview of the developed OERs according to subject, description, teaching form, and format. Following, we describe how licencing, sustainability, accessibility, contextualization, content description, and technical adaptability are implemented. The feedback of students in ongoing classes are evaluated.

Trust your guts: fostering embodied knowledge and sustainable practices through voice interaction (2023)

Esau, Margarita ; Lawo, Dennis ; Neifer, Thomas ; Stevens, Gunnar ; Boden, Alexander

Despite various attempts to prevent food waste and motivate conscious food handling, household members find it difficult to correctly assess the edibility of food. With the rise of ambient voice assistants, we did a design case study to support households’ in situ decision-making process in collaboration with our voice agent prototype, Fischer Fritz. Therefore, we conducted 15 contextual inquiries to understand food practices at home. Furthermore, we interviewed six fish experts to inform the design of our voice agent on how to guide consumers and teach food literacy. Finally, we created a prototype and discussed with 15 consumers its impact and capability to convey embodied knowledge to the human that is engaged as sensor. Our design research goes beyond current Human-Food Interaction automation approaches by emphasizing the human-food relationship in technology design and demonstrating future complementary human-agent collaboration with the aim to increase humans’ competence to sense, think, and act.

A systematic literature review about the consumers’ side of fake review detection – Which cues do consumers use to determine the veracity of online user reviews? (2023)

Walther, Michelle ; Jakobi, Timo ; Watson, Steven James ; Stevens, Gunnar

Background Consumers rely heavily on online user reviews when shopping online and cybercriminals produce fake reviews to manipulate consumer opinion. Much prior research focuses on the automated detection of these fake reviews, which are far from perfect. Therefore, consumers must be able to detect fake reviews on their own. In this study we survey the research examining how consumers detect fake reviews online. Methods We conducted a systematic literature review over the research on fake review detection from the consumer-perspective. We included academic literature giving new empirical data. We provide a narrative synthesis comparing the theories, methods and outcomes used across studies to identify how consumers detect fake reviews online. Results We found only 15 articles that met our inclusion criteria. We classify the most often used cues identified into five categories which were (1) review characteristics (2) textual characteristics (3) reviewer characteristics (4) seller characteristics and (5) characteristics of the platform where the review is displayed. Discussion We find that theory is applied inconsistently across studies and that cues to deception are often identified in isolation without any unifying theoretical framework. Consequently, we discuss how such a theoretical framework could be developed.

Pump Up Password Security! Evaluating and Enhancing Risk-Based Authentication on a Real-World Large-Scale Online Service (2023)

Wiefling, Stephan ; Jørgensen, Paul René ; Thunem, Sigurd ; Lo Iacono, Luigi

Risk-based authentication (RBA) aims to protect users against attacks involving stolen passwords. RBA monitors features during login, and requests re-authentication when feature values widely differ from those previously observed. It is recommended by various national security organizations, and users perceive it more usable than and equally secure to equivalent two-factor authentication. Despite that, RBA is still used by very few online services. Reasons for this include a lack of validated open resources on RBA properties, implementation, and configuration. This effectively hinders the RBA research, development, and adoption progress. To close this gap, we provide the first long-term RBA analysis on a real-world large-scale online service. We collected feature data of 3.3 million users and 31.3 million login attempts over more than 1 year. Based on the data, we provide (i) studies on RBA’s real-world characteristics plus its configurations and enhancements to balance usability, security, and privacy; (ii) a machine learning–based RBA parameter optimization method to support administrators finding an optimal configuration for their own use case scenario; (iii) an evaluation of the round-trip time feature’s potential to replace the IP address for enhanced user privacy; and (iv) a synthesized RBA dataset to reproduce this research and to foster future RBA research. Our results provide insights on selecting an optimized RBA configuration so that users profit from RBA after just a few logins. The open dataset enables researchers to study, test, and improve RBA for widespread deployment in the wild.

Risk-Based Authentication for OpenStack: A Fully Functional Implementation and Guiding Example (2023)

Unsel, Vincent ; Wiefling, Stephan ; Gruschka, Nils ; Lo Iacono, Luigi

Online services have difficulties to replace passwords with more secure user authentication mechanisms, such as Two-Factor Authentication (2FA). This is partly due to the fact that users tend to reject such mechanisms in use cases outside of online banking. Relying on password authentication alone, however, is not an option in light of recent attack patterns such as credential stuffing. Risk-Based Authentication (RBA) can serve as an interim solution to increase password-based account security until better methods are in place. Unfortunately, RBA is currently used by only a few major online services, even though it is recommended by various standards and has been shown to be effective in scientific studies. This paper contributes to the hypothesis that the low adoption of RBA in practice can be due to the complexity of implementing it. We provide an RBA implementation for the open source cloud management software OpenStack, which is the first fully functional open source RBA implementation based on the Freeman et al. algorithm, along with initial reference tests that can serve as a guiding example and blueprint for developers.

Data Cart: A Privacy Pattern for Personal Data Management in Organizations (2023)

Tolsdorf, Jan ; Lo Iacono, Luigi

The European General Data Protection Regulation requires the implementation of Technical and Organizational Measures (TOMs) to reduce the risk of illegitimate processing of personal data. For these measures to be effective, they must be applied correctly by employees who process personal data under the authority of their organization. However, even data processing employees often have limited knowledge of data protection policies and regulations, which increases the likelihood of misconduct and privacy breaches. To lower the likelihood of unintentional privacy breaches, TOMs must be developed with employees’ needs, capabilities, and usability requirements in mind. To reduce implementation costs and help organizations and IT engineers with the implementation, privacy patterns have proven to be effective for this purpose. In this chapter, we introduce the privacy pattern Data Cart, which specifically helps to develop TOMs for data processing employees. Based on a user-centered design approach with employees from two public organizations in Germany, we present a concept that illustrates how Privacy by Design can be effectively implemented. Organizations, IT engineers, and researchers will gain insight on how to improve the usability of privacy-compliant tools for managing personal data.

Data Protection Officers' Perspectives on Privacy Challenges in Digital Ecosystems (2023)

Wiefling, Stephan ; Tolsdorf, Jan ; Lo Iacono, Luigi

Digital ecosystems are driving the digital transformation of business models. Meanwhile, the associated processing of personal data within these complex systems poses challenges to the protection of individual privacy. In this paper, we explore these challenges from the perspective of digital ecosystems' platform providers. To this end, we present the results of an interview study with seven data protection officers representing a total of 12 digital ecosystems in Germany. We identified current and future challenges for the implementation of data protection requirements, covering issues on legal obligations and data subject rights. Our results support stakeholders involved in the implementation of privacy protection measures in digital ecosystems, and form the foundation for future privacy-related studies tailored to the specifics of digital ecosystems.

Automatic Consistency Checking of Table and Text in Financial Documents (2023)

Ali, Syed Musharraf ; Deußer, Tobias ; Houben, Sebastian ; Hillebrand, Lars ; Metzler, Tim ; Sifa, Rafet

A company's financial documents use tables along with text to organize the data containing key performance indicators (KPIs) (such as profit and loss) and a financial quantity linked to them. The KPI’s linked quantity in a table might not be equal to the similarly described KPI's quantity in a text. Auditors take substantial time to manually audit these financial mistakes and this process is called consistency checking. As compared to existing work, this paper attempts to automate this task with the help of transformer-based models. Furthermore, for consistency checking it is essential for the table's KPIs embeddings to encode the semantic knowledge of the KPIs and the structural knowledge of the table. Therefore, this paper proposes a pipeline that uses a tabular model to get the table's KPIs embeddings. The pipeline takes input table and text KPIs, generates their embeddings, and then checks whether these KPIs are identical. The pipeline is evaluated on the financial documents in the German language and a comparative analysis of the cell embeddings' quality from the three tabular models is also presented. From the evaluation results, the experiment that used the English-translated text and table KPIs and Tabbie model to generate table KPIs’ embeddings achieved an accuracy of 72.81% on the consistency checking task, outperforming the benchmark, and other tabular models.

Knowledge Base Question Answering by Transformer-Based Graph Pattern Scoring (2023)

Lamott, Marcel ; Hees, Jörn ; Ulges, Adrian

Question Answering (QA) has gained significant attention in recent years, with transformer-based models improving natural language processing. However, issues of explainability remain, as it is difficult to determine whether an answer is based on a true fact or a hallucination. Knowledge-based question answering (KBQA) methods can address this problem by retrieving answers from a knowledge graph. This paper proposes a hybrid approach to KBQA called FRED, which combines pattern-based entity retrieval with a transformer-based question encoder. The method uses an evolutionary approach to learn SPARQL patterns, which retrieve candidate entities from a knowledge base. The transformer-based regressor is then trained to estimate each pattern’s expected F1 score for answering the question, resulting in a ranking ofcandidate entities. Unlike other approaches, FRED can attribute results to learned SPARQL patterns, making them more interpretable. The method is evaluated on two datasets and yields MAP scores of up to 73 percent, with the transformer-based interpretation falling only 4 pp short of an oracle run. Additionally, the learned patterns successfully complement manually generated ones and generalize well to novel questions.

TreeSatAI Benchmark Archive : a multi-sensor, multi-label dataset for tree species classification in remote sensing (2023)

Ahlswede, Steve ; Schulz, Christian ; Gava, Christiano ; Helber, Patrick ; Bischke, Benjamin ; Förster, Michael ; Arias, Florencia ; Hees, Jörn ; Demir, Begüm ; Kleinschmit, Birgit

Airborne and spaceborne platforms are the primary data sources for large-scale forest mapping, but visual interpretation for individual species determination is labor-intensive. Hence, various studies focusing on forests have investigated the benefits of multiple sensors for automated tree species classification. However, transferable deep learning approaches for large-scale applications are still lacking. This gap motivated us to create a novel dataset for tree species classification in central Europe based on multi-sensor data from aerial, Sentinel-1 and Sentinel-2 imagery. In this paper, we introduce the TreeSatAI Benchmark Archive, which contains labels of 20 European tree species (i.e., 15 tree genera) derived from forest administration data of the federal state of Lower Saxony, Germany. We propose models and guidelines for the application of the latest machine learning techniques for the task of tree species classification with multi-label data. Finally, we provide various benchmark experiments showcasing the information which can be derived from the different sensors including artificial neural networks and tree-based machine learning methods. We found that residual neural networks (ResNet) perform sufficiently well with weighted precision scores up to 79 % only by using the RGB bands of aerial imagery. This result indicates that the spatial content present within the 0.2 m resolution data is very informative for tree species classification. With the incorporation of Sentinel-1 and Sentinel-2 imagery, performance improved marginally. However, the sole use of Sentinel-2 still allows for weighted precision scores of up to 74 % using either multi-layer perceptron (MLP) or Light Gradient Boosting Machine (LightGBM) models. Since the dataset is derived from real-world reference data, it contains high class imbalances. We found that this dataset attribute negatively affects the models' performances for many of the underrepresented classes (i.e., scarce tree species). However, the class-wise precision of the best-performing late fusion model still reached values ranging from 54 % (Acer) to 88 % (Pinus). Based on our results, we conclude that deep learning techniques using aerial imagery could considerably support forestry administration in the provision of large-scale tree species maps at a very high resolution to plan for challenges driven by global environmental change. The original dataset used in this paper is shared via Zenodo (https://doi.org/10.5281/zenodo.6598390, Schulz et al., 2022). For citation of the dataset, we refer to this article.

Effective data protection by design through interdisciplinary research methods: The example of effective purpose specification by applying user-Centred UX-design methods (2022)

Grafenstein, Max von ; Jakobi, Timo ; Stevens, Gunnar

While the recent discussion on Art. 25 GDPR often considers the approach of data protection by design as an innovative idea, the notion of making data protection law more effective through requiring the data controller to implement the legal norms into the processing design is almost as old as the data protection debate. However, there is another, more recent shift in establishing the data protection by design approach through law, which is not yet understood to its fullest extent in the debate. Art. 25 GDPR requires the controller to not only implement the legal norms into the processing design but to do so in an effective manner. By explicitly declaring the effectiveness of the protection measures to be the legally required result, the legislator inevitably raises the question of which methods can be used to test and assure such efficacy. In our opinion, extending the legal compatibility assessment to the real effects of the required measures opens this approach to interdisciplinary methodologies. In this paper, we first summarise the current state of research on the methodology established in Art. 25 sect. 1 GDPR, and pinpoint some of the challenges of incorporating interdisciplinary research methodologies. On this premise, we present an empirical research methodology and first findings which offer one approach to answering the question on how to specify processing purposes effectively. Lastly, we discuss the implications of these findings for the legal interpretation of Art. 25 GDPR and related provisions, especially with respect to a more effective implementation of transparency and consent, and provide an outlook on possible next research steps.

Warum wir parteiische Datentreuhänder brauchen (2022)

Stevens, Gunnar ; Boden, Alexander

Der technische Fortschritt im Bereich der Erhebung, Speicherung und Verarbeitung von Daten macht es erforderlich, neue Fragen zu sozialverträglichen Datenmärkten aufzuwerfen. So gibt es sowohl eine Tendenz zur vereinfachten Datenteilung als auch die Forderung, die informationelle Selbstbestimmung besser zu schützen. Innerhalb dieses Spannungsfeldes bewegt sich die Idee von Datentreuhändern. Ziel des Beitrags ist darzulegen, dass zwischen verschiedenen Formen der Datentreuhänderschaft unterschieden werden sollte, um der Komplexität des Themas gerecht zu werden. Insbesondere bedarf es neben der mehrseitigen Treuhänderschaft, mit dem Treuhänder als neutraler Instanz, auch der einseitigen Treuhänderschaft, bei dem der Treuhänder als Anwalt der Verbraucherinteressen fungiert. Aus dieser Perspektive wird das Modell der Datentreuhänderschaft als stellvertretende Deutung der Interessen individueller und kollektiver Identitäten systematisch entwickelt.

Eight Lightweight Usable Security Principles for Developers (2022)

Gorski, Peter Leo ; Lo Iacono, Luigi ; Smith, Matthew

We propose eight usable security principles that provide software developers with a lightweight framework to help them integrate security in a user-friendly way. These principles should help developers who must weigh usability and security tradeoffs to facilitate adoption.

Employees’ privacy perceptions: exploring the dimensionality and antecedents of personal data sensitivity and willingness to disclose (2022)

Tolsdorf, Jan ; Reinhardt, Delphine ; Lo Iacono, Luigi

The processing of employees’ personal data is dramatically increasing, yet there is a lack of tools that allow employees to manage their privacy. In order to develop these tools, one needs to understand what sensitive personal data are and what factors influence employees’ willingness to disclose. Current privacy research, however, lacks such insights, as it has focused on other contexts in recent decades. To fill this research gap, we conducted a cross-sectional survey with 553 employees from Germany. Our survey provides multiple insights into the relationships between perceived data sensitivity and willingness to disclose in the employment context. Among other things, we show that the perceived sensitivity of certain types of data differs substantially from existing studies in other contexts. Moreover, currently used legal and contextual distinctions between different types of data do not accurately reflect the subtleties of employees’ perceptions. Instead, using 62 different data elements, we identified four groups of personal data that better reflect the multi-dimensionality of perceptions. However, previously found common disclosure antecedents in the context of online privacy do not seem to affect them. We further identified three groups of employees that differ in their perceived data sensitivity and willingness to disclose, but neither in their privacy beliefs nor in their demographics. Our findings thus provide employers, policy makers, and researchers with a better understanding of employees’ privacy perceptions and serve as a basis for future targeted research on specific types of personal data and employees.

Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety (2022)

Deployment of modern data-driven machine learning methods, most often realized by deep neural networks (DNNs), in safety-critical applications such as health care, industrial plant control, or autonomous driving is highly challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability and implausible predictions to directed attacks by means of malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from so-called safety concerns, properties that preclude their deployment as no argument or experimental setup can help to assess the remaining risk. In recent years, an abundance of state-of-the-art techniques aiming to address these safety concerns has emerged. This chapter provides a structured and broad overview of them. We first identify categories of insufficiencies to then describe research activities aiming at their detection, quantification, or mitigation. Our work addresses machine learning experts and safety engineers alike: The former ones might profit from the broad range of machine learning topics covered and discussions on limitations of recent methods. The latter ones might gain insights into the specifics of modern machine learning methods. We hope that this contribution fuels discussions on desiderata for machine learning systems and strategies on how to help to advance existing approaches accordingly.

Von Personal Information Management zu Vendor Management Software (2022)

Dethier, Erik ; Pakusch, Christina ; Boden, Alexander

Personal-Information-Management-Systeme (PIMS) gelten als Chance, um die Datensouveränität der Verbraucher zu stärken. Datenschutzbezogene Fragen sind für Verbraucher immer dort relevant, wo sie Verträge und Nutzungsbedingungen mit Diensteanbietern eingehen. Vor diesem Hintergrund diskutiert dieser Beitrag die Potenziale von VRM-Systemen, die nicht nur das Datenmanagement, sondern das gesamte Vertragsmanagement von Verbrauchern unterstützen. Dabei gehen wir der Frage nach, ob diese besser geeignet sind, um Verbraucher zu souveränem Handeln zu befähigen.

Datenschutz und informationelle Selbstbestimmung in der Schule (2021)

Rau, Franco ; Galanamatis, Britta ; Gerber, Lars ; Grell, Petra ; Konert, Johannes ; Rheinländer, Kathrin ; Scholl, Daniel

Datenschutz und informationelle Selbstbestimmung sind Bestandteile aktueller Leitbilder einer Digitalen Bildung in der Schule. Im Kontext der Schulschließungen und der vorrangigen Nutzung digitaler Medien zeigte sich jedoch, dass Datenschutz weder als Thema noch als Gestaltungsprinzip digitaler Lernumgebungen in der bildungsadministrativen und pädagogisch-praktischen Schulwirklichkeit systematisch verankert ist. Die Diskrepanz zwischen aktuellen Leitbildern einer digitalen Bildung und der sichtbar problematischen Praxis des digitalen Notfalldistanzunterrichts markiert den Ausgangspunkt des Beitrages, der sich der übergeordneten Frage widmet, welche Herausforderungen sich bei der Realisierung von Datenschutz in der Schul- und Unterrichtswirklichkeit in einer digital geprägten Welt stellen. Im Sinne einer Problemfeldanalyse werden prototypische Handlungsprobleme der Schule herausgearbeitet. Fokussiert betrachtet werden exemplarische Herausforderungen und Anforderungen an Technologien und Akteur:innen der inneren und äußeren Schulentwicklung auf den Ebenen der Unterrichtsentwicklung, der Personalentwicklung, der Technologieentwicklung und der Organisationsentwicklung.

Datensouveränität für Verbraucher:innen: Technische Ansätze durch KI-basierte Transparenz und Auskunft im Kontext der DSGVO (2021)

Grünewald, Elias ; Pallas, Frank

Hinreichende Datensouveränität gestaltet sich für Verbraucher:innen in der Praxis als äußerst schwierig. Die Europäische Datenschutzgrundverordnung garantiert umfassende Betroffenenrechte, die von verwantwortlichen Stellen durch technisch-organisatorische Maßnahmen umzusetzen sind. Traditionelle Vorgehensweisen wie die Bereitstellung länglicher Datenschutzerklärungen oder der ohne weitere Hilfestellungen angebotene Download von personenbezogenen Rohdaten werden dem Anspruch der informationellen Selbstbestimmung nicht gerecht. Die im Folgenden aufgezeigten neuen technischen Ansätze insbesondere KI-basierter Transparenz- und Auskunftsmodalitäten zeigen die Praktikabilität wirksamer und vielseitiger Mechanismen. Hierzu werden die relevanten Transparenzangaben teilautomatisiert extrahiert, maschinenlesbar repräsentiert und anschließend über diverse Kanäle wie virtuelle Assistenten oder die Anreicherung von Suchergebnissen ausgespielt. Ergänzt werden außerdem automatisierte und leicht zugängliche Methoden für Auskunftsersuchen und deren Aufbereitung nach Art. 15 DSGVO. Abschließend werden konkrete Regulierungsimplikationen diskutiert.

Consuming disaster data: is IT ethical? (2021)

Perng, Sung-Yueh ; Büscher, Monika ; Moffat, Luke

Most people use disaster apps infrequently, primarily only in situations of turmoil, when they are physically or emotionally vulnerable. Personal data may be necessary to help them, data protections may be waived. In some circumstances, free movement and liberties may be curtailed for public protection, as was seen in the current COVID pandemic. Consuming and producing disaster data can deepen problems arising at the confluence of surveillance and disaster capitalism, where data has become a tool for solutionist instrumentarian power (Zuboff 2019, Klein 2008) and part of a destructive mode of one world worlding (Law 2015, Escobar 2020). The special use of disaster apps prompts us to ask what role consumer protection could play in safeguarding democratic liberties. Within this work, a set of current approaches are briefly reviewed and two case studies are presented of what we call appropriation or design against datafication. These combine document analysis and literature research with several months of online and field ethnographic observation. The first case study examines disaster app use in response to the 2010 Haiti earthquake, the second explores COVID Contact Tracing in Taiwan in 2020/21. Against this backdrop we ask, ‘how could and how should consumer protection respond to problems of surveillance disaster capitalism?’ Drawing on our work with the is IT ethical? Exchange, a co-designed community platform and knowledge exchange for disaster information sharing, and a Societal Readiness Assessment Framework that we are developing alongside it, we explore how co-design methodologies could help define answers.

Datenschutz und -souveränität für Verbraucher:innen im künstlich-intelligenten Fahrzeug (2021)

Kroschwald, Steffen

Künstliche Intelligenz im autonomen Fahrzeug verarbeitet enorme Mengen an Daten. Beim Betrieb eines solchen Fahrzeugs basiert jede Bewegung auf einer datenbasierten, automatisierten und adaptiven Entscheidungsfindung. Aber auch, um Regeln zur Erkennung und Entscheidung in komplexen Situationen wie den hochindividuellen Verkehrsszenarien entwickeln zu können (KI-Training), sind bereits beachtliche Datenmengen von Fahrzeugen im Realverkehr erforderlich – zum Beispiel Videosequenzen aus Kamerafahrten. Für das Training Künstlicher Intelligenz ist es aus Sicht der Fahrzeugentwicklung attraktiv, auf den Datenschatz zuzugreifen, den die Gesamtheit der Fahrzeuge im realen Anwendungskontext erzeugen kann. Als Nutzer:innen und Insassen sind Verbraucher:innen so Teil einer groß angelegten Testdatenerhebung durch Fahrzeughersteller und Anbieter. Das wirft Datenschutzfragen auf. Ziel des vorliegenden Beitrags ist es herauszuarbeiten, inwiefern sich hierdurch Implikationen für die Rechte und Freiheiten von Verbraucher:innen ergeben und welche Mechanismen das geltende Recht sowie aktuelle legislative Entwicklungen bereithalten, den „Datenhunger“ der KI mit den Interessen an Datensouveränität und informationeller Selbstbestimmung in Einklang und Ausgleich zu bringen. Im Fokus steht dabei insbesondere, wie Anforderungen schon im Produktdesign „mitgedacht“ werden und damit für Verbraucher:innen rechts- und vertrauensfördernd wirken können.

„Im Wohnzimmer kriegt die schon alles mit“ – Sprachassistentendaten im Alltag (2021)

Pins, Dominik ; Alizadeh, Fatemeh

Sprachassistenten wie Alexa oder Google Assistant sind aus dem Alltag vieler VerbraucherInnen nicht mehr wegzudenken. Sie überzeugen insbesondere durch die sprachbasierte und somit freihändige Steuerung und mitunter auch den unterhaltsamen Charakter. Als häuslicher Lebensmittelpunkt sind die häufigsten Aufstellungsorte das Wohnzimmer und die Küche, da sich Haushaltsmitglieder dort die meiste Zeit aufhalten und das alltägliche Leben abspielt. Dies bedeutet allerdings ebenso, dass an diesen Orten potenziell viele Daten erfasst und gesammelt werden können, die nicht für den Sprachassistenten bestimmt sind. Demzufolge ist nicht auszuschließen, dass der Sprachassistent – wenn auch versehentlich – durch Gespräche oder Geräusche aktiviert wird und Aufnahmen speichert, selbst wenn eine Aktivierung unbewusst von Anwesenden bzw. von anderen Geräten (z. B. Fernseher) erfolgt oder aus anderen Räumen kommt. Im Rahmen eines Forschungsprojekts haben wir dazu NutzerInnen über Ihre Nutzungs- und Aufstellungspraktiken der Sprachassistenten befragt und zudem einen Prototyp getestet, der die gespeicherten Interaktionen mit dem Sprachassistenten sichtbar macht. Dieser Beitrag präsentiert basierend auf den Erkenntnissen aus den Interviews und abgeleiteten Leitfäden aus den darauffolgenden Nutzungstests des Prototyps eine Anwendung zur Beantragung und Visualisierung der Interaktionsdaten mit dem Sprachassistenten. Diese ermöglicht es, Interaktionen und die damit zusammenhängende Situation darzustellen, indem sie zu jeder Interaktion die Zeit, das verwendete Gerät sowie den Befehl wiedergibt und unerwartete Verhaltensweisen wie die versehentliche oder falsche Aktivierung sichtbar macht. Dadurch möchten wir VerbraucherInnen für die Fehleranfälligkeit dieser Geräte sensibilisieren und einen selbstbestimmteren und sichereren Umgang ermöglichen.

Open Access

005 Computerprogrammierung, Programme, Daten

Refine

H-BRS Bibliography

Departments, institutes and facilities

Document Type

Year of publication

Language

Has Fulltext

Keywords

52 search hits