Refine
Departments, institutes and facilities
Document Type
- Conference Object (9)
- Article (3)
- Report (2)
- Study Thesis (2)
- Preprint (1)
Keywords
- Machine Learning (17) (remove)
Projekte des maschinellen Lernens (ML), insbesondere im Bereich der Zeitreihenanalyse, gewinnen heute zunehmend an Bedeutung. Die Bereitstellung solcher Projekte in einer Produktionsumgebung mit dem gleichen Automatisierungsgrad wie bei klassischen Softwareprojekten ist ein komplexes Unterfangen. Die Umsetzung in Produktionsumgebungen erfordert neben klassischen DevOps auch Machine Learning Operation (MLOps) Technologien und Werkzeuge. Ziel dieser Studie ist es, einen umfassenden Überblick über verfügbare MLOps Tools zu bieten und einen spezifischen Techstack für Zeitreihen ML Projekte zu entwickeln. Es werden aktuelle Trends und Werkzeuge im Bereich MLOps durch eine multivokale Literaturrecherche (MLR) untersucht und analysiert. Die Studie identifiziert passende MLOps Werkzeuge und Methoden für die Zeitreihenanalyse und präsentiert eine spezifische Implementierung einer MLOps Pipeline für die Aktienkursprognose des S&P 500. MLOps und DevOps Tools nehmen eine essenzielle Rolle bei der effektiven Konstruktion und Verwaltung von ML Pipelines ein. Bei der Auswahl geeigneter Werkzeuge ist stets eine spezifische Anpassung an die jeweiligen Projektanforderungen erforderlich. Die Bereitstellung einer detaillierten Darstellung der aktuellen MLOps Tool Landschaft erweist sich hierbei als wertvolle Ressource, die es Entwicklern ermöglicht, die Effizienz und Effektivität ihrer ML Projekte zu optimieren.
Angesichts der raschen Entwicklungen und der Besonderheiten von Softwaresystemen, welche Künstliche Intelligenz (KI) nutzen, ist ein angepasstes Requirements Engineering (RE) erforderlich. Die spezifischen Anforderungen von KI-Projekten müssen dabei erkannt und angegangen werden. Hierfür wird eine systematische Überprufung bestehender Herausforderungen des RE in KI-Projekten durchgeführt. Darauf aufbauend werden neue RE-Ansätze und Empfehlungen präsentiert, die auf die Datensicht von KI-Projekten abzielen. Mithilfe der Analyse bestehender Lösungsansatze, Methoden, Frameworks und Tools soll aufgezeigt werden, inwiefern die Herausforderungen im RE bewältigt werden können. Noch bestehende Lücken im Forschungsstand werden identifiziert und aufgezeigt.
Trends of environmental awareness, combined with a focus on personal fitness and health, motivate many people to switch from cars and public transport to micromobility solutions, namely bicycles, electric bicycles, cargo bikes, or scooters. To accommodate urban planning for these changes, cities and communities need to know how many micromobility vehicles are on the road. In a previous work, we proposed a concept for a compact, mobile, and energy-efficient system to classify and count micromobility vehicles utilizing uncooled long-wave infrared (LWIR) image sensors and a neuromorphic co-processor. In this work, we elaborate on this concept by focusing on the feature extraction process with the goal to increase the classification accuracy. We demonstrate that even with a reduced feature list compared with our early concept, we manage to increase the detection precision to more than 90%. This is achieved by reducing the images of 160 × 120 pixels to only 12 × 18 pixels and combining them with contour moments to a feature vector of only 247 bytes.
In the field of automatic music generation, one of the greatest challenges is the consistent generation of pieces continuously perceived positively by the majority of the audience since there is no objective method to determine the quality of a musical composition. However, composing principles, which have been refined for millennia, have shaped the core characteristics of today's music. A hybrid music generation system, mlmusic, that incorporates various static, music-theory-based methods, as well as data-driven, subsystems, is implemented to automatically generate pieces considered acceptable by the average listener. Initially, a MIDI dataset, consisting of over 100 hand-picked pieces of various styles and complexities, is analysed using basic music theory principles, and the abstracted information is fed into explicitly constrained LSTM networks. For chord progressions, each individual network is specifically trained on a given sequence length, while phrases are created by consecutively predicting the notes' offset, pitch and duration. Using these outputs as a composition's foundation, additional musical elements, along with constrained recurrent rhythmic and tonal patterns, are statically generated. Although no survey regarding the pieces' reception could be carried out, the successful generation of numerous compositions of varying complexities suggests that the integration of these fundamentally distinctive approaches might lead to success in other branches.
Focus on what matters: improved feature selection techniques for personal thermal comfort modelling
(2022)
Occupants' personal thermal comfort (PTC) is indispensable for their well-being, physical and mental health, and work efficiency. Predicting PTC preferences in a smart home can be a prerequisite to adjusting the indoor temperature for providing a comfortable environment. In this research, we focus on identifying relevant features for predicting PTC preferences. We propose a machine learning-based predictive framework by employing supervised feature selection techniques. We apply two feature selection techniques to select the optimal sets of features to improve the thermal preference prediction performance. The experimental results on a public PTC dataset demonstrated the efficiency of the feature selection techniques that we have applied. In turn, our PTC prediction framework with feature selection techniques achieved state-of-the-art performance in terms of accuracy, Cohen's kappa, and area under the curve (AUC), outperforming conventional methods.
Graph databases employ graph structures such as nodes, attributes and edges to model and store relationships among data. To access this data, graph query languages (GQL) such as Cypher are typically used, which might be difficult to master for end-users. In the context of relational databases, sequence to SQL models, which translate natural language questions to SQL queries, have been proposed. While these Neural Machine Translation (NMT) models increase the accessibility of relational databases, NMT models for graph databases are not yet available mainly due to the lack of suitable parallel training data. In this short paper we sketch an architecture which enables the generation of synthetic training data for the graph query language Cypher.
ProtSTonKGs: A Sophisticated Transformer Trained on Protein Sequences, Text, and Knowledge Graphs
(2022)
While most approaches individually exploit unstructured data from the biomedical literature or structured data from biomedical knowledge graphs, their union can better exploit the advantages of such approaches, ultimately improving representations of biology. Using multimodal transformers for such purposes can improve performance on context dependent classication tasks, as demonstrated by our previous model, the Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs (STonKGs). In this work, we introduce ProtSTonKGs, a transformer aimed at learning all-encompassing representations of protein-protein interactions. ProtSTonKGs presents an extension to our previous work by adding textual protein descriptions and amino acid sequences (i.e., structural information) to the text- and knowledge graph-based input sequence used in STonKGs. We benchmark ProtSTonKGs against STonKGs, resulting in improved F1 scores by up to 0.066 (i.e., from 0.204 to 0.270) in several tasks such as predicting protein interactions in several contexts. Our work demonstrates how multimodal transformers can be used to integrate heterogeneous sources of information, paving the foundation for future approaches that use multiple modalities for biomedical applications.