Refine
H-BRS Bibliography
- yes (9) (remove)
Departments, institutes and facilities
Document Type
- Conference Object (7)
- Article (1)
- Preprint (1)
Language
- English (9)
Has Fulltext
- no (9) (remove)
Keywords
- Machine Learning (9) (remove)
Focus on what matters: improved feature selection techniques for personal thermal comfort modelling
(2022)
Occupants' personal thermal comfort (PTC) is indispensable for their well-being, physical and mental health, and work efficiency. Predicting PTC preferences in a smart home can be a prerequisite to adjusting the indoor temperature for providing a comfortable environment. In this research, we focus on identifying relevant features for predicting PTC preferences. We propose a machine learning-based predictive framework by employing supervised feature selection techniques. We apply two feature selection techniques to select the optimal sets of features to improve the thermal preference prediction performance. The experimental results on a public PTC dataset demonstrated the efficiency of the feature selection techniques that we have applied. In turn, our PTC prediction framework with feature selection techniques achieved state-of-the-art performance in terms of accuracy, Cohen's kappa, and area under the curve (AUC), outperforming conventional methods.
Graph databases employ graph structures such as nodes, attributes and edges to model and store relationships among data. To access this data, graph query languages (GQL) such as Cypher are typically used, which might be difficult to master for end-users. In the context of relational databases, sequence to SQL models, which translate natural language questions to SQL queries, have been proposed. While these Neural Machine Translation (NMT) models increase the accessibility of relational databases, NMT models for graph databases are not yet available mainly due to the lack of suitable parallel training data. In this short paper we sketch an architecture which enables the generation of synthetic training data for the graph query language Cypher.
The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models (KGEMs). However, representations based on a single modality are inherently limited. To generate better representations of biological knowledge, we propose STonKGs, a Sophisticated Transformer trained on biomedical text and Knowledge Graphs. This multimodal Transformer uses combined input sequences of structured information from KGs and unstructured text data from biomedical literature to learn joint representations. First, we pre-trained STonKGs on a knowledge base assembled by the Integrated Network and Dynamical Reasoning Assembler (INDRA) consisting of millions of text-triple pairs extracted from biomedical literature by multiple NLP systems. Then, we benchmarked STonKGs against two baseline models trained on either one of the modalities (i.e., text or KG) across eight different classification tasks, each corresponding to a different biological application. Our results demonstrate that STonKGs outperforms both baselines, especially on the more challenging tasks with respect to the number of classes, improving upon the F1-score of the best baseline by up to 0.083. Additionally, our pre-trained model as well as the model architecture can be adapted to various other transfer learning applications. Finally, the source code and pre-trained STonKGs models are available at https://github.com/stonkgs/stonkgs and https://huggingface.co/stonkgs/stonkgs-150k.