View recommendation for multi-camera demonstration-based training
- While humans can effortlessly pick a view from multiple streams, automatically choosing the best view is a challenge. Choosing the best view from multi-camera streams poses a problem regarding which objective metrics should be considered. Existing works on view selection lack consensus about which metrics should be considered to select the best view. The literature on view selection describes diverse possible metrics. And strategies such as information-theoretic, instructional design, or aesthetics-motivated fail to incorporate all approaches. In this work, we postulate a strategy incorporating information-theoretic and instructional design-based objective metrics to select the best view from a set of views. Traditionally, information-theoretic measures have been used to find the goodness of a view, such as in 3D rendering. We adapted a similar measure known as the viewpoint entropy for real-world 2D images. Additionally, we incorporated similarity penalization to get a more accurate measure of the entropy of a view, which is one of the metrics for the best view selection. Since the choice of the best view is domain-dependent, we chose demonstration-based training scenarios as our use case. The limitation of our chosen scenarios is that they do not include collaborative training and solely feature a single trainer. To incorporate instructional design considerations, we included the trainer’s body pose, face, face when instructing, and hands visibility as metrics. To incorporate domain knowledge we included predetermined regions’ visibility as another metric. All of those metrics are taken into account to produce a parameterized view recommendation approach for demonstration-based training. An online study using recorded multi-camera video streams from a simulation environment was used to validate those metrics. Furthermore, the responses from the online study were used to optimize the view recommendation performance with a normalized discounted cumulative gain (NDCG) value of 0.912, which shows good performance with respect to matching user choices.
Document Type: | Article |
---|---|
Language: | English |
Author: | Saugata Biswas, Ernst Kruijff, Eduardo Veas |
Parent Title (English): | Multimedia Tools and Applications |
Volume: | 83 |
Issue: | 7 |
First Page: | 21765 |
Last Page: | 21800 |
ISSN: | 1380-7501 |
URN: | urn:nbn:de:hbz:1044-opus-74816 |
DOI: | https://doi.org/10.1007/s11042-023-16169-0 |
Publisher: | Springer |
Publishing Institution: | Hochschule Bonn-Rhein-Sieg |
Date of first publication: | 2023/08/03 |
Copyright: | © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License. |
Funding: | This work was partly funded through the Campus to World project (Innovative Hochschule, BMBF, FKZ: 03IHS092A). We would like to thank all participants who took part in the user study. |
Keywords: | Camera selection; Camera view analysis; Demonstration-based training; Entropy; Instruction design; Multi-camera; Recommender systems; View selection |
Departments, institutes and facilities: | Fachbereich Informatik |
Institute of Visual Computing (IVC) | |
Projects: | Campus to world - Eine Innovation Mall für das Wissen, Teilvorhaben HS Bonn-Rhein-Sieg (DE/BMBF/13IHS092A) |
Dewey Decimal Classification (DDC): | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 006 Spezielle Computerverfahren |
Entry in this database: | 2023/08/11 |
Licence (German): | Creative Commons - CC BY - Namensnennung 4.0 International |