Refine
H-BRS Bibliography
- yes (2)
Departments, institutes and facilities
Document Type
- Conference Object (1)
- Master's Thesis (1)
Year of publication
- 2015 (2)
Language
- English (2)
Has Fulltext
- no (2)
Keywords
- NIR (1)
- camera-based person detection (1)
- industrial robots (1)
- near-infrared (1)
Semantic Image Segmentation Combining Visible and Near-Infrared Channels with Depth Information
(2015)
Image understanding is a vital task in computer vision that has many applications in areas such as robotics, surveillance and the automobile industry. An important precondition for image understanding is semantic image segmentation, i.e. the correct labeling of every image pixel with its corresponding object name or class. This thesis proposes a machine learning approach for semantic image segmentation that uses images from a multi-modal camera rig. It demonstrates that semantic segmentation can be improved by combining different image types as inputs to a convolutional neural network (CNN), when compared to a single-image approach. In this work a multi-channel near-infrared (NIR) image, an RGB image and a depth map are used. The detection of people is further improved by using a skin image that indicates the presence of human skin in the scene and is computed based on NIR information. It is also shown that segmentation accuracy can be enhanced by using a class voting method based on a superpixel pre-segmentation. Models are trained for 10-class, 3-class and binary classification tasks using an original dataset. Compared to the NIR-only approach, average class accuracy is increased by 7% for 10-class, and by 22% for 3-class classification, reaching a total of 48% and 70% accuracy, respectively. The binary classification task, which focuses on the detection of people, achieves a classification accuracy of 95% and true positive rate of 66%. The report at hand describes the proposed approach and the encountered challenges and shows that a CNN can successfully learn and combine features from multi-modal image sets and use them to predict scene labeling.
Persons entering the working range of industrial robots are exposed to a high risk of collision with moving parts of the system, potentially causing severe injuries. Conventional systems, which restrict the access to this area, range from walls and fences to light barriers and other vision based protective devices (VBPD). None of these systems allow to distinguish between humans and workpieces in a safe and reliable manner. In this work, a new approach is investigated, which uses an active near-infrared (NIR) camera system with advanced capabilities of skin detection to distinguish humans from workpieces based on characteristic spectral signatures. This approach allows to implement more intelligent muting processes and at the same time increases the safety of persons working close to the robots. The conceptual integration of such a camera system into a VBPD and the enhancement of person detection methods through skin detection are described and evaluated in this paper. Based upon this work, next steps could be the development of multimodal sensor systems to safeguard working ranges of collaborating robots using the described camera system.