Refine
H-BRS Bibliography
- yes (2)
Departments, institutes and facilities
Document Type
- Master's Thesis (2)
Language
- English (2)
Has Fulltext
- yes (2)
Keywords
- Object Detection (2) (remove)
Object detectors have improved considerably in the last years by using advanced Convolutional Neural Networks (CNNs) architectures. However, many detector hyper-parameters are not generally tuned, and they are used with values set by the detector authors. Blackbox optimization methods have gained more attention in recent years because of its ability to optimize the hyper-parameters of various machine learning algorithms and deep learning models. However, these methods are not explored in improving CNN-based object detector's hyper-parameters. In this research work, we propose the use of blackbox optimization methods such as Gaussian Process based Bayesian Optimization (BOGP), Sequential Model-based Algorithm Configuration (SMAC), and Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to tune the hyper-parameters in Faster R-CNN and Single Shot MultiBox Detector (SSD). In Faster R-CNN, tuning the input image size, prior box anchor scales and ratios using BOGP, SMAC, and CMA-ES has increased the performance around 1.5% in terms of Mean Average Precision (mAP) on PASCAL VOC. Tuning the anchor scales of SSD has increased the mAP by 3% on PASCAL VOC and marine debris datasets. On the COCO dataset with SSD, mAP improvement is observed in the medium and large objects, but mAP decreases by 1% in small objects. The experimental results show that the blackbox optimization methods have proved to increase the mAP performance by optimizing the object detectors. Moreover, it has achieved better results than the hand-tuned configurations in most of the cases.
Interactive Object Detection
(2019)
The success of state-of-the-art object detection methods depend heavily on the availability of a large amount of annotated image data. The raw image data available from various sources are abundant but non-annotated. Annotating image data is often costly, time-consuming or needs expert help. In this work, a new paradigm of learning called Active Learning is explored which uses user interaction to obtain annotations for a subset of the dataset. The goal of active learning is to achieve superior object detection performance with images that are annotated on demand. To realize active learning method, the trade-off between the effort to annotate (annotation cost) unlabeled data and the performance of object detection model is minimised.
Random Forests based method called Hough Forest is chosen as the object detection model and the annotation cost is calculated as the predicted false positive and false negative rate. The framework is successfully evaluated on two Computer Vision benchmark and two Carl Zeiss custom datasets. Also, an evaluation of RGB, HoG and Deep features for the task is presented.
Experimental results show that using Deep features with Hough Forest achieves the maximum performance. By employing Active Learning, it is demonstrated that performance comparable to the fully supervised setting can be achieved by annotating just 2.5% of the images. To this end, an annotation tool is developed for user interaction during Active Learning.