• search hit 1 of 1
Back to Result List

Learning to Discriminate Text from Synthetic Data

  • Service robots could use textual information to perform important tasks, like product identification . However, natural scene text such as found in household environments can be very arbitrary in terms of size, color, font, layout, symbol repertoire, language, etc. This large variability makes robust text information extraction extremely difficult. Our work on textual information extraction for gray-scale still images uses adaptive binarization, connected component classification with a support vector machine and filtering based on the proximity of the connected components to their neighbours. The contribution of our approach is the use of a partially synthetic dataset for training. This decreases the burden of ground truth labelling at the connected component level. Our experiments show that classification generalization on real instances can be attained when training a classifier with synthetic data. We present our results on the ICDAR dataset.

Export metadata

Additional Services

Share in Twitter Search Google Scholar Availability
Metadaten
Document Type:Conference Object
Language:English
Author:Jose Antonio Ruiz
Parent Title (English):Röfer, Mayer et al. (Eds.): RoboCup 2011: Robot Soccer World Cup XV
First Page:270
Last Page:281
ISBN:978-3-642-32059-0
DOI:https://doi.org/10.1007/978-3-642-32060-6_23
Publication year:2012
Tag:Anschlussteil; Bilddatenbank; Eigenfunktion; Hauptkomponentenanalyse; Haushaltswesen; Produktfindung; Service-Roboter; Support Vector Machine; Verbindungsteil; Zeichenerkennung; optische Zeichenerkennung
Departments, institutes and facilities:Fachbereich Informatik
Dewey Decimal Classification (DDC):0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Entry in this database:2015/04/02