Visual semantic re-ranker for text spotting

Conference Article


Iberoamerican Congress on Pattern Recognition (CIARP)





Doc link


Download the digital copy of the doc pdf document


Many current state-of-the-art methods for text recognition are based on purely local information and ignore the semantic corre- lation between text and its surrounding visual context. In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene. We initially rely on an off-the-shelf deep neural network that provides a series of text hypotheses for each input image. These text hypotheses are then re-ranked using the semantic relatedness with the object in the image. As a result of this combination, the performance of the original network is boosted with a very low computational cost. The proposed framework can be used as a drop-in complement for any text-spotting algorithm that outputs a ranking of word hypotheses. We validate our approach on ICDAR’17 shared task dataset.


computer vision.

Scientific reference

A. Sabir, F. Moreno-Noguer and L. Padró. Visual semantic re-ranker for text spotting, 23rd Iberoamerican Congress on Pattern Recognition, 2018, Madrid, in Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Vol 11401 of Lecture Notes in Computer Science, pp. 884-892, 2019.