Publication
Combining geometric, textual and visual features for predicting prepositions in image descriptions
Conference Article
Conference
Conference on Empirical Methods in Natural Language Processing (EMNLP)
Edition
10th
Pages
214-220
Doc link
http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP022.pdf
File
Authors
-
Ramisa Ayats, Arnau
-
Wang, Josiah
-
Lu, Ying
-
Dellandrea, Emmanuel
-
Moreno Noguer, Francesc
-
Gaizauskas, Robert
Projects associated
Abstract
We investigate the role that geometric, textual and visual features play in the task of predicting a preposition that links two visual entities depicted in an image. The task is an important part of the subsequent process of generating image descriptions. We explore the prediction of prepositions for a pair of entities, both in the case when the labels of such entities are known and unknown. In all situations we found clear evidence that all three features contribute to the prediction task.
Categories
computer vision.
Author keywords
computer vision, natural language processing
Scientific reference
A. Ramisa, J. Wang, Y. Lu, E. Dellandrea, F. Moreno-Noguer and R. Gaizauskas. Combining geometric, textual and visual features for predicting prepositions in image descriptions, 10th Conference on Empirical Methods in Natural Language Processing, 2015, Lisbon, pp. 214-220.
Follow us!