IRI - Combining geometric, textual and visual features for predicting prepositions in image descriptions

Publication

Combining geometric, textual and visual features for predicting prepositions in image descriptions

Conference Article

Conference

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Edition

10th

Pages

214-220

Doc link

http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP022.pdf

File

Download the digital copy of the doc pdf document

Authors

Ramisa Ayats, Arnau
Wang, Josiah
Lu, Ying
Dellandrea, Emmanuel
Moreno Noguer, Francesc
Gaizauskas, Robert

Projects associated

Abstract

We investigate the role that geometric, textual and visual features play in the task of predicting a preposition that links two visual entities depicted in an image. The task is an important part of the subsequent process of generating image descriptions. We explore the prediction of prepositions for a pair of entities, both in the case when the labels of such entities are known and unknown. In all situations we found clear evidence that all three features contribute to the prediction task.

Author keywords

computer vision, natural language processing

Scientific reference

A. Ramisa, J. Wang, Y. Lu, E. Dellandrea, F. Moreno-Noguer and R. Gaizauskas. Combining geometric, textual and visual features for predicting prepositions in image descriptions, 10th Conference on Empirical Methods in Natural Language Processing, 2015, Lisbon, pp. 214-220.

Publication

Combining geometric, textual and visual features for predicting prepositions in image descriptions

Conference Article

Conference

Edition

Pages

Doc link

File

Authors

Ramisa Ayats, Arnau

Wang, Josiah

Lu, Ying

Dellandrea, Emmanuel

Moreno Noguer, Francesc

Gaizauskas, Robert

Projects associated

ViSen: Visual Sense, Tagging visual data with semantic descriptions

RobInstruct: Instructing robots using natural communication skills

Abstract

Categories

Author keywords

Scientific reference