Publication
Visual semantic relatedness dataset for image captioning
Conference Article
Conference
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Edition
2023
Pages
5598-5606
Doc link
http://dx.doi.org/10.1109/CVPRW59228.2023.00592
File
Authors
-
Sabir, Ahmed
-
Moreno Noguer, Francesc
-
Padró, Lluís
Abstract
Modern image captioning system relies heavily on extracting knowledge from images to capture the concept of a static story. In this paper, we propose a textual visual context dataset for captioning, in which the publicly available dataset COCO Captions [30] has been extended with information about the scene (such as objects in the image). Since this information has a textual form, it can be used to leverage any NLP task, such as text similarity or semantic relation methods, into captioning systems, either as an end-to-end training strategy or a post-processing based approach.
Categories
computer vision.
Author keywords
Training, Visualization, Computer vision, Conferences, Semantics, Pattern recognition, Task analysis
Scientific reference
A. Sabir, F. Moreno-Noguer and L. Padró. Visual semantic relatedness dataset for image captioning, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023, Vancouver, Canada, pp. 5598-5606.
Follow us!