IRI - ViSen: Visual Sense, Tagging visual data with semantic descriptions

Research Project

ViSen: Visual Sense, Tagging visual data with semantic descriptions

Type

European Project

Start Date

01/07/2013

End Date

30/06/2016

Project Code

PCIN-2013-047

Staff

Moreno, Francesc

Principal Investigator

Agudo, Antonio

Researcher

Carreras, Xavier

Member

Villamizar, Michael Alejandro

Member

Trulls, Eduard

Member

Simo, Edgar

Member

Ramisa, Arnau

Member

Quattoni, Ariadna

Member

Project Description

Today a typical Web document will contain a mix of visual and textual content. Most traditional tools for search and retrieval can successfully handle textual content, but are not prepared to handle hetereogeneous documents. The new type of content demands the development of new efficient tools for search and retrieval.

The visual sense project aims at mining automatically the semantic content of visual data to enable “machine reading” of images. In recent years, we have witnessed significant advances in the automatic recognition of visual concepts (VCR). These advances allowed for the creation of systems that can automatically generate keyword-based image annotations. The goal of this project is to move a step forward and predict semantic image representations that can be used to generate more informative sentence-based image annotations. Thus, facilitating search and browsing of large multimodal collections. More specifically, the project targets three case studies, namely image annotation, re-ranking for image search, and automatic image illustration of articles.

Project Publications

Journal Publications

A. Ramisa, F. Yan, F. Moreno-Noguer and K. Mikolajczyk. BreakingNews: Article annotation by image and text processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(5): 1072-1085, 2018.

Abstract Info PDF
A. Agudo, J.M. Martínez, L. Agapito and B. Calvo. Modal space: A physics-based model for sequential estimation of time-varying shape from monocular video. Journal of Mathematical Imaging and Vision, 57(1): 75–98, 2017.

Abstract Info PDF
M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Random clustering ferns for multimodal object recognition. Neural Computing and Applications, 28(9): 2445-2460, 2017.

Abstract Info PDF
A. Agudo and F. Moreno-Noguer. Combining local-physical and global-statistical models for sequential deformable shape from motion. International Journal of Computer Vision, 122(2): 371-387, 2017.

Abstract Info PDF
G. Sanromà, A. Penate-Sanchez, R. Alquézar Mancho, F. Serratosa, F. Moreno-Noguer, J. Andrade-Cetto and M.A. González. MSClique: Multiple structure discovery through the maximum weighted clique problem. PLOS One, 11(1): e0145846, 2016.

Abstract Info PDF
A. Agudo, F. Moreno-Noguer, B. Calvo and J.M. Martínez. Real-time 3D reconstruction of non-rigid shapes with a single moving camera. Computer Vision and Image Understanding, 153(12): 37–54, 2016.

Abstract Info PDF
A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. A 3D descriptor to detect task-oriented grasping points in clothing. Pattern Recognition, 60: 936-948, 2016.

Abstract Info PDF
M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Interactive multiple object learning with scanty human supervision. Computer Vision and Image Understanding, 149: 51-64, 2016.

Abstract Info PDF
A. Agudo, F. Moreno-Noguer, B. Calvo and J.M. Martínez. Sequential non-rigid structure from motion using physical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(5): 979-994, 2016.

Abstract Info PDF
E. Serradell, M.A. Pinheiro, R. Sznitman, J. Kybic, F. Moreno-Noguer and P. Fua. Non-rigid graph registration using active testing search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3): 625-638, 2015.

Abstract Info PDF
E. Simo-Serra, C. Torras and F. Moreno-Noguer. DaLI: Deformation and light invariant descriptor. International Journal of Computer Vision, 115(2): 136–154, 2015.

Abstract Info PDF
A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. Learning RGB-D descriptors of garment parts for informed robot grasping. Engineering Applications of Artificial Intelligence, 35: 246-258, 2014.

Abstract Info PDF

Conference Publications

A. Ramisa. Multimodal news article analysis, 26th International Joint Conference on Artificial Intelligence, 2017, Melbourne, Australia, pp. 5136-5140.

Abstract Info PDF
A. Agudo, J.M. Martínez, B. Calvo and F. Moreno-Noguer. Mode-shape interpretation: Re-thinking modal space for recovering deformable shapes, 16th IEEE Winter Conference on Applications of Computer Vision, 2016, Lake Placid, NY, USA, pp. 1-8, IEEE.

Abstract Info PDF
A. Agudo and F. Moreno-Noguer. Recovering pose and 3D deformable shape from multi-instance image ensembles, 13th Asian Conference on Computer Vision, 2016, Taipei, Taiwan, in Computer Vision – ACCV 2016, Vol 10114 of Lecture Notes in Computer Science, pp. 291-307, 2017, Springer.

Abstract Info PDF
A. Quattoni, A. Ramisa, P. Swaroop, E. Simo-Serra and F. Moreno-Noguer. Structured prediction with output embeddings for semantic image annotation, 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2016, San Diego, pp. 552-557.

Abstract Info PDF
L.D. Ellebracht, A. Ramisa, P. Swaroop, J.A. Cordero, F. Moreno-Noguer and A. Quattoni. Semantic tuples for evaluation of image sentence generation, 4th Workshop on Vision and Language, 2015, Lisbon.

Abstract Info PDF
A. Ramisa, J. Wang, Y. Lu, E. Dellandrea, F. Moreno-Noguer and R. Gaizauskas. Combining geometric, textual and visual features for predicting prepositions in image descriptions, 10th Conference on Empirical Methods in Natural Language Processing, 2015, Lisbon, pp. 214-220.

Abstract Info PDF
E. Simo-Serra, E. Trulls Fortuny, L. Ferraz, I. Kokkinos, P. Fua and F. Moreno-Noguer. Discriminative learning of deep convolutional feature point descriptors, 2015 International Conference on Computer Vision, 2015, Santiago de Chile, pp. 118-126, IEEE.

Abstract Info PDF
A. Penate-Sanchez, L. Porzi and F. Moreno-Noguer. Matchability prediction for full-search template matching algorithms, 2015 International Conference on 3D Vision, 2015, Lyon, pp. 353-361.

Abstract Info PDF
A. Rubio, M. Villamizar, L. Ferraz, A. Penate-Sanchez, A. Ramisa, E. Simo-Serra, A. Sanfeliu and F. Moreno-Noguer. Efficient monocular pose estimation for complex 3D models, 2015 IEEE International Conference on Robotics and Automation, 2015, Seattle, WA, USA, pp. 1397-1402.

Abstract Info PDF
M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Modeling robot's world with minimal effort, 2015 IEEE International Conference on Robotics and Automation, 2015, Seattle, WA, USA, pp. 4890-4896.

Abstract Info PDF
M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Multimodal object recognition using random clustering trees, 7th Iberian Conference on Pattern Recognition and Image Analysis, 2015, Santiago de Compostela, in Pattern Recognition and Image Analysis, Vol 9117 of Lecture Notes in Computer Science, pp. 496-504, 2015, Springer.

Abstract Info PDF
E. Simo-Serra, S. Fidler, F. Moreno-Noguer and R. Urtasun. Neuroaesthetics in fashion: Modeling the perception of fashionability, 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015, Boston, MA, USA.

Abstract Info PDF
E. Simo-Serra, C. Torras and F. Moreno-Noguer. Lie algebra-based kinematic prior for 3D human pose tracking, 14th IAPR International Conference on Machine Vision Applications, 2015, Tokyo, Japan, pp. 394-397.

Abstract Info PDF
A. Agudo and F. Moreno-Noguer. Simultaneous pose and non-rigid shape with particle dynamics, 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015, Boston, MA, USA, pp. 2179-2187.

Abstract Info PDF
A. Agudo and F. Moreno-Noguer. Learning shape, motion and elastic models in force space, 2015 International Conference on Computer Vision, 2015, Santiago de Chile, pp. 756-764, IEEE.

Abstract Info PDF
L. Ferraz, X. Binefa and F. Moreno-Noguer. Very fast solution to the PnP problem with algebraic outlier rejection, 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014, Columbus, OH, USA, pp. 501-508.

Abstract Info PDF
M. Villamizar, A. Sanfeliu and F. Moreno-Noguer. Fast online learning and detection of natural landmarks for autonomous aerial robots, 2014 IEEE International Conference on Robotics and Automation, 2014, Hong Kong, China, pp. 4996-5003.

Abstract Info PDF
A. Rubio, M. Villamizar, L. Ferraz, A. Penate-Sanchez, A. Sanfeliu and F. Moreno-Noguer. Estimación monocular y eficiente de la pose usando modelos 3D complejos, XXXV Jornadas de Automática, 2014, Valencia, Spain.

Abstract Info PDF
L. Ferraz, X. Binefa and F. Moreno-Noguer. Leveraging feature uncertainty in the PnP problem, 2014 British Machine Vision Conference, 2014, Nottingham, UK.

Abstract Info PDF
E. Simo-Serra, S. Fidler, F. Moreno-Noguer and R. Urtasun. A high performance CRF model for clothes parsing, 12th Asian Conference on Computer Vision, 2014, Singapore, in Computer Vision - ACCV 2014, Vol 9005 of Lecture Notes in Computer Science, pp. 64-81, 2015, Springer.

Abstract Info PDF
A. Penate-Sanchez, F. Moreno-Noguer, J. Andrade-Cetto and F. Fleuret. LETHA: Learning from high quality inputs for 3D pose estimation in low quality images, 2nd International Conference on 3D Vision, 2014, Tokyo, pp. 517-524.

Abstract Info PDF
E. Trulls Fortuny, S. Tsogkas, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Segmentation-aware deformable part models, 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014, Columbus, OH, USA, pp. 168-175.

Abstract Info PDF
E. Simo-Serra, C. Torras and F. Moreno-Noguer. Geodesic finite mixture models, 2014 British Machine Vision Conference, 2014, Nottingham, UK.

Abstract Info PDF
X. Solé, A. Ramisa and C. Torras. Evaluation of random forests on large-scale classification problems using a bag-of-visual-words representation, 17th Catalan Conference on Artificial Intelligence, 2014, Barcelona, in Artificial Intelligence Research and Development, Vol 269 of Frontiers in Artificial Intelligence and Applications, pp. 273-276, 2014, IOS Press.

Abstract Info PDF
A. Penate-Sanchez, E. Serradell, J. Andrade-Cetto and F. Moreno-Noguer. Simultaneous pose, focal length and 2D-to-3D correspondences from noisy observations, 2013 British Machine Vision Conference, 2013, Bristol, UK, pp. 82.1-82.11.

Abstract Info PDF
A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. FINDDD: A fast 3D descriptor to characterize textiles for robot manipulation, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, Tokyo, Japan, pp. 824-830.

Abstract Info PDF
E. Trulls Fortuny, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Dense segmentation-aware descriptors, 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2013, Portland, OR, USA, pp. 2890-2897.

Abstract Info PDF
E. Simo-Serra, A. Quattoni, C. Torras and F. Moreno-Noguer. A joint model for 2D and 3D pose estimation from a single image, 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2013, Portland, OR, USA, pp. 3634--3641.

Abstract Info PDF

Other Publications

E. Trulls Fortuny, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Dense segmentation-aware descriptors. In Dense Image Correspondences for Computer Vision, 83-107. Springer, 2016.

Abstract Info PDF

Institut de Robòtica i Informàtica Industrial, CSIC-UPC
C/ Llorens i Artigas 4-6, 08028, Barcelona, Spain

Site map
Accessibility
About this web & cookies
Disclaimer

The activities of our institute are supported by:

Research Project

ViSen: Visual Sense, Tagging visual data with semantic descriptions

Type

Start Date

End Date

Project Code

Staff

Principal Investigator

Researcher

Member

Member

Member

Member

Member

Member

Project Description

Project Publications

Journal Publications

A. Ramisa, F. Yan, F. Moreno-Noguer and K. Mikolajczyk. BreakingNews: Article annotation by image and text processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(5): 1072-1085, 2018.

A. Agudo, J.M. Martínez, L. Agapito and B. Calvo. Modal space: A physics-based model for sequential estimation of time-varying shape from monocular video. Journal of Mathematical Imaging and Vision, 57(1): 75–98, 2017.

M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Random clustering ferns for multimodal object recognition. Neural Computing and Applications, 28(9): 2445-2460, 2017.

A. Agudo and F. Moreno-Noguer. Combining local-physical and global-statistical models for sequential deformable shape from motion. International Journal of Computer Vision, 122(2): 371-387, 2017.

G. Sanromà, A. Penate-Sanchez, R. Alquézar Mancho, F. Serratosa, F. Moreno-Noguer, J. Andrade-Cetto and M.A. González. MSClique: Multiple structure discovery through the maximum weighted clique problem. PLOS One, 11(1): e0145846, 2016.

A. Agudo, F. Moreno-Noguer, B. Calvo and J.M. Martínez. Real-time 3D reconstruction of non-rigid shapes with a single moving camera. Computer Vision and Image Understanding, 153(12): 37–54, 2016.

A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. A 3D descriptor to detect task-oriented grasping points in clothing. Pattern Recognition, 60: 936-948, 2016.

M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Interactive multiple object learning with scanty human supervision. Computer Vision and Image Understanding, 149: 51-64, 2016.

A. Agudo, F. Moreno-Noguer, B. Calvo and J.M. Martínez. Sequential non-rigid structure from motion using physical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(5): 979-994, 2016.

E. Serradell, M.A. Pinheiro, R. Sznitman, J. Kybic, F. Moreno-Noguer and P. Fua. Non-rigid graph registration using active testing search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3): 625-638, 2015.

E. Simo-Serra, C. Torras and F. Moreno-Noguer. DaLI: Deformation and light invariant descriptor. International Journal of Computer Vision, 115(2): 136–154, 2015.

A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. Learning RGB-D descriptors of garment parts for informed robot grasping. Engineering Applications of Artificial Intelligence, 35: 246-258, 2014.

Conference Publications

A. Ramisa. Multimodal news article analysis, 26th International Joint Conference on Artificial Intelligence, 2017, Melbourne, Australia, pp. 5136-5140.

A. Agudo, J.M. Martínez, B. Calvo and F. Moreno-Noguer. Mode-shape interpretation: Re-thinking modal space for recovering deformable shapes, 16th IEEE Winter Conference on Applications of Computer Vision, 2016, Lake Placid, NY, USA, pp. 1-8, IEEE.

A. Agudo and F. Moreno-Noguer. Recovering pose and 3D deformable shape from multi-instance image ensembles, 13th Asian Conference on Computer Vision, 2016, Taipei, Taiwan, in Computer Vision – ACCV 2016, Vol 10114 of Lecture Notes in Computer Science, pp. 291-307, 2017, Springer.

L.D. Ellebracht, A. Ramisa, P. Swaroop, J.A. Cordero, F. Moreno-Noguer and A. Quattoni. Semantic tuples for evaluation of image sentence generation, 4th Workshop on Vision and Language, 2015, Lisbon.

A. Ramisa, J. Wang, Y. Lu, E. Dellandrea, F. Moreno-Noguer and R. Gaizauskas. Combining geometric, textual and visual features for predicting prepositions in image descriptions, 10th Conference on Empirical Methods in Natural Language Processing, 2015, Lisbon, pp. 214-220.

E. Simo-Serra, E. Trulls Fortuny, L. Ferraz, I. Kokkinos, P. Fua and F. Moreno-Noguer. Discriminative learning of deep convolutional feature point descriptors, 2015 International Conference on Computer Vision, 2015, Santiago de Chile, pp. 118-126, IEEE.

A. Penate-Sanchez, L. Porzi and F. Moreno-Noguer. Matchability prediction for full-search template matching algorithms, 2015 International Conference on 3D Vision, 2015, Lyon, pp. 353-361.

A. Rubio, M. Villamizar, L. Ferraz, A. Penate-Sanchez, A. Ramisa, E. Simo-Serra, A. Sanfeliu and F. Moreno-Noguer. Efficient monocular pose estimation for complex 3D models, 2015 IEEE International Conference on Robotics and Automation, 2015, Seattle, WA, USA, pp. 1397-1402.

M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Modeling robot's world with minimal effort, 2015 IEEE International Conference on Robotics and Automation, 2015, Seattle, WA, USA, pp. 4890-4896.

E. Simo-Serra, S. Fidler, F. Moreno-Noguer and R. Urtasun. Neuroaesthetics in fashion: Modeling the perception of fashionability, 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015, Boston, MA, USA.

E. Simo-Serra, C. Torras and F. Moreno-Noguer. Lie algebra-based kinematic prior for 3D human pose tracking, 14th IAPR International Conference on Machine Vision Applications, 2015, Tokyo, Japan, pp. 394-397.

A. Agudo and F. Moreno-Noguer. Simultaneous pose and non-rigid shape with particle dynamics, 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015, Boston, MA, USA, pp. 2179-2187.

A. Agudo and F. Moreno-Noguer. Learning shape, motion and elastic models in force space, 2015 International Conference on Computer Vision, 2015, Santiago de Chile, pp. 756-764, IEEE.

L. Ferraz, X. Binefa and F. Moreno-Noguer. Very fast solution to the PnP problem with algebraic outlier rejection, 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014, Columbus, OH, USA, pp. 501-508.

M. Villamizar, A. Sanfeliu and F. Moreno-Noguer. Fast online learning and detection of natural landmarks for autonomous aerial robots, 2014 IEEE International Conference on Robotics and Automation, 2014, Hong Kong, China, pp. 4996-5003.

A. Rubio, M. Villamizar, L. Ferraz, A. Penate-Sanchez, A. Sanfeliu and F. Moreno-Noguer. Estimación monocular y eficiente de la pose usando modelos 3D complejos, XXXV Jornadas de Automática, 2014, Valencia, Spain.

L. Ferraz, X. Binefa and F. Moreno-Noguer. Leveraging feature uncertainty in the PnP problem, 2014 British Machine Vision Conference, 2014, Nottingham, UK.

E. Simo-Serra, S. Fidler, F. Moreno-Noguer and R. Urtasun. A high performance CRF model for clothes parsing, 12th Asian Conference on Computer Vision, 2014, Singapore, in Computer Vision - ACCV 2014, Vol 9005 of Lecture Notes in Computer Science, pp. 64-81, 2015, Springer.

A. Penate-Sanchez, F. Moreno-Noguer, J. Andrade-Cetto and F. Fleuret. LETHA: Learning from high quality inputs for 3D pose estimation in low quality images, 2nd International Conference on 3D Vision, 2014, Tokyo, pp. 517-524.

E. Trulls Fortuny, S. Tsogkas, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Segmentation-aware deformable part models, 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014, Columbus, OH, USA, pp. 168-175.

E. Simo-Serra, C. Torras and F. Moreno-Noguer. Geodesic finite mixture models, 2014 British Machine Vision Conference, 2014, Nottingham, UK.

A. Penate-Sanchez, E. Serradell, J. Andrade-Cetto and F. Moreno-Noguer. Simultaneous pose, focal length and 2D-to-3D correspondences from noisy observations, 2013 British Machine Vision Conference, 2013, Bristol, UK, pp. 82.1-82.11.

A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. FINDDD: A fast 3D descriptor to characterize textiles for robot manipulation, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, Tokyo, Japan, pp. 824-830.

E. Trulls Fortuny, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Dense segmentation-aware descriptors, 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2013, Portland, OR, USA, pp. 2890-2897.

E. Simo-Serra, A. Quattoni, C. Torras and F. Moreno-Noguer. A joint model for 2D and 3D pose estimation from a single image, 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2013, Portland, OR, USA, pp. 3634--3641.

Other Publications

E. Trulls Fortuny, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Dense segmentation-aware descriptors. In Dense Image Correspondences for Computer Vision, 83-107. Springer, 2016.