Research Project

ViSen: Visual Sense, Tagging visual data with semantic descriptions

Type

European Project

Start Date

01/07/2013

End Date

30/06/2016

Project Code

PCIN-2013-047

Project illustration

Staff

Project Description

Today a typical Web document will contain a mix of visual and textual content. Most traditional tools for search and retrieval can successfully handle textual content, but are not prepared to handle hetereogeneous documents. The new type of content demands the development of new efficient tools for search and retrieval.

The visual sense project aims at mining automatically the semantic content of visual data to enable “machine reading” of images. In recent years, we have witnessed significant advances in the automatic recognition of visual concepts (VCR). These advances allowed for the creation of systems that can automatically generate keyword-based image annotations. The goal of this project is to move a step forward and predict semantic image representations that can be used to generate more informative sentence-based image annotations. Thus, facilitating search and browsing of large multimodal collections. More specifically, the project targets three case studies, namely image annotation, re-ranking for image search, and automatic image illustration of articles.

Project Publications

Journal Publications

  • A. Ramisa, F. Yan, F. Moreno-Noguer and K. Mikolajczyk. BreakingNews: Article annotation by image and text processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(5): 1072-1085, 2018.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo, J.M. Martínez, L. Agapito and B. Calvo. Modal space: A physics-based model for sequential estimation of time-varying shape from monocular video. Journal of Mathematical Imaging and Vision, 57(1): 75–98, 2017.

    Open/Close abstract Abstract Info Info pdf PDF
  • M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Random clustering ferns for multimodal object recognition. Neural Computing and Applications, 28(9): 2445-2460, 2017.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo and F. Moreno-Noguer. Combining local-physical and global-statistical models for sequential deformable shape from motion. International Journal of Computer Vision, 122(2): 371-387, 2017.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo, F. Moreno-Noguer, B. Calvo and J.M. Martínez. Sequential non-rigid structure from motion using physical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(5): 979-994, 2016.

    Open/Close abstract Abstract Info Info pdf PDF
  • G. Sanromà, A. Penate-Sanchez, R. Alquézar Mancho, F. Serratosa, F. Moreno-Noguer, J. Andrade-Cetto and M.A. González. MSClique: Multiple structure discovery through the maximum weighted clique problem. PLOS One, 11(1): e0145846, 2016.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo, F. Moreno-Noguer, B. Calvo and J.M. Martínez. Real-time 3D reconstruction of non-rigid shapes with a single moving camera. Computer Vision and Image Understanding, 153(12): 37–54, 2016.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. A 3D descriptor to detect task-oriented grasping points in clothing. Pattern Recognition, 60: 936-948, 2016.

    Open/Close abstract Abstract Info Info pdf PDF
  • M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Interactive multiple object learning with scanty human supervision. Computer Vision and Image Understanding, 149: 51-64, 2016.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Serradell, M.A. Pinheiro, R. Sznitman, J. Kybic, F. Moreno-Noguer and P. Fua. Non-rigid graph registration using active testing search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3): 625-638, 2015.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Simo-Serra, C. Torras and F. Moreno-Noguer. DaLI: Deformation and light invariant descriptor. International Journal of Computer Vision, 115(2): 136–154, 2015.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. Learning RGB-D descriptors of garment parts for informed robot grasping. Engineering Applications of Artificial Intelligence, 35: 246-258, 2014.

    Open/Close abstract Abstract Info Info pdf PDF

Conference Publications

  • A. Ramisa. Multimodal news article analysis, 26th International Joint Conference on Artificial Intelligence, 2017, Melbourne, Australia, pp. 5136-5140.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Quattoni, A. Ramisa, P. Swaroop, E. Simo-Serra and F. Moreno-Noguer. Structured prediction with output embeddings for semantic image annotation, 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2016, San Diego, pp. 552-557.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo, J.M. Martínez, B. Calvo and F. Moreno-Noguer. Mode-shape interpretation: Re-thinking modal space for recovering deformable shapes, 16th IEEE Winter Conference on Applications of Computer Vision, 2016, Lake Placid, NY, USA, pp. 1-8, IEEE.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo and F. Moreno-Noguer. Recovering pose and 3D deformable shape from multi-instance image ensembles, 13th Asian Conference on Computer Vision, 2016, Taipei, Taiwan, in Computer Vision – ACCV 2016, Vol 10114 of Lecture Notes in Computer Science, pp. 291-307, 2017, Springer.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo and F. Moreno-Noguer. Learning shape, motion and elastic models in force space, 2015 International Conference on Computer Vision, 2015, Santiago de Chile, pp. 756-764, IEEE.

    Open/Close abstract Abstract Info Info pdf PDF
  • L.D. Ellebracht, A. Ramisa, P. Swaroop, J.A. Cordero, F. Moreno-Noguer and A. Quattoni. Semantic tuples for evaluation of image sentence generation, 4th Workshop on Vision and Language, 2015, Lisbon.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Ramisa, J. Wang, Y. Lu, E. Dellandrea, F. Moreno-Noguer and R. Gaizauskas. Combining geometric, textual and visual features for predicting prepositions in image descriptions, 10th Conference on Empirical Methods in Natural Language Processing, 2015, Lisbon, pp. 214-220.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Simo-Serra, E. Trulls Fortuny, L. Ferraz, I. Kokkinos, P. Fua and F. Moreno-Noguer. Discriminative learning of deep convolutional feature point descriptors, 2015 International Conference on Computer Vision, 2015, Santiago de Chile, pp. 118-126, IEEE.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Penate-Sanchez, L. Porzi and F. Moreno-Noguer. Matchability prediction for full-search template matching algorithms, 2015 International Conference on 3D Vision, 2015, Lyon, pp. 353-361.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Rubio, M. Villamizar, L. Ferraz, A. Penate-Sanchez, A. Ramisa, E. Simo-Serra, A. Sanfeliu and F. Moreno-Noguer. Efficient monocular pose estimation for complex 3D models, 2015 IEEE International Conference on Robotics and Automation, 2015, Seattle, WA, USA, pp. 1397-1402.

    Open/Close abstract Abstract Info Info pdf PDF
  • M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Modeling robot's world with minimal effort, 2015 IEEE International Conference on Robotics and Automation, 2015, Seattle, WA, USA, pp. 4890-4896.

    Open/Close abstract Abstract Info Info pdf PDF
  • M. Villamizar, A. Garrell Zulueta, A. Sanfeliu and F. Moreno-Noguer. Multimodal object recognition using random clustering trees, 7th Iberian Conference on Pattern Recognition and Image Analysis, 2015, Santiago de Compostela, in Pattern Recognition and Image Analysis, Vol 9117 of Lecture Notes in Computer Science, pp. 496-504, 2015, Springer.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Simo-Serra, S. Fidler, F. Moreno-Noguer and R. Urtasun. Neuroaesthetics in fashion: Modeling the perception of fashionability, 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015, Boston, MA, USA.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Simo-Serra, C. Torras and F. Moreno-Noguer. Lie algebra-based kinematic prior for 3D human pose tracking, 14th IAPR International Conference on Machine Vision Applications, 2015, Tokyo, Japan, pp. 394-397.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Agudo and F. Moreno-Noguer. Simultaneous pose and non-rigid shape with particle dynamics, 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015, Boston, MA, USA, pp. 2179-2187.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Simo-Serra, C. Torras and F. Moreno-Noguer. Geodesic finite mixture models, 2014 British Machine Vision Conference, 2014, Nottingham, UK.

    Open/Close abstract Abstract Info Info pdf PDF
  • X. Solé, A. Ramisa and C. Torras. Evaluation of random forests on large-scale classification problems using a bag-of-visual-words representation, 17th Catalan Conference on Artificial Intelligence, 2014, Barcelona, in Artificial Intelligence Research and Development, Vol 269 of Frontiers in Artificial Intelligence and Applications, pp. 273-276, 2014, IOS Press.

    Open/Close abstract Abstract Info Info pdf PDF
  • L. Ferraz, X. Binefa and F. Moreno-Noguer. Very fast solution to the PnP problem with algebraic outlier rejection, 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014, Columbus, OH, USA, pp. 501-508.

    Open/Close abstract Abstract Info Info pdf PDF
  • M. Villamizar, A. Sanfeliu and F. Moreno-Noguer. Fast online learning and detection of natural landmarks for autonomous aerial robots, 2014 IEEE International Conference on Robotics and Automation, 2014, Hong Kong, China, pp. 4996-5003.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Rubio, M. Villamizar, L. Ferraz, A. Penate-Sanchez, A. Sanfeliu and F. Moreno-Noguer. Estimación monocular y eficiente de la pose usando modelos 3D complejos, XXXV Jornadas de Automática, 2014, Valencia, Spain.

    Open/Close abstract Abstract Info Info pdf PDF
  • L. Ferraz, X. Binefa and F. Moreno-Noguer. Leveraging feature uncertainty in the PnP problem, 2014 British Machine Vision Conference, 2014, Nottingham, UK.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Simo-Serra, S. Fidler, F. Moreno-Noguer and R. Urtasun. A high performance CRF model for clothes parsing, 12th Asian Conference on Computer Vision, 2014, Singapore, in Computer Vision - ACCV 2014, Vol 9005 of Lecture Notes in Computer Science, pp. 64-81, 2015, Springer.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Penate-Sanchez, F. Moreno-Noguer, J. Andrade-Cetto and F. Fleuret. LETHA: Learning from high quality inputs for 3D pose estimation in low quality images, 2nd International Conference on 3D Vision, 2014, Tokyo, pp. 517-524.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Trulls Fortuny, S. Tsogkas, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Segmentation-aware deformable part models, 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014, Columbus, OH, USA, pp. 168-175.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Penate-Sanchez, E. Serradell, J. Andrade-Cetto and F. Moreno-Noguer. Simultaneous pose, focal length and 2D-to-3D correspondences from noisy observations, 2013 British Machine Vision Conference, 2013, Bristol, UK, pp. 82.1-82.11.

    Open/Close abstract Abstract Info Info pdf PDF
  • A. Ramisa, G. Alenyà, F. Moreno-Noguer and C. Torras. FINDDD: A fast 3D descriptor to characterize textiles for robot manipulation, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, Tokyo, Japan, pp. 824-830.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Trulls Fortuny, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Dense segmentation-aware descriptors, 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2013, Portland, OR, USA, pp. 2890-2897.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Simo-Serra, A. Quattoni, C. Torras and F. Moreno-Noguer. A joint model for 2D and 3D pose estimation from a single image, 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2013, Portland, OR, USA, pp. 3634--3641.

    Open/Close abstract Abstract Info Info pdf PDF

Other Publications

  • E. Trulls Fortuny, I. Kokkinos, A. Sanfeliu and F. Moreno-Noguer. Dense segmentation-aware descriptors. In Dense Image Correspondences for Computer Vision, 83-107. Springer, 2016.

    Open/Close abstract Abstract Info Info pdf PDF