Research Project

IPALM: Interactive Perception-Action-Learning for Modelling Objects


European Project

Start Date


End Date


Project Code


Default project illustration


Project Description

In IPALM, we will develop methods for the automatic digitization of objects and their physical properties by exploratory manipulations. These methods will be used to build a large collection of object models required for realistic grasping and manipulation experiments in robotics. Household objects such as tools, kitchenware, clothes, and food items are in focus of many potential applications of robotics, however they still pose great challenges for robot object perception and manipulation in realistic scenarios. We therefore propose to advance the state of the art by considering household objects that can be deformable, articulated, interactive, specular or transparent, or without a strict geometry such as cloth and food items. Our methods will learn the physical properties essential for perception and grasping simultaneously from different modalities: vision, touch, audio as well as text documents. These properties will include: 3D model, surface properties such as friction and texture, elasticity, weight, size, together with the grasping techniques for intended use. At the core of our approach will be a two-level modelling, where a category level model will provide priors for capturing instance level attributes of specific objects. We will build the category-level prior models by exploiting online available resources. A perception-action-learning loop will then use the robots vision, audio, and tactile senses to model instance-level object properties, guided by the more general category-level model. In return, knowledge acquired from a new instance will be used to improve the category-level knowledge. Our approach will allow us to efficiently create a large database of models for objects of diverse types, which will be suitable for example for training neural network based methods or enhancing existing simulators. We will propose a benchmark and evaluation metrics for object manipulation, to enable comparisons of results generated with various robotics platforms on our database.

Imperial College London, United Kingdom (Coordinator)
University of Bordeaux, France
Institut de Robòtica i Informàtica Industrial (IRI), CSIC-UPC, Spain
Aalto University, Finland
Czech Technical University, Czech Republic

Call: European CHIST-ERA 2017
Funds are provided by national research funding organisations. IRI is funded by the Spanish Ministry of Science, Innovation and Universities (MICIU).

Project Publications

Journal Publications

  • A. Agudo, V. Lepetit and F. Moreno-Noguer. Simultaneous completion and spatiotemporal grouping of corrupted motion tracks. The Visual Computer, 2021, to appear.

    Open/Close abstract Abstract Info Info pdf PDF

Conference Publications

  • A. Agudo. Piecewise Bézier space: Recovering 3D dynamic motion from video, 2021 IEEE International Conference on Image Processing, 2021, Anchorage (Alaska, USA), pp. 3268-3272.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Corona, A. Pumarola, G. Alenyà and F. Moreno-Noguer. Context-aware human motion prediction, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, Seattle, WA, USA (Virtual), pp. 6990-6999.

    Open/Close abstract Abstract Info Info pdf PDF
  • E. Corona, A. Pumarola, G. Alenyà, F. Moreno-Noguer and G. Rogez. GanHand: Predicting human grasp affordances in multi-object scenes, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, Seattle, WA, USA (Virtual), pp. 5030-5040.

    Open/Close abstract Abstract Info Info pdf PDF