Exploiting RGB-D data by means of Convolutional Neural Networks (CNNs) is at the core of a number of robotics applications, including object detection, scene semantic segmentation and grasping. Most existing approaches, however, exploit RGB-D data by simply considering depth as an additional input channel for the network. In this paper we show that the performance of deep architectures can be boosted by introducing DaConv, a novel, general-purpose CNN block which exploits depth to learn scale-aware feature representations. We demonstrate the benefits of DaConv on a variety of robotics oriented tasks, involving affordance detection, object coordinate regression and contour detection in RGB-D images. In each of these experiments we show the potential of the proposed block and how it can be readily integrated into existing CNN architectures.


computer vision.

Author keywords

RGB-D Perception; Visual Learning

Scientific reference

L. Porzi, S. Rota-Bulò, A. Penate-Sanchez, E. Ricci and F. Moreno-Noguer. Learning depth-aware deep representations for robotic perception. IEEE Robotics and Automation Letters, 2(2): 468-475, 2017.