PhD Thesis

Visual understanding of human behaviour: 3d pose, motion, actions and context

Work default illustration

Information

  • Started: 01/02/2016
  • Finished: 30/05/2023

Description

In robotics, the goal of allowing an inexperienced end-user to program a robot with a new desired behavior has been pursued using Learning from Demonstration approaches, where the human teaches the robot by simply showing it how to perform the task. A typical set-up consists of a manipulator arm, teleoperated by the user through a haptic device to, for instance, pour water into a glass. The demonstration is usually repeated several times so the robot can learn the different steps, the essence, and the variability of the task. Yet, this procedure is not as natural as we would expect. Ideally, we would like to get rid of any specialized device and teach the robot as we teach another person, just by showing how the human performs the task or simply narrating the steps needed. The robot should be able to perceive the meaningful actions, identify the tools being used, and extract the essential knowledge of the process to be able to actually perform the entire task itself. The main objective of the thesis is thus to move a step forward from existing approaches in robot learning, and develop the technology to instruct a general-purpose robot in a natural and human-like manner. To tackle this problem, we will put together tools from computer vision, machine learning, natural language processing and robotics.