Master Thesis

From EgoPose estimation to conversational group detection

Work default illustration



  • If you are interested in the proposal, please contact with the supervisors.


Detection of groups of interacting people is a useful task in many modern technologies, with application fields spanning from video-surveillance to social robotic. Conversational groups, that is small gatherings of people engaged in free conversation, naturally organize themselves into proxemic patterns. In this context, robust position, head and body pose estimates can facilitate the higher-level description of the ongoing interplay and hence the detection. While head orientation can frequently change during social interactions, body pose is an inherently more stable cue. Although this has been acknowledged in several previous
works, body pose has been effectively used as a primary cue of interaction only very recently in third-person
videos [1].

During this internship, the student will join current efforts at the IRI on the development of EgoPose, a robust estimation algorithm for egocentric videos. EgoPose allows to estimate the invisible body pose of a person wearing a camera from the video he/she recorded. This egocentric body pose estimate will be then exploited for the first time in the context of conversional group detection from wearable cameras.

[1] J. Varadarajan, R. Subramanian, S. Rota-Bulò, N. Ahuja, O. Lanz, E.Ricci, Joint Estimation of Human Pose
During this internshi and Conversational Groups from Social Scenes, International Journal of Computer Vision, 2019