Research Project

AWESOME: Connecting past, present and future: Acquiring and leveraging prior knowledge to unveil the temporal structure of untrimmed videos

Type

National Project

Start Date

01/09/2024

End Date

31/08/2027

Project Code

PID2023-151351NB-I00

Project illustration

Staff

Project Description

Project PID2023-151351NB-I00 funded by MCIN/ AEI /10.13039/501100011033 and by ERDF, UE

Temporal sequence encoding (TSE), the task of representing temporal sequences, is crucial for perception, action and learning, and it is considered a hallmark of human cognitive abilities. A fundamental aspect of TSE is that we perceive continuous temporal observations into discrete units, called events, and we use the temporal organization of events in our memory accumulated along a lifetime as prior knowledge to understand the present in the context of the past and to make inference about the future. Consequently, performing correctly TSE requires a very high level of temporal abstraction, that is, the idea of representing and reasoning about events, actions and changes at different levels of granularity and duration. Currently, this represents a major challenge for AI based systems.

The proposed methodological developments will be put to test on real-world data in the challenging computer vision tasks of temporal action segmentation, localization, and anticipation in untrimmed videos that require high level of temporal abstraction and would largely benefit of the use of prior knowledge about the unfolding of events. Given that Language, together with Vision, is a fundamental modality through which human beings acquire knowledge about the world to guide their future actions, we will leverage them to acquire prior knowledge.

Project Publications

Conference Publications

  • E.B. Bueno Benito and M. Dimiccoli. CLOT: Closed Loop Optimal Transport for unsupervised action segmentation, 2025 International Conference on Computer Vision, 2025, Honolulu, Hawai'i, pp. 10719-10729.

    Open/Close abstract Abstract Info Info pdf PDF
  • E.B. Bueno Benito and M. Dimiccoli. 2by2: weakly-supervised learning for global action segmentation, 27th International Conference on Pattern Recognition, 2024, Kolkata, in Pattern Recognition, Vol 15315 of Lecture Notes in Computer Science, pp. 380-395, 2024, Cham.

    Open/Close abstract Abstract Info Info pdf PDF