Research Project

GREAT: Beyond Graph Neural Networks: Joint graph topology learning and graph-based inference for computer vision


National Project

Start Date


End Date


Project Code


Default project illustration


Project Description

Graphs are an ubiquitous data structure, employed extensively within computer science and related fields, including pattern recognition,
social networks, transportation networks, and biological systems, to name but a few. Recently, Graph Neural Networks (GNN) have
emerged as a deep learning framework able to operate on graph domains to perform inference tasks in an end-to-end fashion. When the
graph is explicit, GNN haven proven to be an extremely powerful modeling paradigm. However, in a variety of data domains, including
point clouds, text corpora and untrimmed video, the graph structure underlying the data is unknown, and must be assumed or inferred.
Generally speaking, inferring graph topologies from observations is an ill-posed problem, and there are many ways of associating a
topology with the observed data samples. Modern approaches for graph topology inference adopt a Graph Signal Processing (GSP)
perspective, which explicitly models certain properties of the graph signals (e.g., smoothness, sparsity). While this emphasizes the relation
between graph topology and the associated graph signals, the approach is tied to strong a priori assumptions on the signals.
The goal of the current project is the theoretical and computational investigation of models, methods and algorithms for defining a novel
framework that enables to learn graph topology jointly with an inference task on a graph in a data-driven, end-to-end formulation, hence
combining the strengths of GNN and GSP.

The proposed methodological developments will be put to test on real-world data in the challenging computer vision tasks of temporal
event/action segmentation and event/action localization. This is motivated by recent neuroscientific findings showing that neural event
representations in humans arise form temporal community structures akin to graphs. The proposed solutions are however not limited to
this context and may contribute to many other areas of application that share similar challenges, e.g., anomaly detection, change
detection, or image and motion segmentation.
The work plan of the project is structured around the following specific objectives:
1. Exploring the use of different techniques for graph learning regularization, with special emphasis on nonlocal methods to reveal complex
long-range data inter-dependencies.
2. Modeling the joint learning of graph topology and graph embedding in an unsupervised fashion.
3. Modeling end-to-end graph topology learning and clustering by absorbing application-driven priors in the learning problem.
4. Modeling end-to-end graph topology learning and weakly supervised node classification, leveraging on dynamic GNN formulations.

The research pursued by this project will constitute a significant theoretical advance in the understanding of graph neural networks in
unstructured contexts, so far a totally unexplored field. It will provide novel set of advanced methodological and operational tools for
hierarchical representation, segmentation, clustering and classification that will open the door to a new generation of high social impact
applications in various fields.

Project Publications

Journal Publications

  • M. Dimiccoli and H. Wendt. Learning event representations for temporal segmentation of image sequences by dynamic graph embedding. IEEE Transactions on Image Processing, 30: 1476-1486, 2020.

    Open/Close abstract Abstract Info Info pdf PDF