Spiking neural network for event camera ego-motion estimation

Conference Article


nanoGe MAT-SUS Symposium on Neuromorphic Sensory-Processing-Learning Systems inspired in Computational Neuroscience (NFM)



Doc link


Download the digital copy of the doc pdf document


Event cameras mimic the workings of the human bio visual pathway by sending image intensity change pulses to the neural system. They are a promising alternative to conventional frame-based cameras for detecting ultra fast motion with low latency, robustness to changes in illumination conditions, and low power consumption. These characteristics make them ideal for mobile robotic tasks. However, exploiting to its full capacity their unconventional sparse and asynchronous spatio-temporal data flow efficiently still challenges the computer vision community.

Deep Artificial Neural Networks (ANN), especially, the recent architecture of visual transformers (ViT) have achieved state-of-the-art performance for various visual tasks. However, the straight-forward use of ANNs on event input data needs a preprocessing step that constraints its sparse and asynchronous nature. Inspired by computational neuroscience, Spiking Neural Networks (SNNs) turn out to be a natural match for event cameras due to their sparse event-driven and temporal processing characteristics. SNNs have been applied mostly for classification tasks. Some other works involve regression tasks for optical flow estimation, depth estimation angular velocity estimation, and video reconstruction. However, limited work has been done to incorporate SNNs for full 3D ego-motion estimation.

We first present an optimization-based ego-motion estimation framework that exploits the event-based optical flow outputs of a trained SNN model. Our method successfully estimates pure rotation and pure translation motion from input events only and shows the potential of using SNNs for continuous ego-motion estimation tasks. Secondly, we show our Hybrid RNN-ViT architecture for optical flow estimation which uses ViT to learn global context yielding better results than SoA. We further present its SNN counterpart which combines SNNs to directly process the event data.


computer vision.

Author keywords

spiking neural network, ego-motion, event camera, transformer

Scientific reference

Y. Tian and J. Andrade-Cetto. Spiking neural network for event camera ego-motion estimation, 2022 nanoGe MAT-SUS Symposium on Neuromorphic Sensory-Processing-Learning Systems inspired in Computational Neuroscience, 2022, Barcelona.