Publication
VQ-HPS: Human pose and shape estimation in a vector-quantized latent space
Conference Article
Conference
European Conference on Computer Vision (ECCV)
Edition
18th
Pages
471-490
Doc link
https://doi.org/10.1007/978-3-031-72943-0_27
File
Authors
-
Fiche, Guénolé
-
Leglaive, Simon
-
Alameda Pineda, Xavier
-
Agudo Martínez, Antonio
-
Moreno Noguer, Francesc
Abstract
Previous works on Human Pose and Shape Estimation (HPSE) from RGB images can be broadly categorized into two main groups: parametric and non-parametric approaches. Parametric techniques leverage a low-dimensional statistical body model for realistic results, whereas recent non-parametric methods achieve higher precision by directly regressing the 3D coordinates of the human body mesh. This work introduces a novel paradigm to address the HPSE problem, involving a low-dimensional discrete latent representation of the human mesh and framing HPSE as a classification task. Instead of predicting body model parameters or 3D vertex coordinates, we focus on predicting the proposed discrete latent representation, which can be decoded into a registered human mesh. This innovative paradigm offers two key advantages. Firstly, predicting a low-dimensional discrete representation confines our predictions to the space of anthropomorphic poses and shapes even when little training data is available. Secondly, by framing the problem as a classification task, we can harness the discriminative power inherent in neural networks. The proposed model, VQ-HPS, predicts the discrete latent representation of the mesh. The experimental results demonstrate that VQ-HPS outperforms the current state-of-the-art non-parametric approaches while yielding results as realistic as those produced by parametric methods when trained with little data. VQ-HPS also shows promising results when training on large-scale datasets, highlighting the significant potential of the classification approach for HPSE.
Categories
computer vision.
Author keywords
Human pose and shape estimation, human mesh recovery, vector quantized autoencoder, transformers.
Scientific reference
G. Fiche, S. Leglaive, X. Alameda, A. Agudo and F. Moreno-Noguer. VQ-HPS: Human pose and shape estimation in a vector-quantized latent space, 18th European Conference on Computer Vision, 2024, Milano, Italy, in Computer Vision – ECCV 2024, Vol 15110 of Lecture Notes in Computer Science, pp. 471-490, 2024.
Follow us!