Being able to understand and model fashion can have a great impact in everyday life. In this work we focus on building models that are able to discover and understand fashion. For this purpose we have created the Fashion144k dataset, consisting of 144,169 user posts with images and their associated metadata. We propose the challenging task of identifying the fashionability of the posts and present a Conditional Random Field model that is not only able to predict fashionability, but it is also able to give fashion advice to the users.
- Neuroaesthetics in Fashion: Modeling the Perception of Fashionability
- Conference in Computer Vision and Pattern Recognition (CVPR), 2015
There are many cases in which data is found to be distributed on a Riemannian manifold. In these cases, Euclidean metrics are not applicable and one needs to resort to geodesic distances consistent with the manifold geometry. For this purpose, we draw inspiration on a variant of the expectation-maximization algorithm, that uses a minimum message length criterion to automatically estimate the optimal number of components from multivariate data lying on an Euclidean space. In order to use this approach on Riemannian manifolds, we propose a formulation in which each component is defined on a different tangent space, thus avoiding the problems associated with the loss of accuracy produced when linearizing the manifold with a single tangent space. Our approach can be applied to any type of manifold for which it is possible to estimate its tangent space.
- Lie Algebra-Based Kinematic Prior for 3D Human Pose Tracking
- International Conference on Machine Vision Applications (MVA) [best paper], 2015
- Geodesic Finite Mixture Models
- British Machine Vision Conference (BMVC), 2014
DaLI descriptors are local image patch representations that have been shown to be robust to deformation and strong illumination changes. These descriptors are constructed by treating the image patch as a 3D surface and then simulating the diffusion of heat along the surface for different intervals of time. Small time intervals represent local deformation properties while large time intervals represent global deformation properties. Additionally, by performing a logarithmic sampling and then a Fast Fourier Transform, it is possible to obtain robustness against non-linear illumination changes. We have created the first feature point dataset that focuses on deformation and illumination changes of real world objects in order to perform evaluation, where we show the DaLI descriptors outperform all the widely used descriptors.
- DaLI: Deformation and Light Invariant Descriptor
- International Journal of Computer Vision (IJCV), 2015
- Deformation and Illumination Invariant Feature Point Descriptor
- Conference in Computer Vision and Pattern Recognition (CVPR), 2011
In this research we focus on the semantic segmentation of clothings from still images. This is a very complex task due to the large number of classes where intra-class variability can be larger than inter-class variability. We propose a Conditional Random Field (CRF) model that is able to leverage many different image features to obtain state-of-the-art performance on the challenging Fashionista dataset.
- A High Performance CRF Model for Clothes Parsing
- Asian Conference on Computer Vision (ACCV), 2014
Kinematic synthesis consists of the theoretical design of robots to comply with a given task. In this project we focus on finite point kinematic synthesis, that is, given a specific robotic topology and a task defined by spatial positions, we design a robot with that topology that complies with the task.
Tree topologies consist of loop-free structures where there can be many end-effectors. A characteristic of these topologies is that there are many shared joints. This allows some structures that may seem redundant to not actually be redundant when considering all the end-effectors at once. The main focus of this work is the design of grippers that have topologies similar to that of the human hand, which can be seen as a tree topology.
- Kinematic Synthesis using Tree Topologies
- Mechanism and Machine Theory (MAMT) 72:94-113, 2014
- Design of Multi-fingered Robotic Hands for Finite and Infinitesimal Tasks using Kinematic Synthesis
- Advances in Robot Kinematics (ARK), 2012
- Design of Non-Anthropomorphic Robotic Hands for Anthropomorphic Tasks
- ASME International Design Engineering Technical Conferences (IDETC), 2011
- Kinematic Model of the Hand using Computer Vision
- Degree Thesis, 2011
This line of research focuses on the estimation of the 3D pose of humans from single monocular images. This is an extremely difficult problem due to the large number of ambiguities that rise from the projection of 3D objects to the image plane. We consider image evidence derived from the usage of different detectors for the different parts of the body, which results in noisy 2D estimations where the estimation uncertainty must be compensation. In order to deal with these issues, we propose different approaches using discriminative and generative models to enforce learnt anthropomorphism constraints. We show that by exploiting prior knowledge of human kinematics it is possible to overcome these ambiguities and obtain good pose estimation performance.
- A Joint Model for 2D and 3D Pose Estimation from a Single Image
- Conference in Computer Vision and Pattern Recognition (CVPR), 2013
- Single Image 3D Human Pose Estimation from Noisy Observations
- Conference in Computer Vision and Pattern Recognition (CVPR), 2012