A scalable metric learning based voting method for expression recognition
In this research work, we propose a facial expression classification method using metric learning-based k-nearest neighbor voting. To achieve accurate classification of a facial expression from frontal face images, we first learn a distance metric structure from training data that characterizes the feature space pattern, then use this metric to retrieve the nearest neighbors from the training dataset, and finally output the classification decision accordingly. An expression is represented as a fusion of face shape and texture. This representation is based on registering a face image with a landmarking shape model and extracting Gabor features from local patches around landmarks. This type of representation achieves robustness and effectiveness by using an ensemble of local patch feature detectors at a global shape level. A naive implementation of the metric learning-based k-nearest neighbor would incur a time complexity proportional to the size of the training dataset, which precludes this method being used with enormous datasets. To scale to potential larger databases, a similar approach to that in  is used to achieve an approximate yet efficient ML-based kNN voting based on Locality Sensitive Hashing (LSH). A query example is directly hashed to the bucket of a pre-computed hash table where candidate nearest neighbors can be found, and there is no need to search the entire database for nearest neighbors. Experimental results on the Cohn-Kanade database and the Moving Faces and People database show that both ML-based kNN voting and its LSH approximation outperform the state-of-the-art, demonstrating the superiority and scalability of our method.