Facial expression recognition with temporal modeling of shapes
Conditional Random Fields (CRFs) is a discriminative and supervised approach for simultaneous sequence segmentation and frame labeling. Latent-Dynamic Conditional Random Fields (LDCRFs) incorporates hidden state variables within CRFs which model sub-structure motion patterns and dynamics between labels. Motivated by the success of LDCRFs in gesture recognition, we propose a framework for automatic facial expression recognition from continuous video sequence by modeling temporal variations within shapes using LDCRFs. We show that the proposed approach outperforms CRFs for recognizing facial expressions. Using Principal Component Analysis (PCA) we study the separability of various expression classes in lower dimension projected spaces. By comparing the performance of CRFs and LDCRFs against that of Support Vector Machines (SVMs) and a template based approach, we demonstrate that temporal variations within shapes are crucial in classifying expressions especially for those with small facial motion like anger and sadness. We also show empirically that only using changes in facial appearance over time without using the shape variations fails to obtain high performance for facial expression recognition. This reflects the importance of geometric deformations on face for recognizing expressions.