The relative impact of within-class model misspecifications on enumeration accuracy in latent profile and factor mixture models
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Latent profile analysis (LPA) and factor mixture modeling (FMM) are frequently used approaches to detecting latent classes, which compare several models that differ in the number of classes estimated. Numerous enumeration criteria are used to aid in this decision-making process, but research concerning their performance has focused on their ability to select the true model when all estimated models being compared have a correctly specified model structure (Henson et al., 2007; Lubke & Muthén, 2007; Nylund et al., 2007; Peugh & Fan, 2013). Previous research (Bauer & Curran, 2004; Lubke and Neale, 2006) has demonstrated that overestimating class number is likely when data that is partly explained by a continuous latent factor in the population (e.g., FMM) is only fit to LPA models. However, no research has determined whether such an enumeration bias is present when LPA data are fit to models that include a continuous latent factor in addition to the categorical class factor (i.e., FMM). This simulation study was designed to assess the impact of misspecifications of the within-class factor structure on the accuracy of FMM and LPA in identifying the true number of classes present in the data. Enumeration accuracy under FMM and LPA estimation was compared across several enumeration indices, including information criteria (AIC, AICc, BIC, and nBIC), classification-based indices (entropy and ICL-BIC), and enumeration-specific LRTs (LMR, aLMR, and BLRT). Data were generated according to each mixture model structure, varying across two levels of true class number (K), class separation (MD), class mixing proportions (π), and sample size (N). Each dataset was fit to both FMM and LPA models that had a correct within-class model specification, as well as FMM and LPA models with a misspecified within-class factor structure due to either mistakenly omitting or including a continuous latent factor. The results of this study suggest that enumeration is more greatly affected in LPA than in FMM when the within-class model is misspecified. LPA estimation produced lower enumeration accuracy rates than FMM estimation, regardless of whether the data’s true structure was FMM or LPA. Information criteria, particularly the nBIC, were the most likely to correctly identify K classes, though FMM estimation was generally necessary to achieve high enumeration accuracy (i.e., > 95%). The LRTs were sensitive to within-class model misspecifications and the classification-based indices had low enumeration accuracy across conditions.