Browsing by Subject "Psychometrics"

Now showing 1 - 8 of 8

A comparison of computer-based classification testing approaches using mixed-format tests with the generalized partial credit model
(2010-08) Kim, Jiseon; Dodd, Barbara Glenzing; Beretvas, Susan N.; Whittaker, Tiffany A.; Vaughn, Brandon K.; Davis, Laurie L.
Classification testing has been widely used to make categorical decisions by determining whether an examinee has a certain degree of ability required by established standards. As computer technologies have developed, classification testing has become more computerized. Several approaches have been proposed and investigated in the context of computer-based classification testing, including: 1) Computerized adaptive test (CAT); 2) Multistage test (MST); 3) Sequential probability ratio test (SPRT), among others. The purpose of this study was to systematically compare the differences in classification decision precision among several testing approaches (i.e., CAT, MST, and SPRT) given three test lengths and three cutoff scores using mixed-format tests based on the generalized partial credit model. The progressive-restricted exposure control procedure and constrained CAT content balancing procedure with test unit types were also incorporated as part of this study. All conditions were evaluated in terms of the classification decision precision and the exposure control property. Overall, this study’s results indicated that all three approaches performed well in terms of classifying people into two categories. The CAT and SPRT approaches produced, on average, comparable results with both performing relatively better than the MST approach in the precision of their classification decision. As the test length increased, the classification decision accuracy generally increased for all approaches; however, the CAT and SPRT approaches yielded more accuracy with the shorter test length. In terms of cutoff scores, predicting classification decision differed according to the location of cutoff scores based on the normal distribution of examinees. In terms of exposure control properties, the progressive-restricted exposure control procedure with the pre-set maximum test unit exposure rate was implemented effectively into the CAT and SPRT approaches. The CAT approach had, on average, a higher proportion of test units with low test unit exposure rates and produced better results in pool utilization rates than the SPRT approach. Finally, the MST approach administered all test units constructed for the panels for each condition. It had, on average, however, a higher proportion of test units with high test unit exposure rates because computations were based only on the proportion of whole test unit pool used for constructing the MST panels.
A comparison of latent growth models for constructs measured by multiple indicators
(2005) Leite, Walter Lana; Stapleton, Laura M.
Latent growth modeling (LGM) of composites of multiple items (for example, means or sums of items) has been frequently used to analyze the growth of latent constructs. However, composites are only equivalent to latent constructs if the items’ factor loadings are equal to one and there is no measurement error (Bollen & Lennox, 1991). In this study, the adequacy of using univariate LGM to model composites of multiple items, as well three other alternative methods were evaluated through a Monte Carlo simulation study. The four methods evaluated in this study were the univariate LGM, the univariate LGM with fixed error variances, the univariate LGM with the correction for attenuation, and the curve-of-factors model (McArdle, 1988; Tisak and Meredith, 1990). This simulation study manipulated the number of items per construct, the number of measurement times, the sample size, the reliability of the composites, the invariance of item parameters, and whether the items were essentially tau-equivalent or essentially congeneric. One thousand datasets were simulated for each of the conditions. The results indicate that using univariate LGM with composites of multiple items only produces unbiased parameter estimates and standard errors if the items are essentially tau-equivalent. The univariate LGM with fixed error variances performed identically to the univariate LGM. The univariate LGM with the correction for attenuation produced unbiased parameter estimates when the items were essentially tauequivalent, but produced negatively biased estimates of standard errors. The curve-of-factors model was found to be the most appropriate method to analyze the growth of latent constructs measured by multiple items. The curve-of-factors model was able to provide unbiased parameter estimates and standard errors under all conditions evaluated in this study. However, with sample sizes of 100 or 200, a large percentage of chi-square statistics were positively biased and the fit indices indicated inadequate model fit. This study’s recommendation is that the curve-of-factors model should be preferred to analyze the growth of latent variables measured by multiple items, but the use of sample sizes larger than 200 is strongly recommended to help ensure that adequate fit statistics and fit indices are obtained for appropriate models.
Estimating the latent trait from Likert-type data : a comparison of factor analysis, item response theory, and multidimensional scaling
(1991) Chan, Chihyu, 1959-; Koch, William R.
Seven statistical procedures were compared with one another in terms of the ability to recover a unidimensional latent trait from Likert-type data. They are factor analysis based on either Pearson correlations (FA-PR) or polychoric correlations (FAPL), the graded response model in item response theory (IRT-GRM), internal unfolding (IMDU), external unfolding (EMDU), weighted unfolding (WMDU), and the common procedure of summing up successive integers assigned to response categories (SSI). Sample size, test length, and skewness of item response distributions were manipulated in this simulation study. Generally speaking, IRT-GRM performed the best and was most robust against skewness. FA-PR and FA-PL performed equally well across almost all conditions but were competitive with IRT-GRM only when item responses were normally distributed. SSI practice might be slightly worse than the two FA procedures when item responses were normally distributed, but it was better than them when item responses were highly skewed. WMDU performed as well as did SSI only when item responses were normally distributed or moderately skewed and sample size was large for MDS models (e.g., N=100). IMDU and EMDU were even worse than WMDU and appeared not appropriate for Likert-type data
Examining the invariance of item and person parameters estimated from multilevel measurement models when distribution of person abilities are non-normal
(2013-05) Moyer, Eric; Pituch, Keenan A.
Multilevel measurement models (MMM), an application of hierarchical generalized linear models (HGLM), model the relationship between ability levels estimates and item difficulty parameters, based on examinee responses to items. A benefit of using MMM is the ability to include additional levels in the model to represent a nested data structure, which is common in educational contexts, by using the multilevel framework. Previous research has demonstrated the ability of the one-parameter MMM to accurately recover both item difficulty parameters and examinee ability levels, when using both 2- and 3-level models, under various sample size and test length conditions (Kamata, 1999; Brune, 2011). Parameter invariance of measurement models, that parameter estimates are equivalent regardless of the distribution of the ability levels, is important when the typical assumption of a normal distribution of ability levels in the population may not be correct. An assumption of MMM is that the distribution of examinee abilities, which is represented by the level-2 residuals in the HGLM, is normal. If the distribution of abilities in the population are not normal, as suggested by Micceri (1989), this assumption of MMM is violated, which has been shown to affect the estimation of the level-2 residuals. The current study investigated the parameter invariance of the 2-level 1P-MMM, by examining the accuracy of item difficulty parameter estimates and examinee ability level estimates. Study conditions included the standard normal distribution, as a baseline, and three non-normal distributions having various degrees of skew, in addition to various test lengths and sample sizes, to simulate various testing conditions. The study's results provide evidence for overall parameter invariance of the 2-level 1P-MMM, when accounting for scale indeterminacy from the estimation process, for the study conditions included. Although, the error in the item difficulty parameter and examinee ability level estimates in the study were not of practical importance, there was some evidence that ability distributions may affect the accuracy of parameter estimates for items with difficulties greater than represented in this study. Also, the accuracy of abilities estimates for non-normal distributions seemed less for conditions with greater test lengths and sample sizes, indicating possible increased difficulty in estimating abilities from non-normal distributions.
Influence of the home environment on diet quality and weight status of adolescents : a social ecological framework
(2015-12-02) Tabbakh, Tamara; Freeland-Graves, Jeanne H.; Finnell, Richard; Jolly, Christopher; Lewis, Karron; Steinhardt, Mary
The home environment is a critical setting for the development of weight status in adolescence. At present a limited number of valid and reliable tools are available to evaluate the weight-related comprehensive home environment of this population. Aim 1a was to develop and validate the Multidimensional Home Environment Scale (MHES), which measures multiple components of the home. This scale includes psychological, social, and environmental domains from the perspective of adolescents and their mothers. After establishing content validity via an expert panel in nutrition, a validation sample of 218 mother-adolescent dyads completed a demographics survey and original version of the MHES. A focus group with the target population of adolescents (n=7) was conducted and feedback regarding item difficulty, content, bias, and relevance was incorporated. Principal components analysis yielded a 12-factor structure for adolescents and 14-factor structure for mothers. Internal consistency reliability was achieved for the majority of subscales, with α=0.5-0.9 for adolescents and α=0.7-0.9 for mothers. In addition, the MHES showed test-retest reliability for both adolescents (r=0.90) and mothers (r=0.91). Aim 1 b was to develop and validate a Nutrition Knowledge scale using the same sample as Aim 1a. Nutrition knowledge was assessed in this sample of 114 dyads. A 20-item scale was modified from previous version developed by the author. This instrument was composed of multiple-choice questions classified into four categories of knowledge: macronutrient, micronutrient, healthy eating and physical activity recommendations and fast-food nutrition. Content validity of the scale was established using feedback from an expert panel in nutrition (n=10) and a focus group of the sample population tested (n=7). The scale demonstrated high internal consistency reliability (adolescents: α=0.70, mothers: α=0.78) and test-retest reliability (adolescents: r=0.47, p=0.01, mothers: r=0.77, p=0.00). Aim 2 was to examine the impact of the comprehensive home environment on diet quality and weight status of adolescents using the MHES. A sample of 206 mothers and adolescents were recruited from local middle schools in the Austin area and completed a demographics survey, final version of the MHES, Food Frequency Questionnaire, and a Nutrition Knowledge scale online. Weight and height of adolescents were measured by the author using a standard protocols. Body Mass Index (BMI)-for-age percentiles were determined using the Center for Disease Control growth charts. Diet quality was estimated using the Healthy Eating Index-2010. Two models were created and reported in this dissertation. The first univariate model included each of the home environment factors as independent variables, and diet quality and BMI as dependent variables. The second model was developed using significant variables only from the initial model. Availability of healthy foods (p=0.00), healthy eating attitude (p=0.01), and accessibility to unhealthy foods (p=0.04) in the home were the strongest predictors of diet quality. Self-efficacy (p=0.02) and availability of healthy foods (p=0.02) emerged as significant predictors of BMI. Aim 3 of this dissertation research was to determine the effect of nutrition knowledge on the home environment and diet quality using the Healthy Eating Index-2010. This aim was accomplished using the same sample as Aim 2. It was hypothesized that the comprehensive home, with its psychological, social, and environmental features, would mediate the relationship between maternal nutrition knowledge and diet quality. A non-linear relationship between nutrition knowledge of the mother and diet quality of the adolescent was observed. Inclusion of the mediator in the model yielded significant estimates of the indirect effect (β=0.61, 95% CI: 0.3-1.0), with a 65.2% reduction in the model. This suggests that the home environment functioned as a partial mediator of the influence of nutrition knowledge on diet quality. Then, mediation analysis with the combination of psychological, social, and environmental factors was conducted in three separate regressions. Psychological (β=0.46), social (β=0.23), and environmental (β=0.65) variables were all significant mediators of nutrition knowledge on diet quality. Collectively, these results suggest that the MHES is an appropriate tool for measurement of the nutritional home environment of adolescents. The home environment appeared to significantly modulate diet quality and BMI of adolescents, particularly with respect to availability of healthy foods, healthy eating attitudes, and self-efficacy.
An investigation of stratification exposure control procedures in CATs using the generalized partial credit model
(2006) Johnson, Marc Anthony; Dodd, Barbara Glenzing
The a-stratification procedure of item exposure control was designed to stratify items by item discrimination to ensure that an adaptive test would administer items from the entire range of items, not just the most-informative ones. An improvement to the a-stratification method, the a-stratification with b-blocking procedure added stratification according to item difficulty in order to take into account any correlation that might exist within the item pool between item discrimination and item difficulty. These procedures have been shown to work well using dichotomous items. This dissertation explored both stratification procedures using polytomous item pools to investigate whether or not an optimum number of strata could be implemented when administering polytomous computerized adaptive tests. In addition to the stratification procedures, two other exposure control conditions were studied. The randomesque procedure was used in one condition while a no exposure control condition served as a baseline condition. Items calibrated according to the generalized partial credit model were used to construct two item pools. Since the items covered three areas of science, content balancing procedures were incorporated to ensure that each adaptive test provided the appropriate balance of content. Maximum likelihood estimation was used to estimate ability levels from simulated CATs. The number of strata used with both stratification procedures ranged from two to five, to ensure enough items per stratum. Along with descriptive statistics and correlations, bias and root mean squared error helped portray the accuracy of the simulated tests. Item exposure and item pool usage rates were used to show how much of the item pools were being used across administrations of the tests. Finally, item overlap rates were calculated to show how many of the same items were being used among simulated examinees of similar and different abilities. The results of this study did not reveal an optimum number of strata for the stratification procedures with either item pool. Furthermore, the randomesque procedure outperformed the stratification procedures in terms of item exposure and item overlap rates for both item pools. This surprising result was not affected by the number of strata used within the stratification procedures.
Optimal Waist-to-Hip Ratios in Women Activate Neural Reward Centers in Men
(Public Library of Science, 2010-02-05) Platek, Steven M.; Singh, Devendra
Secondary sexual characteristics convey information about reproductive potential. In the same way that facial symmetry and masculinity, and shoulder-to-hip ratio convey information about reproductive/genetic quality in males, waist-to-hip-ratio (WHR) is a phenotypic cue to fertility, fecundity, neurodevelopmental resources in offspring, and overall health, and is indicative of “good genes” in women. Here, using fMRI, we found that males show activation in brain reward centers in response to naked female bodies when surgically altered to express an optimal (~0.7) WHR with redistributed body fat, but relatively unaffected body mass index (BMI). Relative to presurgical bodies, brain activation to postsurgical bodies was observed in bilateral orbital frontal cortex. While changes in BMI only revealed activation in visual brain substrates, changes in WHR revealed activation in the anterior cingulate cortex, an area associated with reward processing and decision-making. When regressing ratings of attractiveness on brain activation, we observed activation in forebrain substrates, notably the nucleus accumbens, a forebrain nucleus highly involved in reward processes. These findings suggest that an hourglass figure (i.e., an optimal WHR) activates brain centers that drive appetitive sociality/attention toward females that represent the highest-quality reproductive partners. This is the first description of a neural correlate implicating WHR as a putative honest biological signal of female reproductive viability and its effects on men's neurological processing.
Personality Consistency in Dogs: A Meta-Analysis
(Public Library of Science, 2013-01-23) Fratkin, Jamie L.; Sinn, David L.; Patall, Erika A.; Gosling, Samuel D.
Personality, or consistent individual differences in behavior, is well established in studies of dogs. Such consistency implies predictability of behavior, but some recent research suggests that predictability cannot be assumed. In addition, anecdotally, many dog experts believe that ‘puppy tests’ measuring behavior during the first year of a dog's life are not accurate indicators of subsequent adult behavior. Personality consistency in dogs is an important aspect of human-dog relationships (e.g., when selecting dogs suitable for substance-detection work or placement in a family). Here we perform the first comprehensive meta-analysis of studies reporting estimates of temporal consistency of dog personality. A thorough literature search identified 31 studies suitable for inclusion in our meta-analysis. Overall, we found evidence to suggest substantial consistency (r = 0.43). Furthermore, personality consistency was higher in older dogs, when behavioral assessment intervals were shorter, and when the measurement tool was exactly the same in both assessments. In puppies, aggression and submissiveness were the most consistent dimensions, while responsiveness to training, fearfulness, and sociability were the least consistent dimensions. In adult dogs, there were no dimension-based differences in consistency. There was no difference in personality consistency in dogs tested first as puppies and later as adults (e.g., ‘puppy tests’) versus dogs tested first as puppies and later again as puppies. Finally, there were no differences in consistency between working versus non-working dogs, between behavioral codings versus behavioral ratings, and between aggregate versus single measures. Implications for theory, practice, and future research are discussed.