Operational characteristics of mixed-format multistage tests using the 3PL testlet response theory model
MetadataShow full item record
Multistage tests (MSTs) have received renewed interest in recent years as an effective compromise between fixed-length linear tests and computerized adaptive test. Most MSTs studies scored the assessments based on item response theory (IRT) methods. Many assessments are currently being developed as mixed-format assessments that administer both standalone items and clusters of items associated with a common stimulus called testlets. By the nature of a testlet, a natural dependency occurs between the items within the testlet that violates the local independence of items. Local independence is a fundamental assumption of the IRT models. Using dichotomous IRT methods on a mixed-format testlet-based assessment knowingly violates local independence. By combining the score points within a testlet, researchers have successfully applied polytomous IRT models. However, the use of such models loses information by not using the unique response patterns provided by each item within a testlet. The three-parameter logistic testlet response theory (3PL-TRT) model is a measurement model developed to retain the uniqueness in response patterns of each item, while accounting for the local dependency exhibited by a testlet, or testlet effect. Because few studies have examined mixed-format MSTs administration under the 3PL-TRT model, the dissertation performed a simulation to investigate the administration of a mixed-format testlet based MSTs under the 3PL-TRT model. Simulee responses were generated based on the 3PL-TRT calibrated item parameters from a real large-scale passage based standardized assessment. The manipulated testing conditions considered four panel designs, two test lengths, three routing procedures, and three conditions of local item dependence. The study found functionally no bias across testing conditions. All conditions showed adequate measurement properties, but a few differences did occur between some of the testing conditions. The measurement precision was impacted by panel design, test length and the magnitude of local item dependence. The three-stage MSTs consistently illustrated slightly lower measurement precision than the two-stage MSTs. As expected, the longer test length conditions had better measurement precision than the shorter test length conditions. Conditions with the largest magnitude of local item dependency showed the worst measurement precision. The routing procedure had little impact on the measurement effectiveness.