Enhancing generalizability and feasibility in sample selection : a methodological study of cluster analysis for stratifying populations



Journal Title

Journal ISSN

Volume Title



This dissertation focuses on the critical need for research findings that are applicable and generalizable to diverse populations in the context of policy-making and funding allocation. Biases favoring majority groups often emerge due to overlooked variations within subgroups and inadequate sampling strategies. The objective of this study is to help address this issue by providing accessible and effective methods for selecting representative samples, with the ultimate goal of promoting the inclusion of diverse populations and ensuring unbiased estimation of main effects. In the realm of educational intervention research, randomized control trials (RCTs) have played a pivotal role in demonstrating efficacy. However, reliance on convenience sampling restricts the generalizability of findings beyond the study sample. Recent research has highlighted the lack of representative sampling in federally funded efficacy studies, necessitating the development of design-based approaches to enhance generalizability. The present study focuses specifically on stratified sampling using cluster analysis as a promising method for achieving representative samples. In this context, cluster analysis serves as a dimension reduction technique, enabling the population to be stratified based on covariates associated with treatment effect heterogeneity. The selected stratified samples facilitate population-level inference, addressing the limitations of convenience sampling. The primary aim of this study is to investigate the influence of various decisions in the cluster analysis process on the generalizability and feasibility of stratified sampling. By utilizing Monte Carlo simulation and real-world data, the findings shed light on the optimal number of high-quality strata that enhance generalizability without imposing significant recruitment challenges. These findings offer valuable guidance to researchers in effectively allocating resources and devising sampling strategies that maximize the impact of their study designs. Additionally, this study introduces a novel simulation design framework that can be extended for future methodological research. The framework offers flexibility in designing and testing recruitment strategies and accommodates various algorithms for modeling participation bias. By developing rigorous research designs that promote the inclusion of diverse populations, this study informs effective policy-making and funding allocation, ensuring that research findings are applicable to a broad range of demographic groups.


LCSH Subject Headings