Browsing by Subject "Sampling (Statistics)"

Now showing 1 - 3 of 3

The impact of the inappropriate modeling of cross-classified data structures
(2004) Meyers, Jason Leon; Beretvas, Susan Natasha
Hierarchical linear modeling (HLM) is typically used in the social sciences to model data from clustered settings, such as students nested within classrooms. However, not all multilevel data are purely hierarchical in nature. For example, students can be nested within the neighborhoods in which they live and within the schools they attend. But, most likely, students from a given neighborhood do not all attend the same school and students from a given school do not all reside within the same neighborhood. Because neighborhoods are not nested within schools nor vice-versa, the two are said to be cross-classified. Cross-classified random effects modeling (CCREM) is used to model data from these non-hierarchical contexts. While use of CCREM has increased in various disciplines such as medicine, it is seldom used in educational research. CCREM is mentioned in most multilevel modeling textbooks (for example, Raudenbush & Bryk, 2002; Hox, 2002; Snijders & Boskers, 1999). However, it remains infrequently used, most likely because the models are technically sophisticated and can be somewhat difficult to interpret. Little research has been conducted assessing when it is necessary to use CCREM, so this dissertation involved several studies. A Monte Carlo Simulation Study was conducted in order to investigate potential factors affecting the need to use CCREM as well as the impact of ignoring cross-classification. As a follow-up study, CCREM was applied to a large-scale national data set in order to provide insight into the potential effects of ignoring the cross-classified data structure. Results of both studies indicated that when using HLM instead of CCREM, the fixed effect estimates were unaffected but the standard error estimates associated with the variables modeled incorrectly were biased. In addition, the estimates of the variance components displayed bias. The observed bias was related to the proportion of the total variance that was between each cross-classified factor, the sample size, and the similarity of the cross-classified factors. Implications and limitations are discussed and suggestions for future research are presented.
Improving sampled microprocessor simulation
(2005) Luo, Yue; John, Lizy Kurian
Microprocessor evaluation using detailed cycle-accurate simulation is prohibitively time-consuming. Sampling is the most widely used simulation time reduction technique. In this dissertation, new sampling designs that utilize the characteristics of the workload, the microarchitecture being simulated, and the user’s specific objective are proposed. They improve accuracy, and reduce simulation time and storage cost. Statistical sampling theory is employed to study the choice of sampling unit size for simple random sampling with perfect warm-up. More importantly, the inherent characteristic of the benchmarks that affects the choice of sampling unit size is discerned. Previous research has been focusing on the accuracy of Cycle Per Instruction (CPI). However, most simulations are used to measure the speedup due to some microarchitectural enhancements. A new sampling scheme that employs ratio estimator from statistical theory is proposed to measure speedup and to quantify its error. In the experiment, 9X fewer instructions are simulated as compared to estimating CPI for the same relative error limit. This dissertation extends sampling techniques to the simulation of commercial workloads such as On-Line Transaction Processing (OLTP) used by banks, airlines, etc. The applicability of simple random sampling and representative sampling for OLTP workloads is investigated. A dynamic stopping rule is proposed for sampling OLTP workloads, which requires only one simulation and thus eliminates the second simulation in previous random sampling methods. In order to achieve accurate sampling results, microarchitectural structures must be adequately warmed up before each measurement. Previous warm-up techniques have not considered the cache configuration being simulated, an important factor on the warmup length. This dissertation presents a new cache warm-up technique for sampled microprocessor simulation, which allows the warm-up length to be adaptive to cache configurations and benchmark variability characteristics. As a result, warm-up length has been greatly reduced, especially for small caches, without losing accuracy. For trace-driven simulation, the sampled traces have to be stored. Another contribution of the dissertation is the Locality Based Trace Compression (LBTC) technique, which employs both spatial locality and temporal locality in program memory references. It efficiently compresses not only the address but also other attributes associated with each memory reference.
Qualitative and quantitative sequential sampling
(2006) Rai, Rahul; Campbell, Matthew I.
Sequential sampling refers to a set of design of experiment (DOE) method where the next sample point is determined by information from previous experiments. This dissertation introduces qualitative and quantitative sequential sampling (Q2S2) technique, in which optimization and user knowledge is used to guide the efficient choice of sample points. This method combines information from multiple fidelity sources including computer simulation models of the product, first principals involved in design and designer's qualitative intuitions about the design. Both quantitative and qualitative information from varying fidelty sources are merged together to arrive at new sampling strategy. This is accomplished by introducing the concept of confidence function, C, which is represented as a field that is a function of the decision variables, x, and the performance parameter, f. The advantages of the approach are demonstrated using various function example cases. The examples include design of a bi-stable Micro Electro Mechanical System (MEMS) relay, a complex and relevant mechanical system. In each case, the performance of Q2S2 is highly encouraging.