Browsing by Subject "Evaluation"

Now showing 1 - 20 of 21

An analysis of the duties performed by men physical education teachers in Texas high schools
(1942) McCarver, Clarence Edward, 1909-; Not available
Austin Logistics Inc : assessing defect density
(2010-12) Nanchari, Nithin Krishna; Perry, Dewayne E.; Krasner, Herbert
Austin Logistics Inc. Solutions provides tools that help centralize resource management, optimize and maintain compliance of calling schedules for consumer financial service organization (banks, financial institutions). With the increasing number of customers, the amount of rework and availability of resources had been notably decreasing over time; thereby negatively affecting the overall cost and quality of the software being delivered. The improvement objectives of the company and its departments were broadly stated but lacking a goal-driven nature. The software measurement Goal-Question-Metric (GQM) approach was chosen and used for this research initiative to better support business driven quality improvement. Software defect density data was collected and analyzed to identify significant deviations in the software development life cycle.. The results of the initial analysis on the transformed defect-tracking data helped identify the negatively affected areas within the software development life cycle. The data showed significant variations in the requirements, design and implementation phases of the product life cycle, thus helping identify various process improvement opportunities. On quantifying the change in defect density, the effectiveness of using GQM has also provided valuable insights for process improvement. Based on these results, we were able to identify some of the weaknesses and shortcomings in our application development process.
A collaborative approach to IR evaluation
(2014-05) Sheshadri, Aashish; Grauman, Kristen Lorraine, 1979-; Lease, Matthew A.
In this thesis we investigate two main problems: 1) inferring consensus from disparate inputs to improve quality of crowd contributed data; and 2) developing a reliable crowd-aided IR evaluation framework. With regard to the first contribution, while many statistical label aggregation methods have been proposed, little comparative benchmarking has occurred in the community making it difficult to determine the state-of-the-art in consensus or to quantify novelty and progress, leaving modern systems to adopt simple control strategies. To aid the progress of statistical consensus and make state-of-the-art methods accessible, we develop a benchmarking framework in SQUARE, an open source shared task framework including benchmark datasets, defined tasks, standard metrics, and reference implementations with empirical results for several popular methods. Through the development of SQUARE we propose a crowd simulation model that emulates real crowd environments to enable rapid and reliable experimentation of collaborative methods with different crowd contributions. We apply the findings of the benchmark to develop reliable crowd contributed test collections for IR evaluation. As our second contribution, we describe a collaborative model for distributing relevance judging tasks between trusted assessors and crowd judges. Based on prior work's hypothesis of judging disagreements on borderline documents, we train a logistic regression model to predict assessor disagreement, prioritizing judging tasks by expected disagreement. Judgments are generated from different crowd models and intelligently aggregated. Given a priority queue, a judging budget, and a ratio for expert vs. crowd judging costs, critical judging tasks are assigned to trusted assessors with the crowd supplying remaining judgments. Results on two TREC datasets show significant judging burden can be confidently shifted to the crowd, achieving high rank correlation and often at lower cost vs. exclusive use of trusted assessors.
A critical analysis of the problems in providing vocational-technical education in existing institutions, especially junior colleges
(1948) Brooking, Walter J.; Colvert, Clyde C. (Clyde Cornelius)
Developing a set of holistic signal retiming performance metrics : an evaluation of signal retiming efforts in Austin
(2020-03-27) Cheng, Christine Wang; Machemehl, Randy B.
Evaluating signal retiming efforts is a crucial next step for transportation agencies after retiming their corridors. By analyzing the quality of their retiming efforts with respect to their initial goals for the retiming process, agencies are able to quantitatively assess the effects of signal timing changes and allocate their future resources appropriately. The wide variety of transportation data sources available also facilitates the possibility for agencies to develop a set of performance measures with which to evaluate signal retiming efforts. This thesis presents a set of developed signal retiming performance metrics for use in before-and-after comparison studies that together consider the holistic effect of signal timing changes. These performance metrics are applied to several retimed corridors in the City of Austin to evaluate the effect of signal timing efforts. While more data is needed to supplement several of these performance metrics, these measures significantly improve the current one-dimensional evaluation process used by the City of Austin that relies solely on travel time changes obtained by floating car runs. Finally, this work also includes an initial investigation of seasonal variation on a sample corridor in Austin which could be used to enhance the flexibility of the City of Austin’s retiming schedule and strengthen its understanding of travel time data throughout the year
Development and application of a framework for observing problem solving by teachers and students in music
(2013-08) Roesler, Rebecca Ann; Duke, Robert A.
The development of problem solving capabilities is an essential part of intellectual independence, yet the nature of problem solving in music instruction has not been investigated systematically. The purposes of the current study were to describe the process of problem solving in the context of music learning and to elucidate the relationship between teacher behavior and learners' active participation in solving musical and technical problems. I analyzed approximately 43 hours of private and small-group lessons taught by five internationally-renowned artist-teachers in music. I also analyzed in greater detail 161 rehearsal frames (intervals of instructional time devoted to definable proximal goals) excerpted from recorded lessons by describing the behaviors of teachers and students that led to productive learning outcomes. The process of problem solving was found to comprise five components: establish goals, evaluate performance, conceive and consider options, generalize and apply principles, and decide and act. In assessing the extent of teachers' and students' involvement in problem-solving, I found that teachers promoted change-effecting behaviors in learners by instigating the pursuit of a goal, and then prompting learners to assume responsibility for one or more of the subsequent problem-solving components. In this way these teachers not only brought about change in learners' performance, but also structured ways for learners to practice bringing about change in their own performance.
Educational efficiency : a study of the work of the Austin public white schools
(1924) Jennings, Elzy Dee; Not available
Emergent practices in the use of online assessment and measurement to evaluate learning
(2010-12) Dutt Majumder, Hemangini; Resta, Paul E.; Liu, Min
This report provides an overview of some of the emergent current practices in using technology to evaluate learning. It starts by examining terminology associated with learning evaluation in terms of literature related to the subject. Several innovative models and tools in practice are discussed in terms of their application, situations they are best suited to, advantages or disadvantages they might have and theories they are based on. Some of these are easy to apply and more practically implementable, others are indicative of advanced technologies that are likely to come into use in the future. The report concludes with a few possible scenarios regarding the context in which these technologies and methods are to be used and the real world considerations that would concern the stakeholders.
Evaluating an energy efficiency project for an existing commercial building
(2011-12) Krasner, William Paul; Nichols, Steven Parks, 1950-; Duvic, Robert Conrad, 1947-
In this thesis I provide general guidelines for a commercial building owner’s decision making process for heating, ventilation, and air-conditioning (HVAC) system energy efficiency projects, discuss an example HVAC project at an existing building, and recommend the most energy-efficient, cost-effective project option. First, a building’s HVAC system’s inefficiencies are identified. The systems and the components can be investigated to understand the nature of the operations. In the building owner’s interests, possible alternatives can be developed to address the systems with improvements. Consulting engineers, contractors, and other building professionals can assist in this process. There are necessary engineering and construction considerations for defining realistic project alternatives. With the alternatives, there are costs, benefits, and trade-offs. The costs, which mainly include the investment and the operational costs, and the benefits, which mainly include the available financial incentives, defined in dollars, are identified for the alternatives. The alternatives can be evaluated with Building Life Cycle Cost (BLCC) software. In this evaluation the net present-value (NPV) method is used to rank the alternatives. Then, the highest-ranking, lowest life-cycle cost, alternative is recommended for the owner. In the example, an existing commercial building’s HVAC systems are considered. The construction plans, the facilities records, and the existing field conditions were investigated and analyzed. A few operational inefficiencies were identified. To address two of these existing inefficiencies, there were alternatives considered to replace the standard-efficiency air handling unit motors with premium-efficiency motors and to renovate the ventilation system with an energy recovery wheel. The investment costs, the available rebates, the net annual energy savings, and the energy and other operational costs were estimated, over a 30-year study period, for each of these alternatives, and compared to the costs of the existing system. The BLCC evaluations were performed across a range of discount rates in the present-value calculations. Based on the lowest present-value life-cycle cost reports, the premium-efficiency motor replacement project only is recommended.
Evaluation of a land surface solar radiation partitioning scheme using remote sensing and site level FPAR datasets
(2013-08) Wang, Kai, active 2013; Dickinson, Robert E. (Robert Earl), 1940-
Land surface covers only 30% of the global surface, but contributes largely to the intricacy of the climate system by exchanging water and energy with the overlying atmosphere. The partitioning of incident solar radiation among various components at the land surface, especially vegetation and underlying soil, determines the energy absorbed by vegetation, evapotranspiration, partitioning between surface sensible and latent heat fluxes, and the energy and water exchange between the land surface and the atmosphere. Because of its significance in climate model, land surface model solar radiation partitioning scheme should be evaluated in order to ensure its accuracy in reproducing these naturally complicated processes. However, few studies evaluated this part of climate model. This study examines a land surface solar radiation partitioning scheme, i.e., that of the Community Land Model version 4 (CLM4) with coupled carbon and nitrogen cycles. Taking advantage of multiple remote sensing fraction of absorbed photosynthetically active radiation (FPAR) datasets, ground observations and a unique 28-year FPAR dataset derived from the Global Inventory Modeling and Mapping Studies (GIMMS) Normalized Difference Vegetation Index (NDVI) dataset, we evaluated the CLM4 FPAR’s seasonal cycle, diurnal cycle, long-term trends and spatial patterns. Our findings show the model roughly agrees with observations in the seasonal cycle , long-tern trend and spatial patterns but does not reproduce the diurnal cycle. Discrepancies also exist in seasonality magnitudes, peak value months and spatial heterogeneity. We identified the discrepancy in the diurnal cycle as due to the absence of dependence on sun angle in the model. Implementation of sun angle dependence in a one-dimensional (1-D) model is proposed. The need for better relating vegetation to climate in the model indicated by long-term trends is also noted. Evaluation of the CLM4 land surface solar radiation partitioning scheme using remote sensing and site level FPAR datasets provides targets for future development in its representation of this naturally complicated process.
An evaluation of rural sanitation in India
(2015-08) Mauro, Benjamin Matthew; Eaton, David J.; Weaver, Catherine
One billion people practice open defecation globally resulting in approximately 900,000 deaths via contaminated water and contact with human excreta. India is home to 600 million of the individuals engaging in open defecation, and poor sanitation is estimated to cause over 400,000 deaths annually. The Swachh Bharat Mission, the Indian government's scheme to increase sanitation coverage across India, promotes toilet construction by subsidizing the costs. The program has produced limited uptake in hygienic behavior change, and toilet construction goals are not being met. This study evaluates the effectiveness of sanitation interventions in 22 studies in the rural setting. The review identifies successful sanitation interventions and highlights gaps in the existing literature. Three types of studies were evaluated: infrastructure interventions, education interventions, and interventions that employed a combination of the two methods; and the review of the studies found that interventions utilizing community mobilization and subsidies as a part of their outreach were more likely to increase toilet coverage in the rural environment. The review also provides recommendations for future interventions, research, and implementing organizations operating in the rural sanitation environment. The report was written to inform the work of Humanure Power, an NGO working to end open defecation in rural Bihar, India. The potential for conditional cash transfers and pit latrine volumes were explored as solutions to inducing behavioral change, and the report outlined an evaluation framework for the rural environment. This report provides a framework that tracks multiple indicators and incorporates local help to build a sustainable sanitation tracking system to account for the difficulties of program monitoring and evaluation in a resource-limited environment.
An evaluation of student health service programs in institutions of higher learning in Texas
(1947) Williams, Rhea Hughston, 1911-; Not available
Evaluation of Superpave Fine Aggregate Angularity Specification
(2001-05) Chowdhury, Arif; Button, Joe W.; Kohale, Vipin; Jahn, David W.
The validity of the Superpave fine aggregate angularity (FAA) requirement is questioned by both the owner agencies and the paving and aggregate industries. The FAA test is based on the assumption that more fractured faces will result in higher void content in the loosely compacted sample; however, this assumption is not always true. Some agencies have found that cubical shaped particles, even with 100 percent fractured faces, may not meet the FAA requirement for high-volume traffic. State agencies are concerned that local materials, previously considered acceptable and which have provided good field performance, cannot meet the Superpave requirements. Researchers evaluated angularity of 23 fine aggregates representing most types of paving aggregates used in the USA using seven different procedures: FAA test, direct shear test, compacted aggregate resistance (CAR) test, three different image analyses, and visual inspection. The three image analyses techniques included Hough Transform at University of Arkansas at Little Rock (UALR), unified image analysis at Washington State University (WSU), and VDG-40 videograder at Virginia Transportation Research Council (VTRC). A small study was performed to evaluate relative rutting resistance of HMA containing fines with different particle shape parameters using the Asphalt Pavement Analyzer (APA). The FAA test method does not consistently identify angular, cubical aggregates as high quality materials. There is a fair correlation between the CAR stability value and angle of internal friction (AIF) from the direct shear test. No correlation was found between FAA and CAR stability or between FAA and AIF. Fairly good correlations were found between FAA and all three image analysis methods. Some cubical crushed aggregates with FAA values less than 45 gave very high values of CAR stability, AIF, and ‘angularity’ from imaging techniques. Moreover, the three image analysis methods exhibited good correlation among themselves. A statistical analysis of the SHRP-LTPP (Strategic Highway Research Program-Long-Term Pavement Performance) database revealed no significant evidence relationship between FAA and rutting. This lack of relationship is not surprising since many uncontrolled factors contribute to pavement rutting. The APA study revealed that FAA is not sensitive to rut resistance of HMA mixtures. Image analysis methods appear promising for measuring fine aggregate angularity. Until a suitable replacement method(s) for FAA can be identified, the authors recommend that the FAA criteria be lowered from 45 to 43 for 100 percent crushed aggregate. Analysis of the FAA versus rutting data should be examined later as the amount of data in the SHRP-LTTP database is expanded.
'How do we evaluate this?' : Perspectives on evaluation criteria for digital scholarship from the digital humanities community
(2013-05) Pfannenschmidt, Sarah Lynn; Clement, Tanya Elizabeth; Galloway, Patricia K
Since the advent of the World Wide Web, there has been an increasing influx of digital scholarship. Such scholarship is not always recognized as legitimate, in part because digital work is still in its 'incunabula phase' and also because the staggering variety in tools, user communities, etc. engenders a host of potentially competing evaluation priorities. These concerns have created a pressing need for appropriate evaluation criteria to fairly assess digital projects. Though this topic has received substantial attention in the scholarly literature, discrete solutions and the establishment of firm yet flexible evaluation criteria remain elusive. This paper presents a pilot study that sought to clarify the following: what criteria participants use to evaluate digital scholarship, the place of digital tools in the evaluation of scholarship, who should evaluate digital projects, the role of stated intentions in the formation of evaluation criteria, what role the TEI might play in evaluation of text encoding, and finally how this role would be practically implemented. The study indicated that despite the complex nature of the topic, a number of practical solutions may aid in the legitimization of digital scholarship. In particular, including a statement of intent that explains the methodology of the project goes a long way in establishing the relationship between the content and the tools and the criteria to evaluate both components. Two potential roles for the TEI community also emerged: (1) to provide counsel and formative assistance with ongoing projects in a manner targeted towards project evaluations and (2) to consider including dedicated reviews section in the Journal of the Text Encoding Initiative to feature project evaluations and accept submissions for review. This publication is an ideal online platform for the discussion of review guidelines and may help to clarify what evaluation criteria are necessary to promote fair and accurate assessments of digital projects. Determining what to evaluate and how to do so are perennially relevant questions, and as digital scholarship continues to develop, it will become more important than ever to develop a better of understanding of what we value and why we value it.
Place me : location based mobile app for Android platform
(2010-12) Singhal, Aman; Aziz, Adnan; Khurshid, Sarfaraz
This report describes PlaceMe, a client side, mobile application built on the Android platform that provides personal location-based services such as location reminders, bookmarking, mapping and search nearby. The reminder system allows creating location based reminders, and alerts the user what he needs to do, when he is in the right place to do them. Bookmarking allows the user to virtually “save” places of interest while he is on the move and obtain driving directions. Mapping enables the user to visualize his relative geographic location in real time, and map the location reminders and bookmarks. Finally, search nearby exploits Google’s powerful local search engine to allow finding and bookmarking nearby places such as gas stations, restaurants, etc, and retrieving map-based directions. We first discuss the requirements and use-cases for PlaceMe, followed an introduction to the Android software stack. Next, we describe our design architecture, implementation model, test strategy and key performance enhancements. Then, we evaluate and compare the performance of the Android platform across a set of standard micro-benchmarks. Finally, we conclude with a discussion of future development ideas and present our thoughts on prospects of app-based mobile computing.
Promising practices in superintendent evaluation : a case study of Texas School districts in Education Service Center Region 4
(2012-12) Sandoval, Monica Martinez; Olivárez, Rubén; Cantu, Norma V; de los Santos, Miguel; Garza, Elizabeth P; Sharpe, Edwin R
The primary purpose of this study was to examine the current practice of the superintendent’s evaluation process in three public school districts in Texas. This study collected information about current criteria used, the process as described by superintendents and school board presidents, and their perceptions regarding the effectiveness of the instrument used to measure the performance of the superintendent. A qualitative case study research approach was used to provide the researcher with rich, in-depth, relevant data. The researcher conducted multiple interviews of three superintendents and school board presidents in public school districts in Education Service Region IV of Texas. Additional data was gathered through documents and a reflective journal. There were six themes that emerged from data collected regarding superintendent evaluation: timing, rating, alignment, relationships, performance-based evaluation, and local control. The participating district modified and adjusted criteria and the process to align with the district context to more closely measure the school districts goals and priorities. The perspectives of superintendents and school board members offer insight into the process and struggles that each has with the overwhelming nature of the job of measuring the performance of the superintendent.
The relationship of the State Board of Control to the state-supported institutions of higher education in Texas
(1930) Moore, Lawrence Henry; Pittenger, Benjamin Floyd, 1883-1969
Reliable and low-cost test collections construction using machine learning
(2021-08-12) Rahman, Md Mustafizur (Ph. D. in information studies); Lease, Matthew A.; Kutlu, Mucahid; Ding, Ying; Howison, James
The development of new search algorithms requires an evaluation framework in which A/B testing of new vs. existing algorithms can be reliably performed. While today's search evaluation methodology is reliable, it relies heavily upon people manually annotating the relevance of many search results, which is slow and expensive. Moreover, this practice has become increasingly infeasible as digital collections have grown ever-larger. Consequently, there is an urgent need today for better IR evaluation methods that are both cost-effective and reliable. My doctoral research focuses on developing low-cost yet reliable IR evaluation methods by integrating state-of-the-art machine learning (ML) techniques with traditional human annotation. More specifically, in this dissertation, I focus on improving system-based IR evaluation methods that rely on constructing test collections. I present my work in four directions: i) understanding the effects of the participating systems on the qualities of a test collection, ii) modeling a machine learning system to reduce the human annotation efforts for a given search topic, iii) allocating annotation budget across search topics via a dynamic feedback loop between a reinforcement learning method and an active learning algorithm, and iv) developing a dataset for hate speech by adapting methods for constructing test collections in IR. In the first direction, I investigate how the number of participating systems impacts the qualities of a test collection. Then I propose a robust prediction model that can be utilized to predict the qualities of test collection even before collecting relevance judgments. As for the second direction, I seek to reduce the human annotation effort needed to evaluate IR systems by using active learning. Specifically, rather than relying entirely on human annotators to judge search results, I propose an amalgam of human annotation and machine intelligence. In the third direction, I aim at predicting how human judging effort can be intelligently allocated across different search topics. Whereas traditional approaches allocate the same human judging effort across different search topics, I utilize reinforcement learning which in combination with the active learning algorithm, enables us to allocate budget dynamically for each search topic. Finally, I develop a dataset for hate speech by exploring the ideas of developing test collections in IR. My hate speech dataset has a broader coverage of hate speech than prior hate speech datasets.
Remediation of soiled masonry in historic structures contaminated by the Gulf Coast oil spill of 2010
(2011-08) Vora, Payal Rashmikant; Holleran, Michael; Gale, Frances R.
The objective of this thesis was to understand the factors that affect the selection of remedial treatments for the complex staining of masonry materials on cultural resources located in environmentally sensitive sites such as Fort Livingston, Louisiana, on the Gulf Coast of the United States and other locations impacted by pollutants including crude oil. After the Deepwater Horizon oil spill in April 2010, the brick-and-tabby Fort was stained by crude oil. The EPA recommends SWA for removal of oil from solid surfaces such as masonry; however, limited research has been conducted into SWA effective for removal of crude oil from masonry, particularly in remote and environmentally sensitive locations. Research was conducted collaboratively at NCPTT and UT-Austin to identify a series of suitable SWA and to develop methods for evaluating SWA effectiveness in the laboratory. Products were selected for laboratory evaluation that do not require long dwell times, are easy to transport to the site, can be applied with portable equipment, produce effluent that can be collected for off-site disposal, and are listed on the EPA-published NCP Product Schedule. Two sets of 36 brick samples each were soiled with crude oil from the Fort. One set of samples was artificially weathered and one set was unweathered prior to being cleaned with selected six SWA. Laboratory evaluation shows that the primary factor affecting cleaner selection for remediation of brick masonry stained by light crude oil is the extent of weathering of oil on the masonry. For light crude oil, such as that spilled in the Gulf, organic solvent-based cleaners may be most effective if cleaning is possible soon after the staining occurs. Aqueous surfactant cleaners are most effective for removing weathered light crude oil from masonry. The following SWA listed in order of performance are recommended for field trials at Fort Livingston: 1. Cytosol; 2. SC-1000; 3. De-Solv-It APC; 4. De-Solv-It Industrial followed by De-Solv-It APC; 5. De-Solv-It Industrial followed by SC-1000.
The traceable lifecycle model
(2010-12) Nadon, Robert Gerard; Barber, K. Suzanne; Graser, Thomas
Software systems today face many challenges that were not even imagined decades prior. Challenges including the need to evolve at a very high rate, lifecycle phase drift or erosion, inability to prevent the butterfly effect where the slightest change causes unimaginable side effects throughout the system, lack of discipline to define metrics and use measurement to drive operations, and no "silver bullet" or single solution to solve all the problems of every domain, just to name a few. This is not to say that the issues stated above are the only problems. In fact, it would be impossible to list all possible problems--software itself is infinitely flexible bounded only by the human imagination. These are just a portion of the primary challenges today's software engineer faces. There have been attempts throughout the history of software to resolve each one of these challenges. There have been those who tried to tackle them individually, simultaneously, as well as various combinations of them at one time. One such method was to define and encapsulate the various phases within software, which has come to be called a software lifecycle or lifecycle model. Another area of recent research has lead to the hypothesis that many of these challenges can be resolved or at least facilitated through proper traceability methods. Virtually none of today's software components are completely derived from scratch. Rather, code reuse and software evolution become a large portion of the software engineer's duties. As Vance Hilderman at HighRely puts it, "Research has shown that proper traceability is vital. For high quality and safety-critical engineering development efforts however, traceability is a cornerstone not just for achieving success, but to proving it as well." So if software is not derived from scratch, having the traceability to know about its origination is invaluable. Given today's struggles, what is in store for the future software engineer? This paper is an attempt to quantify and answer (or at least project a possibility) that involves a new mindset and a new lifecycle model or structure change that may assist in tackling some of the above referenced issues.