Browsing by Subject "Fault localization"
Now showing 1 - 7 of 7
- Results Per Page
- Sort Options
Item Applying sequence-to-sequence RNN models to IR-based bug localization(2016-08) Lemons, Clayton Lindsay; Khurshid, Sarfraz; Saha, Ripon KBug localization is the resource intensive process of finding bugs. A considerable amount of time, effort, and money could be saved if this process was automated. Bug localization based on information retrieval (IR) is a static approach to automation that represents source code files as documents in a database and bug reports as queries. The bug localization approach described in this report is centered around the mental model that evolves in the minds of software developers as they work with a codebase. Using a sequence-to-sequence recurrent neural network (RNN), it may be possible to approximate this mental model by mapping the comments in source code (written in a natural language) to the source code itself (written in a programming language). The model can then be used to convert bug reports (also written in a natural language) to source token keywords for use in IR-based bug localization. The results of experimenting with several approaches to defining the mapping are presented. Although not up to par with the current state-of-the-art, the results show that there is potential in using a sequence-to-sequence RNN for IR-based bug localization.Item Automated synthesis and debugging of declarative models in alloy(2018-08) Wang, Kaiyuan; Khurshid, Sarfraz; Garg, Vijay; Gligoric, Milos; Julien, Christine; Marinov, DarkoIn theory, formal specifications offer numerous benefits in developing more reliable software. In practice however, the use of specifications is rather limited, and practitioners often consider them more trouble than they are worth. Indeed, manually writing detailed specifications using notations that have unfamiliar syntax and semantics can be a daunting task -- even for experienced programmers. We introduce a new automated approach for synthesis of desired specifications and debugging of faulty specifications using given examples that capture the essence of desired properties and serve as test cases. Our focus is specifications written in the declarative language Alloy -- a first-order logic based on relations with transitive closure, and its SAT-based analysis engine. Our key insight is that a test-driven foundation enables modern approaches to synthesis and debugging of imperative code to serve as a basis for developing novel analogous techniques for declarative specifications. For synthesis, we build on equivalence in relational algebra and introduce techniques for generating candidate Alloy expressions. We also introduce a technique to complete a partial Alloy model with holes using constraint solving. For locating faults in buggy specifications, we build on mutation-based fault localization and introduce techniques for locating likely faulty nodes in the abstract syntax tree of the faulty specification. Moreover, we integrate our expression generation and fault localization techniques to introduce a technique for automated specification repair. We experimentally evaluate our techniques using several Alloy models as subjects, including those with real faults. The results show that our techniques are effective at synthesis and debugging of the subjects. We believe our techniques provide an important step towards increasing the role of formal specifications in developing more reliable software and realizing their promise.Item Control flow graph visualization and its application to coverage and fault localization in Python(2015-05) Salling, Jackson Lee; Khurshid, Sarfraz; Julien, ChristineThis report presents a software testing tool that creates visualizations of the Control Flow Graph (CFG) from Python source code. The CFG is a representation of a program that shows execution paths that may be taken by the machine. Similar techniques to the ones here could be applied to many other languages, but the CFGs in this tool are tailored to the Python language. As computers get faster, tools to help programmers be effective at work can become more complex and still give quick feedback, without causing an undue performance burden. This tool explores several approaches to giving feedback to developers through a visualization of the CFG. First, just the viewing of a CFG gives a different perspective on the code. A programmer could choose to juxtapose the CFG with complexity metrics during development, seeing increased complexity as graphs grow larger. Second, the tool implements a mechanism to provide code coverage to Python modules. This feature extends the visualization to show code coverage as a highlighted CFG. Test coverage requirements are calculated to check node, edge, edge-pair, and prime path coverage. From studying existing testing tools, it appears no existing tool for Python provides all these test coverage levels. Third, the tool provides an interface for adding custom highlighting of the CFG, used here to visualize fault localization. Seeing the most suspicious locations from fault localization techniques could be used to reduce debugging time. The results of running the tool on several popular Python packages, and on itself, show its performance is competitive with the most popular coverage tool when measuring branch coverage. It is slightly slower on statement cover- age alone, but much faster against an unoptimized version and a logic coverage tool. This report also presents ideas for extensions to the tool. Among them is to incorporate program repair using fault localization and mutation operators. Visualizing code as a CFG provides interesting ways to look at many software testing metrics.Item Fault detection and precedent-free localization in thermal-fluid systems(2010-12) Carpenter, Katherine Patricia; Djurdjanovic, Dragan; Da Silva, Alexandre K., 1975-This thesis presents a method for fault detection and precedent-free isolation for two types of channel flow systems, which were modeled with the finite element method. Unlike previous fault detection methods, this method requires no a priori knowledge or training pertaining to any particular fault. The basis for anomaly detection was the model of normal behavior obtained using the recently introduced Growing Structure Multiple Model System (GSMMS). Anomalous behavior is then detected as statistically significant departures of the current modeling residuals away from the modeling residuals corresponding to the normal system behavior. Distributed anomaly detection facilitated by multiple anomaly detectors monitoring various parts of the thermal-fluid system enabled localization of anomalous partitions of the system without the need to train classifiers to recognize an underlying fault.Item A mixed approach to spectrum-based fault localization using information theoretic foundations(2013-12) Roychowdhury, Shounak; Khurshid, SarfrazFault localization, i.e., locating faults in code, such as faulty statements or expressions, which are responsible for observed failures, is traditionally a manual, laborious, and tedious task. Recent years have seen much progress in automated techniques for fault localization. A particularly promising approach is to utilize program execution spectra to analyze passing and failing runs and compute how likely each statement is to be faulty. Techniques based on this approach have so far largely focused on either using statistical analysis or similarity-based measures, which have a natural application in evaluating such runs. However, in spite of some initial success, the current techniques lack the effectiveness of localizing the faults with a high degree of confidence in real applications. Our thesis is that information theoretic feature selection can provide a basis for novel techniques that mix coverage of different program elements for improving the effectiveness of fault localization using program spectra. Our basic insight is that each additional failing or passing run can increase the information diversity with respect to the program elements, which can help localize faults in code. For example, the statements with maximum feature diversity information can point to the most suspicious lines of code. This dissertation presents a new fault localization approach that embodies our insight and introduces Bernoulli divergence for feature selection and uses it as the foundation for two novel techniques: (1) mixing of branch and statement coverage information; and (2) varying of feature granularity from function-level to statement-level. An experimental evaluation using a suite of subject programs commonly used in evaluation of fault localization techniques shows that our approach provides an effective basis for fault localization.Item Systematic techniques for more effective fault localization and program repair(2015-12) Gopinath, Divya; Khurshid, Sarfraz; Perry, Dewayne; Pingali, Keshav; Julien, Christine; Bias, RandolphDebugging faulty code is a tedious process that is often quite expensive and can require much manual effort. Developers typically perform debugging in two key steps: (1) fault localization, i.e., identifying the location of faulty line(s) of code; and (2) program repair, i.e., modifying the code to remove the fault(s). Automating debugging to reduce its cost has been the focus of a number of research projects during the last decade, which have introduced a variety of techniques. However, existing techniques suffer from two basic limitations. One, they lack accuracy to handle real programs. Two, they focus on automating only one of the two key steps, thereby leaving the other key step to the developer. Our thesis is that an approach that integrates systematic search based on state-of-the-art constraint solvers with techniques to analyze artifacts that describe application specific properties and behaviors, provides the basis for developing more effective debugging techniques. We focus on faults in programs that operate on structurally complex inputs, such as heap-allocated data or relational databases. Our approach lays the foundation for a unified framework for localization and repair of faults in programs. We embody our thesis in a suite of integrated techniques based on propositional satisfiability solving, correctness specifications analysis, test-spectra analysis, and rule-learning algorithms from machine learning, implement them as a prototype tool-set, and evaluate them using several subject programs.Item Unifying regression testing with mutation testing(2014-05) Zhang, Lingming; Khurshid, SarfrazSoftware testing is the most commonly used methodology for validating quality of software systems. Conceptually, testing is simple, but in practice, given the huge (practically infinite) space of inputs to test against, it requires solving a number of challenging problems, including evaluating and reusing tests efficiently and effectively as software evolves. While software testing research has seen much progress in recent years, many crucial bugs still evade state-of-the-art approaches and cause significant monetary losses and sometimes are responsible for loss of life. My thesis is that a unified, bi-dimensional, change-driven methodology can form the basis of novel techniques and tools that can make testing significantly more effective and efficient, and allow us to find more bugs at a reduced cost. We propose a novel unification of the following two dimensions of change: (1) real manual changes made by programmers, e.g., as commonly used to support more effective and efficient regression testing techniques; and (2) mechanically introduced changes to code or specifications, e.g., as originally conceived in mutation testing for evaluating quality of test suites. We believe such unification can lay the foundation of a scalable and highly effective methodology for testing and maintaining real software systems. The primary contribution of my thesis is two-fold. One, it introduces new techniques to address central problems in both regression testing (e.g., test prioritization) and mutation testing (e.g., selective mutation testing). Two, it introduces a new methodology that uses the foundations of regression testing to speed up mutation testing, and also uses the foundations of mutation testing to help with the fault localization problem raised in regression testing. The central ideas are embodied in a suite of prototype tools. Rigorous experimental evaluation is used to validate the efficacy of the proposed techniques using a variety of real-world Java programs.