Automating program transformations based on examples of systematic edits
MetadataShow full item record
Programmers make systematic edits—similar, but not identical changes to multiple places during software development and maintenance in order to add features and fix bugs. Finding all the correct locations and making the ed- its correctly is a tedious and error-prone process. Existing tools for automating systematic edits are limited because they do not create general purpose edit scripts or suggest edit locations, except for specialized or trivial edits. Since many similar changes occur in similar contexts (in code with similar surrounding dependent relations and syntactic structures), there is an opportunity to automate program transformations based on examples of systematic edits. By inferring systematic edits and relevant context from one or more exemplar changes, automated approaches can (1) apply similar changes to other loca- tions, (2) locate code that requires similar changes, and (3) refactor code which undergoes systematic edits. This thesis seeks to improve programmer produc- tivity and software correctness by automating parts of systematic editing and refactoring. Applying similar, but not identical code changes, to multiple locations with similar contexts requires (1) understanding and relating common program context—a program’s syntactic structure, control, and data flow—relevant to the edits in order to propagate code changes from one location to oth- ers, and (2) recognizing differences between locations in order to customize code changes for each location. Prior approaches for propagating nontrivial, general-purpose code changes from one location to another either do not ob- serve the program context when placing edits, or do not handle the differences between locations when customizing edits, producing syntactic invalid or in- correctly modified programs. We design a novel technique and implement it in a tool called Sydit. Our approach first creates an abstract, context-aware edit script which contains a syntax subtree enclosing the exemplar edit with all concrete identifiers abstracted and a sequence of edit operations. It then applies the edit script to user-selected locations by establishing both context matching and identifier matching to correctly place and customize the edit. Although SYDIT is effective in helping developers correctly apply edits to multiple locations, programmers are still on their own to identify all the appropriate locations. When developers omit some of the locations, the edit script inferred from a single code location is not always well suited to help them find the locations. One approach to infer the edit script is encoding the concrete context. However, the resulting edit script is too specific to the source location, and therefore can only identify locations which contain syntax trees identical to the source location (false negatives). Another approach is to encode context with all identifiers abstracted, but the resulting edit script may match too many locations (false positives). To suggest edit locations, we use multiple examples to create a partially abstract, context-aware edit script, and use this edit script to both find edit locations and transform the code. Our experiments show that edit scripts from multiple examples have high precision and recall in finding edit locations and high accuracy when applying systematic edits because the extracted common context together with identified common concrete identifiers from multiple examples improves the location search without sacrificing edit application accuracy. For systematic edits which insert or update duplicated code, our systematic editing approaches may encourage developers in the bad practice of creating or evolving duplicated code. We investigate and evaluate an approach that automatically refactors cloned code based on the extent of systematic edits by factoring out common code and parameterizing any differences between them. Our investigation finds that refactoring systematically edited code is not always feasible or desirable. When refactoring is desirable, systematic ed- its offer a better way to scope the refactoring as compared to whole method refactoring. Automatic clone removal refactoring cannot obviate the need for systematic editing. Developers need tool support for both automatic refactoring and systematic editing. Based on the systematic changes already made by developers for a subset of change locations, our automated approaches facilitate propagating general purpose systematic changes across large programs, identifying locations requiring systematic changes missed by developers, and refactoring code undergoing systematic edits to reduce code duplication and future repetitive code changes. The combination of these techniques opens a new way of helping developers automate tedious and error-prone tasks, when they add features, fix bugs, and maintain software. These techniques also have the potential to guide automated software development and maintenance activities based on existing code changes mined from version histories for bug fixes, feature additions, refactoring, and software migration.