Show simple item record

dc.contributor.advisorKim, Miryung
dc.creatorRay, Baishakhien
dc.date.accessioned2013-11-11T18:45:15Zen
dc.date.issued2013-08en
dc.date.submittedAugust 2013en
dc.identifier.urihttp://hdl.handle.net/2152/22103en
dc.descriptiontexten
dc.description.abstractSoftware forking---creating a variant product by copying and modifying an existing project---is often considered an ad hoc, low cost alternative to principled product line development. To maintain forked projects, developers need to manually port existing features or bug-fixes from one project to another. Such manual porting is not only tedious but also error-prone. When the contexts of the ported code vary, developers often have to adapt the ported code to fit its surroundings. Faulty adaptations or inconsistent updates of the ported code could potentially introduce subtle inconsistencies in the codebase. To build a deeper understanding to cross-system porting and porting related errors, this dissertation investigates: (1) How can we identify ported code from software version histories? (2) What is the overhead of cross-system porting required to maintain forked projects? (3) What is the extent and characteristics of porting errors that occur in practice? and (4) How can we detect and characterize potential porting errors? As a first step towards assessing the overhead of cross-system porting, we implement REPERTOIRE, a tool to analyze repeated work of cross-system porting across peer projects. REPERTOIRE can detect ported edits between program patches with high accuracy of 94% precision and 84% recall. Using REPERTOIRE, we study the temporal, spatial, and developer dimensions of cross-system porting using 18 years of parallel evolution history of the BSD product family. Our study finds that cross-system porting happens periodically and the porting rate does not necessarily decrease over time. The upkeep work of porting changes from peer projects is significant and currently, porting practice seems to heavily depend on developers doing their porting job on time. Analyzing version histories of Linux and FreeBSD, we derive five categories of porting errors, including incorrect control- and data-flow, code redundancy, and inconsistent identifier and token renamings. Leveraging this categorization, we design a static control- and data-dependence analysis technique, SPA, to detect and characterize porting inconsistencies. SPA detects porting inconsistencies with 65% to 73% precision and 90% recall, and identify inconsistency types with 58% to 63% precision and 92% recall on average. In a comparison with two existing error detection tools, SPA outperforms them with 14% to 17% better precision.en
dc.format.mimetypeapplication/pdfen
dc.language.isoen_USen
dc.subjectSoftware evolutionen
dc.subjectForkingen
dc.subjectPortingen
dc.subjectRepetitive changesen
dc.subjectCode clonesen
dc.subjectStatic analysisen
dc.subjectSubgraph isomorphismen
dc.subjectBugen
dc.subjectError detectionen
dc.subjectCopy-paste erroren
dc.titleAnalysis of cross-system porting and porting errors in software projectsen
dc.date.updated2013-11-11T18:45:16Zen
dc.description.departmentElectrical and Computer Engineeringen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineElectrical and Computer Engineeringen
thesis.degree.grantorThe University of Texas at Austinen
thesis.degree.levelDoctoralen
thesis.degree.nameDoctor of Philosophyen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record