# Proof engineering for large-scale verification projects

## Access full-text files

## Date

## Authors

## Journal Title

## Journal ISSN

## Volume Title

## Publisher

## Abstract

Software controls many aspects of our daily lives, thus, software correctness is of utmost importance. One way to develop correct-by-construction software is by using proof assistants, i.e., writing machine-checked proofs of correctness at the level of executable code. Although the obtained guarantees via such development are highly desirable, proof assistants are not currently well adapted to large-scale software development, and are expensive to use in terms of both time and expertise. In particular, the productivity of proof engineers is lowered by inadequate interfaces, processes, and tool support, which lone expert users may not be hindered by, but become serious problems in large-scale projects with many contributors.

This dissertation shows that research in traditional software engineering can improve productivity considerably in large-scale verification projects that use proof assistants and facilitate proliferation of formally verified software. Specifically, this dissertation, inspired by research in the software engineering area on regression testing, software evolution, and mutation testing, presents three main bodies of research with the goal to speed up proof checking and help proof engineers to evaluate the quality of their verification projects.

First, this dissertation introduces regression proof selection, a technique that tracks fine-grained dependencies between Coq definitions, propositions, and proofs, and only checks those proofs affected by changes between two revisions. We instantiate the technique in a tool dubbed iCoq. We applied iCoq to track dependencies across many revisions in several large Coq projects and measured the time savings compared to proof checking from scratch and when using Coq’s timestamp-based toolchain for incremental proof checking. Our results show that proof checking with iCoq is up to 10 times faster than the former and up to 3 times faster than the latter.

Second, this dissertation describes the design and implementation of piCoq, a set of techniques that blend the power of parallel proof checking and proof selection. piCoq can track dependencies between files, definitions, and lemmas and perform parallel checking of only those files or proofs affected by changes between two project revisions. We applied piCoq to perform regression proving over many revisions of several large open source projects. Our results indicate that proof-level parallelism and proof selection is consistently much faster than both sequential checking from scratch and sequential checking with proof selection. In particular, 4-way parallelization is up to 28.6 times faster than the former, and up to 2.8 times faster than the latter.

Third, this dissertation introduces mutation proving, a technique for analyzing quality of verification projects that use proof assistants. We implemented our technique for the Coq proof assistant in a tool dubbed mCoq. mCoq applies a set of mutation operators to Coq functions and datatypes, and then checks proofs of lemmas affected by operator application. We applied mCoq to several medium and large scale Coq projects, and recorded whether proofs passed or failed when applying different mutation operators. We then qualitatively analyzed the failed proofs, and found several examples of weak and incomplete specifications.