Integrating programming languages and databases via program analysis and language design
MetadataShow full item record
Researchers and practitioners alike have long sought to integrate programming languages and databases. Today's integration solutions focus on the data-types of the two domains, but today's programs lack transparency. A transparently persistent program operates over all objects in a uniform manner, regardless of whether those objects reside in memory or in a database. Transparency increases modularity and lowers the barrier of adoption in industry. Unfortunately, fully transparent programs perform so poorly that no one writes them. The goal of this dissertation is to increase the performance of these programs to make transparent persistence a viable programming paradigm. This dissertation contributes two novel techniques that integrate programming languages and databases. Our first contribution--called query extraction--is based purely on program analysis. Query extraction analyzes a transparent, object-oriented program that retrieves and filters collections of objects. Some of these objects may be persistent, in which case the program contains implicit queries of persistent data. Our interprocedural program analysis extracts these queries from the program, translates them to explicit queries, and transforms the transparent program into an equivalent one that contains the explicit queries. Query extraction enables programmers to write programs in a familiar, modular style and to rely on the compiler to transform their program into one that performs well. Our second contribution--called RBI-DB+--is an extension of a new programming language construct called a batch block. A batch block provides a syntactic barrier around transparent code. It also provides a latency guarantee: If the batch block compiles, then the code that appears in it requires only one client-server communication trip. Researchers previously have proposed batch blocks for databases. However, batch blocks cannot be modularized or composed, and database batch blocks do not permit programmers to modify persistent data. We extend database batch blocks to address these concerns and formalize the results. Today's technologies integrate the data-types of programming languages and databases, but they discourage programmers from using procedural abstraction. Our contributions restore procedural abstraction's use in enterprise applications, without sacrificing performance. We argue that industry should combine our contributions with data-type integration. The result would be a robust, practical integration of programming languages and databases.