Ph.D. in Computer Science
Dissertation: Reimagining the Programming Experience
Liberating the Programmer with Prorogued Programming — SPLASH Onward! '12
Prorogued programming is a new programming paradigm closely aligned with the programmer's thought process by providing the ability to prorogue, to defer a concern, to focus on and finish the current concern, run and experiment with an incomplete program, and gradually and iteratively reify the missing parts while catching design and implementation mistakes early.
Compiler Validation via Equivalence Modulo Inputs — PLDI'14 Distinguished Paper
[July 2016 — UPDATE]: Thus far,
Orion, our initial realization of EMI, and its derivatives by my colleagues
have collectively uncovered more than a thousand bugs in GCC and LLVM (Clang) projects,
most of which have been confirmed and fixed by the developers of the respective compilers.
Toward Rapid Transformation of Ideas into Software — arXiv:1602.01075
A key mission of computer science is to enable people realize their
creative ideas as naturally and painlessly as possible. Software
engineering is at the center of this mission — software technologies
enable reification of ideas into working systems. As computers become
ubiquitous, both in availability and the aspects of human lives they
touch, the quantity and diversity of ideas also rapidly grow. Our
programming systems and technologies need to evolve to make this
reification process — transforming ideas to software — as
quick and accessible as possible.
The goal of this paper is twofold. First, it advocates and highlights
the “transforming ideas to software” mission as a moonshot for
software engineering research. This is a long-term direction for the
community, and there is no silver bullet that can get us there. To
make this mission a reality, as a community, we need to improve the
status quo across many dimensions. Thus, the second goal is to outline
a number of directions to modernize our contemporary programming
technologies for decades to come, describe work that has been
undertaken along those vectors, and pinpoint critical challenges.
Building
White-Box Abstractions by Program Refinement — SPLASH Onward! '16
Abstractions make building complex systems possible. Many facilities provided by
a modern programming language are directly designed to build a certain style of
abstraction. Abstractions also aim to enhance code reusability, thus enhancing programmer productivity and effectiveness. Real-world software systems can
grow to have a complicated hierarchy of abstractions. Often, the hierarchy
grows unnecessarily deep, because the programmers have envisioned the most
generic use cases for a piece of code to make it reusable. Sometimes, the
abstractions used in the program are not the appropriate ones, and it would be
simpler for the higher level client to circumvent such abstractions. Another
problem is the impedance mismatch between different pieces of code or libraries
coming from different projects that are not designed to work together.
Interoperability between such libraries are often hindered by abstractions,
by design, in the name of hiding implementation details and encapsulation.
These problems necessitate forms of abstraction that are easy to manipulate if
needed. In this paper, we describe a powerful mechanism to create white-box
abstractions, that encourage flatter hierarchies of abstraction and ease of
manipulation and customization when necessary: program refinement. In so
doing, we rely on the basic principle that writing directly in the host
programming language is as least restrictive as one can get in terms of
expressiveness, and allow the programmer to reuse and customize existing code
snippets to address their specific needs.
Mehrdad Afshari, Earl Barr, Zhendong Su
Vu Le, Mehrdad Afshari, Zhendong Su
Equivalence modulo inputs (EMI) is a simple, widely
applicable methodology for validating optimizing compilers. Our
key insight is to exploit the close interplay between (1) dynamically
executing a program on some test inputs and (2) statically compiling
the program to work on all possible inputs. Indeed, the test inputs
induce a natural collection of the original program's EMI variants,
which can help differentially test any compiler and specifically target
the difficult-to-find miscompilations.
To create a practical implementation of EMI for validating C
compilers, we profile a program's test executions and stochastically
prune its unexecuted code. Our extensive testing in eleven months
has led to 147 confirmed, unique bug reports for GCC and LLVM
alone. The majority of those bugs are miscompilations, and more
than 100 have already been fixed.
Mehrdad Afshari, Zhendong Su
Mehrdad Afshari, Zhendong Su