Generating Program Generators
by Arne J. Glenstrup, Henning Makholm and Jens Peter Secher
Research in semantics based program manipulation has been carried
out in Copenhagen for more than a decade. The DART project (a
member of ERCIM partner DANIT) is putting theory into practice
by deriving program manipulation tools from programming language
theory. An important instance is that program generators (and
generator generators, etc.), can be produced automatically and
correctly from executable problem specifications.
One of the main goals is to produce tools that are automatic in
the sense that once instantiated by a user, they can be repeatedly
applied to widely varying problem instances without further user
intervention. This semantic approach is applicable to a wide range
of areas:
- program transformation can be safely performed, eg for instrumentation,
simplification, or translation
- optimisation by specialisation of generic programs to specific
tasks can significantly reduce computational overhead
- generation of compilers from interpreters can be achieved automatically.
Partial Evaluation
Partial evaluation is an automated source-to-source transformation
technique that can be used to optimise generic software: a program
together with some known portion of its input, is transformed
into a specialised version obtained by pre-computing the parts
of the program that only depend on this input.
Usually when writing programs there is a tradeoff between, on
the one hand, generic and easily maintainable programs and, on
the other hand, fast programs. Partial evaluation bridges this
gap, letting you have your cake and eat it too: generic programs
are (automatically!) turned into fast programs for specific problem
instances.
In a software development setting, a generic, well-tested module
can be taken off the shelf and then specialised automatically
to a specific usage pattern, thereby speeding it up. Since the
specialisation preserves the semantics of the module, it is possible
to produce both reliable and efficient software.
Specialisation can be done quickly, and this means that rapid
prototyping and production can be unified. As a side-effect, some
of the optimisation usually done by hand can be done in an automatic,
application-independent way. Furthermore, the original program
can be left untouched, so its structure and readability can be
retained.
Theoretical foundation: the Futamura Projections describe the
generation of compilers from interpreters, avoiding the error-prone
task of writing a compiler by hand. DART researchers were the
first in the world to transform these theoretical ideas into practice
by producing an automatic compiler generator using these principles.
Compiler generation has continued to be a focus point of DART
work in partial evaluation.
Partial Evaluators
The most sophisticated partial evaluators today work with semantically
well-defined languages like Scheme and SML (an example is Similix;
see web address below). The languages that are more widely used
in the software industry tend to be less rigorously defined, but
nevertheless the understanding gained by semantics-based techniques
have proved valuable in the development of practically oriented
partial evaluators like DARTs C-Mix for the C language; FSPEC
for Fortran; and a partial evaluator for Java that is under way
in the project.
There is an interaction and exchange of ideas with INRIA. In Rennes,
the COMPOSE group develops the Tempo Specialiser for C which complements
C-Mix by focusing on a very fast specialisation process that makes
it possible to produce specialised object code at run time.
Successful experiments have been carried out in a number of settings,
including ray tracing, model simulations, implementations of domain-specific
languages, numerical algorithms, and compiling by specialising
interpreters.
C-Mix: Partial Evaluation of C
C-Mix is an automatic partial evaluator that specialises C programs
that conform to the ISO C standard. It takes a generic program
together with a classification of the input and produces a specialised-program
generator. When this generator is provided the portion of the
input that is known in advance, it will produce a specialised
program, as depicted in the figure. When the specialised program
is run on the remaining input it produces the same output as the
original program, only faster. Often, the same specialised program
is reused, resulting in significant speedups in the running times.
The specialised-program generator is a stand-alone program that
can be used without any knowledge about partial evaluation.
C-Mix does a number of complex analyses on the subject program
- eg, alias, liveness and binding-time analysis - to calculate
which constructs depend on known input only. Using this information,
the specialised-program generator can be produced.
C-Mix is available free of charge, and is implemented with standard
tools (GNU C/C++ compiler and tools), which means that it is highly
portable (it runs on Unix-like systems, Windows, DOS, etc.). It
has an HTML-based inspection toolkit that provides the developer
with the results of the analyses, enabling fine-tuning of the
specialisation process. URL http://www.diku.dk/topps/Research.html
contains links to C-mix, Similix, and various other DART activities.
The COMPOSE group in Rennes also work on program specialisation
and provide a C program specialiser, see http://www.irisa.fr/compose/.
Please contact:
Jens Peter Secher - DART
Tel: +45 35 32 14 08
E-mail: jpsecher@diku.dk