Software Renovation
by Arie van Deursen
In 1976, Belady and Lehman formulated their Laws of Program Evolution
Dynamics. First, a software system that is used will undergo
continuous modification. Second, the unstructuredness (entropy)
of a system increases with time, unless specific work is done
to improve the systems structure. This activity of improving
legacy software systems is called system renovation. It aims at
making existing systems more comprehensible, extensible, robust
and reusable.
Due to the fact that a typical industrial or governmental organization
has millions of lines of legacy code in continuous maintenance,
well-applied software renovation can lead to significant information
technology budget savings. For that reason, in 1996 Dutch bank
ABN AMRO and Dutch software house Roccade commissioned a renovation
research project. The research was carried out by CWI, the University
of Amsterdam, and ID Research. The goals of the project included
the development of a generic renovation architecture, as well
as application of this architecture to actual renovation problems.
Of the various facets of software renovation - such as visualization,
database analysis, domain knowledge, and so on - an enabling factor
is the analysis and transformation of legacy sources. Since such
source code analysis has much in common with compilation (in which
sources are analyzed with the purpose of translating them into
assembly code), many results from the area of programming language
technology could be reused. Of great significance for software
renovation are, for example, lexical source code analysis, parsing,
dataflow analysis, type inference, etc.
Program Transformations
Software renovation at the source code level includes automated
program transformations for the purpose of step-by-step code improvement.
In this project, we successfully applied transformations to COBOL
programs, dealing with goto elimination, dialect migration (between
COBOL-85 and COBOL-74) and modifications in the conventions for
calling library utilities.
To make this possible, we developed a COBOL grammar, instantiated
the ASF+SDF Meta-Environment with this grammar to obtain a COBOL
parser and pretty printer, and designed term rewriting rules describing
the desired transformations. The resulting system is capable of
automatically performing the desired transformations on hundreds
of thousands of lines of code, yielding a fully automatic transformation
factory.
Object Identification
At a higher level of abstraction, software renovation includes
the migration of legacy code to architectures better capable of
meeting todays requirements. A typical example is the migration
of procedural COBOL code to object technology. This is a process
which cannot be fully automated. Instead, renovation tools will
have to focus on helping the human reengineer in understanding
the legacy system.
Thus, such renovation tools will have to extract as much meaningful
information as possible from a legacy system, showing it to the
human reengineer in a concise and usable manner. For the purpose
of object identification, this information consists of the business
data items (candidate object attributes) the programs or procedures
performing the key tasks (candidate methods) and an overview of
the combined use of these (candidate classes). We were able to
develop heuristic techniques based on cluster and concept analysis
to extract such class proposals automatically.
Conclusion
The overall result of the project is a generic renovation architecture
aimed at program transformation and system understanding. A specific
instantiation for COBOL has been developed, which has been applied
to various real life case studies.
Pointers to publications with more information can be found at:
http://www.cwi.nl/~arie/resolver/
Please contact:
Arie van Deursen - CWI
Tel: +31 20 592 4075
E-mail: arie@cwi.nl