Dynamic Translator Development: Modelica in the Python TRAP

PROGRAMMING LANGUAGE TECHNOLOGIES
ERCIM News No.36 - January 1999

Dynamic Translator Development: Modelica in the Python TRAP

by Thilo Ernst

Modelica is the new unified, object-oriented description language for dynamical models of physical systems developed in an international effort in which GMD co-operates. GMD is developing a Modelica translator for integration in the Smile dynamic simulation environment. The Python language which is used as an integration platform in Smile, together with associated tool components also proved to be a powerful basis for translator development. By its combination of very high level of abstraction, interpreted execution, and ease of extensibility, Python enables a new development methodology also for language processors; with reference to generic simulation environments, it provides a unique foundation for R&D in dynamic model evolution.

Modelica (http://www.modelica.org) is a unified language for dynamic models of complex physical systems being developed in international effort (formally, a combined EUROSIM technical Committee/SCS Technical Chapter) in which GMD participates (see also ERCIM News Number 32). Smile (http://www.first.gmd.de/smile) is an object-oriented dynamic simulation environment developed by Technische Universität Berlin and GMD. In its latest revision, it heavily builds on Python as an integration platform for both external and internal software components. GMD is developing a prototypical Modelica compiler component for integration into Smile. The Modelica processor is intended to be open, extendible, and reusable.

Python (http://www.python.org) is an interpreted, object- oriented language often referred to as a ‘scripting’, ‘extension’ or ‘glue language’, as it is well-suited and popular as a framework for integrating software components across diverse implementation languages and programming paradigms. However beyond that, Python is a full-fledged, platform-independent, modern programming language. Python offers powerful features such as classes, modules, exceptions, dynamic typing and very high level collection data types in a concise, regular and very readable syntax. The language’s high level of abstraction and (byte-code-)interpreted execution provide an outstanding development efficiency. Performance bottlenecks can easily be identified and can be effectively attacked by ‘extensions’, ie optimised low-level (C/C++) re-implementations of the (typically small) code parts in question. This enables a very efficient rapid prototyping/rapid application development (RP/RAD) methodology. The Python language and software package are free and not subject to legal restrictions hampering any kind of application. Python is mature and reliable, and (thanks to a large and active user community) a huge collection of library components written in or interfaced with Python exists, for the most part free like Python itself.

Using Python for translator implemen-tation was a rather obvious idea in the project setting described above, as a Python interface between the Modelica translator and the Smile-internal equation system data structure was planned anyway. Therefore this option was evaluated in more detail. It turned out that the language has a set of features that very effectively can be exploited to approach common data structure and algorithm patterns found in compilers. For instance:

Python offers sequence types (lists and tuples) and their standard manipulation methods as built-ins. List manipulation is sufficient to implement algorithms based on incremental set manipulations, which are ubiquitous in compilers.
Python’s object model and extension concept make it easy to work with sets or graphs of objects (that represent any semantically relevant information) using a class library, and later on transparently migrate to a more efficient bitset implementation.
Python’s dictionary datatype can be used to represent arbitrary mappings between Python objects. Mappings occur frequently in compilers as well: name space and symbol table both refer to mappings (on different levels of abstraction).
Python’s object model can be effectively used for object oriented compiler techniques, eg representation of abstract syntax tree (AST) node sorts by a class hierarchy in which standard functionality (eg tree traversal according to the visitor design pattern) is packaged. Python is not statically typed, but has a dynamic type system which can be easily used to enforce compiler-specific constraints, eg local wellformedness constraints for AST nodes can be checked in the node constructors.
Python, by integrating concepts from both worlds in one language, also allows the user to choose the right mixture of functional and imperative programming adapted to the task at hand.
Python uses a reference-counting based automatic memory management scheme hidden from the user: Objects can be simply created without caring about the memory allocation that is automatically happening; objects are silently reclaimed as soon as they are no longer referenced.

The transformation phase of a translator is where these advantages can be best exploited; lexical and syntactical analysis of the source text have to be done already. Fortunately, front-end tool components (scanner and parser generators) already were available as Python modules, so only a thin layer of tooling needed to be added to obtain a small, but sufficiently powerful Python-based development environment called TRAP (Translator RApid Prototyping) for building a Modelica translator. TRAP takes a compiler description consisting of an EBNF-style grammar specification (enriched with semantics actions expressed by pieces of Python code attached to the production rules) and a concise hierarchical description of AST node sorts. In addition, type constraints for non-terminals and fields can be specified. From this, a (Python) compiler frame module is automatically generated, providing scanner, parser, the set of node class definitions with standard method instrumentation (printing, dynamic typecheck, traversal, pattern matching) and some auxiliary code. That way, source code of the language to be processed is easily converted into Python data structures; subsequent transformations are implemented directly in Python. The figure presents sample snippets from a compiler description.
Example constructs of the TRAP description.
compiler SimpleMod

# comment syntax of language processed
comment r'//.*$'
comment r'/\*.*\*/'

# lexical tokens
tokx`zNT '[A-Za-z][A-Za-z0-9_]*' # default semantics: matched text

token INT_CONST '[0-9]+':
string.atoi(str) # explicit semantics: convert to IntType

# grammar: nonterminals with rules & semantics
nterm primary
# default: pass through constituent's semantics
<- INT_CONST
<- "time"
<- "false"
<- "true"
<- component_reference
<- "(" expression=E ")":
E # explicitly pass through semantics of E

nterm name::[] # type constraint: Python list
<- ["." IDENT+] # non-optional repetition with separator "."
# automatic semantics: list of strings

nterm class_definition::Mclass # type constraint: a node class
<- class_key=C IDENT=i1 Mcomment=K (component*)=T "end" IDENT=i2 ";":
if i1 != i2: # a simple semantics check
ERR("Name mismatch:", i1, "/", i2)
Mclass(C, i1, K, T) # construct AST node as semantics value

# abstract syntax: a node type definition
Mclass (
key, # default field type: StringType
name,
Mcomment,
components::(component) # type: Python tuple of 'component' nodes
)

Modelica can be categorised as a special-purpose language of medium complexity. TRAP of course should be generally useful for building translators for such languages. Indeed, it was already used for bootstrapping itself. However the emphasis here is not on building yet another compiler tool - TRAP mainly integrates relevant concepts from existing toolkits such as Cocktail, Gentle, and PCCTS. The important point is that this integration was done in the open, dynamic framework provided by Python.

In the context of processing Modelica, expressing semantics transformations in Python is more than merely convenient - this approach indeed provides a unique foundation for further R&D in generic modelling and simulation: Currently, most generic simulation systems have a rigid separation of compilation vs. Simulation phase. However, for certain application classes, it would be desirable to change structure and/or details of the model being worked on during simulation. With a Python-based, dynamic Modelica translator, (re-)doing semantically complex model transformations at simulation time poses no technical problems. A fully dynamic modelling and simulation system architecture in which concepts like dynamic model evolution can be conveniently investigated is an important area of future work.
Please contact:
Thilo Ernst - GMD
Tel: +49 30 6392 1919
E-mail: Thilo.Ernst@gmd.de

Example constructs of the TRAP description.
compiler SimpleMod # comment syntax of language processed comment r'//.$' comment r'/\.\/' # lexical tokens tokx`zNT '[A-Za-z][A-Za-z0-9_]' # default semantics: matched text token INT_CONST '[0-9]+': string.atoi(str) # explicit semantics: convert to IntType # grammar: nonterminals with rules & semantics nterm primary # default: pass through constituent's semantics <- INT_CONST <- "time" <- "false" <- "true" <- component_reference <- "(" expression=E ")": E # explicitly pass through semantics of E nterm name::[] # type constraint: Python list <- ["." IDENT+] # non-optional repetition with separator "." # automatic semantics: list of strings nterm class_definition::Mclass # type constraint: a node class <- class_key=C IDENT=i1 Mcomment=K (component)=T "end" IDENT=i2 ";": if i1 != i2: # a simple semantics check ERR("Name mismatch:", i1, "/", i2) Mclass(C, i1, K, T) # construct AST node as semantics value # abstract syntax: a node type definition Mclass ( key, # default field type: StringType name, Mcomment, components::(component) # type: Python tuple of 'component' nodes )

return to the ERCIM News 36 contents page