Performance Prediction Studies and Adaptive Refinements for Local Area
Weather Models
by Kläre Cassirer, Reinhold Hess, Wolfgang Joppich and Hermann
Mierendorff
The new generation of numerical weather prediction models is designed
for parallel machines. Since the development of such a code can take several
years of time, the increase of computer power during this time frame has
to be considered in the design. But there is not only an increase of computational
power, further on also research on new and faster algorithms takes place
in order to predict the weather even more accurate and reliable in the
future. Current activities within the theme cluster METEO of GMD Institute
for Algorithms and Scientific Computing concentrate on performance prediction
studies for meteorological models and algorithmical research on local adaptive
refinements.
Numerical weather forecasting and climate prediction require enormous
computing power to achieve reliable results. Six of the twenty-six most
powerful computer systems in the world are dedicated to weather research
(November 1997). All of them are parallel architectures with the number
of nodes ranging between 64 and 840.
In order to exploit this high computing power existing codes have to
be adapted to parallel computers, and new parallel algorithms are developed.
After the year 2001 the operational Local Model (LM) of the Deutscher
Wetterdienst (DWD) will run in a horizontal resolution of approximate 3
km mesh size by 50 vertical layers and with time steps of only 10 seconds.
Since a one-day prediction on about 800x800x50 grid points has to run within
half an hour of real time in the operational mode, enormous computing power
is demanded.
Currently no available machine in the world can fulfill these computational
requirements, and direct run-time measurements for the LM with the operational
resolution are not possible. Therefore, a study was initiated to predict
the performance of the LM and to define specifications of adequate parallel
systems.
The computational complexity was modelled with series of LM-runs on
an IBM SP2 with up to 32 nodes using different spatial and temporal resolutions
and different numbers of processors. In a least square approximation the
coefficients for approximation functions were determinated. With calibration
runs of the LM with lower resolution the model could be scaled for various
existing machines. These run-time measurements are presented in the figure
1. Note, that different parts (computation and physics) of the LM scale
in a different way, a model for each individual part was set up therefore.
The communication requirements
were determined by a structural analysis of the LM. The communications,
the number and of sizes of the messages to be exchanged, were parameterized
by the size of the global grid and its partitioning. For existing machines
the communication model is based on communication benchmarks. However,
standard benchmarks measuring ping-pong, etc., were not fully sufficient
and a special communciation benchmark measuring the special data exchange
of the LM had to be implemented. Hypothetical architectures can be modelled
by assumed characteristics for computation and communication, as sustained
flop rate, latency and bandwidth.
As a result of the study, the predicted required computation demands
are tremendous. Indeed, no currently available machine meets the requirements
mentioned above. A hypothetical parallel computer at least requires 1024
nodes with 8-10 GFlops each and a network with about 250MB/s bandwidth.
However, it can be assumed, that in the year 2001 machines of this style
will be available on the market.
From these requirements for an operational system it becomes obvious,
that algorithmical improvement of meteorological models is very important.
Since also new numerical models have to run on parallel systems, parallelism
becomes a very important factor for modern algorithms beside numerical
efficiency.
A promising idea is to apply dynamically adaptive local refinements
to numerical weather simulations. The computational costs could be essentially
reduced, when high resolutions are provided only where it is necessary
(eg weather fronts, strong low pressure areas). Calm regions could be calculated
with a lower mesh size. However, since the weather situation is changing
during the simulation, the refinement areas have to be adapted in time.
For the Shallow Water Equations, which build the dynamical core of numerical
weather predictions, a parallel local model with dynamically adaptive local
refinements has been developed and implemented. On a structured global
grid, refinement areas are composed of adjacent rectangular patches, which
are aligned to the global grid with refinement ratio 1:2. With a suitable
mathematical criterion the refinement areas are dynamically adapted to
the calculated solution.
Major problems for parallel computers with distributed memory are the
organization of data and the dynamical load distribution. Asynchronous,
non-blocking communication is used in order to combine adaptivity and parallelism
best in this approach.
Please contact:
Reinhold Hess - GMD
Tel: +49 2241 14 2331
E-mail: Reinhold.Hess@gmd.de