TABLE OF CONTENTS
PROJECT SUMMARY
I. Introduction
II. Resources and Background
A. The UCLA AGCM
B. The GFDL Modular Ocean Model (MOM)
C. The UCLA Coupled GCM
D. The UCLA Advanced CATM
1. Atmospheric Photochemistry Model (APM)
2. Polar Stratospheric Cloud Model (PSCM)
3. Ames Tracer Model (ATM)
4. UCLA Chemical/Aerosol Tracer Model (CATM)
E. The LLNL Atmospheric Chemistry Model (LACM)
F. The UCLA Coupled GCM in a Distributed Computer Environment
G. Sequoia 2000
H. Recipe Management for Scientific Programming
III. Proposed Research
A. Computation Challenges
1. Revision and improvement of the parameterization of sub-grid scale processes
2. Parallelization of the calculations in the different components of the ESM
3. Algorithm parallelization and load balancing
4. The ESM and Data Base Management System (DBMS)
5. Development of a new recipe management architecture
B. Earth Science Challenges
1. Seasonal cycle and interannual variability of the atmosphere-ocean system
2. Global Distribution of Greenhouse Gases
3. Ozone Perturbations
IV. Industrial Support to this Project
V. Project Personnel
BIBLIOGRAPHY
BIOGRAPHICAL SKETCH
BUDGET CURRENT AND PENDING SUPPORT APPENDIX A – F
SUBCONTRACTS/COLLABORATIONS
Project Summary
We
will develop a coupled model of atmospheric and oceanic circulations, and
chemical tracers. The coupled model, formulated for parallel computer environments,
will be applied to study problems in climate, climate change, and climate/chemistry
interactions, including the general circulation of the coupled atmosphere/ocean
system, global distributions of greenhouse gases, and global ozone perturbations.
The proposed model development utilizes the following
major components: 1) The UCLA general circulation model of the atmosphere
(AGCM), 2) the GFDL/Princeton University general circulation model of the
ocean (OGCM), 3) the NASA Ames/UCLA chemical/aerosol tracer model (CATM),
and 4) the Lawrence Livermore National Laboratory (LLNL) atmospheric chemistry
model (LACM). Versions of these component models are currently operational,
and preliminary coupling of the components has been carried out.
We propose to assemble a fully coupled model with
a highly modular structure, in which components can be interchanged or added
with a minimum of computational difficulty. The code will be highly optimized,
including local parallelization within each module as well as between modular
components. Finally, the coupled model will be configured to work as well
in a distributed (heterogeneous) computer environment through high-speed networks.
Concerning the computational challenges, we will focus on issues related to
distributed computation and algorithm optimization. The AGCM, OGCM, CATM and
LACM are all grid-point models, which is an advantage in parallel architectures.
Other numerical problems arise, however. For example, the atmospheric flow
in polar regions is difficult to simulate; we are developing specific solutions
to this and related problems of resolution and numerical stability. The issue
of load balancing among processors may become more critical as the parameterizations
of physical and chemical processes grow in sophistication and the computational
load between model regions becomes more variable.
We also propose to apply the coupled model to study
the climate system, climate change, and climate/chemistry interactions. We
will divide this work into studies of the fundamental coupled dynamics of
the global atmosphere-ocean system, and the global chemistry of the atmosphere.
We will simulate the interannual variability of the coupled atmosphere and
oceans, including the seasonal cycle and transients up to decadal time scales.
We plan to couple the AGCM and CATM to study global ozone depletion, including
the effects of atmospheric aerosols on the chemistry of the upper atmosphere.
These latter simulations will be based on work already completed in studies
of the ozone hole chemistry, which provides a degree of validation of the
coupled model. The operational atmospheric chemistry model at LLNL (LACM)
will be used to develop and test photochemistry and heterogeneous chemistry
algorithms to be used in the coupled AGCM/CATM. The LACM will also be employed
for tropospheric chemistry simulations using dynamical fields from the coupled
AGCM/OGCM.
The model development and applications proposed here
will lead to important algorithm development for future large-scale Earth
System Science simulation, and will provide unique scientific insights into
coupled dynamical/chemical/microphysical processes that contribute to global
climate change and ozone depletion.
I. Introduction
This
proposal responds to the NASA Research Announcement in connection with the
High Performance Computing and Communications (HPCC) Program call for Grand
Challenge science applications involving massively parallel computations.
Computer simulations of the global climate (using general circulation models,
GCMs) and of global chemistry, including ozone depletion, address fundamental
issues that affect the environment and epitomize the challenge of Earth sciences
to computer technology.
GCMs that describe atmospheric and oceanic dynamics
(AGCMs and OGCMs, respectively) play a key role in the study of climate. These
models explicitly solve the equations governing fluid motion on a rotating
sphere, including parameterizations of physical processes at sub-grid scales
(e.g., cloud convection, turbulent diffusion), and thus can be used to study
nonlinear interactions and feedbacks between different components of atmosphere
and ocean circulations. Moreover, when atmosphere and ocean general circulation
models are coupled, one of the most important interconnections in the climate
system can be studied. Examples of outstanding problems that may be addressed
with coupled GCMs are El Niño-Southern Oscillation (ENSO) events and
the role of the oceans in moderating the greenhouse warming effect of carbon
dioxide and other gases.
Chemical tracer models describe the detailed composition
of the atmosphere, couple chemistry and climate processes, and simulate the
behavior of the ozone layer. An advanced tracer model couples three-dimensional
dynamics, multi-species photochemistry, and multi-component particulate microphysics
in a general way. Such a chemical/aerosol tracer model (CATM) can be driven
by an AGCM to investigate a wide range of problems such as the formation of
the Antarctic "ozone hole" and the impact of volcanic eruptions
on the ozone layer and on climate. A CATM connected to a coupled AGCM-OGCM
can be used to analyze the chemistry of the marine atmosphere, including the
marine sulfur cycle, long-range transport and transformation of nitrogen and
other species, and the impacts of marine chemistry and microphysics on the
climate system. The development of efficient and accurate algorithms for chemical
and microphysical processes in multi-dimensional models requires considerable
preliminary analysis with models of lower dimension.
The goal of the research proposed here is to develop
a state-of-the-art coupled Earth Systems Model (ESM) that can be applied to
problems of global climate change and atmospheric chemistry. The ESM will
be constructed using operational versions of the UCLA AGCM, the GFDL OGCM,
and an advanced version of a CATM. The Earth Systems Model will simulate coupled
global atmosphere-ocean processes, including the chemistry of tracers in the
system. Development of the ESM will involve the formulation and application
of new algortihms for use on massively parallel computers.
The ESM will be constructed using three principal
component models: 1) The UCLA AGCM; 2) the GFDL/Princeton University OGCM;
and 3) the NASA AMES/UCLA CATM. Existing research versions of these models
have been developed over a number of years under funding by NASA, NSF and
other federal agencies. In addition, the LLNL atmospheric chemistry model
will be used for algorithm design and testing, and for validating simulations.
By combining the model components listed above, we
will produce a model with the following desirable characteristics:
a) A highly modular structure,
so that modules can be interchanged or added with a minimum of computational
restructuring.
For example, a unified radiative transfer scheme coupling the atmosphere and
oceans, accounting for predicted
clouds and trace gases and aerosol distributions, will be added.
b) Optimized for high-speed
computation, including parallelization within each module as well as between
modular components.
The component codes (UCLA AGCM, GFDL OGCM, and AMES/UCLA CATM) are all based
on finite difference
methods, which are highly adaptable for distribution in massively parallel
computer systems.
c) Tested capability for displaying
and analyzing output in real-time.
d) Collateral implementation
in the Thinking Machines Corporation (TMC) CM-5 in a fashion that allows the
models to be ported
to other MIMD machines, such as the CRAY shared-memory multiprocessors and
emerging massively parallel distributed
memory systems.
e) Designed for distributed
computations in heterogeneous computer environments.
The proposed model development will include revision of physical parameterizations
and computational techniques, and consideration of issues related to distributed
computation and algorithm optimization. For example, grid-point models require
careful treatment of the atmospheric flow in polar regions of the Earth. Current
solutions to this problem, which involve Fourier decomposition and truncation,
are highly non-local. Hence, new algorithms are being designed to solve this
problem for applications in a distributed computer environment. There is also
the issue of load balancing among processors, which will become more critical
as the parameterizations of physical and chemical processes become more sophisticated
and, therefore, the computational burden is distributed more heterogeneously
between model domains.
The ESM resulting from this work will be applied to
study current problems in climate, climate change, and climate/chemistry interactions,
including the general circulation of the coupled atmosphere/ocean system,
and the global distributions of greenhouse gases. We also propose to build
into the ESM the capability for simulating global atmospheric chemistry, including
tropospheric and stratospheric photochemistry and heterogeneous chemical processes.
Such a model could be applied to a diverse set of problems, including the
depletion of stratospheric ozone, particularly at high latitudes in both hemispheres
associated with particulates, and the coupling of ozone to atmospheric dynamics
and climate forcing.
In the proposed work, the well-established atmospheric
chemistry model of the Lawrence Livermore National Laboratory (the LACM) will
be employed as a test-bed for numerical algorithms, and as a validating code
for three-dimensional simulations. The LACM has a complete photochemistry
and transport scheme in two dimensions, and its use complements the applications
of the CATM. For example, initial simulations of tropospheric chemical cycles
can be carried out using the LACM with appropriately averaged dynamical fields.
II. Resources and Background
The
proposed research is built on a foundation of existing sophisticated numerical
model components and extensive research and applications in the area of large-scale
computing, including parallel processing elements. In this section, we discuss
the computational tools and research projects of the investigators that will
support and enhance the proposed work. Our resources include well-established
models of the atmosphere, oceans and chemical/aerosol tracers, experience
with parallel architectures and algorithms, model applications on massively
parallel machines, and broad expertise in the necessary scientific disciplines.
The principal model components and other computational assets consolidated
under this proposal are described below.
The essential ingredients for this project comprise
the long-term research tools developed by the participants under funding from
several sources. These assets and activities include:
a) Versions of a Coupled GCM suitable
for vector computer architectures are being used to study the dynamics and
predictability
of the coupled atmosphere-ocean system under funding from DOE, NASA, NOAA,
NSF, and ONR.
b) An effort to revise and parallelize
the UCLA AGCM code for massively parallel computer environments is being carried
out with support
from DOE (CHAMMP).
c) A study on distribution of the Coupled
GCM in both homogeneous and heterogeneous computer environments is being performed
under funding from NSF and DARPA through the Corporation of National Research
Initiatives Gigabit Network Project/CASA
Testbed.
d) The physics and chemistry algorithms
in an advanced CATM for chemical tracer transport and transformation are under
continuing development
under funding from DOE, EPA, NASA, and NSF.
e) The model codes to be used in this
project are running, or will soon be running in CRAY Y-MP, and Intel Touchstone
Delta computers.
f) Members of the Research Team are participating
in the University of California/Digital Equipment Corporation Sequoia 2000
Project, which addresses
major issues in storage, management and visualization for Global Change research.
The key project elements are discussed in greater
detail below.
A. The UCLA AGCM
The
atmospheric component of the ESM is the UCLA AGCM. This model has been developed
under the direction of A. Arakawa (Arakawa and Lamb, 1977). The current version
of the AGCM has been used since the early 1980s at UCLA and Colorado State
University (Randall et al., 1985). The distinctive feature of this latter
version is the treatment of the planetary boundary layer (PBL). This is considered
as well-mixed and represented by the model's bottom layer, whose variable
depth is predicted (Suarez et al., 1983). A more detailed description of the
UCLA AGCM is given in Appendix A.
A major effort is being carried out at UCLA –
in close collaboration with computational physicists and computer scientists
at the Lawrence Livermore National Laboratory (LLNL) – to develop, document,
and optimize a new version of the UCLA AGCM designed for execution in massively
parallel computing environments, and to reconfigure it for efficient coupling
with other major components comprising an ESM. The AGCM code under development
incorporates fully three-dimensional data structures. Additionally, the code
is highly modular with identifiable portions of the finite difference algorithm
broken up into their own subroutines. Recent LLNL work in parallel geophysical
fluid dynamics modeling suggested the feasibility of multi-dimensional horizontal
domain decomposition of such a code. To run the AGCM in parallel using the
two-dimensional domain decomposition paradigm a message-passing shell code
was developed at LLNL. For the purposes of parallelization, the domain is
partitioned in latitude-longitude using a rectangular decomposition, with
each sub-domain corresponding to a process. Variables associated with a given
sub-domain are local to that process, and the various processes communicate
information by sending messages. Variables are allocated dynamically, and
provision is made to dynamically vary the decomposition to achieve load-balance
among the processors. The AGCM presently runs under this message-passing driver
on the 126-processor BBN TC2000 system at LLNL. Because only distributed-memory
constructs are explicitly used in this implementation, and because message-passing
on the BBN is based on a portable communications library, porting to other
message-passing parallel processors is expected to be relatively straightforward.
B. The GFDL Modular Ocean Model (MOM)
The
oceanic component of the ESM is the GFDL Modular Ocean Model (MOM) developed
at Princeton University GFDL (Geophysical Fluid Dynamics Laboratory) by R.
Pacanowski, K. Dixon, and A. Rosati. It is the successor to the code written
by M. D. Cox (Cox, 1984) based on the work by K. Bryan (Bryan, 1969). An outline
of the Bryan-Cox model is given in Appendix B.
Except for the use of namelists, MOM is written in
standard Fortran 77. C-preprocessor directives are used to enable/disable
different options for parameterizations of physical processes (such as turbulent
mixing in the horizontal and the vertical), boundary conditions, numerical
schemes used, and optimizations. The MOM code, therefore, is highly portable
and easy to configure. Its modular design and the use of C-preprocessor directives
make it easy to accommodate alternative physical parameterizations and improvements
as new modules or options.
A parallel version of Bryan-Cox code has been developed at Los Alamos National
Laboratory (LANL) for the massively parallel CM-2 Connection Machine. This
version incorporates the efforts by Chervin and Semtner (1988) to utilize
multiple processors on CRAY computers. Further improvements made at LANL include
the solution of surface pressure instead of the barotropic streamfunction
for efficiency on parallel machine architectures, and a different data structure
that is more suitable for a parallel environment.
C. The UCLA Coupled GCM
The UCLA AGCM and the GFDL OGCM are the components of the UCLA Coupled GCM. The AGCM provides the wind stress, heat and fresh water fluxes to the OGCM, and the OGCM returns sea surface temperature (SST) to the AGCM (Mechoso et al., 1991a). At the initialization stage both models pass their grid systems to coupling routines. Throughout the integration, the models exchange updated boundary conditions through the coupling routines, which perform the interpolations required by the difference in model grids. Currently, the OGCM has a Global version, and a Tropical Pacific version with enhanced resolution in the equatorial region (Philander and Pacanowski, 1980). The Tropical Pacific version of the Coupled GCM produces a realistic simulation of the seasonal cycle (Mechoso et al., 1991a, 1991b). Figure 1 shows the simulated time series of equatorial SST. There is no evidence of significant climate drift, which is a major concern in modeling of the coupled atmosphere-ocean system without flux corrections (Neelin et al., 1991).
D. The UCLA Advanced CATM
An advanced version of a chemical/aerosol tracer model (CATM), originally developed at the NASA Ames Research Center in collaboration with UCLA scientists, is currently being tested and validated at UCLA. The model consists of several components that can be linked together in various combinations to study the transport and transformations of chemically active gas and aerosol tracers in the atmosphere or oceans. The key components of the CATM are described below.
1. Atmospheric Photochemistry Model (APM)
An
accurate and efficient photochemical model has been developed at UCLA for
applications in multidimensional models such as the CATM (Elliott et al.,
1991a, 1992). The solution scheme utilizes a well-developed "family"
technique (Turco and Whitten, 1974, 1977, 1978) with a new implicit projection
scheme for species concentrations. The technique proves to be quite accurate
with a one hour time step when compared to highly-accurate fine-time-step
solutions. The APM has been tested under the wide range of conditions that
would be encountered over a three-dimensional global grid (e.g., Cicerone
et al., 1991).
The APM is specifically designed for multi-dimensional
applications, and incorporates a new efficient algorithm to calculate photodissociation
rates (X. Zhao and Turco, 1991, 1992), using simple empirical functions of
the solar-path-integrated column concentrations of O2 and O3 and a wavelength-dependent
albedo factor (e.g., Luther and Gelinas, 1976). For tropospheric photochemical
calculations, the details of surface reflection, Rayleigh scattering, and
aerosol and cloud scattering and absorption are explicitly treated using an
efficient two-stream code (Toon et al., 1989b). The radiative code is already
incorporated into the Ames Tracer Model (ATM) described later, which provides
the framework for integrating transport, chemistry and microphysics algorithms
in the CATM. The APM will serve as one of the photochemical modules for the
CATM, and thus can utilize the powerful radiative treatment in the Ames tracer
model.
2. Polar Stratospheric Cloud Model (PSCM)
The
UCLA model for polar stratospheric clouds (PSCs) is based on a series of earlier
models designed to study aerosols in planetary atmospheres (e.g., Turco et
al., 1979a,b; Toon et al., 1979a,b). The PSC physics treated in the model
is described in a number of papers (Hamill et al., 1988; Toon et al., 1989a;
Drdla and Turco, 1991). The resulting PSCM treats a multicomponent aerosol
consisting of sulfate aerosols, nitric acid trihydrate crystals (Toon et al.,
1986) –– type-I PSCs –– and ice particles ––
type-II PSCs. The type-I PSCs are nucleated on sulfate aerosols, and type-II
PSCs on type-I particles (Hamill et al., 1990). The subsequent microphysics
is dominated by condensation/evaporation processes and particle sedimentation
(Turco et al., 1989; Toon et al., 1990). The PSCM predicts the detailed behavior
and properties of the clouds, including subtle aspects of simultaneous denitrification/dehydration
processes, which in turn influence the formation of the ozone hole. Model
predictions have been validated against observational data (Drdla and Turco,
1991; Drdla et al., 1992a).
Heterogeneous chemistry is included in the PSCM in the form of measured sticking
coefficients (Drdla et al., 1991, 1992a). The PSCM provides a detailed representation
of the particle surface areas available for chemical reaction, and hence the
heterogeneous chemical reaction rates. The PSCM heterogeneous chemistry model
has recently been coupled to the APM. Trajectory calculations have been carried
out with coupled microphysical, heterogeneous chemical and photochemical processes.
The trajectories were determined by back-tracing from an aircraft sampling
track during the Airborne Arctic Stratosphere Expedition II (AASE-II) mission
in the winter of 1991/92. The simulated chemical transformations corresponded
closely to those measured by the aircraft (Drdla et al., 1991, 1992b).
3. Ames Tracer Model (ATM)
An advanced tracer advection
model has been developed at NASA Ames through an ongoing collaboration with
UCLA and San Jose State University (Hamill et al., 1977; Turco et al., 1979a,b;
Toon et al., 1979a,b). The Ames tracer model (ATM) simulates the distributions
of atmospheric gas and aerosol trace constituents under the influence of fluid
motions in one, two or three dimensions (Turco et al., 1989; Toon et al.,
1988). The principal features of the ATM are: transport algorithms that are
non-diffusive and mass conservative; a complete treatment of aerosol microphysics;
an automated package for photochemical calculations; and an accurate and fast
radiative transfer scheme for photodissociation and atmospheric heating rate
calculations. Details of model physics and numerical techniques are discussed
by Toon et al. (1988, 1989a,b) and Turco et al. (1979a, 1989). The ATM acts
as the framework for assembling the CATM from the chemistry and microphysics
modules discussed earlier.
The ATM tracer transport scheme has been successfully coupled to atmospheric
general circulation models and mesoscale models (Malone et al., 1986; Toon
et al., 1988; Westphal et al., 1988; Kao et al., 1990; Westphal and Toon,
1991; Young et al., 1992). The algorithms for microphysical and chemical processes
have been extensively tested in one-dimensional models to verify their accuracy
and efficiency. The tracer model is unique in that it can be used to couple
three-dimensional dynamics, multi-species photochemistry, and multi-component
particulate microphysics in a general way. The tracer code is highly vectorized
and optimized for coupled chemical/microphysical calculations. The model is
also modular and can accommodate a variety of photochemical and microphysical
packages that are designed to study polar stratospheric clouds, air pollution,
the marine sulfur cycle, volcanic eruption clouds, noctilucent clouds, and
cirrus and stratus clouds.
4. UCLA Chemical/Aerosol Tracer Model (CATM)
In the version of the CATM that has been assembled at UCLA (based on the ATM),
the following processes are treated: gas-phase photochemistry, homogeneous
and heterogeneous binary vapor nucleation, multi-component aerosol coagulation,
aerosol growth and evaporation by vapor transfer, thermodynamic chemical equilibrium
of multiple components in aqueous solutions, including vapors over solid condensates,
chemical transformations of aqueous components in solution, aerosol and gas
dry and wet deposition, atmospheric dynamics including advection by winds
and diffusion by turbulence, boundary layer dynamics, convection, humidity
and energy balance, horizontal and vertical transport of all chemical tracers
and aerosols, and solar and infrared radiation transfer, including spectral
intensities, heating rates, photorates, and visibility.
Gas-phase inorganic and organic chemistry is treated using a new formulation
of a matrix inversion-based family technique (Jacobson et al., 1991, 1992)
that derives from earlier work on the APM. The chemistry module is generalized
to accept any photochemical mechanism. The aerosol algorithms treat any number
of pure aerosols, and two-component aerosols, and one generalized mixed aerosol
with any number of components. The size distribution of each aerosol type
can be divided into an arbitrary number of discrete size-bins, covering any
specific size range. Homogeneous homomolecular and heteromolecular nucleation
processes are explicitly treated using the standard classical theory (Hamill
et al., 1982; J.-X. Zhao and Turco, 1992). Heterogeneous homomolecular and
heteromolecular nucleation are also included (Hamill et al., 1982; J.-X. Zhao
and Turco, 1992). Condensational growth and evaporation rates are calculated
using a numerical scheme that suppresses artificial "numerical diffusion"
across size bins (analogous to numerical diffusion in advection calculations)
(Toon et al., 1988; Turco et al, 1979a,b). Coagulation is solved using the
unique semi-implicit numerical technique of Turco et al. (1979a,b) (see also
Toon et al., 1988). Self-coagulation rates of similar aerosol types and coagulation
rates between different aerosol types are calculated. Sedimentation velocities
are calculated every time step at each model grid point corresponding to the
local environmental conditions, and aerosol properties (i.e., mean density).
Aerosol dry deposition velocities are calculated using an algorithm derived
from the work of Giorgi et al. (1986). All of the numerical algorithms conserve
aerosol number and mass exactly, and are always numerically stable. In addition,
the equilibrium thermodynamics of multicomponent aerosol solutions is determined
in manner of Pilinis and Seinfeld (1987) using an algorithm due to Villars
(1959) and gas/liquid/ion/solid equilibrium relations determined by the ZSR
method (Robinson and Stokes, 1965) or the MK method (Kusik and Meissner, 1978),
including simultaneous specification of the multicomponent activity coefficients
(Bromley, 1973) through binary activity coefficients (e.g., Pitzer and Mayorga,
1973). Aqueous chemical reactions are determined through a set of coupled
rate equations for the reactant and product species in solution in each aerosol
size-bin.
E. The LLNL Atmospheric Chemistry Model (LACM)
The atmospheric chemistry models developed at LLNL have been applied in a
wide range of studies of tropospheric and stratospheric processes, and the
impacts of human activities. Many of these studies have been related to global
ozone perturbations and the effects of chemistry on climate. The ozone work
has included assessments of chlorofluorocarbon release (e.g., Hammitt et al.,
1987; Kinnison et al., 1988; Wuebbles, 1990), solar flux variations (e.g.,
Wuebbles et al., 1991), and aircraft emissions (e.g., Johnston et al., 1989;
Wuebbles and Kinnison, 1990). Other studies have focused on the recent trends
in ozone concentrations (e.g., Reinsel et al., 1987, 1988; DeLuisi et al.,
1989; Wuebbles et al., 1991). Livermore scientists introduced the concept
of Ozone Depletion Potentials (ODPs) (Wuebbles, 1988; Connell and Wuebbles,
1989), which is now used throughout the world in ozone assessments. The Livermore
group has also carried out extensive simulation of the impacts of chemistry
on climate (Wang et al., 1986; Wuebbles and Edmonds, 1988, 1991; Penner et
al., 1990; Wuebbles et al., 1989; Penner, 1990; Lacis et al., 1990). The chemistry
modeling has been extended to three dimensions, including studies of the tropospheric
nitrogen cycle (Atherton and Penner, 1988, 1990; Penner et al., 1991), and
the role of aerosols in climate change (Walton et al., 1988; Ghan et al.,
1988; Kreidenweis et al., 1990; Ghan et al., 1990; Erickson et al., 1990;
Penner and Mulholland, 1990; Ghan and Penner, 1990; Penner, 1990).
The LLNL zonally averaged two-dimensional chemical-radiative-transport model
currently calculates the atmospheric distributions of 54 chemically-active
species in the troposphere and stratosphere. The model domain extends from
pole to pole, and from the surface to 60 km altitude. The vertical resolution
is 1.5 km in the troposphere and 3 km in the stratosphere. The photochemical
package includes all of the relevant processes for the oxygen, nitrogen, hydrogen,
chlorine and bromine systems, as well as methane and products. For photodissociation
calculations, a two-stream radiative transfer code with 126 spectral intervals
is used to compute local irradiances. The LLNL atmospheric chemistry model
has been ported to the MIMD NCUBE and iPSC/860 machines. Hence, parallelization
of the LACM algorithms has been carried out. Moreover, Livermore personnel
are working to port a three-dimensional chemical tracer model to massively
parallel computers under the DOE CHAMMP project.
During the first year of the project, the two-dimensional LACM will be used
as a test bed for various numerical schemes and algorithms that are being
designed at LLNL and UCLA to handle stiff photochemical rate equations accurately
and efficiently. Additional testing of radiative transfer and aerosol microphysics
modules will be conducted. In subsequent years, the applications will expand
to include the generation of initialization data for three-dimensional simulations,
initial sensitivity studies of the ozone depletion and greenhouse gas distribution
problems, and validation runs for the global chemistry model.
F. The UCLA Coupled GCM in a Distributed Computer Environment
Several characteristics of the UCLA Coupled GCM make it both an ideal and
difficult application for a distributed computer environment. The model is
computation intensive and generates large amount of data that have to be stored
for analyses. There are both vector and scalar codes in different parts of
the model. Ideally, one would like to have massively parallel processors with
vector capabilities, and large volume, high-speed data archiving systems and
visualization hardware working seamlessly together.
A pilot program that explores the distribution of the Coupled GCM across high-speed
(gigabit per second) networks is underway at UCLA. This study is an integral
part of the Corporation for National Research Initiatives (CNRI) Gigabit Testbed
Initiative, which is funded by NSF and DARPA (see Appendix C). In running
the GCM in a distributed fashion, we have the following objectives:
i) Explore the possibility of superlinear speedup of concurrent computation
through the use of heterogeneous computer architecture.
ii) Enhance graphics capabilities by allowing for remote real-time animation
of model results.
iii) Increase available resources by allowing for the utilization of geographically
separated computers, and to guarantee continuous availability of resources
even when a site is temporarily unavailable.
iv) Facilitate closer collaboration among researchers specializing in different
parts of the model, by providing a system in which modules under development
at different institutions can be easily exchanged for the purpose of performance
evaluation.
The principal issues being addressed in this research include hiding network
latency with computation, and the mechanisms for exchanging data among processes.
Depending on the level of parallelism one wishes to achieve, the coupled GCM
can be decomposed at different levels. The first (coarsest) level of decomposition
is based on the difference in tasks each component of the model carries out
(task decomposition). The AGCM and OGCM are two well-separated entities interconnected
by coupling routines. Within the AGCM itself, there are also two relatively
well-defined components. One is the AGCM/Physics, which computes the effect
of subgrid-scale processes on grid-scale motions. The output of the AGCM/Physics
is supplied to the other component – AGCM/Dynamics – as forcing
terms in the primitive equations.
Based on these considerations, we have decomposed the Coupled GCM into three
tasks: AGCM/Physics, AGCM/Dynamics, and OGCM. This decomposition requires
a Master Control Program (MCP) to provide the user interface and supervise
communications between different processes. The large datasets produced by
the coupled GCM require a Dataset Manager to collect model output from different
locations and dispose data to Mass Storage Subsystems (MSS) or to process
data for real-time visualization. The resulting distributed coupled GCM application
is shown in Fig. 2. So far, the interprocess communication is done on a message-passing
basis, which is carried out by Berkeley sockets or similar utilities provided
by EXPRESS or PVM. With the Coupled GCM decomposed in this manner, the AGCM/Dynamics
and the OGCM can be run concurrently. This is because all the boundary conditions
required by the OGCM are available after AGCM/Physics is completed. In this
way, a substantial reduction in wall-clock time can be achieved when running
the Coupled GCM.
The decomposition in Fig. 2 allows us to explore the possibility of nonlinear
speedup by running different modules in computers with architectures that
are most efficient for each of them. The nature of the dynamics and physics
codes in the AGCM is very different. The dynamics code is highly vectorizable,
while the physics code can be easily distributed because most calculations
are made in atmospheric columns. In particular, we expect that the wall-clock
time required to run the AGCM/Physics will be greatly reduced by using a massively
parallel architecture.
A higher level of decomposition, which enables the overlapping of communication
with computation in a parallel environment consisting of either a single or
multiple computers, is discussed in the Proposed Research section of this
proposal.
G. Sequoia 2000
Achieving the goals of this project will depend not only on improved codes,
but also on improved data systems for manipulation of large-scale model output.
The synergistic interactions between observations and model-based simulations
require massive amounts of diverse information to be stored, organized, accessed,
distributed, visualized, and analyzed. Refinements in computing - specifically
involving storage, networking, file systems, extensible data base management,
and visualization - are needed. The University of California/Digital Equipment
Corporation Sequoia 2000 Project seeks to develop large capacity object servers
and visualization techniques for Global Change research (see Appendix D).
The University of California Berkeley (UCB) and UCLA are principal participants
in Sequoia 2000.
There are considerable shortcomings in current information systems. The SEQUOIA
2000 research project is executing a coordinated attack on these issues:
i) Current storage management system technology is inadequate to store and
access the massive amounts of data required for Earth System Science research.
ii) Current I/O and networking technologies do not support the data transfer
rates required for browsing and visualization of satellite data or output
from models of Earth processes.
iii) Current visualization software is too primitive to allow for useful interactive
viewing of data on scientific workstations.
iv) Current database systems are inadequate to store the diverse types of
information required, such as point-data for specific geographic points, vector,
raster, and text data.
v) It is extremely difficult to share the objects noted above with other interested
researchers.
Sequoia 2000 plans to extend the next-generation data base management system
(DBMS) POSTGRES (Stonebreaker et al., 1990) to manage effectively Global Change
data. The project is also addressing visualization issues, and plans to produce
a seamless interface between POSTGRES and a variety of visualization packages.
H. Recipe Management for Scientific Programming
Scientific programming has been traditionally performed by coding directly
in Fortran 77. Recently, several scientific visualization systems have been
developed to try to move scientific programming to a higher, more productive
level. Examples of such programming environments are AVS, built by Stardent
and licensed to many vendors, Explorer, marketed by Silicon Graphics, and
Khoros, from the University of New Mexico. It is expected that GCMs and other
remote sensing applications will be moved to such programming systems to take
advantage of their module sharability and reusability and their built-in output
visualization software. We call such scientific visualization systems, recipe
managers, because they describe a recipe by which a collection of inputs,
read from files can be cooked to produce a desired visualization output.
There are, however, several serious disadvantages of current recipe managers.
First, recipe managers are file-oriented. The only way to get data into a
recipe is to read it from a file. There is no integration between the recipe
manager and a modern DBMS. Second, current recipe managers are main-memory
oriented. The output of each recipe step is passed to the input of its successors
through shared main memory. In extensions to distributed recipe managers,
information flows between recipe steps through the interprocess communication
(IPC) system supported by the operating system of the vendor involved. For
recipes that require a very large amount of data to be passed between steps,
this use of shared memory or IPC will prove inefficient. Third, there is only
a limited notion of time in current recipe managers. Because they support
rendering only a single data set, a user cannot "flicker" between
two data sets. Put differently, there is only a limited notion of animation
in any of the packages. Fourth, there is no notion of version control. In
current systems, when a recipe is modified, the older version is discarded.
There is no notion of a time sequence of versions or the ability to return
to the recipe of a previous time. Named alternatives, popular in source code
control systems such as SCCS (Tichy, 1982) and RCS, are similarly missing.
III. Proposed Research
The proposed research can be logically separated into two specific categories.
The first category refers to computational challenges of the proposed research
in the context of the HPCC. These challenges in turn may be subdivided into
tasks focusing on revision and improvement of the parameterizations of sub-grid
scale processes, both from the physical and computational points of view,
the parallelization of calculations within different components of the coupled
Earth Systems Model, and communications between the components in a parallel
computing environment.
The second category of research refers to the scientific challenges to be
addressed with the coupled Earth Systems Model. The three exemplary Earth
science issues to be studied under this proposal are: the seasonal cycle and
interannual variability of the coupled atmosphere/ocean system; the distribution
of greenhouse gases in the atmosphere; and depletion of the stratospheric
ozone layer.
A. Computation Challenges
1. Revision and improvement of the parameterization of sub-grid scale processes
We propose to revise and further develop the AGCM, OGCM, and CTM codes both
from the physical and computational points of view. Our development efforts
will include a version of the Arakawa-Schubert cumulus parameterization that
makes use of a prognostic cumulus kinetic energy. This version permits more
realistic coupling between convective and stratiform clouds, drastically simplifies
the computational algorithm, and significantly improves the computational
speed of the model. Moreover, the parameterization is highly amenable to parallelization,
and is considerably more portable than the original version. Our revisions
will also include an improved planetary boundary layer (PBL) parameterization,
and an improved parameterization of land-surface processes, including a simple
but explicit model of photosynthetic carbon exchange. This work is being performed
under EOS sponsorship (P. Sellers P.I., D. Randall, Co-I.).
In addition, we plan to couple the ESM to spatially distributed watershed
models over rugged terrain to simulate the spatial distribution of snowmelt
and snow chemistry processes within the snowpack. These models are driven
by large-scale analyses of Earth radiation budget calculated from satellite
(Dozier 1989; Dozier and Frew, 1990; Shi et al. 1991), and precipitation,
surface wind stress and sensible heat flux produced by the ESM. The satellite
data sets used by the watershed models are large: for example, a single Landsat
frame (185 x 185 km) is 266 megabytes, and an AVIRIS (Airborne Visible and
Infrared Imaging Spectrometer) image is 140 megabytes. They also have high
dimensionality: Landsat images have 7 spectral bands, AVIRIS images have 224;
AIRSAR (synthetic aperture radars on aircraft) images have 3 frequencies with
4 polarizations each. Work with these datasets will motivate development of
fast inverse methods for estimating geophysical properties on parallel machines,
distributed management of large data bases, integration of inverse methods
into a next-generation data base system (Postgres), and coupling of visualization
tools (IDL, AVS, and Khoros) to a data base management system.
We also propose to implement the coupling of the chemistry and physics of
trace species to the UCLA Coupled GCM. This coupling will provide a new dimension
to global modeling capabilities, which is needed to study in sufficient detail
the coupled atmosphere, ocean and chemical tracer interactions controlling
the Earth's climate system and the chemical processes of the ozone layer.
The CATM can be run in two configurations with the dynamical models: 1) off
line, driven by dynamical history tapes; 2) coupled and in parallel with the
dynamics model. For the latter configuration, the tracer model grid will be
adapted to the AGCM and OGCM grids and boundary conditions.
The atmospheric tracer fields will be initialized using data that are available
for specific tracers, or model predictions that have been produced for lower
dimensionality (e.g., 2-D model simulations of atmospheric composition). The
chemical tracer species can be divided into a number of categories that determine
their mode of initialization: long-lived source gases with mainly vertical
variations; gases that vary significantly in space and time and that have
both chemical and dynamical influences, such as ozone; and photochemical-equilibrium
species that can be derived by simple photochemical analysis from the first
two types. Using rough initial conditions, model simulations will be bootstrapped
to test for stability and accuracy and to obtain a library of initial states
for later simulations. The velocity and temperature fields, and any other
parameters, required to drive the tracer model will be obtained from the GCMs.
The dynamical and tracer models will be fully coupled and run in parallel.
In this configuration, the tracer model will provide detailed gas and particle
distributions and radiation fields to the dynamics models, from which heating
(and cooling) rates can be calculated. For example, the CATM will predict
variation in radiatively-active gases such as ozone and methane. The CATM
aerosols can likewise be coupled to the AGCM radiation algorithm, and the
microphysical properties of the clouds predicted by the AGCM inferred from
the aerosol properties. In turn, the cloud fields predicted by the AGCM can
be used to determine the convective pumping and heterogeneous chemical processing
of the tracers (Turco et al., 1989).
2. Parallelization of the calculations in the different components of the
ESM
We propose to investigate the parallelization of the Coupled GCM. Our goal
is to reduce the wall-clock time required to run the Coupled GCM to that required
to run the AGCM/Physics only. Such a parallelization involves communications
between model components. Overlapping of communication with computation, therefore,
becomes an important issue.
A higher level of decomposition than that discussed in Section II.F of this
proposal is based on physical domains (spatial or domain decomposition). In
this method, the region of model simulation is divided into sectors (domains),
and calculations for the sectors are carried out concurrently. This decomposition
is straightforward for our codes since both the UCLA AGCM and GFDL OGCM are
grid-point models.
The exchange of data between different components of the coupled GCM can also
be carried out in I/O subdomains smaller than the entire globe. Namely, instead
of sending all the boundary data after one component (AGCM/Physics, AGCM /Dynamics,
or OGCM) completes calculation for the entire model domain, the data for each
I/O subdomain is sent to the other components as soon as calculation in that
subdomain is completed. In this way the transmission for this data can be
masked by the computation for the next I/O subdomain in the corresponding
component or task.
The I/O decomposition can be used to run in parallel the three components
of the Coupled GCM. A possible scheme is shown schematically in Fig. 3, which
illustrates the case of four I/O subdomains. In such a scheme, the data produced
by the AGCM/Physics for an I/O subdomain is transferred to the AGCM/Dynamics
and OGCM. These components advance the models prognostic variables one time
step as AGCM/Physics is computing the next I/O subdomain for the previous
time step. The AGCM/Physics advances one time step when AGCM/Dynamics and
OGCM return updated data. Spatial decomposition is used for calculation inside
each I/O subdomain. Further considerations on this decomposition are given
in Appendix E.
We also propose to investigate the parallelization of the CATM. This parallelization
will involve a reconfiguration of the model and re-ordering of the solutions
steps. The code will be considered in terms of its distinct photochemical
component and aerosol microphysics components. Initialization of the gaseous
species and reaction processes in the photochemical component demands only
a very small fraction of the total run time, and will not be affected. The
solution of the individual species continuity equations requires a number
of specific computational steps, which are time-split in the model: i) horizontal
advection; ii) vertical transport; iii) radiative transfer/photodissociation
coefficients; iv) photochemical kinetics. The horizontal advection and vertical
transport, including any diffusion, will be parallelized in the same manner
as for the solution of the continuity equations in the dynamical model. Indeed,
the solutions of the continuity equations for air, and the tracers in air,
require information only for the local grid points.
The photochemical kinetics calculation involves a number of steps that must
be carried out at each grid point individually, and thus can be done in parallel
at all grid points simultaneously. The steps within this calculation include:
retrieval of meteorological data from the dynamics simulation, such as air
temperature , density and humidity; calculation of chemical reaction rate
coefficients (which are functions of the above parameters); calculation of
the individual chemical reaction rates are needed (i.e., a rate coefficient
multiplied by the appropriate species concentrations); determination of the
photodissociation coefficients from a calculation for the radiation field
or using information on the distribution of absorbers along optical ray paths;
assembly of all the photochemical processes into total production and loss
rates for the individual species or “family” rate equations; application
of an efficient and stable numerical solver to the chemical rate equations;
determination of the final species concentrations by partitioning of the families.
In the existing CATM, these steps have been extensively optimized for vector
operations. Accordingly, the use of vector processors at the nodes within
a parallel architecture is ideal for the CATM. In addition, porting of the
CATM to a vectorized processor will require minimum algorithm redesign.
For the simulation of aerosol microphysics, the problem –– again
for the non-transport terms –– is a local calculation. Particle
sedimentation is included in the vertical transport (diffusion and convection)
algorithm. The aerosol microphysical processes of primary interest are (see
the section on the advanced version of the CATM): nucleation, condensation,
evaporation, coagulation, thermochemical equilibrium, and nonequilibrium chemical
transformation. The aerosols are also incorporated into the radiative transfer
calculations. To represent these aerosol processes accurately, the size distribution
of the particles is divided into a set of discrete size bins (the ratio of
the particle volumes in adjacent bins is fixed, with the ratio defined as
, with ). Inclusion of different materials in the aerosols, and the possibility
of several distinct types of particles, increases the number of individual
aerosol tracers that must be treated in the model.
The microphysical processes typically cause the aerosols to move across the
size grid from smaller to larger sizes (with the exception of evaporation,
which reverses this direction). Hence, these processes are analogous to advection
across a spatial grid. In the case of the CATM, special techniques have been
developed to suppress numerical diffusion for particle growth, as must be
done with spatial advection. The mathematical structure of the aerosol physics
allows a unique linearized semi-implicit solution to be formulated (Turco
et al., 1979a,b). This solution, which combines the nucleation, growth and
coagulation processes, is accurate and stable for all conditions of interest.
Moreover, the structure of the numerical algorithms allows efficient solutions
through a vectorized tridiagonal solver. Thus, again, the individual grid-point
calculations can be optimized for aerosol physics by employing vectorizable
processors.
3. Algorithm parallelization and load balancing
We propose to investigate issues related to algorithm parallelization and
load balancing. These issues are important because realistic ESM simulations
will only be possible if we can map the enormous amounts of parallelism available
in the model components to the parallelism becoming available on massively
parallel computers. The experience gathered by thoroughly understanding this
important application can then be used to design general purpose computing
tools for use in other applications.
Code development in this project will be performed in a hierarchy of computer
environments. Basic development and code debugging will be carried out in
a network of scientific workstations, which will represent Digital Equipment
Corporation's direct contribution to this effort. Development and test runs
of a massively parallel MIMD version of the code will be carried out on the
CM-5 at UC Berkeley. This 64-node CM-5 will have a maximum performance of
8 Gflops and 2 Gbytes of main memory; the developed code will be executable
on a 1024-node machine with 128 Gflops peak performance and 32 Gbytes of main
memory.
In addition to having one of the first three CM-5 installations in the country,
members of the Computer Science Division at UCB are engaged in close cooperation
with TMC to develop programming and mathematical software tools for the CM-5.
There are also several related parallel software development activities at
UCB that will contribute to the research in this proposal, and we have had
experience in parallelizing several large applications. We have also done
a preliminary study of the UCLA AGCM Physics code. We will discuss this background
work briefly, and then discuss implications for UCLA AGCM-Physics.
LAPACK, a project headed by Demmel, Kahan and other, recently released the
most complete, portable and optimized linear algebra library available for
shared memory vector and parallel machines (Anderson et al., 1992). We are
currently extending this work to distributed memory machines such as the CM-5,
under funding from NSF and DARPA. The first version of LAPACK only targeted
dense and band matrices, whereas the new project is targeting sparse matrices
as well. We expect the LAPACK experience to be quite useful since much computing
time in the ESM is spent in solving tridiagonal linear systems, as well as
block structured linear systems within the stiff ODE solver. The experience
with parallelizing the ESM will likely influence the future LAPACK design
and the next TMC math software library CMSSL as well.
Another project at Berkeley, led by Graham and Yelick with DARPA funding aims
to enable realistic computationally intensive programs to be run on massively
parallel computers. The focus, therefore, is on making it easy to express
the parallelism inherent in such applications, and on compiling the applications
to deliver high sustained performance without requiring excessive work on
the part of the programmer. The methodology of the group is to parallelize
real, large-scale scientific codes in close collaboration with researchers
in other fields.
Applications are sped up through a combination of automatic techniques provided
by compilers and tools; manual restructuring of the code; and changes in the
numerics identified as sources of large-scale improvements and implemented
by the collaborators. In each study, the collaborating scientists have benefited
by the development of greatly sped-up versions of their code; the Berkeley
computer scientists have developed many new ideas in parallel languages, compilation,
and run-time techniques.
In the area of language design, these projects led to the development of a
"coordination language" called Delirium. In this language, the synchronization
and communication patterns of an application can be described (Lucco and Sharp,
1990). The computation itself is expressed in a conventional language such
as FORTRAN, making it easy to convert existing sequential programs to run
in parallel. In the area of run-time support for scientific computations,
the work led to the design of scheduling algorithms, a memory-object layer,
and a new parallelization technique.
The scheduling algorithms are applied when the load balance of the iteration
space of a program is not known at compile-time: the iteration space is sampled
and work is redistributed dynamically to ensure even load. This allows effective
parallelization of a much broader class of applications than methods that
rely on static or random scheduling (Lucco, 1992). The memory object layer
was developed to address the problem of managing memory objects required by
dynamic parallel applications. Tarmac is a mobile memory-object layer that
allows memory objects to be moved and addressed in a location-independent
manner. Because all communication is expressed in terms of these memory objects,
the scheduler can redistribute them at will without changing any of the communication
code. Tarmac has been implemented on the CM-5, for which we developed the
fastest bulk-memory transfer protocol available on the CM-5 (Bacon and Lucco
1992, Lucco and Anderson, 1990). Finally, a run-time technique called "optimistic
parallelization" was developed for those cases when future states of
a computation can be guessed with high probability; in this case the future
computation can be performed without waiting for the current computation to
complete, thereby allowing parallelization even when data- and control-dependencies
exist (Bacon and Strom, 1991).
Preliminary work on the UCLA AGCM-Physics code has shown that our run-time
scheduling techniques will offer substantially better performance than other
scheduling methods. This is because the Physics portion of the computation
has widely varying computational load: a grid element with cumulus clouds
in the tropics requires a great deal more computation than a grid point with
a clear sky. The result is that an optimal implementation can not simply assign
the same number of grid points to each processor; the assignment must be varying
and dynamic. Our Tarmac run-time system for the CM-5 supports exactly this
type of time-varying decomposition; Tarmac is also being ported to run on
networks of workstations such as the DECstations.
4. The ESM and Data Base Management System (DBMS)
We propose to explore the advantages of a close coupling between the ESM and
a Data Base Management Systems(DBMS). The purpose of this coupling is twofold:
i) Storing model output in a DBMS will allow users to query output from previously
run models looking for trends of interest. Complex queries to model output
are anticipated, as users will browse through model output, and compare model
output with observational datasets (i.e., satellite imagery).
ii) Storing model output in a DBMS as it is being generated will allow users
to examine model output as the model is running. As such, if the model is
not producing desired effects, the user can end the simulation run and save
valuable computer resources.
To satisfy these two needs, we propose to explore three different research
thrusts. First, each ESM output variable can be considered as a four dimensional
array of the variables – longitude, latitude, elevation, time. Since
GCMs have vectors of model outputs, they generate a collection of 4-D arrays
or a single 5-D array, where the last dimension is an index for the model
variable. Providing sophisticated query capabilities amounts to storing this
5-D array in a way that ad-hoc queries can be run against it with good response
time.
We plan to examine the advantages of organizing model output into tiles. Each
tile would store a range of values in each of the five dimensions contiguously.
For example, 10 longitude, 10 latitude, 3 elevation, and 5 time values could
be put in a tile for all model variables. With multiple tiles, response to
a mix of queries can be tuned by adjusting both the total number of tiles
and the number of values from each dimension placed into a tile. We propose
to investigate maintaining statistics on the mix of queries being run and
then dynamically adjusting the tiling parameters to achieve best possible
average performance. Furthermore, it is possible to implement two different
tiling systems simultaneously if redundant secondary storage is allowable.
With multiple tilings, further optimization of response time can be performed.
A second thrust stems from the realization that model output will typically
reside on tertiary memory. Initial thoughts on tertiary memory optimization,
expanded bookkeeping, and optimization of expensive functions is presented
in Stonebreaker (1991). We propose to continue with this work and to expand
the resulting optimizer with knowledge of our tiling approach to array storage
discussed above. We expect to develop and implement these techniques within
the context of the POSTGRES next-generation DBMS (Stonebreaker, 1990; Stonebreaker
et al., 1991, Mosher et al. 1991), under construction at the University of
California, Berkeley.
A third thrust is to support the high rate of data insertion associated with
collecting data while an ESM model is running. As a result, we propose to
explore high throughput insertion schemes that could be added to a DBMS to
accommodate the rate of output generation associated with model execution
on next generation massively parallel systems. Specifically, we propose to
explore lightweight protocols that could support data entry at very low CPU
cost. Moreover, we propose to explore solutions to the synchronization and
consistency problems that arise with parallel data entry into a DBMS.
5. Development of a new recipe management architecture
We propose to explore a different architecture for recipe management that
couples it closely to data base systems. This alternate architecture is discussed
below.
Many of the objects visualized by the scientific community are the values
of regular arrays of cells. Such objects abound in GCMs as well as in remote
sensing applications. Such large objects are best supported in DBMSs. If put
in a DBMS, then standard data base services are automatically available such
as the query language, automatic query optimization, alternate views of data,
a sophisticated rules system, etc. Such capabilities are valuable in building
recipe management systems, and as a result, we believe that a recipe management
architecture should be DBMS-centric.
Considerable effort has been spent by the DBMS research community in constructing
next generation DBMSs that are extendible, i.e. that support user-defined
types, functions and access methods. Example data managers in this class are
POSTGRES (Stonebraker et al., 1990), IRIS (Wilkinson et al., 1990), Starburst
(Haas et al., 1990), and Orion (Kim et al., 1990). The second cornerstone
of our proposal is that such type extension facilities can be used to advantage
to define a very sophisticated recipe management system.
Based on these observations, we propose to explore the following methodology:
i) Store all data on which recipes operate in a next-generation DBMS
ii) Register all functions with the DBMS which implement the recipe steps
in the visualization system.
iii) As a result of 2), any user recipe can be compiled into one or more query
language commands. This collection of commands is then optimized by the DBMS
query optimizer and run to produce the desired result.
We also propose to build a prototype of this architecture using the next generation
DBMS, POSTGRES. Our specific research goals are the following.
First, since a recipe is a data base object and represents one or more queries
to a DBMS, it has points in common with the traditional notions of views (Stonebraker,
1975) and stored query plans. Hence, we propose to explore both the similarities
and differences between recipes, views and query plans
A common operation in recipe management is to run a recipe, browsing the output
as desired, and then change a run-time parameter somewhere in the recipe.
Continued browsing of the altered recipe is then expected. A reasonable optimization
is to place recipe execution into a state where all data that flows along
each arc is captured by the DBMS and retained as temporary data. Then, if
the user alters the recipe, the whole recipe does not need to be rerun. Instead,
the saved data that is input to the first recipe step which has been changed
is re-inputted to that recipe step. Any previous recipe steps do not need
to be repeated.
This caching of intermediate results has been advocated in Sellis (1986);
however, it is interested in the optimization of multiple queries in a query
stream and hopes that a previous result can be useful as a part of a subsequent
query. In our environment, when a recipe is changed, we can avoid recreating
the whole recipe by using this caching technique. We propose to explore the
utilization of this technique in a recipe context.
Furthermore, optimization of the query or queries which results from the compilation
process described above is an issue to be studied. It is expected that such
queries will have a large number of cascaded functions performing recipe steps.
One of the key operations in optimizing such queries will be moving data base
selections through such functions to restrict the amount of data on which
they operate. We propose to explore these and other optimization tactics that
a recipe compiler can use. Our basic approach will be to extend the pioneering
work of Selinger (1979) in this direction.
One capability required for recipes has been termed data lineage. Users wish
to focus a point or a region in a visualized object. Generally, the indicated
data is obviously incorrect. As a result, the user wishes to trace backward
through the recipe looking for defective input data or classification errors.
In effect, the user wishes a "debugger" that can move backward through
recipe execution. This capability has been termed data lineage by the scientific
community.
There are several ways to construct the required lineage. First, if a recipe
step is invertible, and the person who registered the function involved in
the step also provided the inverse function, then, the recipe manager can
simply pass the defective data to the inverse function, thereby generating
the defective input data to the function. Iteratively performing this step
would allow a user to trace any given data back to the beginning of the recipe.
Unfortunately, recipe steps are rarely invertible, As a result, we propose
to explore other capabilities. The first is a forward marking system which
we call dye. The user would be allowed to mark any subset of the data in a
recipe step with dye of a color of his choosing. When recipe execution is
resumed, the recipe manager must propagate the dye to output data elements
that are computed from dyed input data elements. A user can thereby guess
offending input values, dye them, watch the dye appear in the output and prove
or disprove his hypothesis about the lineage of offending data. We propose
to explore efficient and general techniques for supporting this capability.
The other capability we plan to explore for supporting lineage is a lineage
function. When a user registers a function, he can optionally provide a lineage
function. In this case, the user does not need to back up to the input to
a recipe step and guess appropriate input to dye to check his lineage hypotheses.
Instead, he dyes the defective data directly. The recipe manager then passes
the dyed data to the lineage function and visualizes the output of this function.
Of course, the dye is propagated to the output as described above. If the
lineage function provides an approximation to the inverse of the recipe function,
then it will provide significant information about the actual data lineage.
Hence, our second approach to data lineage utilizes a lineage function that
a user can write to provide as much information as possible about the real
lineage situation.
Lastly, we propose to build a prototype visualization system using these ideas.
We expect to utilize the user interface code form Khoros, interfacing it to
POSTGRES (Mosher, 1991) as the basic building block to explore provision of
the above capabilities.
B. Earth Science Challenges
The Earth Systems Model described in the previous sections will be applied
to investigate important problems related to coupled climate dynamics and
chemistry. Three specific problems that will be addressed are outlined below.
The applications phase of this proposal serves at least two purposes. First,
it provides a means of validating the model accuracy and performance in relation
to realistic environmental problems. Second, it offers a unique analysis of
these critical problems using a new powerful coupled predictive model.
1. Seasonal cycle and interannual variability of the atmosphere-ocean system
We propose to analyze the simulated seasonal cycle and interannual variability
of the coupled atmosphere-ocean system. The development of an ESM that can
produce a realistic simulation of the seasonal cycle of the coupled atmosphere-ocean
system is one of the major goals of this project. Achieving this goal implies
an interdisciplinary effort that brings together observational and theoretical
investigations of a broad spectrum of processes, in the atmosphere, the oceans,
on land, and in polar regions.
Improved predictions of El Niño/Southern Oscillation (ENSO) –
the major feature in the interannual variability of the Tropical Pacific –
depend upon better understanding of the seasonal cycle, and on the ability
to simulate it with GCMs. The current models have considerable predictive
skill if forecasts start in June or July but do poorly if the forecasts start
in February. It appears, therefore, that the seasonal cycle modulates the
interannual variability. At this time we have a poor understanding of the
relation between the annual and interannual variations, and are unable to
explain many aspects of the seasonal cycle. Why, for example, is an annual
cycle dominant in some equatorial zones (the eastern Pacific) whereas a semi-annual
cycle is dominant in others (the central Indian Ocean)? How do land and oceanic
conditions compete for the location of major convective zones in the Indian,
Atlantic and Pacific sectors?
The seasonal cycle is a very large and accessible periodic global climate
change. It is therefore troublesome that the models that predict future climate
changes (in response to higher CO2 levels) have difficulty in coping with
the seasonal cycle. (At present the models have to resort to "flux"
and other "corrections" to simulate the seasonal cycle). The development
of a climate model capable of an accurate simulation of the seasonal cycle
is of critical importance.
A study of the seasonal cycle is important both for "short-term"
purposes (predicting El Niño and, more generally, interannual variability)
and for "long-term" purposes (predicting climate changes during
the next century). To assess the simulated seasonal cycle of the coupled atmosphere-ocean
system requires multi-year simulations with the ESM; the seasonal cycle in
any one year is always anomalous, so that analyses of the simulated seasonal
cycle will automatically provide data about simulated interannual variations.
We will explore the hypothesis that models able to simulate the seasonal cycle
will automatically be capable of simulating interannual variability. The two
phenomena, the seasonal cycle and interannual variations, are inextricably
linked even though they are distinct: the one is a forced response, the other
involves natural variability of the seasonal climate system.
Interannual and interdecadal oscillations in the climate system are important
both in their own right and because they can mask detection of anthropogenic
climate change. Coupled modeling of these phenomena may eventually lead to
the possibility of predicting short-term climate fluctuations, as is now experimentally
being done for ENSO. We emphasize the modeling of interannual and interdecadal
variability for two reasons. First, there are important insights into the
climate system that can be obtained from such a study. Second, there are a
number of processes in current climate models that are poorly known, and which
limit the confidence which can be placed, for instance, in greenhouse warming
simulations. In attempting to simulate coupled interdecadal variability, we
confront the models with phenomena which are at longer time scales than they
have previously been used for. This provides a an additional level of testing
for the models, which minimizes the possibility of accidental tuning of the
model during its design. At some future stage, a more thoroughly tested version
of the model might then be considered a more reliable vehicle for greenhouse
warming studies.
2. Global Distribution of Greenhouse Gases
We propose to study the global distributions of greenhouse gases, including
the chlorofluorocarbons, methane and ozone, including sources and sinks for
these gases, photochemical transformations, and removal processes, using the
CATM as a means of integrating these various processes; the simulations will
provide a calibration of the dynamical predictions by comparing observational
data on global distributions against model predictions. This project will
result in a generalized atmospheric dynamics/tracer model that can be applied
to a number of problems involving transport, chemistry and radiation. The
basic algorithms for treating chemical processes (Turco and Whitten, 1974,
1977, 1978), microphysical processes (Turco et al., 1979a,b; Toon et al.,
1988, 1989a), and radiative processes (Toon et al., 1989b) have been developed
by the project participants over two decades of research. Moreover, these
algorithms have been extensively used in to study the causes of global ozone
depletion and to develop models of the underlying physical/chemical mechanisms
(e.g., Hamill et al., 1977, 1982, 1988, 1990; Toon et al., 1986, 1987, 1990;
Turco and Hamill, 1992; Turco et al., 1982, 1989).
The construction of a three-dimensional chemical tracer model, and its coupling
with a global climate model such as that described earlier, represents a major
advancement toward a practical, accurate prognostic climate simulation. With
such a model, forecasts can be used to forestall environmental degradation
and to design effective approaches for preserving the global environment.
The development of three-dimensional atmospheric tracer models has been slow.
The most recent attempts incorporate limited sets of chemical species or employ
low spatial resolution (e.g., Kaye et al., 1989; Cariolle et al., 1990; Kao
et al., 1990; Rood et al., 1990). No three-dimensional models yet include
aerosols and their effects (Charlson et al., 1987), involving radiative and
chemical processes (Drdla et al., 1992a). It is recognized, for instance,
that the generation of tropospheric sulfate aerosols from anthropogenic sulfur
emissions are the likely reason for the pause in the northern hemisphere warming
trend during the 1950's and 60's (Charlson et al., 1990). Impediments to the
development of practical 3-D tracer chemistry/microphysics models include
the lack of efficient numerical algorithms for complex mechanisms, the enormous
computational burden imposed by 3-D chemical/aerosol tracer simulations, and
the limited motivation to construct highly sophisticated codes for narrow
disciplinary studies. However, recognition of the Earth's climate system as
a fully coupled atmosphere/ocean/land/biosphere system that is under stress
has created an urgent need for coupled predictive simulations of the climate
system.
The present research team includes specialists at UCLA and LLNL who have produced
efficient and accurate numerical treatments for atmospheric chemical and microphysical
processes. The team also has access to the latest generation of supercomputers,
including previously experimental parallel architecture machines that will
soon provide sufficient computational speed for the problem. Moreover, the
initial steps toward achieving a fully coupled tracer-dynamics model have
been successfully taken by the research team (e.g., Erickson et al., 1990;
Jacobson et al., 1992a; Lu and Turco, 1991; Penner et al., 1991; Zhao et al.,
1992a). These studies, although preliminary to the work proposed here, represent
essential experiments that validate the approach adopted for modeling complex
dynamical/ chemical/microphysical systems.
The most straightforward application of the coupled dynamical/tracer Earth
Systems Model is the simulation of the distributions of global greenhouse
gases, particularly the chlorofluorocarbons (CFCs). These gases have well-defined
sources, and their concentrations have been monitored worldwide for over a
decade. Accordingly, detailed comparisons between model predictions and observations
provides an excellent validation scheme for the simulated atmospheric transport.
To date, models used to calculate the rates of global tracer transport have
used simplified atmosphere and ocean dynamical models. The coupled AGCM and
OGCM proposed here represents one of the most advanced coupled dynamical simulations
available. On the other hand, the details of the coupled simulations will
offer a rigorous test of the coupled model dynamics, and will provide insight
into the feasibility and fidelity of coupled models.
In the case of methane and ozone, photochemical processes are a primary influence.
The atmospheric lifetimes of methane and other greenhouse compounds are determined
by the concentrations of hydroxyl radicals, which in turn are controlled by
a complex sequence of chemical reactions. The proposed coupled Earth systems
model will be capable of predicting OH concentrations as a function of environmental
conditions, and therefore will provide a flexible analytical tool for evaluating
the impact of methane emissions.
The Lawrence Livermore National Laboratory atmospheric chemistry model (LACM)
includes most of the chemical reactions of interest to this problem. Accordingly,
the LACM will be a test bed for the inclusion of complex photochemistry in
the CATM. The distribution of trace gases can be simulated with the LACM prior
to a full treatment in the CATM. This will allow an initial evaluation of
the key processes and expected sensitivities to physical and chemical parameters.
3. Ozone Perturbations
We propose to investigate the global depletion of the stratospheric ozone
layer, particularly the high latitude ozone depletions associated with aerosols
in both hemispheres, and the coupling of ozone to atmospheric dynamics and
climate change. The global depletion of the ozone layer is well established
through extensive satellite and in situ observations (Stolarski et al., 1991;
Anderson et al., 1991). The massive ozone losses associated with the "ozone
hole" over Antarctica (Farman et al., 1985) are attributable to chemical
processes that are catalyzed by the presence of ice particles in the stratosphere
(Solomon et al., 1986; Crutzen and Arnold, 1986; McElroy et al., 1986). These
ice particles - polar stratospheric clouds (PSCs) - are composed of nitric
acid and water ices (Toon et al., 1986). However, the stratosphere also holds
a ubiquitous layer of sulfuric acid aerosols (e.g., Turco et al., 1982) that
may also cause ozone destruction (Hofmann and Solomon, 1989; Turco and Hamill,
1992).
The issue of ozone depletion is significant for several reasons. First, it
is a paradigm for global environmental problems, involving many aspects of
atmospheric dynamics, chemistry and physics that interact to produce the final
effect. Second, the ozone problem requires an accurate predictive capability,
to project future potential ozone depletions before they occur so that effective
policy actions and control measures can be designed. Third, the magnitude
of the effect is highly nonlinear in the basic parameters. For example, the
effects of heterogeneous (ice) chemistry in causing large ozone decreases
does not occur unless the temperature of the atmosphere falls below a certain
threshold value; once below this value, rapid ozone depletion can set in.
Hence, temperature changes of only a few degrees in the stratosphere may have
important consequences for ozone not predicted by uncoupled dynamical/chemical
models.
In most current models, the homogeneous photochemistry of the atmosphere is
calculated independently of heterogeneous processes. This is practical because
adequate methods to treat heterogeneous chemistry have not been devised and
thus are not generally available, and the computer resources required for
a full chemical treatment remain prohibitive. We have studied polar ozone
depletions with a coupled polar stratospheric cloud microphysics, heterogeneous
chemistry, and photochemistry simulation driven by the UCLA version of the
U. K. Met Office Stratosphere/Mesosphere Model (Drdla et al., 1991, 1992a).
In one case (Drdla et al., 1992a,b), back trajectories for air masses sampled
during the Airborne Arctic Stratosphere Expedition-II (AASE-II) were obtained,
and the complex microphysical evolution of the PSCs was simulated along these
trajectories. The corresponding heterogeneous chemical processing rates were
calculated, and inserted into the homogeneous chemical mechanism of the coupled
APM. The resulting species concentrations predicted at the aircraft track
were compared against ER-2 aircraft data. The good agreement demonstrates
the feasibility and usefulness of coupling microphysical and chemical simulations
(also see Jones et al., 1990). Another example is offered by recent simulations
of volcanic eruption clouds (Turco et al., 1991; J.-X. Zhao et al., 1992a).
In this case, sulfur dioxide photochemistry and sulfate aerosol microphysics
have been coupled as subroutines in the CATM (Toon et al., 1987), and the
dynamical fields were obtained from a stratospheric general circulation model,
which uses a version of the CATM to advect passive tracers (Young et al.,
1992).
IV. Industrial Support to this Project
In addition to the indirect support to this project provided by Digital Equipment
Corporation through Sequoia 2000, DEC has committed direct support (see Appendix
F). DEC is interested in the use of networked workstations as a platform for
developing and testing models such as those motivating this proposal. The
Sequoia workstation network will provide a prototype configuration for testing
a distributed ESM, and will provide valuable evaluation debugging and optimization
of the algorithms that are distributed.
V. Project Personnel
The Principal Investigators on this project form a multidisciplinary team
consisting of atmospheric and oceanic dynamicists and atmospheric chemists,
as well as computer scientists. The development of the ESM ––
which combines the UCLA AGCM, GFDL OGCM, and Ames/UCLA CATM ––
involves the participation of Earth scientists, while the implementation of
a high-performance model in an MIMD computer environment requires the participation
of computer scientists. The Earth scientists provide technical information
regarding the numerical algorithms, data structures, validation tests, and
benchmark datasets. Computer scientists provide parallelization strategies
and methods for parallel performance assessment, as well as computational
resources in the form of parallel developmental platforms and software, and
access to vector machines. Communication between the project team members
will be greatly enhanced by a Picturetel Videoteleconferencing system that
will link the University of California campuses participating in the Sequoia
2000 Project (UCLA, UCB, UCSD, and UCSB).
The leaders of the proposed research effort will be Professors Carlos R. Mechoso
and Richard P. Turco. Professor Mechoso has been working with the UCLA coupled
atmosphere/ocean GCM for a number of years; he will direct the effort to develop
the parallelized version of the coupled GCM, as well as its application to
the Earth sciences challenges. Professor Turco has participated in the design
and application of the CATM; he will head the effort to develop a parallelized
chemistry/microphysics model. Mechoso and Turco will both work toward the
development of a fully couples dynamics/tracer model. The other team members
and their research specialties are:
Akio Arakawa-Atmospheric numerical modeling, dynamics, and parameterization;
James W. Demmel - Numerical analysis and parallel scientific computing;
Jeff Dozier - Hydrologic models and of satellite and aircraft data analysis.
David Halpern - Physical oceanography and satellite data analysis;
George S. H. Philander - Ocean-atmosphere dynamical systems and modeling;
Michael E. Stonebraker - Data base management systems and communications;
Donald J. Wuebbles - Atmospheric chemistry modeling and analysis
The Co-Investigators on the project are:
William P. Dannevik - Computational hydrodynamics and turbulence
Susan L. Graham - Software tools to aid high performance parallel computing;
Joyce L. Penner - Atmospheric chemistry modeling and analysis;
David A. Randall - Atmospheric physics and dynamics modeling;
Douglas Rotman - Atmospheric chemistry modeling and analysis.
The institutions involved in this project are:
Colorado State University - Department of Atmospheric Sciences (Randall);
Jet Propulsion Laboratory - Earth and Space Sciences Group (Halpern);
Lawrence Livermore National Laboratory - Atmospheric and Geophysical Sciences
Division (Wuebbles, Dannevik);
Princeton University - Program in Atmospheric and Oceanic Sciences (Philander);
University of California Berkeley - Computer Sciences (Demmel, Graham, Stonebreaker);
University of California Los Angeles - Department of Atmospheric Sciences
(Arakawa, Mechoso, Neelin, Turco).
University of California Santa Barbara - Center for Remote Sensing and Environmental
Optics (Dozier)
The budget includes support for two postdoctoral researchers, two students
and 50% of a programmer at UCLA. Travel funds are also allocated to support
one trip per year to UCLA by each non-UCLA PI and one trip per year for each
to a scientific conference for each PI. At UCLA, one postdoc will be assigned
to perform parallelization of and simulations with the dynamical models (AGCM
and OGCM) under the direction of Mechoso, Arakawa and Neelin. A second postdoc
will work between UCLA and LLNL on the atmospheric photochemistry algorithms,
parallelization of the chemistry codes, and atmospheric chemistry simulations,
under guidance from Turco and Wuebbles. One UCLA student will focus on the
dynamical modeling of the seasonal cycle, as formulated by Mechoso, Arakawa
and Neelin. The second student will work on the problem of aerosol microphysical
and chemical simulation using the CATM, with Turco as an advisor. The third
student will work on the hydrological models with Dozier as an advisor.
Subcontracts (attached) provide detailed budgetary information concerning
the collaborative research between UCLA and UC Berkeley, Princeton University,
Colorado State University and Lawrence Livermore Laboratory. The UC Berkeley
subcontract supports the computer science elements of the proposal, while
the Princeton and CSU subcontracts support both the computational and science
elements, as described in body of the proposal.
APPENDIX A
The UCLA Atmospheric GCM (AGCM)
The UCLA AGCM predicts the values of horizontal velocity, potential temperature,
water vapor and ozone mixing ratios, surface pressure, and ground temperature.
In an approach unique to the UCLA model, the planetary boundary layer (PBL)
is treated as well-mixed and represented by the variable-depth bottom layer
of the model. The depth of this layer is also predicted by the model.
The AGCM includes parameterizations of PBL processes using bulk assumptions
for the description of turbulence (Suarez et al., 1983). Surface fluxes of
sensible heat, moisture and momentum are modeled using the bulk parameterization
proposed by Deardorff (1972). The model also includes parameterizations of
cumulus convection and its interaction with the PBL (Arakawa and Schubert,
1974), stratus clouds, and solar and infrared radiative heating (Katayama,
1972 and Harshvardhan et al., 1987, respectively). The cloudiness used in
the radiation calculation is predicted. A parameterization of orographic gravity
wave drag similar to that developed by Palmer et al (1986) is included in
the model. Efforts are under way to develop improved cloud formation parameterizations,
especially for convective and cirrus clouds, based in part on the use of explicit
ice and liquid water variables.
In the vertical, the model is based on a coordinate system for which the lower
boundary, the PBL top, and isobaric surfaces above a prescribed pressure level
(100 mb) are coordinate surfaces (Arakawa ans Suarez, 1983). The top of the
model atmosphere is assumed to be a material surface. The vertical finite-differencing
used above the PBL guarantees conservations of the global mass integrals of
potential temperature and total potential plus kinetic energy under frictionless
adiabatic processes.
The equations are horizontally discretized using a staggered atitude-longitude
“C” grid (Arakawa and Lamb, 1977). The scheme for the horizontal
advection terms in the momentum equation conserves potential enstrophy and
gives fourth-order accuracy for the advection of potential vorticity. The
horizontal advection scheme used for the potential temperature is also fourth-order
and conserves the global mass integral of its square. The scheme for the horizontal
advection of water vapor and ozone does not allow the occurrence of negative
values. In all other terms, including the continuity equation, the pressure
gradient force and the definition of absolute vorticity, the differencing
is of second-order accuracy.
The geographical distributions of surface albedo and ground wetness are interpolated
from prescribed monthly means based on the observed climatology.
At present, the UCLA AGCM has a tropospheric version with the top at 50 mb
and a tropospheric-stratospheric version with the top at 1 mb. These versions
can be configured to run at low and high vertical resolutions: 9- and 17-layer
for the tropospheric version, 15- and 29-layer for the tropospheric-stratospheric
version. For each vertical resolution, there are two horizontal resolutions.
The coarse (standard) horizontal resolution has a grid of 5° longitude
by 4° latitude, and the fine horizontal resolution version has a 2.5°
by 2° grid.
The GCM has been evaluated in a variety of studies including long-term simulations
of monthly mean fields (Suarez et al., 1983; Randall et al., 1985), experimental
medium-range (10-day) predictions (Mechoso et al., 1985, Mechoso et al., 1986},
and assessments of the impact of SST anomalies on the atmospheric circulation
(Mechoso et al., 1990).
APPENDIX B
The GFDL Ocean GCM (OGCM)
The OGCM is based on that developed at the NOAA Geophysical Fluid Dynamics
Laboratory (GFDL)/Princeton University by K. Bryan and M. D. Cox (Bryan, 1969;
Cox, 1984). The OGCM predicts the horizontal velocity, temperature, and salinity.
Density is determined from the temperature and salinity using Knudsen's equation
of state. The model uses depth as the vertical coordinate. The top of the
model is assumed to be a rigid lid. In the horizontal, the equations are discretized
using a staggered longitude-latitude “B” grid (Arakawa and Lamb,
1977).
At present, we are using two versions of the OGCM: a) the Global-OGCM (G-OGCM),
which covers the ocean in the latitude belt from 60°S to 60°N; and
b) the Tropical Pacific-OGCM (TP-OGCM), which covers the Pacific Ocean in
the latitude belt from 28°S to 50°N. The northernmost and southernmost
parts of the domains are relaxed towards the observed climatology in both
salinity and temperature fields. Incorporation of a sea-ice module is also
planned. When this is complete, the southern boundary of the G-OGCM will be
extended to the periphery of the Antarctic continent. Also, we are incorporating
to the G-OGCM a module that simulates the microphysics and photochemistry
of constituents under the influence of fluid motions in a three-dimensional
field (Turco et al.,1989; Toon et al., 1989).
The G-OGCM incorporates realistic bottom topography. The horizontal resolution
is 1° longitude by 1° latitude, and there are 15 unevenly spaced levels
in the vertical. The TP-OGCM has 27 levels in the vertical, with 10 levels
equally-spaced over the upper 100 m. The ocean depth is assumed to be constant
at approximately 4,150 m. In longitude, the resolution is 1° in latitude,
the mesh size is 1/3° between 10°S and 10°N and increases gradually
toward the poles. Table 1 gives a summary of OGCM timings on a CRAY Y-MP using
one processor.
A crucial part of the OGCM is the parameterization of vertical transports
by sub-grid turbulence. These transports play a major role in distributing
heat and momentum from surface to the deep ocean. In the original configuration
of the TP-OGCM, representation of turbulence terms in the governing equations
is based on first-order turbulence closure (K-theory), in which the vertical
mixing coefficients are taken to be a function of the local Richardson number
(Pacanowski and Philander, 1981). We have implemented alternative formulations
of the mixing processes, including the the Mellor-Yamada level (2–1/2)
second-order turbulence closure scheme (Mellor and Yamada, 1974, 1982). This
scheme adds to the model two additional prognostic equations for turbulence-related
quantities.
We have performed a series of multi-year simulations with the uncoupled TP-OGCM
using both the first- and second-order turbulence closure schemes described
above (Ma et al., 1991). These simulations generally produce realistic structure
for the ocean currents and temperature field. The second-order scheme produces
in general deeper mixed layers and sharper thermoclines than the first-order
scheme, particularly in the eastern equatorial Pacific.
APPENDIX C
The CNRI Gigabit Testbed Initiative
The CNRI Gigabit Testbed Initiative is a three-year project for research on
very high speed (gigabit per second) communication network. The major goals
of this research are to develop architectural alternatives for consideration
in determining the possible structure of a wide-area gigabit network serving
the research and education communities, and to understand the utility of gigabit
networks by the end user.
CNRI's role is to lead a testbed-based research effort consisting of collaborators
from universities, national laboratories, supercomputer centers, and major
industrial organizations. The major activities revolve around a set of five
testbeds: AURORA, BLANCA, CASA, NECTAR, and VISTANET.
The principal research organizations in the CASA wide-area testbed are the
Los Alamos National Laboratory (LANL) in Los Alamos, New Mexico; the California
Institute of Technology (Caltech) and the Jet Propulsion Laboratory (JPL)
in Pasadena, California; and the San Diego Supercomputer Center (SDSC) in
conjunction with UCLA. The carriers collaborating in the CASA testbed are
MCI, Pacific Bell, and U S West. The testbed will connect JPL, Caltech, SDSC
and LANL.
The CASA testbed investigates whether distributed supercomputing over wide-area
high-speed networks can provide new levels of computational resources for
leading-edge scientific problems. In distributing the GCM code we explore
the methodology and performance issues for decomposing scientific simulations
to run concurrently on computers of different architectures.
Appendix E
Network Bandwidth Requirements
The organization of computation and communication shown in Fig. 3 is valid
only when the time steps of all three modules are equal. The time steps of
the AGCM/Physics and OGCM, however, are usually much larger than that of AGCM/Dynamics.
Let us focus on the AGCM. For this model, the AGCM/Dynamics goes through time
steps before the next time step of AGCM/Physics starts. The finite-differencing
in the AGCM/Dynamics implies that to advance a domain time steps requires
information from neighboring domains for the same time steps. The scheme in
Fig. 3, however, does not advance neighboring domains more than one time step.
To account for different time steps of the AGCM components, we divide the
AGCM/Dynamics in ( - 1) I/O domains, and latitude bands for each I/O domain.
The resulting organization of the calculation is schematically shown in Fig.
E1, in which we have taken = 4. Arabic characters in Fig. E1 indicate the
time steps in AGCM/Dynamics being executed for the corresponding latitude
band. Here, we have also assumed that the additional data required by the
finite-differencing in a domain of the AGCM/Dynamics is one extra latitude
band, and that there are no communication delays. If is also the ratio between
the time required to compute one AGCM/Physics time step and one AGCM/Dynamics
time step, then there will be no idle time for the processors assigned to
those model components. A similar procedure can be applied to the OGCM. This
method, therefore, allows for the parallelization of the three components
of the Coupled GCM, so that the only wall clock time required to run the model
is that corresponding to the AGCM/Physics. The method can be extended to a
geographically distributed computer environment. In this case, the number
of I/O subdomains has to be increased to account for network delays .
The efficiency of an application distributed across a network is affected
by network delays. If the network covers a wide geographical area, the time
required for communication between remote locations, in particular the time
required for message to traverse long distances, can become a major issue.
When designing algorithms intended for applications using a wide-area network,
it is imperative to overlap data transmission with computation so that the
cost of communication can be minimized.
Because large amounts of data need to be exchanged over the network within
a limited time, there is a minimum requirement on the bandwidth of the network.
To study the bandwidth requirement for the distributed GCM, we have to estimate
the flow of data, the network latencies and the time within which data transmission
has to be carried out. Here we obtain an estimate by considering a simple
model of the network and a scenario based on two assumptions:
1. The AGCM/Physics is considered as the slowest component of all three modules
and
2. the time to communicate boundary data for one I/O subdomain is masked by
the AGCM/Physics computation for the next I/O subdomain.
We will also assume that the data produced by AGCM/Dynamics and OGCM for a
particular subdomain are received by AGCM/Physics before the computation of
next time-step for that subdomain begins. This can usually be satisfied by
making , the number of I/O subdomains, large enough. Therefore, this assumption
implies the existence of a limit on the minimal that can be used.
Assumptions (1) and (2) can be expressed as an inequality
(E1)
where is the time required to integrate AGCM/Physcis one timestep for the
whole globe. The other terms in Eq. (1) are defined by:
(transmittal time)
(round-trip latency)
(error correction)
(contention delay)
(CPU overhead)
where is the total size of boundary data of the climate model, and is the
bandwidth. The network related parameters and their estimated values for the
projected gigabit network are listed in Table E.1 (Moore 1991). Equation (E1)
determines the minimum bandwidth required for a particular model resolution
and network configuration.
Table E.1. Parameters for the gigabit/second network.
variable definition units value
d
distance
speed in fiber
packet size
packet error rate
routing delays
contention delays
fractional protocol overhead
system overhead m
errors/packet
*
Solving Eq. (E1) for , one obtains the minimum bandwidth required
(E2)
if
(E3)
And the bandwidth efficiency in this case is given by
(E4)
For the current coarse-resolution AGCM coupled to the global OGCM, . Using
one processor on the Cray Y/MP, is approximately . If we take the number of
I/O subdomain to be 10, and neglect the error rate, the estimated minimum
bandwidth is 7 Mbits. This is less than the bandwidth of a T3 network.
As the computing power increases or as the model code is further parallelized,
the minimum bandwidth will increase. To consider the dependence of required
bandwidth on the computing power, we introduce the relationship
(E5)
where is the number of floating point operations per AGCM/Physics step, and
is the execution rate. Examination of Eq. (E4) reveals that the bandwidth
efficiency for a given network configuration ( fixed) is approximately constant
when remains constant. Therefore we consider the case for which the model
resolution will be increased in such a way that will always remain constant.
With Eq. (E5) this implies that increases linearly with . The size of data
transmitted has a fixed relationship with since both are functions of model
resolution. The number of operations does not have a simple linear dependence
on model resolution because vectorization tends to make the increase less
than linear, whereas some operations (matrix inversion, subgrid-scale physics)
tend to increase more than linearly with model resolution. We have assumed
their relationship to be
(E6)
We also assume that the CPU overhead decreases like
(E7)
since the system overhead is expected to decrease with faster computers. Using
Eqs. (E5), (E6), (E7) and assuming to be constant, we can obtain and as functions
of execution rate (Fig. E2). In the figure we also indicated the estimate
for running the high resolution (2.5° longitude by 2° latitude) GCM.
It is clear that with this configuration not only is it necessary to have
a gigabit network but also that the climate model can use it at an efficiency
of 90%.
We have conducted several experiments to analyze the performance of the
distributed climate model using existing networks. In particular we attempted
to quantify the following type of latencies:
• round-trip delay,
• I/O access delays – time spent waiting for I/O requests to be
processed through the CRAY I/O Subsystem,
• CPU access delays – time spent waiting for the UNICOS CPU scheduler
to connect a process already in memory to a CPU, and
• memory access delays – time spent waiting for the UNICOS memory
scheduler to swap the distributed task into memory from disk.
A series of experiments were made to quantify the latencies associated with
level of access. The task-decomposed GCM was first run on the SDSC CRAY Y-MP
during dedicated time. The same run was then duplicated during production
use of the SDSC Y-MP, illustrating the impact of memory access delays. Finally,
the distributed application was run using both the SDSC and NCAR CRAY Y-MPs
across a T1 link. For the dedicated run inside the CRAY used a total CPU time
of 210.22s for a one day simulation. The total wall clock time used was only
167.80s. Therefore 42.42s was saved for the one-day run by having the AGCM/Dynamics
and OGCM running in parallel.
Interpretation of the results required additional tests. The seemingly low
speed of internal interprocess communication on the CRAY Y-MP was analyzed
by comparing the achieveable bandwidth as a function of message size, kernal
buffer size for the TCP/IP protocol, and the number of messages in flight
allowed between acknowledgements (window scaling factor). Table E2 gives a
series of analyses for both dedicated use of the CRAY Y-MP and for production
use on a fully utilized system.
Table E.2a
Bandwidths as a function of message size and buffer size on a dedicated system
(MBytes/sec).
Message size buffer size
(kB) 16 kB 32 kB 64 kB 128 kB 256 kB
64 24.8 42.5 53.8 69.3 69.4
32 24.6 42.4 47.3 46.6 54.8
16 24.5 37.8 43.0
8 17.9 30.0 31.0
Table E.2b
Bandwidths as a function of message size and buffer size on a heavily loaded
system (MBytes/s)
Message size buffer size
(kB) 16 kB 32 kB 64 kB 128 kB 256 kB
64 6.3 12.5 16.0 14.6 17.9
32 6.3 14.5 13.3 12.6 14.6
16 7.1 8.8 8.0
8 4.4 6.3 6.6 Table E.2c
Bandwidths as a function of window size and buffer size for a 64 kB message
on a dedicated system (MBytes/s)
window scaling buffer size
factor 64 kB 128 kB 256 kB 512 kB
4 48.9 67.4 83.4 76.5
2 48.9 67.1 82.1 80.3
1 48.9 66.9 66.8 66.4
0 53.8 69.3 69.4
Table E.2d
Bandwidths as a function of window size and buffer size for a 64 kB message
on a heavily loaded system (MBytes/s)
window scaling buffer size
factor 64 kB 128 kB 256 kB 512 kB
4 10.6 15.4 15.8 15.0
2 11.1 15.9 15.5 17.3
1 13.1 16.9 13.4 15.6
0 16.0 14.6 17.9
The bandwidth achieved in the GCM experiments also showed dependencies on the
message size. An average bandwidth of 13.2 MBytes/sec was obtained for a message
of size 5 kB, and 40 MBytes/sec for a message of 270 kB for the dedicated run.
These are consistent with findings in the simple experiments for the default
buffer size of 32 kB.
For the production runs, the communication time has increased by a factor of
1725, which is much larger than the expected increase of a factor of 3-5. The
magnitude of the deterioration in performance is dependent on the size of the
program being executed. On a heavily loaded system, UNICOS will swap jobs out
of memory that are waiting for I/O to complete. The length of time needed to
do the the swap is proportional to job size. The time the job remains on disk
is also dependent on the availability of space in memory for swapping the jobs
back in. The test program was much smaller in size than the GCM. Thus the job
waited on disk longer for the I/O to complete, producing a drastically lowered
effective bandwidth. Of interest is the fact that window size scaling provided
no effective enhancement in bandwidth on a heavily loaded system. At SDSC the
I/O load to disk average 17 MBytes/sec. This contention for I/O resources completely
dominates any time saved by keeping multiple messages in flight between acknowledgements.
The time spent swapped out of memory is greatly increased during the production
run. This additional time represents the contention for access to memory when
time sharing is done.
The wall clock time of the distributed run is dominated by the time required
to transmit the data over a heavily loaded T1 link. Transmission rates of 10-15
kBytes are observed even for very large files sent across the link. Simple experiments
showed that store-and-forward delays accounts for a large percentage of this
delay.
In order to achieve high effective bandwidth utilization across a gigabit/second
link with the present CRAY supercompuer technology, the distributed application
will have to be run on dedicated systems. The development of sophisticated job
scheduling algorithms will be needed to avoid memory access delays and I/O resource
contention delays when such applications are run in competition with production
job mixes.
Several questions can be raised against the ultimate viability of our distributed
application in a realistic environment of geographically distributed computers.
The distributed application will have to compete with other processes on each
individual machines the application is distributed. The initialization of processes
and the synchronization between them require sophisticated enough interprocess
communication tools, as well as special considerations from system administrators.
These problems will have to be addressed if distributed computing across wide-area
networks is to become an established way of scientific research.