link to courtesy copy of Dr. Venkatramani Balaji’s PPt. see below for transcript of 3 slides of ecosonic interest – “How to Get to Exascale”, “Hardware & Software Challenges”, “Climate Sci – HPC Challenges”.
(pointer to) the UofT hosted copy of the PPt (???) expected here

Dr Balaji is expecting the publication of a textbook he’s been working on, in a couple of years’ time. He also teaches courses in his area at Princeton.

The talk was hosted by Paul Kushner, formerly of GFDL, currently Associate Prof, UT, and W. Richard Peltier, Professor, UT, both with the Atmospheric Physics Group.

When: Tue June 8, 2010; 4:10pm Room TBA [Rm 408, 60 St George St]
Presenter: V. Balaji
Title: Climate Computing: Computational, Data, and Scientific Scalability

Climate modeling, in particular the tantalizing possibility of making projections of climate risks that have predictive skill on timescales of many years, is a principal science driver for high-end computing. It will stretch the boundaries of computing along various axes:

– resolution, where computing costs scale with the 4th power of problem size along each dimension
– complexity, as new subsystems are added to comprehensive earth system models with feedbacks
– capacity, as we build ensembles of simulations to sample uncertainty, both in our knowledge and representation, and of that inherent in the chaotic system. In particular, we are interested in characterizing the “tail” of the pdf (extreme weather) where a lot of climate risk resides.

The challenge probes the limits of current computing in many ways. First, there is the problem of computational scalability, where the community is adapting to an era where computational power increases are dependent on concurrency of computing and no longer on raw clock speed. Second, we increasingly depend on experiments coordinated across many modeling centres which result in petabyte-scale distributed archives. The analysis of results from distributed archives poses the problem of data scalability.

Finally, while climate research is still performed by dedicated research teams, its potential customers are many: energy policy, insurance and re-insurance, and most importantly the study of climate change impacts — on agriculture, migration, international security, public health, air quality, water resources, travel and trade — are all domains where climate models are increasingly seen as tools that could be routinely applied in various contexts. The results of climate research have engendered entire fields of “downstream” science as societies try to grapple with the consequences of climate change. This poses the problem of scientific scalability: how to enable the legions of non-climate scientists, vastly outnumbering the climate research community, to benefit from climate data.

The talks surveys some aspects of current computational climate research as it rises to meet the simultaneous challenges of computational, data and scientific scalability.

Slide 31. How to get to exascale
If individual arithmetic processors are going to remain at ~1GHz(109) how do we get to exascale (1018)? We need billion-way concurrency!

  • Components of a coupled system will execute on O(105) processors (driver-kernel programming model)
  • There will be O(10) concurrent components coupled by a framework (FMS, ESMF, PRISM)
  • We will reduce uncertainty by running O(10-100) ensemble members.
  • We will use a task-parallel workflow of O(10-100)to execute, process and analyze these experiments (FRE).

Exascale software and programming models are expected by 2013, hardware by 2018.

[What got pasted here from the PDF file as O, to my best surmise 🙂 😦 would be notation for processor speed/frequency (if Hz) – but what is the character? Lynne]

Slide 32. Hardware and software challenges

  • We still haven’t solved the I/O problem.(Useful data point: our IPCC-class climate models have a data rate of 0.08GB/cp-h).
  • Integrated systems assembled from multiple manufacturers: chips, compilers, network, file systems, storage, might all come from different vendors. Many points of failure.
  • Multi-core chips: many processing units on a single board. Since our codes are already memory-bound, we do not expect to scale out well on multi-core.
  • New programming models may be needed, but are immature: Co-Array Fortran and other PGASl anguages, OpenCL.
  • Reproducibility as we now understand it is increasingly at risk: GPU for instance does not appear to have a formal execution consistency model for threads.

Slide 33. Climate science: HPC challenges

  • Adopt high-level programming models (frameworks)to take advantages of new approaches to parallelism should they become operational.
  • Component-level parallelism via framework superstructure.
  • Approach models as experimental biological systems: single organism or “cell line” not exactly reproducible; only the ensemble is.
  • There is more “downstream” science than there are climate scientists: a scientific scalability challenge.
  • Use curator technology to produce “canned” model configurations that can be run as services on a cloud.