You are currently browsing the daily archive for June 9, 2010.

Last updated: Tue, June 15, 2010

link to courtesy copy of Dr. Venkatramani Balaji’s PPt. See Talk Announcement post for transcription of 3 slides of ecosonic interest – “How to Get to Exascale”, “Hardware & Software Challenges”, “Climate Sci – HPC Challenges”
(Pointer to) UofT-hosted copy (???) of PPt expected on talk + abstract page

Not having the requisite sci background, I’ll skip the technical core (a huuuuge pity!) and mention a few points of ecosonic interest that Dr Venkatramani Balaji brought up. I’ll also add my related search findings concerning the epistemic flows among actors of various import on the climate science and climate change stage.

Re scientists and research units of any size “talking” to each other
VB asked, How do you make climate models usable by a large number of people? (for the purposes of this section, I’m assuming |people=climate scientists|; re non-climatologists and non-scientists, see further below, VB’s “people” likely included those as well – per post-talk exchange; see note 1 re disciplinary labels)

The efforts of the Modelling Systems Group at GFDL, which he heads, a.o.t., focus on developing metadata “in view of facilitating data management of large national and international modeling campaigns such as the IPCC”. In principle, they would be facing the challenges to the standardization of (semantic web) ontologies, which, e.g., ColumbiaU philosopher Barry Smith analyzes in detail. (cf. a pres I gave at UT a couple of years ago)

On the eco-consonant side, climate model metadata standardization has advanced thanks to the Climate and Forecast Metadata Convention, which is adopted or “encouraged” by a number of research institutions in North America and Europe, (among 33 institutions and projects listed) e.g., the SeaDataNet partnership, dubbed “Pan-European infrastructure for ocean and marine data management”, which currently has 49 partners, and also plans to provide university-level training, Humbold (a EU project based in Germany), the University of Colorado’s Atmospheric Research center’s (known as UCAR) North American Regional Climate Change Assessment Program, the NERC RAPID THCMIP (Thermohaline Circulation Model Intercomparison Project) at the Natural Environment Research Council, UK. GFDL also uses the Convention, though it’s not on the posted list.

Re scientists and technologies they rely on
On the eco-dissonant side, VB mentioned among the challenges the fact that, e.g., integrated systems are assembled from multiple manufacturers, and enumerating the components used, he pointed out that this makes for numerous potential “points of failure” (Slide 32). Also, that “new programming models may be needed but are immature” (ibid.).

Re the computational complexity of models
A re-visit to my “chronic” query – computational complexity as regards manageability of 1) current (scientific and technological) knowledge 2) incoming (climate) data and new (scientific and technological) knowledge (cf. Hansen’s concern over current epistemic gaps). E.g., re 1) VB pointed out that “[e]xascale [meaning in the range of 1018 and up] software and programming models are expected by 2013, hardware by 2018″, though at present exceeding the 1 GHz limit is impossible for individual arithmetic processors and the components of a coupled system would stay at 105; so I wonder if fulfilling the 2013/2018 projections would be a prerogative of quantum computing under development. (see expectations of Canada’s Perimeter Institute) Re 2), my question is, How easy is it for the design of models to open up options for incorporating a new variable if/when necessary – without breaking stride? Would it be a piece of cake, adding/changing a few lines of code, or would reworking the program, a new software package edition, be in order?

After all, global models failed to predict the recent rapid Arctic sea ice loss, according to leading US climate scientist Jim Hansen, who’s been dealing with modelling for over 30 years. (It seems this could easily be a case of 1) or 2) posited above, or both (?)) He notes in his 2009 book Storms of My Grandchildren that “[e]ven as our understanding of some feedbacks improves [basically he is saying global models are good for known feedbacks], we don’t know what we don’t know – there may be other feedbacks. Climate sensitivity will never be defined accurately by models.” (p.44, emphasis mine) Thus, he places models as a heuristic below paleoclimate studies and ongoing observations, even as he acknowledges that they have their uses. [see Note on Climate Models (still to do)]

Surely, [June 28 update: in view of what models can handle within existing human climate knowledge, and the (at least theoretical) possibility to prompt human discovery, not intended by the design] Hansen’s “never” would depend on how advanced the technology that handles (in most general terms) variables is? Would it necessarily be an oxymoron to be programming for “what we don’t know”? After all, isn’t the “unknown” (un)consciously implicated in assessing the “probability” of something happening, which some models do? And more interestingly for the process of scientific discovery, [substitution June 28: following a deductive thought process,] would what is/may be part intuition/intuitive exectation help climatologists to put their finger on previously unknown/unacknowledged factors that contribute to the picture, [update June 28: whether the eye-opener comes independently of or through the modelling technology]? (cf. the experience of discovery of cytogeneticist and Nobel prize winner Barbara McClintock, e.g., per bio by E. Fox Keller, 1983 – see note 2 below)

[June 26, 2010 update: I cannot believe I did not put this down – How close is modelling to being able to program for what a human can get out of ongoing observations and paleoclimate data. Are the connections humans can make way too loose to be formalized? Really, how up-to-date is what gets into a model – if according to Hansen “ongoing observations”, (alongside paleo data) would make a difference in weighing climate sensitivity?
There ARE paleoclimate models, currently, but are they conversing (well) with programs for current predictions?]

Re the human actors’ mindset
It seems, then, that a big part of a good climate scientist’s mindset would be to handle equally well emergent and existing knowledge. This state of affairs would clash with “computer logic” to the extent that software models operate with closed sets of options – the knowledge they host would be part of what their designers know/have access to. Also, it could be that a climate scientist’s attention is trained on identifying patterns of behaviour of feedbacks (from observations and historically) and how well a model recreates history and anticipates future behaviour, whereas that of a programmer or a software engineer would be targetting what’s wrong with how a computer executes a program rather than with the truth value of the science fed into the program.

Oops, essentializing and dichotomizing? [take a look at update June 18 – June 23, 2010, below] To the extent that it would help design a workable ecosonic model of Human-Human and Technology-Human Relatedness, yes. Noam Chomsky called this operation (unavoidable for him, fallacious for others) “idealizing the data” (in formal linguistics, I have to add).

So, How can the two communities find a common “language” (used metaphorically, I do not mean Python, etc.) to build climate models that work, and do it well, at the same time spurring on technology to match ongoing developments in science? Added to that is the possibility that the options presented by a technology can inspire (serendipitously or otherwise) innovative ways to approach the “science” itself. Form influencing content, in most general semiotics terms.

Re grist in the mill of disseminating and passing on scientific knowledge
Lots of wisdom and communication mastery are needed to achieve eco-consonance in the case of communicating science to the public (which, I’d imagine, would include non-climate scientists, who’d want the “results”, not the “process”), ditto passing on this knowledge – in addition to knowledge of climate science per se – to future generations of scientists.

In all www evidence, Princeton University’s Cooperative Institute for Climate Sciences CICS is already scoping out the former, while doing an excellent job of the latter. (see post A Whole Prof All to Yourself) In a quick post-lecture exchange, VB mentioned that GFDL (or CICS?) has included in the past seminars (incl. for grad students?) on how to communicate with the public.

These skills would be mandatory for the purposes of providing decision support, as Stanford’s Stephen Schneider amply illustrates in his 2009 book Science as a Contact Sport, including in the context of negotiations for IPCC4’s WGII (check Vocab post) report between scientists (himself, a.o.) and government reps (he “converted” Kenya’s, if I remember correctly, delegate). Schneider records, with well timed anecdotal relief, the draining mind+rhetorics battles over, e.g., what scientists vs. policymakers mean by “confidence” vs. “high confidence”, “likely” vs. “very likely” as applied to CC, which “inspired” him to propose percentage quantification. (in a similar vein, see the Vocab post entry re IPCC’s euphemistic/diplomatic use of “climate change” in lieu of “global warming”)

Intermediate conclusions
A wide-range epistemic transfer, exchange rather, is needed within and between climate science and the software engineering/technological domain for quality knowledge production, with special attention to exchanges between scientists and future scientists.

Since climate knowledge production depends on modelling technologies, it would seem as much as on data observation and paleoclimate studies, and conversely, climate modelling depends as much on climate knowledge as on technological competence, it would be extremely beneficial to prioritize close collaboration between climatologists and software engineers. In addition to the technical, programming side, model-building will crucially benefit from a legacy of requirements engineering and standardization methodology.

To the observation that what SE has developed over and above programming may be/is largely irrelevant to climatologists, I’d advocate considering “translating”, not “transferring” that legacy, thus:

* * * * * * * * * * A Sub-branch of SE – perhaps? * * * * * * * * * *

As regards decision support, it requires accurate and adequately selected knowledge, and accessibly and diplomatically executed knowledge exchange among climate scientists, and between them and non-climate scientists, humanitarians, policymakers, the general public… Think of what is invloved in preparing the Assessment Reports of IPCC’s WGI (scientists), and WGII and III (interfaces with economists, policymakers et al.) and, ultimately, in the Synthesis Reports, based on the work of all three working groups.

If any reprsentatives of (any of) the non-climatologist demographic groups above are to be made privy to “wassup with climate”, how are they not a “client” whose needs and background should be taken into account? Closer to the core of science, if the output of models were to count on a par with – if not census data – then official (scientific) publications, in the public domain and with proper credit and responsibility assigned, then any climate-savvy external scientist not privy to the workflow of a research community of any size, would also be a client, highly demanding, at that. Plus, even if a climatologist is designing the program for her-/himself, they are, after all, following some tacit requirements, as their own “client”. [check if Dr Balaji/s.o. else has a graphic representation of the varying scope of climate models req’s engineering, which I am assuming cannot equal zero, even when “doing it for oneself”.]

Educating for climate-science-and-software-engineering hybridity
In the education section, I’d like to mention that Dr Balaji is expecting the publication of a textbook he’s been working on, 1-2 years from now. He also develops courses he teaches at Princeton.

What caught my attention was his emphasis on there being much more “downstream science” than there are scientists prepared to meet the climate analysis demand. He identifies this as a scientific scalability challenge. (Slide 33)

Currently, at GFDL, which includes Dr Balaji’s Modelling Systems Group, I do not recall coming across a profile which explicitly features formal training in software engineering.

Dr Balaji liked my idea of “hybrid” education, organically interfacing climate science and software engineering. In my terminological taxonomy, “hybridity” would mean superseding multidisciplinarity, and moving from the interdisciplinarity on to the transdisciplinarity stage. That is, going beyond “merely” juxtaposing self-contained disciplines (multi = Lat. “many”), and proceeding with epistemic exchanges (inter = Lat. “between”), and even with disciplinary merger (trans = Lat. “across”, “through”). [should link to my presentation on ***-disciplinarity]

However, the “linguistic” (metaphorically speaking) correspondences between the two communities and their epistemologies regarding modelling are far from obvious.

In principle, to the extent that the “language barrier” between an A and a B is overarchingly disciplinary, setting aside individual psychological specificity, “translation” between two distinct disciplinary mindsets may pose a problem, and “climate science” itself IS a crossroads of multiple disciplines, which multiplies the potential barriers to eco-consonant relationships. What happens with the addition of yet another player, software engineering (SE)? One mindset predicated on an open – and perhaps undefinable – set of options (the “not-knowing-the-unknown” problem per Hansen above), and another on an explicitly defined set of options (remember the colloquial idiom “engineering solution”?). Request: Pls keep in mind that this is only an abstract ecosonic model, in this particular case, also playing with stereotypes, certainly not meant to reflect the various degrees and shades, especially as related to specific (not excluding exceptional!) individuals 🙂

If “translation lossy-ness” is what has so far prevented (it would seem, desirable, and urgent!) closer collaboration between climatologists and software engineers, then perhaps a good motivator for more extensive epistemic exchanges could be the opportunity to slim down each other’s Zones of Proximal Development? (see note 3)

Once the leadership by already accomplished engineers and climate scientists is in place, carefully thought-out hybrid university-level (grad?) curricula/programs may be the path to developing the requisite “linguistic” skills of the future generations in the making. After all, in view of current IPCC estimates, well prepared climate-tackling scientists, including meta-science communication talent, would be needed for at least a century from now, moreover in top-priority mode.

The catch in this uplifting scenario? See …From Contact to In-tact Sport post.

NOTE 1: In this post I am using “climate science”/”climatology” as a shorthand for a variety of disciplines involved in the study of climate, as represented, e.g., in the research profiles at GFDL – geophysics, atmospheric physics and chemistry, oceanography.

NOTE 2: Because this definitely is a book worth reading for those interested in scientific discovery, the exact bibliographic info:
Evelyn Fox Keller. (esp. “Chapter Three: Becoming a Scientist.” In) A Feeling for the Organism: the life and work of Barbara McClintock. San Francisco: W.H. Freeman. (1983)

For the other references, pls see post Researched CC Material (Ongoing)

NOTE 3: Widely recognised psychologist and pedagogue Lev Vygotsky postulated that what he termed “Zone of Proximal Development” constitutes the difference between a student’s capacity to arrive at a solution on his/her own and the capacity to do so with the help of a more experienced teacher/adult… He also recommended to give students assignments in the ZPD, to encourage cognitive development. I couldn’t agree more, and would emphatically extend his recommendation to Any Learning Context at Any Stage in Life.

update June 18 – June 23, 2010
It is important to stress that technology is not devoid of scientific “texture”, and that science and technology are team mates, rather than rivals. Hence, the term “technoscience” (see Vocab entry). In establishing patters and physical dependencies between abiotic, moreover artefactual, units, software engineering, like e.g. cybernetics, is very much the counterpart of biology, medicine, whose units are biotic, which until not that long ago also entailed “non-artefactual”. But with cyborgs on the rise… (see Vocab entry)

It must be the “applied” part of it that gave engineering the meaning in the idiom “engineering solution” (“mechanistic”, “operational”), e.g., looking to “fix” rather than “explain”, and (historically) kept it from entering university curricula until late in the 19th century, in North America, at least.

Last updated: Tue, June 15, 2010

It turns out grad students’ paradise exists. A higher than 1 to 1 prof – student ratio at the Cooperative Institute for Climate Science (CICS), founded in 2003 as the outcome of a 40-year long collaboration between Princeton University’s Atmospheric and Oceanographic Sciences Program and NOAA’s Geophysical Fluid Dynamics Laboratory (GFDL).

I paste below part of the SICS mission statement – it reads like a spell-out of ecological thinking:

…the co-evolution of society and the environment – integrating physical, chemical, biological, technological, economic, social and ethical dimensions of climate change – and … educating the next generations to deal with the increasing complexity of these issues. (emphasis mine)

As of Jan 29, 2010, the CICS People page lists 32 GFDL-based and 12 Princeton-based members, of whom 26 research, 12 GSs, 1 visitor, 5 admin.

Of note are several seminar series (STEP has some PPts online):

  • GFDL seminars – upcoming
  • Geosciences series
  • STEP (= Science, Technology and Enviro Policy; within the Program headed by Prof. Michael Oppenheimer), whose speakers include academics working in areas related to climate and the environment, policy makers, science writers – going back to 1999.
  • Biochemistry

The research data above were gathered in prep for and subsequent to a talk by Prof. V. Balaji, head of GFDL, hosted by UofT’s Physics Dept. (Atmospheric Physics group) on June 8, 2010 [see ES post]

link to courtesy copy of Dr. Venkatramani Balaji’s PPt. see below for transcript of 3 slides of ecosonic interest – “How to Get to Exascale”, “Hardware & Software Challenges”, “Climate Sci – HPC Challenges”.
(pointer to) the UofT hosted copy of the PPt (???) expected here

Dr Balaji is expecting the publication of a textbook he’s been working on, in a couple of years’ time. He also teaches courses in his area at Princeton.

The talk was hosted by Paul Kushner, formerly of GFDL, currently Associate Prof, UT, and W. Richard Peltier, Professor, UT, both with the Atmospheric Physics Group.

When: Tue June 8, 2010; 4:10pm Room TBA [Rm 408, 60 St George St]
Presenter: V. Balaji
Title: Climate Computing: Computational, Data, and Scientific Scalability

Climate modeling, in particular the tantalizing possibility of making projections of climate risks that have predictive skill on timescales of many years, is a principal science driver for high-end computing. It will stretch the boundaries of computing along various axes:

– resolution, where computing costs scale with the 4th power of problem size along each dimension
– complexity, as new subsystems are added to comprehensive earth system models with feedbacks
– capacity, as we build ensembles of simulations to sample uncertainty, both in our knowledge and representation, and of that inherent in the chaotic system. In particular, we are interested in characterizing the “tail” of the pdf (extreme weather) where a lot of climate risk resides.

The challenge probes the limits of current computing in many ways. First, there is the problem of computational scalability, where the community is adapting to an era where computational power increases are dependent on concurrency of computing and no longer on raw clock speed. Second, we increasingly depend on experiments coordinated across many modeling centres which result in petabyte-scale distributed archives. The analysis of results from distributed archives poses the problem of data scalability.

Finally, while climate research is still performed by dedicated research teams, its potential customers are many: energy policy, insurance and re-insurance, and most importantly the study of climate change impacts — on agriculture, migration, international security, public health, air quality, water resources, travel and trade — are all domains where climate models are increasingly seen as tools that could be routinely applied in various contexts. The results of climate research have engendered entire fields of “downstream” science as societies try to grapple with the consequences of climate change. This poses the problem of scientific scalability: how to enable the legions of non-climate scientists, vastly outnumbering the climate research community, to benefit from climate data.

The talks surveys some aspects of current computational climate research as it rises to meet the simultaneous challenges of computational, data and scientific scalability.

Slide 31. How to get to exascale
If individual arithmetic processors are going to remain at ~1GHz(109) how do we get to exascale (1018)? We need billion-way concurrency!

  • Components of a coupled system will execute on O(105) processors (driver-kernel programming model)
  • There will be O(10) concurrent components coupled by a framework (FMS, ESMF, PRISM)
  • We will reduce uncertainty by running O(10-100) ensemble members.
  • We will use a task-parallel workflow of O(10-100)to execute, process and analyze these experiments (FRE).

Exascale software and programming models are expected by 2013, hardware by 2018.

[What got pasted here from the PDF file as O, to my best surmise 🙂 😦 would be notation for processor speed/frequency (if Hz) – but what is the character? Lynne]

Slide 32. Hardware and software challenges

  • We still haven’t solved the I/O problem.(Useful data point: our IPCC-class climate models have a data rate of 0.08GB/cp-h).
  • Integrated systems assembled from multiple manufacturers: chips, compilers, network, file systems, storage, might all come from different vendors. Many points of failure.
  • Multi-core chips: many processing units on a single board. Since our codes are already memory-bound, we do not expect to scale out well on multi-core.
  • New programming models may be needed, but are immature: Co-Array Fortran and other PGASl anguages, OpenCL.
  • Reproducibility as we now understand it is increasingly at risk: GPU for instance does not appear to have a formal execution consistency model for threads.

Slide 33. Climate science: HPC challenges

  • Adopt high-level programming models (frameworks)to take advantages of new approaches to parallelism should they become operational.
  • Component-level parallelism via framework superstructure.
  • Approach models as experimental biological systems: single organism or “cell line” not exactly reproducible; only the ensemble is.
  • There is more “downstream” science than there are climate scientists: a scientific scalability challenge.
  • Use curator technology to produce “canned” model configurations that can be run as services on a cloud.


June 2010


© CreativeCommonsLicense

Creative Commons License Img

accurate quoting proper attribution by/on ES & of ES