Science (the mag, not the concept) sez:
Science is driven by data. New technologies… blah… publishers, including Science, have increasingly assumed more responsibility for ensuring that data are archived and available after publication… blah… Science’s policy for some time has been that “all data necessary to understand, assess, and extend the conclusions of the manuscript must be available to any reader of Science” (see www.sciencemag.org/site/feature/contribinfo/)… blah… Science is extending our data access requirement listed above to include computer codes involved in the creation or analysis of data
Well, jolly good. I look forward to them insisting the full code for HadCM3 / HadGEM / whatever is published before accepting any GCM papers using them (which, amusingly, will now include all the papers doing the increasingly fashionable “multi-model” studies using the widely available AR4 data archives).
Come to think of it, it would also prevent S+C (but not RSS?) ever publishing in Science.
[Update: meanwhile, Werner Kraus, not content with being a tosser has decided that he is an idiot -W]
* One of James / Jules’s posts pushing the appropriate model journal – Geoscientific Model Development.
* Eli comments on Nature’s policy, which is more nuanced.
* Devil in the details Nature 470, 305-306 (17 February 2011) doi:10.1038/470305b To ensure their results are reproducible, analysts should show their workings – nice Nature article on Genomics trubbles, h/t NB.
Continue reading “Nah, don’t believe it”
A post about “Engineering the Software for Understanding Climate Change” by Steve M. Easterbrook and Timbo “Not the Dark Lord” Johns (thanks Eli). For the sake of a pic to make things more interesting, here is one:
It is their fig 2, except I’ve annotated it a bit. Can you tell where? Yes that’s right, I added the red bits. I’ve circled vn4.5, as that was the version I mostly used (a big step up from vn4.0, which was horrible. Anecdote:it was portablised Cray Fortran, which had automatic arrays, but real fortran didn’t. So there was an auto-generated C wrapper around each subroutine passed such things, which did the malloc required. Ugh). vn4.5 was, sort of, HadCM3, though the versionning didn’t really work like that. Although that pic dates vn4.5 to 1999 that is misleading: it was widely used both within and without the Met Office until, well, outside it was still being used when I left in 2007, partly because HadGEM (which as I recall was vn6.0/1, though I could be wrong) was much harder to use. Also the “new dynamics” of vn5.0, although in theory deeply desirable, took a long time to bed in.
Note: you should also read Amateurish Supercomputing Codes? and the interesting comments therein.
Continue reading “Engineering the Software for Understanding Climate Change”
Via mt I find
too much of our scientific code base lacks solid numerical software engineering foundations. That potential weakness puts the correctness and performance of code at risk when major renovation of the code is required, such as the disruptive effect of multicore nodes, or very large degrees of parallelism on upcoming supercomputers 
The only code I knew even vaguely well was HadCM3. It wasn’t amateurish, though it was written largely by “software amateurs”. In the present state of the world, this is inevitable and bad (I’m sure I’ve said this before). However, the quote above is wrong: the numerical analysis foundations of the code were OK, as far as I could tell. It was the software engineering that was lacking. From my new perspective this is painfully obvious.
[Update: thanks for Eli for pointing to http://www.cs.toronto.edu/~sme/papers/2008/Easterbrook-Johns-2008.pdf. While interesting it does contain some glaring errors (to my eye) which I’ll cmoment on -W]