The more naive approach under discussion is to just regress temperature on the set of proxies in a multiple regression and look at the regression F statistic. It is entirely possible that the reason why many of the samples did not track regional or global trends is that at any one time local climates may be moving in the opposite direction from either regional or global trends.

There is wide range of trends in both groups of proxies with the Gergis proxy trends, as expected, being on average higher than the not-Gergis proxy trends. The argument could be made that the defect rate would decay as more 10% ranges of the instruments were tested. Nick: However, the belief that a larger ring width is due to a difference in temperature alone is only an assumption. Isn’t he hitting right at the heart of the matter by questioning the lack of a physics/biology based ex ante method of tree selection? I haven’t seen anyone do it yet.
Steve: there are a number of counter-examples.

How often would a paper be published that showed that while the relationship looked good in the training period it failed in verification?
————————————————-. The most famous is the graphic in the “trick to hide the decline” memo. Hypothesize one or more common signals across all proxies, and time-series for each proxy type.” But unstated was that they had been selected from the top 100, out of 10000 calculated, by HS index. Under the same calculations 3 of the 18 not-Gergis proxies would have met the Gergis criteria. And you’re not peeking at output data. Converting tree ring records preceeding the instrumental record is the same as converting farenheit to centigrade, it is not an experiment. Yule [1926] did a study on Church of England Marriages. In other words, the uniformity principle does not apply to either or . You don’t assume correlation in the training period – you look for it. And I also hope you acknowledge the risk that the current available body of tree ring proxies may well be biased by what the investigators thought should be the right answer (leading them not to include some that didn’t conform to that preconceived notion). Any procedure that allows you to peek at the output data before selection introduces spurious outcomes. Nor was it one of the criticisms made by Wegman. By performing an initial analysis to identify the most informative features using the entire data set – if feature selection or model tuning is required by the modeling procedure, this must be repeated on every training set. I’ve seen this gotcha often repeated, but never with a counterexample of people who have done it. more or less guarantees a “flat” shaft. The small the difference in mean the larger the n required. Gergis is a long way from tree level. Yet I am not finding any other than M&M, which itself isn’t linked directly. From Gergis’ (now hidden) blog…, “The challenge is to help people visualise our future climate and, in turn, provoke a strong emotional response… But the question is, can we process the science fast enough for us to see it in our imaginations?”. I guess it comes down to which is the cart and which is the horse: 1) Develop an hypothesis for how physical attribute XYZ may be a proxy for temp. One has to recognise that the proper temp reconstruction (hockey stick) consists of the C20/21 instrumental combined with (spliced to, if you like) the proxy shaft. Please tell us how your perspective is not Model B? I never associated any of my own published research with a “strong emotional response” from my putative readers.

And I’ve asked for a clear specification and never got one. This process could be justified, and it could fairly be called screening by **empirical situation**. another example of the problem: years or what ever long period they can proxy. Steve: MBH suffered from many problems but did not use selection fallacy screening. 8 papers which deal with tropical Pacific, southern hemisphere, and/or AUS although there are various other papers so the issue is not devoted to SH — still it could be they were trying to have as large a group of papers as possible for that issue. Again I’m speaking there of the simple NRC model. In fact, it is about the only reason any significant difference exists in the impression given by your figure and that given by MM’s. My observation of the individual proxies is that (1) the proxies do not appear to have that much synchronous response which provides a wide range of possible selections for those playing that game, (2) there is not a line of demarcation between the Gergis and not-Gergis groups of proxies but rather the groups blend one into another and (3) closer examination of those proxies in close proximity show differences in response in parts of the series. 6) Then they analyse ALL the data. Also for the sake of exposition let’s ignore the other proxies that don’t give us direct hindcasts at their respective locations back to this time. I mean, why should you discuss the fact your claimed “properly representative sample” is grossly distorted in relation to what you compare it to due to your methodological changes? And I am having a hard time in understanding the biological rationale for proxy selection. Steve: Nick, this is as simple as upside-down Tiljander. I had a concern about using the rcs function in R in that it asked for the pith offset. And the thread is about selecting proxies from those 62. The authors noted in the article that Cook was planning to incorporate the adjustment into ARSTAN. AFAICT, the error is in the use of the keyword “calibrate”; that is a bridge too far. “Had they done this in the first place, if it had later come to my attention, I would have objected that they were committing a screening fallacy (as I had originally done), but no one on the Team or in the community would have cared. He said a key step in the study was to establish the relationship between temperature variation and the response of natural systems, such as tree rings and ice-cores, by looking at the period (1920-1990) when yearly temperature records were available. The first thing I saw was the original Mann hockey stick together with the assertion that it was independently replicated by his former student. Haven’t you missed a step? The effect of this is to greatly increase the visual disparity between the MM approach and what you claim is the right approach. If you exclude the instances that don’t calibrate, then all you’ve done is abandon your selection criteria and used anything that happens to calibrate. And yes, she got a meaningless shaft, by design. Roman,

No circularity there. “Gergis is a long way from tree level. b) What is the uncertainty range for the correlation? Because the parameters I would be estimating would be the (unknown, but assumed to exist) temperatures in the past. you are using a probabilistic model as part of your selection procedure). ‘John Quiggin, a seemingly unlikely ally in criticism of methods’. I am an engineer by profession and find this whole issue utterly incomprehensible. This might mean adopting a 99% critical value, a 99.9% or a 99.999% critical value depending on the degree of autocorrelation. In reality, even if one completely dismissed the image in question, both the Wegman Report and your paper would still show MBH was fatally flawed. The latest comment there is here, at this writing. And when she extended the training period to include a test period post-1900 rabdom proxies tracked Hadcrut there too. Even collecting more data can be problematic! Now there do seem to be other faults. As best I can tell — I welcome correction on this point — “infilling via RegEM” simply generates numbers that plausibly fit the pattern of the time series, past the point where actual data do not exist. To make the story go full circle (a condition sought for tree ring work), Japanese scientists did indeed study the growth of children fed giberellins extracted from rice, ref ferd berple Posted Jun 11, 2012 at 2:37 AM. If the data is conspicuously non-Gaussian, then instead of trimming the data to fit the model, a statistical method that takes this non-Gaussianity into account should be used instead. To seek knowledge of a proxy without dealing with the spices not correlating with a ‘predefined’ signal, well it’s not seeking knowledge. If our hypothesis is that they are a valid measure, they should correlate nicely to one another. HAS, Then there’s a whole protocol and lore for finding sites within those areas. So what? On a tangential issue, it is worth comparing Nick Stokes’ comment “The splicing isn’t a bug, it’s a feature” with RealClimate’s 2004 assertion: “No researchers in this field have ever, to our knowledge, “grafted the thermometer record onto” any reconstruction. I would like to know where you are coming from so that I would know how to better explain what I am trying to get across to you. However, once we started to check if the observations were consistent with each other or whether our discoveries were artefacts of a statistical bias (eg if you look at enough distributions you’ll see a bump in some of them through chance alone) the evidence was weakened and now nobody takes pentaquark evidence particularly seriously. We actually have lots of data with which we can test the degree to which the “uniformity principle” holds for specific types of proxies. It’s true that eg CPS does some joint regression and it gets more complicated.

