Today, I’m going to discuss a couple of points arising out of Fleitmann et al 2007, a discussion of speleothems in western Asia. Most notably, I’m going to discuss a “statistical” calculation made in this article, where the authors relied on a “visual” interpretation of noisy time series, which readily permit an alternative interpretation. (The reference to Fleitmann et al 2007 arises out of prior discussion of speleothem dO18 series; the comment today is not a direct response to that discussion, it’s a standalone comment prompted by my reading of the article.)

The abstract to Fleitmann et al says (the bolding of the *statistical* term anti-correlation is mine):

For the late Holocene there is an

anti-correlationbetween ISM precipitation in Oman and inter-monsoon (spring/autumn) precipitation on Socotra, revealing a possible long-term change in the duration of the summer monsoon season since at least 4.5 ka BP. Together with the progressive shortening of the ISM season, gradual southward retreat of the mean summer ITCZ and weakening of the ISM, the total amount of precipitation decreased in those areas located at the northern fringe of the Indian and Asian monsoon domains, but increased in areas closer to the equator.

One of their conclusions states (my bold):

During the late-Holocene, the **observed anti-correlation** between monsoon precipitation in Southern Oman and inter-monsoon (spring/autumn) precipitation on Socotra possibly reveals that the mean seasonal cycle of the ISM has also changed in response to insolation forcing.

This anti-correlation is discussed and illustrated in the running text as follows:

In order to better reveal the anti-correlation between precipitation in Southern Oman and Socotra in greater detail, we compared the overlapping d18O profiles of stalagmites Q5 and D1 (Fig. 9). This comparison shows a visible anti-correlation, with higher monsoon precipitation in Southern Oman coinciding with lower intermonsoon (spring/autumn) precipitation on Socotra.

Their Figure 9 is shown below. Observe that the term “strong” is added to anti-correlation in the caption comment: “Note the** strong** visible anti-correlation between D1 and Q5.”

Elsewhere the authors say of the relationship between Q5 and D1:

“In contrast to the gradual decrease in ISM precipitation recorded in stalagmite Q5 from Southern Oman, the D1 stalagmite d18O record from Socotra shows a long-term increase in inter-monsoon (spring and autumn) precipitation, as indicated by a long-term decrease in d18O since ~4.4 ka BP (Figs. 4 and 7). This anti-phase behavior between Southern Oman and Socotra seems to be apparent not only on millennial but also on multi-decadal timescales.”

They posit the following theoretical explanation for the “strong anti-correlation” between Q1 and D5:

Because the onset and termination of the summer monsoon on Socotra determines the end of the spring and start of the autumn precipitation season respectively (as it is in Eastern Africa; Camberlin and Okoola, 2003), it is

conceivablethat a gradual shortening in the length of the summer monsoon season resulted in an extension of pre- and post-monsoon rainy season precipitation on Socotra.

and suggest that their results may affect many interpretations:

A later advance and earlier retreat of the ITCZ in spring and autumn, respectively would lead to an on average shorter monsoon season over Southern Oman. If the movement of the ITCZ took place more slowly, a second effect would be a longer spring/autumn rainfall season over the northern Indian Ocean and Socotra, respectively. Provided that our interpretation is correct, our results should have far-reaching consequences for the interpretation of many monsoon proxy records in the ISM and other monsoon domains, as most paleoclimate time series are commonly interpreted to reflect overall monsoon intensity, but not the duration of the monsoon season.

**The “Strong Anti-Correlation”**

By now, most CA readers have seen a lot of noisy climate time series side by side and learned not to place a whole lot of weight on non-quantitative characterizations of statistical data. And Fleitmann et al provide no quantitative analysis to support their claims of “strong anti-correlation”; the edifice is built on the visual impression of their Figure 9, which, quite frankly, conveys an impression of noise to me, rather than “strong anti-correlation”.

Establishing correlations between two time series with irregularly spaced intervals and with uncertainties in the time calibration of both series is non-trivial. And establishing that an anti-correlation between two such series is “strong” or significant is also not all that easy.

The first step is to collect the data.

The Qunf Q5 data is archived at WDCP here ftp://ftp.ncdc.noaa.gov/pub/data/paleo/speleothem/asia/oman/qunf2007.txt.

The Dimarshim D1 data is not archived at WDCP. In early September, I sent two polite requests for data to Fleitmann coauthor, Stephen Burns, neither of which received a reply. (Burns is in Bradley’s department at U Mass and I guess that responding to such an email would be as bad as saying the name of He Who Must Not Be Named).

But the Dimarshim D1 (as plotted in Fleitmann Figure 7 #13) matches Mann’s Socotra dO18 series, where it has been interpolated to annual data. As a temperature proxy, Mann uses the Socotra dO18 series with positive dO18 up, while Fleitmann 2007, using the series as a monsoon proxy, uses it with negative dO18 up – the same orientation as Dongge, Wanxiang and Heshang as monsoon proxies. (As a note to a CA reader, Fleitmann accordingly does not *rationalize* the basis for Mann using Socotra/Dimarshim D1 oriented oppositely to Dongge; that issue remains outstanding.)

Between Mann’s version of Dimarshim D1/Socotra dO18 and the WDCP version of Qunf Q5, we have some tools for beginning a more quantitative approach to the evaluating the “strong anti-correlation”. First here is a plot of the two data sets (as I’ve used them), which can be compared to Fleitmann Figure 7 to confirm that we’re talking apples-and-apples.

One obvious impression from this picture is first that the Qunf Q5 speleothem has suffered a substantial hiatus. I plotted up a graph doing age-depth comparisons for the two speleothems, which have quite different patterns. So the comparison between the two speleothems is far from ideal on the above basis and it seems to me that one would like to compare two speleothems with more comparable growth histories before investing too much energy in elaborate conclusions. However, that’s not the main point here.

Second, given that there are uncertainties in the dating of both series, it doesn’t take a lot of imagination to wonder whether the wiggles might match a bit better with dating alternatives within the acknowledged error margins of the dating. In order to test this data, I examined the subset of overlapping data (from about 2700-4700 BP), interpolating the Qunf Q5 data to annual series (as Mann had done with the Dimarshim D1 series) and calculated the correlation between the two series (r= -0.08). This is a pretty rough-and-ready methodology, but I think that it is less bad than a “visual” comparison. The residuals from the linear fit (corresponding to the correlation calculation) were highly autocorrelated.

The average reported error bar for Dimarshim D1 dating in the 2700-4700BP range was 72 years and for Qunf Q5 was 113 years, yielding an error bar for relative dating of 134 years (Pythagorean).

“Visually”, it looked to me like the Qunf Q5 series might fit with the D1 series a bit better if its dates were brought a little closer to the present. So I displaced the Q5 series one year by one year towards the present and for each displacement calculated a correlation, yielding the graphic below (this sort of method is used in dendro dating as a check.) By bringing the Q5 series about 85 years closer to the present, instead of a negative correlation of -.078, we get a positive correlation exceeding 0.2 (probably with improved autocorrelation properties in the residuals though I didn’t check this.) An 85 year displacement is well within the error bounds of the speleo dating method.

As noted above, this analysis has nothing to do with the interpretation of dO18 records as monsoon proxies or as temperature proxies. It is restricted to the issue of whether there is an “observed” anti-correlation between Q5 and D1 dO18 records.

Am I claiming the opposite: that there is a statistically significant positive correlation between D1 and Q5 dO18 records? Nope. I don’t know enough about these records to make such claims and there are some intricate issues in estimating correlations between irregular time series that would need to be canvassed. But, on this data, it’s quite possible that there is no actual anti-correlation and that the “observed” anti-correlation is simply an artifact of relative dating errors.

[Update] Long-Term Trends

Dr Fleitmann points out in a comment below that, regardless of the above possible issue, the long-trends are different between Oman and Socotra, a point illustrated in their Figure 4, excerpted below:

**Update – Dec 15**

In my calculations, as noted above, because Fleitmann et al failed to archive their Socotra data and because Burns refused to provide the data when requested, I compared the Qunf Q5 data to Socotra as used in Mann et al 2008. However, given that I used *Mann *data, I obviously should not have assumed that it would correspond to other versions of the data. Below I’ve excerpted Figure 9 of Fleitmann et al (which shows Dimarshim D1) and a replot of the Mann Dimarshim D1 version (identified there as “burns_2003_socotrad18o”). The versions are similar, but there are notable differences that prohibit use of the Mann version for analysis without reconciliation. Note the downspike to nearly -5 in the Fleitmann version (grey series) at about 3570 BP. The corresponding downspike in the Mann version is about 3800 BP, about 230 years earlier. The preceding upsike to about -1.5 occurs just before 3900 BP in Fleitmann and about 4100 BP in Mann. The Fleitmann version ends just before 4500 BP, while the Mann version goes to about 2700 BP. The Mann version seems to be about 200 years earlier than the Fleitmann version.

Fleitmann et al 2007 Figure 9.

Replot of Mann et al 2008 Version of Socotra O18.

**Reference:**

Fleitmann, Dominik, Stephen J. Burns, Augusto Mangini, Manfred Mudelsee, Jan Kramers, Igor Villa, Ulrich Neff, Abdulkarim A. Al-Subbary, Annett Buettner, Dorothea Hippler, and Albert Mater, 2007. Holocene ITCZ and Indian monsoon dynamics recorded in stalagmites from Oman and Yemen (Socotra). Quaternary Science Reviews Vol. 26, No 1-2, pp. 170-188, January 2007,

online at http://www.manfredmudelsee.com/publ/pdf/Holocene_ITCZ_and_Indian_monsoon_dynamics_recorded_in_stalagmites_from_Oman_and_Yemen_(Socotra).pdf

## 141 Comments

Hi Steve,

That’s an interesting analysis, worthy of further study.

Have you tried wiggling 300 years the other way i.e. putting Q5 back instead of forward?

Steve, you say

I’m sure they would’ve liked to have a continuous record thru the whole timeperiod as well. Anyone working in speleothems wants a continuous record. But one cannot tell if there is a hiatus in growth by just walking into cave and looking at the formations. Well, I admit I can’t. Maybe you can. I have to cut it open and date it to tell. However, given that the formations are a precious natural resources, one cannot go into a cave and take every formation to find the perfect specimen. So you work with what you have.

Regarding their “elaborate” conclusions, one point was about the trend from 4.4ka to present. Which you didn’t not comment on. Socotra gets more negative in d18O and Oman gets more positive (Figure 4). That is a clear anti-corrleation on millennial timescales. Why did you leave that part out?

Also, in Figure 9 the y-axes are reversed between the two samples. It is very clear to me, at least, that there are regions where the two records match up. That would indicate anti-correlation on multi-decadal timescales. The difficulty, which you point out well, is quantifying it. Two series with unevenly spaced data due to variable growth rates as well as dating errors is hard to do quantitative statistics on. I’m open to any suggestions you have about working with data such as it.

Re: Jud Partin (#2),

“But one cannot tell if there is a hiatus in growth by just walking into cave and looking at the formations. Well, I admit I can’t.”

I wholeheartedly agree. Admittedly my experience is with cave sediments and the fossils the may contain rather than speleothems sensu stricto, but deposition in karst caves is an incredibly complex process which can vary strongly over a distance of just a few meters. Sometimes a hiatus is obvious from the stratigraphy, but often it is completely invisible. If you can’t date the deposits directly you can never tell for sure.

The most amazing fact to come out of Fleitmann et al’s paper is that it made it through peer-review.

There again, maybe this sort of thing is not amazing. After all it is “climate science”

Don Keiller

Would not that -0.08 be closer to a no correlation than an anti-correlation? These speleothem analyses are almost as much fun as the ones for NATL TCs and probably more complicated. I have really enjoyed all the background information and comments made on the subject in these threads and appreciate the efforts made by the science contributors and others.

Sounds like a joint speleothem and statistics project.

I would suggest that for considering the effect of dating error, the data is more like a slinky. Each estimated date in the age model pins the slinky of measured data down. Some sections are more stretched than others. The error at each control point gives not just a possible shift, but earlier/later shifting of all control points which shifts the shape of the age model. This leads to huge degrees of freedom for wiggle matching. This points to a general problem with such data: in contrast to controlled experiments where a statistical frame can be defined rigorously which leads to an appropriate stat test, with this type of problem there are many “reasonable” analyses that can be performed, but not (yet) a “correct” method of analysis. However, any reasonable attempt at quantification is better than eyeballing it, IMO.

Jud Partin,

It is difficult, in fact I would pose that it is impossible, to reach a concluson on the trend from 4.5 ka to present when there there is no data available for one of the series for 40% of that time frame. My suggestion would be to acknowledge the incompleteness of the Qunf series and let it rest. Not so great in feeding the need to publish, I know. That or restrict the analyses to the intervals where the data are continuous and overlap.

You mention millenial scale anticorrelation and take Steve to task for not addressing it. It would greatly further discussion of the topic if you could specify what millenial time frame you are looking at, and perhaps provide a quantitative analyses that distinguishes the decadal from the millenial? In looking at the recent Qunf data (circa 1550 ybp to 500 ybp) there appears to be a centennial scale correlation. I’d like to think you take this into account when chastising our host, but given the brevity of your post I am left to wonder.

The data show some periods of decadal anti-correlation, yet also show periods of decadal correlation. The recent Qunf data support (by visual analysis) a “strong” correlation at the centennial scale. You seem to be reaching a conclusion of millenial scale anti-correlation by mentally infilling the time period of no Qunf data. Is this the case?

Jud Partin,

Reading between the lines you say that there are considerable difficulties in collecting data because the record is not continuous etc and you have to make do with what you have. That’s fine if you acknowledge the facts in your papers (and as I don’t know your work I assume you do).

But Mann, et al, have never acknowledged any problems with using the BCPs and they continue to use them to this day. Don’t you find that at least a bit odd?

Visual analysis strongly supports that in 1983, David Copperfield made the Statue of Liberty vanish and reappear.

The reason we have statistics is that “visual analysis” is misleading except in the most extreme cases.

#2. Jud, I didn’t

blamethe authors for not picking a continuous speleothem nor in any way suggest that their judgement was flawed by selecting Q5 for analysis. So there’s no need to be chippy on this account. I’m very familiar with the vagaries of exploration and appreciate the problems. But when one is seeking to interpret the results, it’s fair enough to observe that a comparison between records with a long hiatus is far from ideal. That is not a criticism of the authors, merely a practical observation on the quality of the record.I was commenting on a statistical issue in the article. I wasn’t trying to summarize everything in the article – merely something that caught my eye. To say that I “left out” an analysis of some other aspect of the article is being a bit invidious. Doubtless there are many other interesting aspects of the article that I didn’t discuss in my post. That doesn’t mean that the points in my post were incorrect.

You observe that one record has become more positive over the millenia and the other more negative. This is an interesting point and is doubtless worth reflecting on, but I don’t think that is something that I would term an “anti-correlation” – a correlation (or anti-correlation) implying a number of measurement points to me, not simply two trends. Also while Fleitmann et al explain their “observed” anti-correlation in terms of length of the monsoon season and such, they elsewhere appear to attribute at least part of the long-term shift in relative dO18 to changes in source moisture separate from the “amount effect”:

You observe:

and say that you are “open” to suggestions. It’s not a problem that I’ve thought about particularly; I’m sure that there must be some statistical literature on it somewhere. As a first pass, my practice is to see what the authors themselves do and, in this case, it seems to me that there are some obvious defects in their methods. However, developing a better statistical procedure is something that would take longer than a quick blog post, though my first suggestion would be to canvass the statistical literature for analogue situations.

Re: Steve McIntyre (#11), to be accurate, an identified hiatus is not a bad quality in this record or any other. It is a bad quality in the sample. That the authors identified it and quantified it is a good quality of the record. However b/c the sample was not depositing calcite during this period, it is non-ideal (actually impossible) to use during said period as there is no information.

You put in a quote from the article which mentioned millennial-scale trend in your post. But didn’t address it. That was my point. Nothing nefarious. I simply asked why you didn’t. It is a pretty glaring feature of the records in Figure 4. I wonder what the trends would look like if one did a linear regression from 4.4ka to present.

Are there opposite trends between Oman and Socotra? If so, how would one term said opposite trends? Obviously you do not like “anti-correlated”…

Re: Jud Partin (#14),

In the arcane science of statistics, one usually talks about positive and negative correlation (as in the sign of the slope or the correlation coefficient) so that the two records with opposite trends would be said to be negatively correlated. On occasion, descriptives such as inversely correlated are also used, but anti-correlation is a term that is rarely seen. By the way, what is the opposite of anti-correlation? Pro-correlation?

Re: RomanM (#16),

I have to agree about moving on. Clockwise and anti-clockwise are an antonym pair that you do not see so much in N. America, but not incorrect either (also in answer to what the opposite of anti-correlation would be!).

Any strangeness is probably coming from the Socotra side, given its exotic landscape! See photos at http://www.darkroastedblend.com/2008/09/most-alien-looking-place-on-earth.html.

“Visual interpretation?” I don’t think I’ve ever seen that “method” used to make a conclusion in a scientific paper! How’d that get through peer-review?

Hu:

Great site and really cool pictures. Now the tree rings from some of those trees might yield some interesting data!

I am not sure I would worry quite as much about carefully dissecting one of those speleothems, as there seem to be quite a few.

Roman:

I too found the term odd. However, there are over 100,000 hits on Google with a number of the references associated with physical sciences – so perhaps the term got established there. In any event, I assume that anti-correlation and negative correlation are statistically the same thing. If so, we should move on to something more substantive.

Re: bernie (#17),

Interestingly enough, the statistical terminology used by people often indicates information about where they acquired whatever statistical skills they have and how well they might understand them or how limited those skills might be.

For mem, anti-correlation is one such term. Another is a statement like “this is significant at the 95% level” – a mixture of the terminology of statistical testing and estimation. Statisticians are usually more careful with how they state their results.

A question to Jud Partin: Are there any direct comparisons between speleothems – tree rings? Depending on tree species temperature and precipitation are in variable degree important for the growth. To compare speleothems and tree rings the data series should be available within a reasonable distance from each other. Are such data available?

What interests me about the Qunf speleothem is that it does have a hiatus. This is telling us something but I’m not sure what. Does the hiatus represent a long and severe drought period, a change in the local hydrogeology and underground plumbing system with migration of dripwater, or something else. Whatever, it underlies the fact that we don’t fully understand many of the local controls on cave hydrology. Given this how can one understand a millenial drift of somewhat less than 0.5 per mille in the Qunf specimen between 2500 and 500 BCE.

Re: Paul Dennis (#21),

The presence of a hiatus suggests that hydrology altered from flowing to non-flowing. If it did this gradually, then there would likely be changes to the chemical and isotopic equilibria and rates that would make interpretation very difficult in the transition period, which would probably not allow use of the calibration period, when it seems there was flow. This would make estimation of the start and stop of the hiatus quite difficult.

My preference would be to say that the trends are “opposite” or “diverging”.

It was not the stylistic use of “anti-” that offput me, but the use of “correlation” when the millennial trends are reduced to individual numbers and there is no actual correlation calculation between the

trends.But the main thing is that people understand what’s at issue; I always try to look past the form of words to the calculations themselves. I’ll think a bit about the “opposite”/”diverging”/”anticorrelated” trends as between Socotra and Qunf.

I guess most of you have already downloaded the Fleitman et al paper. But the link is broken. This one should work however.

There is also a complete list of pubs at Mudelsee’s website

A quick observation: Your graph shows d18O in the same direction for both speleothems, while the authors’ Fig. 9 shows opposite orientations. I assume that the authors have followed the convention of “wetter” up, which as we have seen can be correlated negatively (Q5 – Onan) or positively (D1 – Socotra) with depletion of O18 depending on the site.

Therefore it seems that the authors’ “anti-correlation” is based on the opposite assumption to yours. For example Fig. 9 shows the clearest divergence from 2700 through 3500 years b.p., whereas yours shows the most agreement there (-700 to -1500 on your graph). (Of course, besides Q5 being flipped in the opposite direction in your figure, the time arrow is also flipped, making comparison of the two charts very difficult indeed.)

Anyway, I do agree that the main point should be comparing the trends. Assuming the authors’ assumption of opposite signs of correlation between d180 and precipitation at the two sites, it does seem that there are opposite precipitation trends over much of the period 4500-2000 bp.

Of course, how all this relates to temperature trends is a whole other discussion.

Steve:There is no statistical convention that you flip series opportunistically. It’s much better to start with the native data and see what it looks like, before you start imposing an order on it. That’s elementary statistical practice regardless of what is “conventional” for speleos.Your statement that the author’s “anticorrelation” is based on the “opposite assumption” to mine shows no understanding whatever of the point in the post, which made no “assumption” of the type that you allege. In the post, I showed that a positive correlation was possible for the Figure 9 period within the dating uncertainties of the speleothems. The authors’ claim that Figure 9 showed a “strong anti-correlation” was therefore bogus. I clearly stated that I had no personal position on whether the data was positively or negatively correlated – only that the authors had not proved their claim of a significant anti-correlation (due to their inadequate statistics.) Neither you nor Jud nor anyone else has provided the slightest argument against this point.

That doesn’t mean that every observation in the article is incorrect. You may well think that the long-term trend is “more important”. However, the point that I discussed in the post was deemed sufficiently important by the authors to discuss in the running text, the conclusions and the abstract.

Since people seem to be interested, I’ll discuss the divergence between long-term trends – another divergence problem, it seems – on another occasion, but I can’t discuss do everything at the same time.

Re: Deep Climate (#24), Steve says

I just searched the article. The phrase “significant anti-correlation” is not in it. What they note is a “strong visible anti-correlation”.

Now people have jumped all over that. They want hard #’s. But to those who brought that up, Steve didn’t know how to compute the correlation between two series with unevenly spaced data due to variable growth rates as well as dating errors. He told me to go search the statistics literature. The difficulty of this calculation is what I got out of this post. And told him he did a good job of pointing it out.

In this low resolution 2 colour image there is a dog Sam Urbinto (#9) jae (#13) look upon visual interpretation with scorn. I can see the dog can statistics?

Visual misinterpretation is possible (seeing things not actually there). BUT surely the brain/eye system is very good at noticing patterns?

Mike

Steve:I obviously believe in looking at data visually. I do it all the time. But there’s also a reason to quantify things. Let’s stick to narrower issues. Please debate this sort of epistemological issue somewhere else.Re: thefordprefect (#25),

I don’t agree with your (intimated but unstated) point. Your “photo’ is crazy and irrelevant. I was looking at the term “anticorrelation” from a statistical standpoint, not just as a general “laymans” observation. But if it was just a general observation, it should have been called that. The word “correlation” should not have been used there, because it confuses many readers and connotes more than just a casual “visual observation.” (which Steve notes is mentioned in all important parts of the paper). It’s misleading, at best, snip And I stick by my guns that the peer-reviewers should have picked up on this issue. And I would add that peer-reviewers in climate science seem to be very, very, very forgiving.

I hope everyone will take my main point, but rereading this I see I don’t have it quite right.

I meant, of course, to say that:

I’ll let the experts correct me on the specific characterization of the relationship, but the main point stands I believe.

My point is this: The authors postulate an opposite direction for d180-precipitation correlation at the two sites, presumably based on previous studies. They then go on to show that under that

assumptionthere is an “anti-correlation” between the two precipitation records (not the two d180 records per se).I do not believe you are justified in claiming that this is somehow opportunistic. The authors have scaled and oriented the d180 curves in a manner consistent with previous studies’ characterization of the d180-precipitation relationship at these two sites. On my current reading and understanding, your calculation of the correlation between the two raw data sets is not particularly meaningful or enlightening. That’s my opinion at this point, but of course I’d like our two visiting experts to weigh in.

You misunderstand my point about trends. I’m not referring to overall trends, but rather to the changing trends within the record. If the records are interpreted in the manner established in the speleothem literature, it is very easy to see that much of the time peaks in one record correspond to valleys in the other. That is, there are opposing trends at most points in time, but that is not a statement about overall trends over the entire period.

Then blame that on Fleitmann et al who claimed that their Figure 9 illustrated a “strong anti-correlation”.

Deep Climate:

Some kind of non-parametric sign test will establish this. The point is that a strong conclusion needs more rigorous support. The authors should have done something beyond the hairy eyeball test.

Steve, you say,

To falsify that particular statement of Fleitman et al you would have to calculate the correlation between the two data sets,

centred, scaled and orientedas they are in Fig. 9, and not the correlation between the raw data sets.Or you can argue that said centring, scaling and orientation is not justified. But few will believe that until you publish that finding in a recognized science journal.

Re: Deep Climate (#29),

Um, calculating a correlation removes the center point (mean), divides by the standard deviation (scaling), and it can be done for any lag (orientation?), so this statement is meaningless. Unless of course you have a different definition of correlation than what I typically use?

Mark

#29 Deep Climate:

But – would such a calculation be worth the effort when the mere eyeballing (as Fleitmann apparently proposes it as valid “method”) already DOESN’T show the claimed anti-correlation? If I had received that paper for proofreading, I’d have put a large red query against that fig.9 and its legend, assuming either the illustration or the text had been misplaced at some point, or one of the curves was accidentally upside-down, as to these (layman) eyes there’s a rather good (positive) correlation between the two as shown – both series have the big dent at around 3.6 with a subsequent rise. One can also eyeball what Steve showed in his analysis, namely that moving the black line a bit to the left, “synching” the minima between the two curves, would improve the agreement at least in the left half of the curve. Reading the caption “strong anti-correlation”, I’d expect something like you get when you switch one of these curves upside down, turning the most obvious common feature – the dent at circa 3.6 followed by a rise to a slightly higher average level than at the start – into a spike with subsequent fall below the former average.

Logically, phase-inverting a “strong anti-correlation” should produce a “strong correlation” and vice versa, so if the (eyeballed) correlation gets distinctly worse by doing so, it can’t be such a strong anti-correlation to begin with, can it?

#30 (and others).

Certainly I accept the point that a statistical analysis with an actual calculated correlation would be preferable. If and when I have time I’ll try it on the above basis. And I’m not sure what number should be ascribed to “strong.” But clearly an absolute value of 0.08 (whether negative or positive) would not suffice.

That would be true if the data sets were standardized (i.e. centred and scaled) the same way the authors have done, with inversion of one data set the only difference. That’s not what was done, however.

I can’t comment on 3.6K bp, because I can’t see the chart right now (some kind of update going on?)

Centering and scaling does not in any way impact the correlation between two series, DC. I suggest you revise your position on this.

Mark

Mark,

On reflection, you may well be right. However, if so, it means that Steve is showing the

oppositecorrelation to that of the paper, while his shiftingis supportingthe claim (albeit weakly).There may be an explanation for this. When I look at the Fig. 9 (and it’s pretty small), I think I see clearly opposing trends only since 3.5k bp.

You see there are two somewhat related, but not, identical claims in the paper. One is about the long term trends of precipitation at the two sites, which are held to be opposing, while the other claim is about negative correlation between precipitation levels at any give time.

Now, I haven’t read the paper in detail, but perhaps the problem is that the authors have not clearly stated the time interval that this second claim operates. In the quoted section that doesn’t appear.

It’s not “may be right,” I am right. Appealing to the definition of a correlation:

E{(X-mx)(Y-my)}/(sx*sy) where mx and my are the means (centers) of X and Y respectively and sx and sy are their respective standard deviations (scales).

Lagging simply shifts the series with respect to each other and is useful when there is a time dependency between them, e.g., effect is delayed from the cause.

I’m not sure what your gripe is when you take this point since he clearly stated that there is a weak correlation of +0.2 when lagged 85 years (although, from what I’m reading, it is a constant lag, and the “error margin” of the dating would likely be more random). The simple fact remains that the “strong anti-correlation” claim is bogus.

Methinks though doth protest too much. You’re looking too hard for “disinformation” when in fact there is none (well, not in what Steve posted).

Mark

Mike, not to debate the issue, and I agree with Steve, we do look at data visually all the time. But the same pattern recognition functions in our brains that let us see this that or the other (such as dogs in lo-res B&W images) also has us seeing elephants in clouds. Or positive or negative correlation where none exists, or none where it does exist.

The point is, as Bernie mentioned, that strong conclusions need rigorous support. What we think we see isn’t usually.

And my observations are significant at the 95% level.

:)

The problem with ‘eyeballing’ something is that people more often than not see something that isn’t there (we’re very good at pattern recognition and evolved to make Type I errors because jumping at shadows is better than being eaten by tigers). That is why people use statistics so that one can rigorously define what is actually there and what isn’t — otherwise we’re just staring at ink blots.

We all know what the correlation between the two series is 0.2 means, we don’t know what “that cloud looks a little bit like a dog to me” means. Science is about rigour, leave the ink blots and cloud pictures to the artists.

Steve, Have I understood correctly:

You took the WHOLE of the Q5 series and displaced it. Surely this is not the correct thing to do.

I assume (please correct me if I’m wrong) that the dating is done at specific ring depth in the speleothem and then rings counted between 2 dated rings (or perhaps a depth to date linear interpolation is made between 2 dates?). This would give an age at the measured ring +- the error – this would be constant offset until the next measured date ring (using my firs assumption). At each dated ring the error will not be offset in the same direction as it it an absolute measurement error.

So each Speleothem must have data shifted +- the sum of each error band at each date – the average cannot be used as it is meaningless.

I assume your statistical techniques are made at each date (some of which are fictional) not looking for max correlation arround a band of dates? If so then my engineering mind say it is invalid. A visual interpretation does not look at precise dates but at approximate ones so perhaps visual is better?

Mike

Error plot For The Q5 data

Re: thefordprefect (#38),

I don’t see anything to suggest that Steve has offset the Qunf data. Notice in Steve’s plot that it ranges from about -1.7 to -0.5 in the time period of circa 4.6ka to 2.5ka. Now look at the Figure 9 from Fleitmann 2007. The Qunf data ranges from around -0.5 to -1.7 for the same time period (sign of Y axis is reversed). The data plotted by Steve and the data plotted by Fleitmann appear to be identical. Nothing is evident here to support the conclusion you’ve reached. On what basis do you make such a claim?

To create q5 d1 correltion graph Steve states:

Lets assume some date calibrations with errors in each speleothem. Then this illustrates the amount of date wiggle that should be allowed for each region. There should be no averaging.

Mike

Re: thefordprefect (#40), This is what I was getting at with my slinky analogy, or something close to it.

snip – OT

#38. I don’t know how the errors are calculated and why they vary so much from measurement to measurement. What I wanted to see was whether a displacement of 80 years or so was even conceivable relative to the dating error structure. And it clearly was. Does an “average” error make any sense? I don’t see why it’s not a reasonable indication.

I agree that a displacement of the sort shown here requires that the errors be highly autocorrelated i.e. this in turn would probably require some sort of systemic error. I’ve spent a bit of time looking today at how the dates are calculated. I suspect that this is possible, but I can’t say for sure on my present knowledge.

There’s another possibility in this. Given the noisiness, I suspect that there are non-autocorrelated error sequences that would yield positive correlations. I don’t know what the statistical analysis of this would look like on a likelihood basis – but it’s an interesting statistical question that the authors simply didn’t comprehend or consider.

Re: Steve McIntyre (#44)

I would have thought that before a mathematical correlation check was made between 2 REAL WORLD data sets it was essential to know how figures for the data was derived, the environments in which the data was created and especially the noise inputs.

The sample times would be critical – are these derived from yearly growth rings? Are they derived from the linear depth from one dated depth to another? If from yearly results why are there not results for each year? Perhaps if your minions have not annoyed Jud too much he could say?

Not understanding what your Correlation mathematics achieves and how it gets there (please feel free to call me an idiot if you like!) let me question you with a couple of scenarios.

1 Assume a large low frequency signal is added into one of 2 identical low level signal. What would the correlation show?

2 Assume 2 identical low frequency signals but one has a high level of high frequency noise added. What would the correlation show?

3 & 4 take scenario 1 and 2 and add sample jitter of say 3 samples deviation

5 Assume 2 identical signals differing amplitudes with time.

6 Assume 2 identical signals with 3 sample jitter added.

I genuinely would like to know the answers – there are no tricks honest!

Also I would have thought comparing interpolated data was a big NO (I don’t care if Mann did it – it does not make it valid in your analysis!)

My guess – and it is only an (mathematically) uneducated guess:

Very poor correlation on 1 and 2 unless you filtered for the data you knew was there.

Very poor on 3 and 4 even if filtered

5 Poor correlation unless the signals are levelled

6 very poor unless sample times are allowed to wiggle

A visual approach could give better results in all cases?

Perhaps fuzzy correlation mathematics is required for this sort of thing?!!

Mike

http://en.wikipedia.org/wiki/Correlation

I’d have to think about it a bit but my approach would probably be to first come from an analytical viewpoint. Create two series X and Y that are sampled at intervals of n+del_X and n+del_Y respectively where the del_X/Y is a random error in the sampling. This assumes their initial sampling is done at some desired, fixed and equal increment (given by n). I’d model the errors as either uniform or Gaussian, the former being the easiest to deal with for a first cut. Means and standard deviations should not be impacted by uneven sampling. The problem is then to convert the X(n+del_X) and Y(n+del_Y) to some function {X(n)Y(n)+F(X)G(Y)} where F and G are the portions resulting from the random sampling errors. Since we have a model for these portions, we can put bounds on their contributions to the final correlation and arrive at a “better” estimate.

Just a thought about how I’d approach the problem. I looked at the IEEE for some references but found naught.

Mark

Btw, dealing with uneven sampling is a common problem in signal processing systems called jitter (in fact, I actually had to delete the word jitter and insert “error” above since that’s what I call it). Here’s a link that explains how to deal with the impact on a sine wave. There’s no detail on how to deal with arbitrary signals (receiver designers simply use this as a worst-case approximation of the increase in noise-power) nor detail on dealing with the correlation between two arbitrary signals with independent jitter.

Mark

#48

This might be useful to find relevant references:

Irregularly Sampled Signals: Theories and Techniques for Analysis (1998), by Richard James Martin

http://www.itr.unisa.edu.au/~steven/thesis/rjm.ps.gz

Thanks UC. I’ll check it out tomorrow. Time for bed now (2:00 a.m. here). :)

Part of the problem with finding references is that there are a variety of ways to refer to the problem. There may be IEEE literature, but I simply entered the wrong search terms.

Mark

Jud–#46—this is what is frustrating [snip...] when someone uses (“—-“) quotations, they are quoting someone, when there are no quotations they have used their own words—–definition of “significant” from Merriam Webster—1: having meaning ; especially : suggestive 2 a: having or likely to have influence or effect : important ; also : of a noticeably or measurably large amount b: probably caused by something other than mere chance .

Maybe the authors should have chosen “significant” rather than strong, who knows.

Stop ignoring the points of this post and try to learn something.

Regards

Re: Gaelan Clark (#52), No need to be rude Gaelan, Jud has been very constructive.

Gaelan:

Jud was right to point out that the authors did not use the word significant. I, with you, make a similar inference as to the meaning of strong visible anti-correlation. But a more constructive approach I think is in order.

Gaelan- From what I can tell, Jud Partin is an active investigator of speleothems and for me at least, any insights that he can provide are helpful and instructive, whether or not I agree or disagree with all of his conclusions. Therefore, I find your invective against him unacceptable. Steve – I would suggest snipping the offensive part of Gaelan’s post. You certainly snip other posts.

(Initial snip complete; I’ll leave it to Steve beyond that.)

Would it be of any use to plot one series against the other, but at each X from the first series plot all the Y values from the other series that may correspond, within the appropriate dating error interval? Or maybe, for every X, show mean Y and errorbars calculated from the Y values in the interval (with some gaussian weighting or somesuch).

Warning — Layman Suggestion — WarningPerhaps it would be useful to calculate the autocorrelation of each series to detect their periodicity. If te periodicities are not the same then the claim of anti-correlation would seem to be spurious.

The problem is more complicated than mere irregular spaced samples, because of the dating uncertainty, but irregular spaced samples is a start.

The sorts of things that you can do are still constrained by the data that you’ve got and there is always some merit in doing obvious things first.

I interpolated the Qunf series to an annual basis – and anyone who wants to cavil at that is going to have to object to Mann’s systemic interpolation of series, including Socotra.

What does this correlation mean? Well, anyone who cavils at interpreting such a statistic is also going to have to object to Mann’s correlation of interpolated Socotra to infilled temperature – or worse, correlations of smoothed interpolated Dongge to smoothed interpolated infilled gridcell temperature.

I do not accept the hairsplitting argument that Fleitmann can claim to have observed a “stong visible” anticorrelation without also claiming that the correlation was “significant”. If the anti-correlation is not “significant”, it cannot be “strong visible”.

Unfortunately, the entire discussion is about the results shown in Figure 9 of our paper. I recommend all users to check figure 4. It is quite clear (at least to me) that the records have an opposing trend on long timescales. Furthermore, I would like to point out that our paper aims to stimulate further discussions on Holocene climate and monsoon variability in this area. That is why we are using words like “possibly” and “conceivable” quite often in our paper. We are well aware that our data have uncertainties. Climate model simulations suggest that the duration of the summer monsoon decreased over the course of the Holocene, as

possiblyrevealed by our data shown in figure 4.Over the last years I talked to many statisticians to find a way to compare unevenly spaced time series with chronological uncertainties. None of them was able to find a satisfying solution. Therefore, I would be more than happy if some of you could provide solutions.

Steve:Dear Dr Fleitmann, thank you for your comments. I contacted Burns because the data id’s in Mann et al 2008 connected to Burns, leading to a couple of Burns et al papers, where he was the senior author. My reference to the paper in which you were the head author came via Jud Partin some time after I’d been looking at the data. Given your commitment to sharing data, you might convey this view to Burns, so that he can govern himself accordingly should a similar situation arise in the future.The statistical problems are interesting ones. At present, I have no references that provide a satisfactory answer to the problem (that’s not to say that there aren’t any; it’s not a topic I’ve examined in depth), but, in the absence of proper statistical authority, one should be cautious in how claims are expressed.

Re: Dominik Fleitmann (#61),

Though I have not read it yet, I’d suggest you start with UC’s link above. If you’re really interested, you should also focus on signal processing literature in general for your searches since we folk have to deal with irregularly spaced data regularly (uh, on a regular basis…). As I noted previously, the sampling error is usually modeled as a random, time-varying component in the function argument (a phase offset, almost like frequency modulation) which can be pulled out of the function argument using trigonometric identities. When the signals have a known analytical representations, this concept is very useful. Perhaps not so much here, I can’t say for sure. You might at least be able to come up with an analytical model that provides bounds.

For the record, I’m not focusing on any figure, I’m focusing on the -0.08 correlation between Q5 and D1. If that’s the same data used in Figure 4, then I’m sorry, but there is no long term correlation, even if your eyes are fooling you into believing so. Unfortunately, I cannot access the paper (I’m getting a 404 not found error in Steve’s link). If that’s NOT the data used in Figure 4, then I’ll reserve judgment.

Mark

I never would have considered a -0.08 to be significant, strong or even weak. I would say that within sampling error, the two series are orthogonal, i.e., uncorrelated.

Interpolation to get annually spliced data is not an uncommon technique, either, and is probably not a lot different than smoothing anyway.

Re: thefordprefect (#59),

Uh, no. We don’t know what either the noise or the signal is so this comment is baseless to begin with, but even if we did, the point of a correlation is simply to find out how “alike” two series are, regardless of a priori knowledge of the underlying signals. Yes, on REAL WORLD signals. That’s sort of the point, actually.

As to your numbered points, you’ve got them all either partially or completely wrong:

1. I think you mean two small signals and one has a large component added? If so, then there will be some correlation, but it will be scaled by the contribution of the larger component that was added to the original signal. Saying “poor” is not possible unless you provide relative levels.

2. Again, you need to provide relative levels to make a guess. The fact that the noise is high frequency is largely irrelevant, btw. It merely needs to be different to be uncorrelated and therefore contribute only to the length (L2-norm) of the original signal. If the noise is small relative to the original signal, then the correlation will be high and vice versa if the noise is large relative to the original signal.

3/4. What do you mean by “3 sample deviation?” Do you mean a 3-sample lag between the two? Artificially created or not, i.e., a lagged correlation? In either case, it depends upon the frequencies of the underlying signals. A sine and cosine wave exhibit a sinusoidally oscillating correlation with lag, and are uncorrelated at zero lag (original 90 degree phase offset).

5. If their amplitudes vary with time they aren’t identical, are they? Either way, if their frequencies are the same, and there are no phase reversals, i.e., the “amplitude” never changes negative w.r.t. each other, then their correlations will be rather high. Here’s a check I just performed in MATLAB:

t = 0:1:999; % time index

a = sin(2*pi*t/100); % signal 1

b = sin(2*pi*t/1000) + 2; % modulation

c = a.*b; % signal 2 with amplitude modulation applied

% correlation is the dot product of two vectors divided by their respective lengths.

corrac = a*c.’/(norm(a)*norm(c))

corrac =

0.9428

But again, it depends upon specifics. As I noted above, correlation automatically “levels” the signals in question.

6. Now you’re using jitter instead of deviation. I’m confused as to what you mean by “3-sample jitter.” If you mean the samples can be out of place by up to 3 samples each, then that’s not jitter, that’s incorrect ordering (jitter is always less than one sample period else it is a different sample rate or incorrect ordering). If you mean lag, then again, it depends upon the frequencies involved. If 3 samples is large relative to the largest frequency (assuming a composite signal), then yes, it will probably be low with or without the amplitude variation. If it is small relative to the largest frequency, then no, the correlation will still be large.

The correlation formula, btw, is also the cosine of the angle between the two vectors. Something we all probably learned in geometry at one point or another. The difference here is that the vectors have many more dimensions than the three spatial ones typically used in geometry. ;)

Mark

Thank you Professor. With all the data available for both series, perhaps Steve and others can find a satisfying solution to quantifying unevenly spaced chronologically uncertain time series. There’s probably a way somehow to get it all together, with enough insight on the actual questions in play. But then again, perhaps not.

It would be nice to see a joint project between geologists, statisticians, climate scientists, physical chemists, lawyers, speleothemologists, botanists, dendrochronologists and mathematicians to solve some of these issues.

BTW, for those looking, the paper has a new link different from the initial post:

Fleitmann et al QSR 2007

And here’s figure 4

#64. Sam, be realistic here. Unless someone is aware of a reference to an existing treatment of the topic – which incorporates all the difficulties of the topic that Dr Fleitmann faces, then even people as able as Jean S and UC are not going to be able to whip out a blog post solving a non-trivial problem.

On the other hand, this is exactly the sort of the problem that needs to be presented as a problem to stats grad students. There are many other similarly interesting statistical problems arising out of paleoclimate that would be more interesting IMO than some of the things that stats grad students are probably working on.

IMO this was a huge failure of the PR Challenge conference. As it was organized, it was the same old, same old. If I’d been asked for suggestions, I’d have structured the thing so that there was an effort to recruit young stats students to go to the conference to try to interest them in some of the problems.

Now there’s a fair bit or work in even framing the issues in statistical terms and as statistical problems. Merely framing the speleo question in statistical terms changes how you approach it.

Re: Steve McIntyre (#65),

All I’m saying is that we’re in a stage here where we are canvassing the intricate issues involved with estimating correlations between irregular time series.

So far, at least given the conversation here, there seems to be no existing treatment of the topic incorporating all the difficulties. So I’m not suggesting anyone is going to find a solution to this here or anywhere, much less whip it out. I’m simply expressing optimism that given full or near full availability of all related data, and the cooperation and interest of the knowledgeable and capable people here and those associated with them, perhaps a satisfying solution can be found in time. Or at least knowing that those most likely to solve it were unable to, telling us that the best we can expect is some approximation that something possibly shows something. That all we get is a visual eyeballing and not some scientific experiment-grade data on it.

#35

I went away and read the paper …

I have some ideas that may be relevant, but first I need to acknowledge Mark T, that you were right, of course (“may” was ill-advised to say the least).

But the other point I was trying to make, though, is that a negative correlation for the entire data series translates into a

positiveone for precipitation. In other words I’m actually saying that Steve’s result of -0.08 correlation (before any lagging) is itself opposite to the the authors’ statement (because they posited a negative correlation with the “flipped” orientation).I perceive a “visible” negative correlation at the “multi-decadal” level only from 3.4 k bp on. I have also calculated the correlation for that period and it is in the direction claimed, but not strongly (except for the most recent few decades). In fact, when I look at Fig. 9, I see a visible positive correlation on the RHS (i.e. before the changes at 4.4k bp described in the article), visible negative correlation on the LHS, and not much correlation in between.

I think the authors would have been on more solid ground to point at this part of the graph as demonstrating “visible anti-correlation”.

However, there are certainly details that need to be worked out in terms of how the data should be prepared and derived for such calculations, given the spacing, dating errors and noisiness of the data. And I still wonder whether the “fine-grained” correlation over the entire length of the series tells us very much about the trends at longer scales.

#61

Having said all that, I do agree the claim of opposing millenial trends is strong (and is probably the main point of the paper).

Note/request to Steve:

Have you posted a file with the interpolated Oman Q5 data set and Socotra data set aligned together by year? I followed the procedure as you outlined, but my numbers are close but not identical. This might be due to mis-alignment because of the “present” convention of 1950 in Oman, or differing methods of interpolation, or some other error on my part.

Re: Deep Climate (#67),

Thank you for your acknowledgment but this is still a stretch. A value this small is hardly a correlation useful for conclusions, it is small enough that the addition of one extra sample may be sufficient to switch the result to +0.08 with series this short. It could be significant, I don’t recall that Steve posted the p-value for our consumption (and I have not run the numbers myself, no time to parse them out), but hardly worth consideration. Oddly, however, it does at least

appearas if there is LHS or RHS correlation in Figure 9 (as you state, and none in the middle) which suggests to me a non-linearity (with a maxima/minima of some sort, local or extreme) which makes correlation calculations dubious to begin with.I should also mention that I can see Dominik’s assertion regarding the other data in Figure 4. I still don’t buy visual correlations as even remotely rigorous other than as a good indicator of whether or not more investigation is justified. Rather than saying there was some quantifiable statistical relationship (if you use the statistical terms, that’s what is implied) perhaps it would have been more prudent to say there seems to be a visual relationship that deserves further investigation and that algorithms may need to be developed to quantify this relationship?

Mark

To quickly put into words the issue I am talking about. Can’t talk more b/c time is ticking (hopefully no errors in this post)…

1) U-Th dates have analytical errors.

2) The d18O data between dates are linearly extrapolated based on sampling depth.

3) What if a sample has non-linear growth rates during an interval between dates? This is evident in an age model by a change in slope. But if there are no dates between, one cannot know – so you assume linear growth. …this phenomenon and the U-Th dating errors are the “slinky” aspect to a record. Actually it is a slinky between each set of dates. Not just the whole record. But to be fair when plotting the data, the average age for a U-Th measurement is used and linear growth is assumed. No wiggle-matching… Obviously the ideal case has tons of dates, but that’s very costly.

4) #3 is very likely. As such, when you sample a stal at a fixed linearly distance – say you mill out 150 microns of powder – this will integrate different amounts of time in the sample depending upon growth rate. Therefore your record will have some pretty crazy averaging going on. And I seriously doubt that two samples have exactly the same growth rates. (for comparison to the experimentalists wondering, the drill bit diameter for a drilling a date is generally 1-2mm and integrates all of that powder. At 150 um sampling/d18O measurement, this would be 6-13 d18O measurements represented by a date.)

5) What about unidentified hiatuses in growth? What if a drip stops dripping for a couple of years and doesn’t deposit calcite, then starts again? This is very possible in karst where dissolution of the host rock is constantly occurring and rainfall input is changing. Think of plinko from the Price is Right and the pegs are slowly moving over time. Without tons of dates, you could not say with certainty whether on not this happened.

All of these issues present difficultly in directly comparing two records as Steve has done in this post. And moving an entire record by a fixed interval and calculating the correlation is probably not the way to go. As Steve has shown. Kudos to Dominik for consulting statisticians on this difficult problem.

Therefore, I agree with Dominik that there is visible anti-correlation between the records (Fig. 9) -which is evident despite all of the issues listed above. These changes are 1.5permil on the multi-decadal timescale, which is much larger than the 0.06permil error associated with the measurement. Also, I would argue these changes are not noise… but that’s a whole nuther discussion.

Naively, it seems to me that the big issue here relates to matching dates from different data sets (Dr. Fleitmann’s “chronological uncertainties”) and the gaps per se are simply incomplete data which you can either in-fill or not. If you have two sets of data and you are not sure how to match them up then you are dead in the water. If you have a good guess as to how to pair the data (or some subset of the data)then you would go through an exercise that is similar to the one Steve undertook – a kind of sensitivity analysis.

It is a bit like getting a message in what might or might not be a sophisticated code. If you are not sure whether or not the message is actually in code as opposed to a random sequence of symbols (i.e., high chronological uncertainties), it may be smarter to find an other message, one where you can definitively say that it is in code (i.e., minimal chronological uncertainties).

#61

I should also mention that I found quite compelling the manner in which the paper drew together several threads of research to weave an emerging climate history in the entire monsoon region.

It seems to me that, while the Fleitman et al. paper may have used an a poor choice of words (“strong visible anti-correlation”), the point of Figure 9 was not to show a point by pint correlation but rather that overall there were opposing trends in dO18 on the multi-decadal and centennial scale. In other words, over the long-term when one goes up the other goes down. I think this is the point Jud is trying to make and I agree with him that, overall, there appears to be opposing effects.

Given this thought process, maybe the way to statistically compare the two is to look at the correlation of trends over a given time period instead of a date-by-date comparison of dO18 values. This might minimize the effects of irregular intervals as well as the effects of the uncertainty in dating.

Climate archives such as speleothems, tree rings… deliver climate proxy data and not instrumental-like time series. All climate proxy archives have uncertainties and we are working hard to reduce them. I would also like to emphasize that climate archives reflect local (site-specific) and large-scale climate variability. I have the feeling that most of you are not really aware of this. So be careful if you are using the word “noisy”, because this “noise” could be also a real local climate signal. Therefore, we can not expect a perfect correlation between our sites in Yemen and Oman. However, I do not need a statistical approach to say that the long-term trends in precipitation are very different between Oman and Yemen (see figure 4 in our paper). I would appreciate if Steve could show this figure in his main text.

Last but not least, I am not very happy that our data were used for temperature reconstructions in a recently published PNAS paper, despite the fact that our stalagmite isotope records from Oman and Yemen reflect changes in the amount of rainfall and effective moisture.

“Climate is what you expect, weather is what you get” (Mark Twain)

Re: Dominik Fleitmann (#73), I

Overall doubtful, but I can’t expect you to have read this entire site to get enough information on the general understanding of this concept. Developing algorithms to deal with this type of data is difficult as the difference between “noise” and “signal” is impossible to discern in a chaotic system such as the climate. Signal processing methods are often developed under extremely strict assumptions regarding signal and noise characteristics. This is why otherwise simple topics such as smoothing (a few or more threads on the subject are around) get so much attention.

I agree fully. However, one of the contentions in here is that you couched your explanations using statistical terminology in spite of using an actual statistical approach. It’s not personal, you simply opened yourself up to statistical critique by doing so (there are a ton of statistics and signal processing experts running around in here).

Mark

Re: Mark T. (#74), That should be “in spite of

notusing an actual statistical approach.”Mark

Re: Mark T. (#74), so do the PDO, NAO, AMO (and other multi-decadal oscillations) = noise? BTW, this is a serious question – not me being short.

Re: Jud Partin (#80),

Jud – I would say it depends. I used to work with seismic data a long time ago. We used to say that one man’s noise is another man’s signal.

I would say that any climate

phenomenomcould be defined as signal on some temporal or spatial scale. But all climateproxies, even temperature records, have noise.The real challenge is that some kinds of noise can look a lot like signal, especially if the signal you’re looking for is undersampled.

Re: Dominik Fleitmann (#73), You say:

“Last but not least, I am not very happy that our data were used for temperature reconstructions in a recently published PNAS paper, despite the fact that our stalagmite isotope records from Oman and Yemen reflect changes in the amount of rainfall and effective moisture.”

I think you will find that almost everyone here is unhappy about this also.

Re: Craig Loehle (#76), I didn’t even catch that part at first, Craig. Kudos for your candor Dominik.

Mark

Re: Dominik Fleitmann (#73),

Dr Fleitmann, accepting the uncerntanties you refer to, can I assume you have sent a complaint to PNAS, it’s your work, if you think it’s been abused then the onus is on you to reflect this to journals that publish papers based on your work.

Regards

Ian

Re: Ian (#78), No, I didn’t send a complaint to PNAS. I’ve seen this paper last week and had no time to respond as I just came back from a field trip in Turkey. I would like to read the entire paper very carefully before I do the next step.

Re: Dominik Fleitmann (#73),

Well as a layman who has been reading this site pretty much daily for over a year, I can say that if there is one thing I have learned it’s that in many cases things such as ice cores, sediment cores, speleothems, and tree rings have not been shown to reflect regional or global temps, (not that they can’t but that they have not been

shownto).Steve has on more than one occasion remarked on data that was collected for the purpose of moisture proxies being used as temperature proxies.

Thanks Jud for answering my questions.

As an example of a further error I have attached the following graphs The first shows the frequency of thickness of layers in the Beijing 2650 year Speleothem (there are some interesting missing values – no idea why). The second shows the variation in adjacent layer thickness. This is the variation on a yearly basis Most common thickness is 2 um but there are significant results at 50um (equivalent to 25 years of low growth). To linearly interpolate age with distance between 100 year references is therefore subject to further errors (above the dating errors of the reference years). All in all I would suggest that mathematical correlation on a year by year basis is “tricky” to say the least.

data source

ftp://ftp.ncdc.noaa.gov/pub/data/paleo/speleothem/china/shihua_tan2003.txt

Re: thefordprefect (#77),

The missing values are every fifth value. The only reason I can think of off-hand is that the measurements were made in a microscope with a scale marked every 5 units and where the scale covered the line it was assigned the lower value. You’ll note that the value before the missing one is generally a good deal higher than the one after it.

I would say no. Since these are on-going drivers (or indicators?) of the climate, they seem to me to clearly fall into the “signal” category. Of course, “what is signal” when speaking in regards to the climate is a difficult question to answer in general. Is it temperature, precipitation, humidity, plant life, etc. Perhaps the only real “noise” is something like sampling error or in this case, dating errors.

I can think of something that I would call noise, even if it may be caused by something specific. Any extreme external and otherwise anomalous influences such as the Tunguska explosion in 1909 (?). Of course, once the impact of the initial anomaly turns into long-term influence, then I guess it joins the “signal” fray as well.

Confusing indeed. Never have I envied anyone that undertakes this avenue as a career. Radar and communications are much better behaved. :)

Mark

Re: Mark T. (#83), thanks for the honesty. One of the major goals of paleoclimate reconstructions is to identify multi-decadal timescale oscillations (and longer). The instrumental record is only 150 years or so, at best. How many cycles of an oscillation with a 50-year period is in it? Only 3. And how good is the early data??? People call these oscillations with only 3 realizations. Why is this paleoclimate timeseries (and others) then just a bunch of noise? Not to say that I don’t believe in red noise/brownian motion, mind you.

What is tricky is when the dating errors approach the timescale of the oscillation you wish to study. Fast-growing samples are very much sought after. In fact, I will be hoping for some luck in doing this next week. But it’s hard to tell by just looking at a stalagmite in a cave…

Re: Jud Partin (#84),

Don’t let the word “noise” have its emotional impact on you. To a statistician, noise is variation in the data caused by anything which is not a factor of interest in whatever you are measuring or looking at. Noise can be nonlinear effects when a linear model is used. That is why statistical models are an important part of analyzing a dataset – it gives you an opportunity to take into account (and understand the effects of) all of these various factors which can make your data behave in a fashion different from what you expected. The behaviour of the “noise” is often not appreciated by people who don’t have the background or experience of handling it… or understanding why is also important to calculate quantities such as correlations (or anti-correlations), no matter how difficult instead of eye-balling the graphs when stating conclusions they would like to be true. geesh, I sound like a missionary… :)

Re: Jud Partin (#84), No problem. This is good stuff, constructive I hope.

This is also difficult data to analyze, and even more difficult when, as you note, only 3 relevant cycles can be compared to any legitimate record, be it precipitation, temperature, or whatever.

Re: RomanM (#87), Yes, though we engineering folk tend to separate those other things out as “interference” if they don’t have some random process driving them. Interference is more difficult to deal with because it usually has structure, which can cause a multitude of problems. There are “true” noises, however, that probably don’t show up much other than through measurement errors in the climate realm. Maybe non-climatic things are also present, which could be called noise, though I don’t know enough about what causes sediment layers to grow to say. Thermal noise is ever present in my world, but at only a few atto watts per MHz of bandwidth (-114 dBm) you’d incorrectly think it wouldn’t be an issue. :)

Mark

Is it just me, or is some progress being made with this thread?

#77. Nice spotting. This series is Mann’s #1050, where it is said to have a “low frequency” correlation to temperature of 0.46 from 1850-1995.

The authors of this series also perform a “trend adjustment” that would be worth evaluating some time.

Dr Fleitmann, a quick question on Q5: the dates and some of the measurements for Q5 values in the SI to Fleitmann et al 2003 are different than in Fleitmann et al 2007 Table 1 (not all – the MC-ICPMS values are unchanged.) What accounted for the differences?

Also, what software do you use for solving the nonlinear equations? I’m getting somewhat different answers for the ages in my attempt to replicate the standard equations (which is new territory for me and I’d like to ensure that I’m not missing something obvious?) Thanks, Steve

Re: Steve McIntyre (#88), All information are provided on page 173 (Chapter 3: Materials and methods) of our 2007 QSR paper. Most of the ages were measured by a Augusto Mangini’s group in Heidelberg and I don’t really know what type of program they used to calculate the ages. However, I have used two other software packages to verify their calculations, the results were almost identical (+/- 10 years).

For those of you who would like to learn more about speleothems.A few months ago we published a PAGES Newsletter on “advances in speleothem research”. This newsletter can be downloaded at:http://www.pages.unibe.ch/ (full address is: http://www.pages.unibe.ch/cgi-bin/WebObjects/products.woa/wa/product?id=303).

#90. I’m not disputing the calculations; I just like to replicate things step by step. The MAterials and Methods of the 2007 paper don’t do this -I’m not suggesting that this is a defect in the paper which is reporting empirical results, but it does not provide a step by step exposition of the calculation and references like Neff et al 2001 and Fleitmann et al 2003 don’t either. I can replicate calculations in say Edwards 1987 up to the nonlinear equation and then I get different answers, which I’m trying to understand. Im not suggesting that this is an issue with the software as opposed to how I’m trying to implement it; that;s why I want to look at the exact software yielding a known result.

Re: Steve McIntyre (#93), Steve, I recommend to ask people like Augusto Mangini, Larry Edwards, Hai Cheng, Gideon Henderson, Jess Adkins…. I am only a user of this method (can do sample preparation and measurements). I am rather a stable isotope person.

Re: Dominik Fleitmann (#95), “I am rather a stable isotope person.” and you are rather more stable than some posters here!

Is there a visible correlation between these two series ?

Code to generate the figure:

randn(‘state’,0); % RANDN to its default initial state

n=512 ; % # of samples

MI=(tril(ones(n,n))); % integrator matrix

MhI=sqrtm(MI); % half integrator

R1=MhI*randn(n,1); % half integrate white noise and scale to unit std

R1=(R1-mean(R1))/std(R1);

R2=MhI*randn(n,1);

R2=(R2-mean(R2))/std(R2);

close all

plot(R1)

hold on

plot(R2,’r’)

C=corrcoef(R1,R2);

C(2,1) % sample correlation, 0.2

% http://signals.auditblogs.com/files/2008/12/ftest1.txt

Re: UC (#94), I hear ya. I never said I didn’t believe in red noise. But my job is to understand the climate system. I’m not giving up and calling for Miller Time b/c of red noise. If there are low frequency climate oscillations, I want to understand them.

Out of curiosity, what happens if you add additional series using your random number generator to that plot?

Re: Mark T. (#108), I’m on my way to the West Pacific. I have a long layover at LAX, so I thought I’d see what was said here. …I’d like to see someone try out the Starbucks Hypothesis on this data!

out

Re: Jud Partin (#115),

Here’s the next in line,

plotted -R3, as correlation was -0.13 .

Here’s a sample correlation histogram for 100 series (values of lower diagonal of correlation coef. matrix, n=4950 ) :

#95. I’ll do a post on my experiments with the nonlinear equations. Replication issues with nonlinear solvers was the starting point of McCullough and Vinod, an influential article on replication in econometrics – McCullough later citing our work on MBH98 in a later article as an interesting replication effort.

One anthropological observation on your recommendation to ask someone else, which I realize was presented with good will, but nonetheless is done in a way that is quite different than I would have done it in similar circumstances. I’ve spent many years in business and, if I were presented with a question that one of my partners was better placed to answer, I would personally forward the inquiry to my partner and ask him to answer. And if there were some problem in getting the answer, I would view that as being a concern of mine and take responsibility in ensuring that the inquiry got answered. In my own personal experience with paleoclimate academics, I’ve had relatively little success whenever I’ve been passed along the food chain and in no case in which there was no response has the original referrer taken any responsibility or ownership of the problem. In your shoes, I would have forwarded the inquiry to the responsible coauthor and asked for an answer and then forwarded the answer back to me. Cheers, Steve

Re: Steve McIntyre (#97), The only reason why I made this recommendation is that I will attend the AGU meeting next week. You will get the answers much faster if you contact the experts directly.

Re: Steve McIntyre (#97),

Steve, I’ve had a chat with my colleagues and they are using ISOPLOT version 3. This is public domain software written by Kenneth Ludwig of the Berkeley Geochronology Center

I’ve just had a quick look at the web site and the package looks impressive to me.

I hope this helps.

Re UC #94,

I don’t know R, and so don’t know what “tril” does, but if MI is an “integrator matrix,” it must be a lower triangular matrix with ones on and below the main diagonal and zeroes elsewhere. But then MhI would simply be MI back again.

What is a “half integrated” process? One that is fractionally differenced with d = .5? That doesn’t look like what you are generating.

But to answer your question, yes it does

looklike these are correlated, but no, there isn’t anytruecorrelation, since these are just series somehow generated by uncorrelated random numbers.Re: Hu McCulloch (#98), From MATLAB’s help:

You are correct about MI but MhI is the “principal square root” of MI (sqrtm function is not just a sqrt). As a result, MI is not equal to MhI. If you change UC’s last line to:

[C, p] = corrcoef(R1, R2);

you’ll get a p-value of 4.8e-6. :)

Mark

Re: Hu McCulloch (#98),

Mark already responded, but just to clarify

MhI*MhI=MI

and hence the term half-integration.

Re Steve #97,

That would be Bruce D

McCulloughof Drexel U, not to be confused with myself. See http://www.lebow.drexel.edu/Faculty/BruceMcCullough.html. His paper with HD Vinod was in the American Economic Review for 6/03.On the terms “signal” and “noise”, I don’t disagree with this point. Coming from a business-economics background, I initially found the terms “signal” and “noise” to be highly offputting when applied to data. You’d never use these terms in economic series. They are what they are. The danger with such terms is that they get reified and, IMO, this is a real issue in the multiproxy world where practitioners tend to indiscriminately call things “proxies” whether or not any statistical relationship has been established.

I’ve obviously spent too much time in this particular swamp if I’m not falling prey to the same terminology and will be more attentive to this in the future.

Re: Steve McIntyre (#104),

Or equally, if not more importantly, any theoretical relationship has been established.

Mark

Various: “Noise” is fluctuations in a signal (stream of target information) as well as any added external factors to that signal, as it is received at a detector. Noise can be random, patterned or both. So the noise for one detector can be the signal for a different one and vice versa. Sometimes the non-target information can be detected and accounted for, other times it can’t. And of course, in some cases, it can’t even be identified.

#74, 80, 83: “PDO, NAO, AMO (and other multi-decadal oscillations)” are considered (long-term) weather, correct? In which case, is there even a signal to be processed? The question if they’re noise or not may not even be applicable, at least in a climate sense.

Jud #84 “The instrumental record is only 150 years or so, at best. How many cycles of an oscillation with a 50-year period is in it? Only 3. And how good is the early data???”

Ah, one of the issues so many ignore when speaking of the anomaly trend. There are a number of years and places where temperature sampling methods and/or equipment changed for either or both land and sea.

150 years? There are people who live almost that long. Hardly enough time for a nice beefy number of realizations.

Steve #97 “if I were presented with a question that one of my partners was better placed to answer”

It appears in the case of these papers and other matters, many times the coauthors are simply supplying data and/or data processing rather than active collaboration. Papers that build off other papers so to speak. That would also tend to explain a number of other things, including why some pre-provide or are willing to provide data and others aren’t. And why some answer and others don’t. A little, um, standoffish way to do things perhaps, but it appears that’s how it works. A business partnership analogy won’t fit very well, it doesn’t seem.

Don’t hate the player, hate the game. :)

Re: Sam Urbinto (#106), Sam, supplying crucial data such as U-Th dates is an act of very active collaboration. Do you have any idea how much work and time is needed to do the dating? Do you know how difficult it is to get these samples? Nobody is capable to do all the things himself.

Last but not least, these days you should be careful with your “business partnership analogy”. I think we climatologists did a better job than most business men, otherwise the US wouldn’t have these problems ;-).

Steve:Let’s not get into discussing the beauty contest of who’s to blame for the present situation – I try to restrict such discussion to cordoned off threads. In my own opinion, there is much blame to go around, but there are important peculiarities of American public policy and policies/practices of state trading and investment agencies that are quite separate from performance of the businesses. Many of the mortgage problems that prompted the US situation were not part of the fabric in Canada, which leads me to want to examine policy. My own business experience, combined with reflection on prior smaller collapses such as Enron and Bre-X, has led me to believe in the importance of disclosure and due diligence, themes that occur here frequently. Be that as it may, we are not going to get into trying to judge a beauty contest between climate scientists and businessmen on this thread and I’m not to going to permit any further discussion on this thread along this line.Well, if we’re talking about tree-ring cores, refer to the Starbucks hypothesis that has been testedin here at a location not far from where I am currently sitting (though I cannot see the Amalgre peak from my office). :)

Sorry, but that was too good an opportunity to pass up. I would venture obtaining cave samples are a tad higher on the list of things that I would call difficult to do.

Btw, I think business suffers the same reason climatology suffers these days: politics. ;)

Mark

Dominik, my comment was a generalization. I doubt anyone would suggest that field geology is “easy”. I certainly wouldn’t! Also, I wasn’t attempting to suggest the U-Th dating or sample gathering was some simple nonchalant process. Actually that was more my point, it’s difficult and far-reaching, and that no one person can do every piece of it. So it might be more appropriate to go to the source. The expert in that area. Steve was simply pointing out that in the past, even if there was contact with the primary source, there hasn’t been much active response from secondary or tertiary et al ones. And I was simply pointing out some sort of a partner-like responsibility / ownership analogy probably didn’t really fit here, not making such an analogy myself.

snip

Dominik’s comment

sheds some more light on why some climate researchers may be reluctant to share data. I still don’t agree with them, but now find myself a little more sympathetic.

Re #94, 98, 103, 110,

Thanks, Mark, for the clarification on sqrtm.

UC, I gather that your “half-integrated” process necessarily starts from 0, whereas a fractionally integrated process of order d in the sense of Hosking’s 1984 paper starts with a draw from the unconditional distribution of the process. When d rises to .5, the unconditional FI variance reaches infinity, so that it cannot be simulated, whereas your process always has a finite conditional variance (conditional on starting from 0), so this is not a problem.

Is this correct?

Re: Hu McCulloch (#112),

Yes, my process is still a blog-level stats process ;) Hosking’s paper is very interesting; make recursively a prediction (with prediction interval) and then draw a sample form such distribution that obeys the prediction. Need to think about that with d=0.5 a bit though.

Other interesting lines from the paper (OT, sorry ) :

nice example of a natural property ;)

and

Yup. Sitting in front of an open MATLAB console trying hard to find something else to do, so I might as well run help commands. :) Actually, I have something else to do but I need the software used to implement the original (sort of broken) solution. Ok, so that’s a cop-out for 4:00 p.m. on a Friday.

Mark

I have considerable experience in the mineral exploration business and have a very precise understanding of how much work is involved in doing the samples and measurements. Unlike climate, the sampling and assaying labor is divided in geology – the geologist collects the samples and the assaying is done in an independent lab.

All the work seems to me to be in the stages up and including to the isotope analysis. Once the isotope measurements are done, the dating looks like merely applying a very straightforward algorithm. I’ve made a pretty good replication on another thread – and it looks like to me like how Th adjustments are done may be material to the correlation/anti-correlation of Q5 and D1.

#115. Jud, no need to be chippy about the Starbucks Hypothesis. The canonical bristlecone pine collections are in relatively convenient locations. The failure to do elementary updating of conveniently located to ensure that they actually recorded warmth of the 1990s and 2000s appalled me when I first became aware of the problem and appals me still.

It is precisely because I am familiar with what’s involved with mineral exploration in remote areas that this appalled me. I like geologists and field people and have never deprecated field activity or discouraged the collection of data. Quite the opposite. I encourage the collection of data.

That doesn’t mean that Mann and the Team should continue to regurgitate Graybill’s bristlecone pine chronologies. Or that the broader climate science community should acquiesce in such regurgitation without registering any objection.

Re #94, 98, 103, 110, 112,

On further thought, if you’re going to subtract the sample mean anyway, it doesn’t matter what the initial value of the process is, so UC’s method is just as good as Hosking’s in this context, and in fact is better, since it can simulate d = .5, whereas Hosking cannot. Although the unconditional variance is infinite with d = .5, the variance conditional on the initial value is quite finite, as is the variance conditional on the sample mean value, just as it is for an “infinite variance” random walk with d = 1.

Note that if we define MqI = sqrtm(MhI) to be the generator of a “quarter integrated process,” then Mqi*MhI generates a fractionally integrated process starting at 0 at t = 0, but with d taking on the supposedly verboten value 0.75!

This procedure cannot generate arbitrary real values of d, but it can arbitrarily closely approximate any real value with values of d that have finite binary representations.

Re UC #116, how many exceed the .05 critical value that assumes independence? (about .088 with the large sample approximation.) It looks like a lot more than 5%!

Re: Hu McCulloch (#118),

red = white noise, blue= half integrated

.05 critical value seems to be 0.28 (n=512) for this kind of noise. ‘Effective n’ drops from 512 to some 50..

There’s an even simpler problem here.

In my calculations, as noted above, because Fleitmann et al failed to archive their Socotra data and because Burns refused to provide the data when requested, I compared the Qunf Q5 data to Socotra as used in Mann et al 2008. However, given that I used

Manndata, I obviously should not have assumed that it would correspond to other versions of the data. Below I’ve excerpted Figure 9 of Fleitmann et al (which shows Dimarshim D1) and a replot of the Mann Dimarshim D1 version (identified there as “burns_2003_socotrad18o”). The versions are similar, but there are notable differences that prohibit use of the Mann version for analysis without reconciliation. Note the downspike to nearly -5 in the Fleitmann version (grey series) at about 3570 BP. The corresponding downspike in the Mann version is about 3800 BP, about 230 years earlier. The preceding upsike to about -1.5 occurs just before 3900 BP in Fleitmann and about 4100 BP in Mann. The Fleitmann version ends just before 4500 BP, while the Mann version goes to about 2700 BP. The Mann version seems to be about 200 years earlier than the Fleitmann version.Fleitmann et al 2007 Figure 9.

Replot of Mann et al 2008 Version of Socotra O18.

Obviously, it was not safe practice to use Mann’s data version for analysis purposes – whatever was I thinking. Having said that, Mann must have got his data from somewhere. I presume that, at one time, there was a different dating scheme for Dimarshim D1 that was floating around in a grey version and, as in MBH98, Mann latched onto a grey version.

I guess that there’s not much that can be done until I have a better version of the Dimarshim data. In passing, I note that the WDCP archive for Qunf Q5 lacks depth measurements – this is poor practice in case dating assumptions change.

#120, Steve:

Exactly. I was going to post on the same problem, which I found when I tried to superpose the two proxy data sets, scaled and oriented like Fleitman fig. 9. The offset of Socatra D1 is immediately obvious (see below). I would caution against any conclusions until we can see the actual data set used, possibly along with some explanation of intermediate versions (Mann’s version of the proxy is dated 2003). I would say the Mann version does seem to match up to Fleitmann in the period covered in the Mann study (a peak around 1920 bp seems to be present in both). Also, fig. 4 appears to show D1 only back to 4.4k bp, but fig. 9 D1 gray curve seems to go beyond 4.5k bp, as do the “signpost” dates in Table 1. Besides possible dating changes in the earlier part of the record there may also be some slight plotting errors.

Obviously, it’s difficult to comment on the “anti-correlation” until this issue is resolved.

D1 vs Q5, Fleitmann et al style, but using available data sets:

#121:

Not sure the graph came out right – here’s the link:

#121

Here’s the same graph, but this time with D1 Socatra (Mann version) shifted 270 years towards the present. The match is pretty good, but not perfect, with fig. 9 (some “stretching” would be required to get D1 to line up properly – I think I’ll wait for the real data instead of trying to do that).

But just for the heck of it, I tried a preliminary correlation on the overlapping data sets; shifting D1 forward results in a correlation of -0.29 (instead of -0.08).

Here’s the same data (Mann D1 shifted forward by 270 years), displayed “McIntyre” style, but with trend lines superposed. Note opposing trends in the two data sets.

Re: Deep Climate (#124),

I think it is time to put some CI limits on the trend correlations with all things considered, i.e. autocorrelation. Are the trends shown in the graph in this post statistically different? What would a two synthetic series with no correlation produce by pushing one ahead or back? R= -0.08 gives R^2 = 0.006 and R= -0.28 gives R^2 = 0.08. Are we really supossed to get excited about this?

#124. I noted in my post that I was going to discuss trends on another occasion. I would have liked to do so by now, but, after spending time on this, I got annoyed at the crappy data situation and got frustrated by my inability to replicate any dating results and I decided to leave speleothems alone for a while until better data/code turned up.

However, I’m glad that you’ve stayed on the case. Fleitmann interpreted the speleothem data as evidence of N-S movements of the ITCZ – this is a topic that actually interests me considerably and which I’ve discussed passim from time to time. Newton et al 2006, reviewed a while ago, contained an interesting analysis of proxies, also interpreting them as ITCZ movements. In another context, I suggested that Kim Cobb’s idea of a “cool medieval Pacific” was equally consistent with slight N-S movements of the ITCZ. So this is the something that we’ve kept an eye on and hopefully we can extract some information from the Oman speleothems in this respect.

My earlier point was merely a statistical comment on the implausibility of Fleitmann’s claim to have observed “strong visible anti-correlation” illustrating this on a multidecadal scale – a point that simply cannot be established on the available information. The invalidity of this point does not preclude

trendsdiverging on a millennial scale. I have some concerns about the effect of the Q5 hiatus and what that might entail for the rest of the record were it to be examined closely, but for now I have no reason to contest the idea that there has been a divergent trend in O18 values over a couple of millennium despite southern Oman and Yemen being nearby. What this means is a different issue. In addition, if O18 has different millennial trends between Oman and Yemen, it poses some knotty problems for using this sort of information in Mann-style data mining algorithms – an issue that we can discuss on another occasion.#125 – Steve

Yes, I figure this is the best that can be done for the moment, but will need to be updated of course. And it’s important to note that the focus of the Fleitmann paper seems to be more on the period up to 2000 b.p.

So … the relationship of the proxies to precipitation in the last two millenia (and the fact that both D1 and Q5 have trended up in the last 1500 years), as well as the imputed relationship between precipitation and temperature, are all discussions for another time. I have a feeling that Dr. Fleitmann may weigh in on this and that would be interesting too.

I agree that “strong” should not have been used. And it does seem there is confusion in terminolgy between “anti-correlation” and “opposing trends” in this paper. I think there was a lack of clarity in the description of fig. 9 as well – after all, the axes were reversed from fig. 4 and fig. 7. Does that mean we should see a “strong” visible

positivecorrelation with D1 reversed, which therefore implies a “visible” anti-correlation in the trends?I see that way back, when I suggested trying to shift 300 years the other way. Little did I know …

#126 – Kenneth

CI limits would be good – if there is an interest in posting the related trend statistics, maybe I could dig them out. I guess they should also be adjusted to account for the true number of data points (one fifth of the interpolated data sets, right?). I’m not particulary “excited”, by the way (see above). But it is interesting to note that the original data sets (with D1 unshifted) had a very slight negative correlation but slightly positive trends (pretty much flat for D1) for the overlapping period. So things are making a little more sense this way, aren’t they? Anyway … maybe this horse is well and truly flogged now … I see we’re the last nerds standing.

Original Q5-D1 comparison (McIntyre style, unshifted D1 from Mann, with trend lines)

Re UC #94, #128, and my #118,

By all means, we need a general formula for the covariance matrix of y-ybar, where y is a fractionally intgrated process of order d, for any real d in the interval [.5, 1], which is supposedly prohibited for FI processes. Hosking (1984) gives the formula for the unconditional covariance matrix of y for d < .5, but the unconditional variance explodes at d = .5, unless the mean ybar is first subtracted out.

When d = 1 and the sample is large, y behaves like a standard Brownian Motion or Wiener Process W(r) for r in [0, 1], apart from vertical and horizontal scaling. The residual y-ybar then behaves, apart from scaling, like what might be called the “Brownian Residual” R(r) = W(r)-Wbar, where Wbar is the average of W(r) over [0, 1]. It can easily be shown that Var(R(r)) = 1/3 – r(1-r), which takes its largest value of 1/3 at r = 0 and 1, and it smallest value of 1/12 at r = 1/12. This tendency of the variance to be largest at the ends helps account for the spurious appearance of an up or down trend (or a even a quadratic shape) when in fact there is no uptrend or downtrend. A similar effect would hold for d < 1, but this has never been quantified. The method of my #118 could be used to construct it, but only very awkwardly, and only for d equal to a finite binary fraction.

Once the covariance matrix of y-ybar is determined, the process can easily be simulated once its Cholesky decomposition is known. Unfortunately, Matlab function Chol will not ordinarily do it, since it requires its argument to be non-singular. A variant [R, p] = Chol(X) permits singular X matrices, but then R’*R does not equal the full matrix X.

The following Matlab script for a function Cholesky(X) yields an upper triangular matrix R such that R’*R = X for any symmetric X. If u is a vector of standard normals, then R’*u has the covariance matrix X, even if X is singular:

function[r] = cholesky(x)

% function[r] = cholesky(x)

% Written 12/26/08 by J. Huston McCulloch, Ohio State Univ.

% Computes upper triangular matrix r such that r’*r = x for

% symmetric matrix x.

% x may be singular, with singularities in any order

% r contains a row of zeros for each zero eigenvalue of x

% If x is not positive semidefinite, r will be complex

% Values less than 1.e-14 are treated as 0

% Answers are not more accurate than 1.e-7

n = size(x,1);

% errortraps:

if n ~= size(x,2)

disp(‘x is not a square matrix in function cholesky’)

disp(‘size(x) = ‘)

disp(size(x))

r = NaN*ones(size(x));

return

end % if

if sum(sum(x ~= x’))

disp(‘x is not a symmetric matrix in function cholesky’)

disp(‘x = ‘)

disp(x)

r = NaN*ones(n,n);

return

end % errortraps

r = zeros(n,n);

% solve r(1,1)^2 = x(1,1) for r(1,1):

r(1,1) = sqrt(x(1,1));

% if argument of sqrt less than 1.e-14, set sqrt = 0:

if abs(r(1,1)) < 1.e-7

r(1,1) = 0;

end % if

if n == 1

return

end % if

if r(1,1) ~= 0

% solve r(1,1)*r(1,2:n) = x(1,2:n) for remainder of row 1:

r(1,2:n) = x(1,2:n) / r(1,1);

end % if

for i = 2:n,

% solve r(1:i,i)’*r(1:i,i) = x(i,i) for r(i,i):

r(i,i) = sqrt(x(i,i)-r(1:i-1,i)’*r(1:i-1,i));

if abs(r(i,i)) < 1.e-7

r(i,i) = 0;

end % if

if r(i,i) ~= 0 & i ~= n

% solve r(1:i,i)’*r(i:1,i+1:n) = x(i,i+1:n) for remainder of row i:

r(i,i+1:n) = (x(i,i+1:n)-r(1:i-1,i)’*r(1:i-1,i+1:n))/r(i,i);

% else r(i,i+1:n) remains 0

end % if

end % i

return

% end of function cholesky

If X is known to be invertible, the build-in Matlab function Chol is 6 times faster, so the above should only be used on singular matrices.

Barnes and Allan (1966) describe one method, haven’t yet tried how their version works. One reference they had to use was

Rand Corp., A Million Random Digits with 100,000 Normal Deviates Glencoe, Illinois: Free Press, 1955.

for us this should be easier ;)

—

Barnes and Allan : A Statistical Model of Flicker Noise, Proceedings of the IEEE Vol. 54, No. 2 February, 1966

Re UC #130,

I like your method of #94 better than Barnes and Allan’s for simulating long-memory half-integrated process that look rather similar to Fleitmann’s series.

For one thing, your method is exact (albeit a lot more computation-intensive) for a finite sample.

For another thing, Barnes and Allan seem to have forgotten to reciprocate their coefficients. Their (10) effectively defines MBA = sqrt(tril(toeplitz(1:n))), and then pre-multiplies a column vector of N(0,1) RVs by this matrix. In order to approximate your matrix MhI, the non-zero coefficients would all have to be inverted. Their (15) (with an ad-hoc 2/3 power instead of 1/2) does give a better approximation to your MhI, but again only when the non-zero coefficients are all inverted.

Did you invent the sqrtm(MI) method, or has this been around? If it’s new you should definitely publish it somewhere other than on CA!

This half-integrated “1/f” noise seems to be known as “pink” noise. See eg Wikipedia article on pink noise, and related article on the “colors of noise”. This literature reserves “red noise” for a pure I(1) process, rather than a merely serially correlated process as it has sometimes been used in the climate discussions here.

Wiki doesn’t cite Barnes & Allen, but does track the idea of half-integrated noise back to Shottky in 1918.

My home computer has an old version 4 of Matlab, on which sqrtm(MI) is very slow, not to mention inaccurate, for n = 512. Version 5+ works much faster and accurately, so the computational practicality of your sqrtm(MI) method is a relatively recent thing.

Re: Hu McCulloch (#131),

I did invent it. Then I mentioned the method to one math professor, who said it’s trivial. That happens all the time ;) Yet, I have something to publish from CA work, and that won’t go to climate science journal.

I wrote more efficient, recursive way, to compute that matrix. Will clean it up and email.

While trying to learn how uncertainties in sea level changes are computed, I found this paper :

http://sealevel.colorado.edu/MG_Leuliette2004.pdf

wherein it is said that

The paper cited is:

and I even found the Matlab code :

http://www.ldeo.columbia.edu/~kja/access/code/ebisuzaki.m

Then I tried the code with data in #94

and got the result

Critical R is quite close to the value I computed in #119 , where I knew how the process was made. Not bad!

Possibly related posts:

http://www.climateaudit.org/?p=5341

http://wattsupwiththat.com/2009/04/06/sea-level-graphs-from-uc-and-some-perspectives/

http://www.climateaudit.org/?p=3720

Re: UC (#133), You may also want to look at the presentation by Leuliette here (Pdf file).

Thanks Geoff, I’ll take a look.

But before going to sea level business, here’s the interesting thing:

This method is published in high-quality journal, and used by sea level experts. Why wouldn’t Mann use it for testing significant correlations in proxy vs. temperature series? Because it gives critical r values that are in the range of 0.2 – 0.3 ? I’ll check how many of Mann08 proxies survive this test..

Re: UC (#135),

It seems that only

1070 tornetrask

1061 tiljander_2003_darksum

survive this test in the early steps. At AD1400 step there following proxies get through:

271 ca630

272 ca631

287 cana106

314 cana175

362 co556

397 fisher_1996_cgreenland

424 gisp2o18

628 mo037

654 mt110

778 nm560

796 norw010

820 nv516

908 schweingruber_mxdabd_grid1

909 schweingruber_mxdabd_grid10

910 schweingruber_mxdabd_grid100

920 schweingruber_mxdabd_grid11

927 schweingruber_mxdabd_grid12

933 schweingruber_mxdabd_grid18

934 schweingruber_mxdabd_grid19

935 schweingruber_mxdabd_grid2

936 schweingruber_mxdabd_grid20

937 schweingruber_mxdabd_grid21

938 schweingruber_mxdabd_grid22

959 schweingruber_mxdabd_grid42

960 schweingruber_mxdabd_grid44

961 schweingruber_mxdabd_grid45

977 schweingruber_mxdabd_grid6

988 schweingruber_mxdabd_grid70

1002 schweingruber_mxdabd_grid89

1070 tornetrask

1061 tiljander_2003_darksum

1104 ut509

1122 vinther_2004_scgreenland

427 haase_2003_srca

330 chuine_2004_burgundyharvest

Leave Tiljander out, calibrate with iHAD_NH_reform: http://signals.auditblogs.com/files/2009/04/cce.png ..

Re: UC (#136), Nice picture. It sure give me confidence (limits).

Re: Geoff (#137),

I think it’s ok, given the quality of scientifically published reconstructions.

Pros:Proxy screening was done locally, by using published method, without restrictive AR(1)-noise assumption.

Calibration result and CIs are obtained using Brown82 equations.

Cons:Local proxy vs. global temperature calibration

No dynamic model for temperature applied

Some proxies used might have some problems ( search CA )

Re: UC (#138),

Looks to me like if you draw a line through 0 (and lots of other values) well over 95% of the values are within the 95% envelope. Which means the reconstruction can’t be used to rule out the null hypothesis that there’s been no temperature rise in the past 2k years.

Re: Dave Dardinger (#139),

Correct. Statements such as in Mann08,

are based on unrealistic CIs. M&M reply made it clear.

The figure in here now, http://www.climateaudit.info/data/uc/Ebisuzaki_cce.png (cref http://noconsensus.wordpress.com/2011/07/27/123-2/ )

Calibration should be done locally, but not sure how to combine the resulting non-normally distributed outputs to a global average. Median tree -approach (once you’ve seen one tree…) might actually work well.