Besonen et al 2008 on Hurricane Proxies

There have been a couple of recent mentions of Besonen et al 2008 (including Ray Bradley) which discusses varve sediment thickness in Lower Mystic Lake, New England as a hurricane proxy, reported as a “1,000-year, annually-resolved record of hurricane activity from Boston, Massachusetts”.

Before discussing the article, I checked to see whether any of the data had been archived at WDCP. Bradley had told the House Energy and Commerce Committee

When I or my students have generated data sets they are generally sent to the WDC-A (World Data Center for Paleoclimatology) once the results have been published. This is the normal procedure followed in my field.

Unfortunately, a search under Besonen did not show any contributions. In fact, a search under Bradley likewise showed no contributions other than ones where he was joint author with Mann (or in one case Jones). I guess WDC-A forgot to include Bradley’s contributions in its index. (See CA discussion from 2005 here.)

The article refers to SI to be located at ftp://ftp.agu.org/apend/gl/2008GL033950; however there is no such directory and ftp://ftp.agu.org/apend/gl/ showed no relevant candidates. The Supplementary Information linked in the HTM version of the article consisted only of radiocarbon dates, useful, but hardly complete.

The graph of sediment thickness shown in the article noticeably resembles our favorite shape as shown below. However, as observed in the article, there is (unsurprisingly) substantial non-climatic disturbance of sedimentation patterns in the Boston area and post-1870 data is removed from the analysis. Accordingly the authors say:

We confined our analysis to the period prior to 1870 given the significant anthropogenic interference and altered sedimentation dynamics as discussed above.

Figure 2 (a) LML varve thickness time series plot and identified extreme events. In the plot, actual varve thicknesses (mm) are plotted by the lower black line. The thickened gray line shows a robust estimate of the time dependent background thickness based on median smoothing with a 17 year window. The upper black line represents the med +3.5 rstd threshold, and varves with thicknesses which fall above this TDV are considered extremes (total of 47 observed). Of the 47 identified extreme events, the 36 which contain a graded bed are marked by filled black circles and listed in the inset table, and the 11 which do not contain a graded bed are marked by open black circles. The dashed vertical line at 1630 indicates the prehistoric/historic boundary for the region.

“Significantly”
Besonen et al report that certain centuries had a “significantly” higher hurricane frequency:

Hurricane frequency, as recorded at LML, has not been constant over the last millennium (Figure 2b); the 12th–16th centuries had a significantly higher level of hurricane activity (up to 8 extreme events occurring per century) compared to the 11th and 17th–19th centuries when only 2–3 per century was the norm.

On the other hand, they note that a nearby study had also observed “significant” changes, but apparently in a different direction:

We note that conclusions about frequency changes reached from the LML record differ from those reached by studies based on lower resolution records from nearby areas. For example, a study from Long Island [Scileppi and Donnelly, 2007] concluded that activity had significantly increased over the last 300 years with reduced activity during the earlier part of the millennium.

Figure 2b, referred to in the above statement, is shown below:

Figure 2b. (b) Frequency of hurricane-related deposits in the LML record grouped by century. The darker central bars represent the number of extreme events identified using a TDV of med +3.5 rstd. The flanking light gray bars represent the number of identified extremes using TDVs of med +2.0 rstd. (left) and med +5.0 rstd. (right). Note that given our analysis range (1011–1870), the first and last columns do not span a full century.

A Question
Although the authors state that the 12th–16th centuries had a “significantly higher” level of hurricane activity, they do not describe how they carried out their significance test nor do they provide the data by which one can conduct a significance test on one’s own. Perhaps one of the readers are interested in Poisson calculations for hurricanes would be interested n writing to Bradley and Besonen and (1) inquiring what significance test was used as a basis for this claim; (2) obtaining the underlying data used to make the claim and then carry out his own significance test to see whether the variations actually show “significantly higher” levels or whether the data could be the result of a Poisson distribution. I have a pdf of the article.

Reference:
Besonen, M. R., R. S. Bradley, M. Mudelsee, M. B. Abbott, and P. Francus (2008), A 1,000-year, annually-resolved record of hurricane activity from Boston, Massachusetts, Geophys. Res. Lett., 35, L14705, doi:10.1029/2008GL033950. url ftp://ftp.agu.org/apend/gl/2008GL033950

This entry was written by Stephen McIntyre, posted on Jul 24, 2008 at 12:58 PM, filed under General. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

43 Comments

Barclay E. MacDonald

Posted Jul 24, 2008 at 2:04 PM | Permalink

Could this be an oversight by Bradley? If I made a specific representation to the U.S. Congress in response to a question on specific issue, I would be very very careful about that representation and its truthfulness? Perhaps Bradley is busy preparing for the PR Challenge and this is simply a terrible oversight. If so, I’m sure it will be fully corrected very quickly? What’s the chance I am wrong?
Steve McIntyre

Posted Jul 24, 2008 at 2:27 PM | Permalink

It’s pretty amazing that Bradley would say something like that to a congressional committee. What if they’d asked him where the data sets were? He’s lucky that Mann was thrown overboard in the hearings and the committee lost interest.
ad

Posted Jul 24, 2008 at 3:21 PM | Permalink

OT: But can’t resist a bit of nominative determinism when I see it… one of the authors is named Mudelsee!
Earle Williams

Posted Jul 24, 2008 at 3:28 PM | Permalink

As a geologist who never studied varves, I am curious as to the low frequency (ie near linear) increase from circa 1200 to present. Anyone have a professional opinion as to whether that represents climatic changes or perhaps the evolution of a basin?

Steve: they mention the possiblity of compression in the article.
Jeff K

Posted Jul 24, 2008 at 4:10 PM | Permalink

Dummy question: what is a ‘varve’ & how can it (they?) be directly correlated to a hurricane (especially that far in the past) and not some other high wind event such as a Nor’Easter, Derecho thunderstorm, etc. Seems bogus to me but nobody here is questioning it so…
UK John

Posted Jul 24, 2008 at 4:15 PM | Permalink

link to AGU not work!

Steve: That’s because they didn’t create the target yet. Not a problem at our end.
jae

Posted Jul 24, 2008 at 4:32 PM | Permalink

I note that the heavy curve shows a robust estimate. Are the other curves not robust?
John A

Posted Jul 24, 2008 at 4:51 PM | Permalink

$10 bet that the distribution of extreme weather events in that proxy is Poissonian.
Larry Sheldon

Posted Jul 24, 2008 at 5:35 PM | Permalink

I have a newly identified problem.

I see an article that has a graph that has the characteristic “hockey stick” shape, and I immediately without reading the words conclude “another crank AGW proponent or another attempt to pound sense into one.”

Dr. McIntyre, is there any hope for me? Will I ever again be able to read articles way over my head with thinking, “oh no, not again!”

Seriously, am I being unfair?
RomanM

Posted Jul 24, 2008 at 5:42 PM | Permalink

OK, I like to do math and stat stuff so I’ll give it a shot. The figure 2(b) makes it pretty easy to do a test without a request for the data. We’ll use the middle bars (corresponding to +3.5 rstd to illustrate. Archiving my data: the number of events in each century: 2, 5, 8, 4, 5, 4, 2, 3, 3. Total = 36. Now, if there are no rookie calculation errors (after the dinner wine)…

Let’s assume that it is a Poisson process and the null hypothesis is that the intensity of the process is uniform in time. Since we are given 36 occurrences in the time period considered, it is a known fact that under this condition and our assumption, the 36 events occur at times which behave like a collection of 36 points randomly selected uniformly over the entire study time period. If we bin them by century, the number of occurrences in a given century has the same distribution as the frequency of occurrence of a particular face in 36 rolls of a 9-sided die (Binomial, n= 9, p = 1/9 – the simultaneous number of points in each bin has a multinomial distribution, but that just makes the calculation unnecessarily longer and more complicated). I am actually stretching the two end centuries a bit also, but this won’t affect the answer too much, so we will ignore that.

Now, since the the authors state that the 12th–16th centuries had a “significantly higher” level of hurricane activity, our test statistic will be the maximum number of occurrences observed in any century = 8. We need to calculate a probability value for the statistic. The probability that a given (chosen beforehand) century has 8 or more occurrences (using R and Minitab – just to double check the results) is .0408. Statistically significant? However, that is if the century has been chosen ahead of time without looking at the results. Using the Bonferroni (or Boole’s) inequality to save calculation, the probability that the maximum for some century is 8 or more is at least .343. That means if the frequency of occurrence of these events does not change over time, more than one-third of the time we will see 8 or more in the same century when 36 of them occur over a period of 900 years. Hardly significant.

The situation for the bars on the right side gives results very similar to the ones above. The bars on the left give results much larger. There seems to be a tendency for many of the folks in climate science to declare “significance” using the eye-ball test rather than to use tried-and -true stat techniques for interpreting the magnitude of differences observed in their studies. Gotta go keep the significant other in good spirits… Back tomorrow.
Oriz Johnson

Posted Jul 24, 2008 at 5:47 PM | Permalink

As a lurker old enough to have lived through the record heat of the thirties, and since that time having been intimately exposed to executives and mega performers of virtually every ilk, I’m confident Steve McIntyre is certainly the sharpest player I have ever witnessed. I submit it is for him the word awesome was coined. An asset of the first water.
bender

Posted Jul 24, 2008 at 5:47 PM | Permalink

#9

am I being unfair?

When considering the cause of any trend in time-series data the first line of inquiry has to be technical/methodological. i.e. Does the method of sampling produce systematic biases over time (i.e. false trend), or is it unbiased? If this issue is not addressed in paragraph one of the methods, the paper is suspect. Reviewers that do not demand that this aspect be addressed are negligent.

You are being unfair if you dismiss the paper before even reading the methods. It is ok for alarm bells to go off when you see a HS. It is not ok to act on those alarm bells and disimss the work prematurely.
Larry Sheldon

Posted Jul 24, 2008 at 10:05 PM | Permalink

#12

You are being unfair if you dismiss the paper before even reading the methods. It is ok for alarm bells to go off when you see a HS. It is not ok to act on those alarm bells and disimss the work prematurely

You are right, and I am properly chasened. (I’m not really that bad, really, but some times …)

I’ll go back and do better.
John A

Posted Jul 25, 2008 at 4:01 AM | Permalink

Re #10:

It looks like my $10 is safe.
RomanM

Posted Jul 25, 2008 at 5:32 AM | Permalink

#14

Not so fast, JohnA! My analysis does not conclude that the process is Poissonian. Given the data as it appears in figure 2b AND the assumption that the occurrences behave as a Poisson-type process, then the analysis shows (in a technically valid fashion) that the observed variation in the numbers over the nine centuries can very reasonably occur if the process intensity is uniform over the given time period. This contradicts the observation of the authors that the higher numbers in some centuries must be a result of systematic differences and not natural sampling randomness. There have been many papers considered on this web site where data have been presented (usually in a graphical fashion) with a mere visual observation that something is unusual as the sole “analytic” justification of the authors’ point of view. Unless the authors have a great deal of experience with the behaviour of randomness, these subjective observations can easily be spurious. A statistical test adds a proper context for the evaluation and puts the entire situation on a more solid objective basis. Hats off to Steve for noticing that the authors’ conclusion was not justifiable.

To illustrate how important experience in observing randomness is, in this situation: If the events occur as a uniform process, which is a more likely result: all 9 centuries each having 4 occurrences (as uniform as it can get) or one century having 5, another 3 and the remainder having 4 each? It surprises many people that the latter is 57.6 times as likely as the former. The reason is that there is only one way to select which century has how many in the first case (each has 4). In the second case there are 9*8 = 72 ways to pick which century has 5 and which has 3. Although each of the 72 cases has a slightly smaller chance of being the result than the unform case, there are so many of them that the “expected” uniform (all 4s) is much less likely.

As far as your hypothesis goes, JohnA, to properly test that it behaves like a Poisson cannot really be done without at least having more information about the process (e.g., knowing the years that each of the occurrences took place). Your bet still stands. However, it was my impression that the currency unit on this site is the quatloo. What would $10 = in quatloos?
Steve McIntyre

Posted Jul 25, 2008 at 6:11 AM | Permalink

Romanm we have information on the years of the events – see the insert to the top figure. A question – is a Possson process the right way of thinking about whether a hurricane hits Boston? Wouldn’t it be more like a binomial process with odds of 3.6%?

I was in Boston for one of the hurricanes – Hurricane Hazel in 1954, a very famous hurricane which caused huge damage in Toronto and affected how subsequent development was carried out in river valleys.
bender

Posted Jul 25, 2008 at 6:58 AM | Permalink

If you are looking at the aggregate count integrated over time, it’s possion. If you are looking at probability of an individual occurrence, it’s binomial. Time-scale makes the difference. You are arguing at cross purposes.
RomanM

Posted Jul 25, 2008 at 7:01 AM | Permalink

#16

Steve, if one can assume that the occurrence of hurricanes in the Atlantic basin is a Poisson process, then it is not unreasonable that the ones that hit Boston also form a Poisson process. A problem in some standard text books on stochastic processes is to show the following:

Suppose that we have a Poisson process with intensity parameter (average number of occurences per unit time) m. If the probability of observing the occurrence is p (with the usual caveats of independence of observing separate occurrences, etc.), then the sequence of “observed occurences” also forms a Poisson process but with intensity parameter mp. If only those hurricanes reaching Boston are “observed”, they could still be a Poisson process.

I cleverly overlooked the box of dates given in the other figure. A better test for uniform occurrence of the Boston hurricanes is based on the observation I made in my first post: “… the 36 events occur at times which behave like a collection of 36 points randomly selected uniformly over the entire study time period”. Thus, test whether the 36 dates given represent a simple random sample from a uniform distribution on an interval [A, B]. The authors seem to indicate that the interval should be: [1011, 1870]. This bothers me a bit since I don’t know exactly how these dates were chosen (was the fact that there were occurences in 1011 and 1869 influential in making this choice?). If they were influential, then the test should involve only the 34 intervening occurences and the interval would be [1011, 1869]. A decent goodness-of-fit test for the uniform distribution should be used.

I too remember Hazel. A day or two after it hit, I went on a school trip from Hamilton to Toronto by train. I was impressed by how much damage was done by it. It was my first real experience with a hurricane.
Glacierman

Posted Jul 25, 2008 at 8:05 AM | Permalink

Am I missing something, or do we not have actual huricane data for the past 60 years or so? How does this data compare with thier graph. Seems silly to use varves to try to show an increase in huricanes near Boston. Did they take into accout compaction as they progress downward in the sequence? Have they done anything to demonstrate the physical process by which more sediment is deposited during a huricane year – and only by a huricane? Is this just another example of time series data (noise) resulting in a hockey stick?

Many processes can cause higher sedimentation – floods, etc. Just like a tree ring width is affected by more than one variable.

Varves are seasonally deposited sediment (within a year). They typically are recognizable where lakes are frozen in the winter. The frozen surface allows the finest sediment to settle out because there is little to no agitation of the water. This forms a visible layer of extremely fine material. Without this difference the seasonal variations would not be recognizable – you would just have sedimentation with gradations based on different storm events.

I have not seen anything that would suggest that a thicker varve is a record of a huricane. That is one example of an event that may explain the differences. Seems like any large flooding/storm event could cause a larger seasonal deposition.

Steve: Have you read the article? They discuss this sort of thing and if you want to differ, please do so based on their analysis,
David Smith

Posted Jul 25, 2008 at 8:43 AM | Permalink

Besonen’s website has kindly made a copy of the article available for academic reading (no copies) here .

I applaud their efforts but wonder about their premise

Thus, we conclude that the graded
beds represent deposition related to intense hurricane
precipitation combined with wind-driven vegetation
disturbance that exposes fresh, loose sediment.

If I read that correctly then they are saying that the storm disturbs (uproots) trees and bushes, exposing soil which is then swept into the lake by intense rain. My experience has been that the storms uproot some trees but any wash from the rootball tends to drain into the hole where the roots used to reside.

Also, trees tend to fall in the second half of the storm, once the soil is saturated and the roots have been rocked to and fro. By then a blanket of wet leaf litter plasters exposed surfaces, creating something of a thick protective blanket against raindrops and consequent erosion.

I just don’t recall seeing muddy runoff during or after a hurricane. I can conceive that the large amount of rotting green leaf litter might somehow subsequently impact the lake sediment but I struggle with the extra soil erosion idea.
SMcE

Posted Jul 25, 2008 at 9:06 AM | Permalink

I sort of have to side with Jeff K on this. I find it difficult to understand how the sediment record would be remarkably different for a tropical cyclone versus a nor’easter or a late season blizzard and subsequent rapid melt/flooding. Then again that may be just me.
bender

Posted Jul 25, 2008 at 9:22 AM | Permalink

Kind of makes you want to read the reviews, doesn’t it?
Sam Urbinto

Posted Jul 25, 2008 at 9:31 AM | Permalink

It’s like the difference between picking a random card out of a freshly shuffled deck and getting a Queen of Hearts 20 times.

If it’s a standard poker deck with Jokers, the odds of pulling a Queen of Hearts the next pull are still 54:1

But overall, the odds of pulling a Queen of Hearts from such a scenario 21 times in a row is a bit higher…
UK John

Posted Jul 25, 2008 at 10:38 AM | Permalink

Right read the paper. See if I have got the general meaning, I am sure someone will tell me! These science boys don’t make it easy!

The hockey stick shape of sediment from 1870 has nothing to do with hurricanes, and all to do with building of dams and other human interference – am I correct?

The other bit of the graph up to 1870 may be a representation of hurricanes but it also may not?

My question is why did they bother, and did someone really pay good money for these people to do this?
Kenneth Fritsch

Posted Jul 25, 2008 at 11:02 AM | Permalink

RomanM:

Thanks much for your detailed explanations and here’s hoping you will be able to spend more time now doing the same for other papers and threads here at CA.

Your example below is well known (and often explained by the experts as something not necessarily intuitive to everyday players) by most bridge players worth their salt — the 4/2 versus 3/3 split of cards in a suit in your opponents hands with the chances 48% versus 36% in favor of the 4/2 split.

To illustrate how important experience in observing randomness is, in this situation: If the events occur as a uniform process, which is a more likely result: all 9 centuries each having 4 occurrences (as uniform as it can get) or one century having 5, another 3 and the remainder having 4 each? It surprises many people that the latter is 57.6 times as likely as the former. The reason is that there is only one way to select which century has how many in the first case (each has 4). In the second case there are 9*8 = 72 ways to pick which century has 5 and which has 3. Although each of the 72 cases has a slightly smaller chance of being the result than the unform case, there are so many of them that the “expected” uniform (all 4s) is much less likely.
Kenneth Fritsch

Posted Jul 25, 2008 at 12:03 PM | Permalink

That this paper presents results with regards to proxies for historical TC activity that differ from previous studies, does not show statistical significance in their result for extreme events per RomanM’s analysis and makes a rather tenuous connection between pre-1870 hurricanes and graded bed deposits (excerpted below), I would see this paper as more of something to be put on the shelf until better analyses are available. The authors seem to concede that the analysis is tenuous, but that since they can make a connection to SST proxies (tenuous also??) going back in time their analysis gains some significance.

How well would hurricanes be categorized between 1630 and 1870 and how many would be missed? How well are the SSTs in the MDR for hurricanes estimated from proxies as described by the authors? Is not the statement below: “In support of this clear relationship, we interpret that the hurricane mechanism (vs. the other two) is more likely to produce the graded beds even with smaller amounts of precipitation because hurricanes are often accompanied by damaging winds which disturb vegetation and uproot trees to mobilize a supply of fresh, loose sediment.” pretty much conjecture and without an independent source of evidence or even a reasonable theory?

It would appear to me that the authors are expecting readers to be impressed and make sense of it all by their stringing this scenario together. If they looked at only this scenario that might be more impressive, but if one contemplates their looking at many scenarios then not so impressive.

No relationship was noted with the freshets or extratropical systems. However, we did note a very strong correspondence with known historical hurricanes—10 of the 11 prominent graded beds deposited between 1630 and 1870 fall during years in which hurricanes are known to have struck the Boston area [Ludlum, 1963]. In support of this clear relationship, we interpret that the hurricane mechanism (vs. the other two) is more likely to produce the graded beds even with smaller amounts of precipitation because hurricanes are often accompanied by damaging winds which disturb vegetation and uproot trees to mobilize a supply of fresh, loose sediment. Such disturbance was aptly described by William Bradford of Plymouth Plantation (_55 km SE of Boston) who witnessed the 1635 hurricane, ‘‘This year . . . was such a mighty storm of wind and rain as none living in these parts, either English or Indians, ever saw. . . . It blew down many hundred thousands of trees, turning up the stronger by the roots and breaking the higher pine trees off in the middle.’

At this value, nine of the varves were identified as extreme events based on thickness alone. Two of the events (1707 and 1736) are varves which do not contain a graded bed, and are simply thicker than usual, for unknown reasons. However, the remaining seven events (1635, 1706, 1727, 1770, 1804, 1850, and 1869) are all varves that contain a graded bed, and correspond to a year in which a hurricane is known to have struck the Boston area [Ludlum, 1963]. We note that this choice of TDV is conservative as it was large enough to exclude the only varve with a prominent graded bed that does not correspond with a known hurricane year (1649). However, it did so at the expense of excluding three other varves that do include graded beds, and also correspond with hurricane years (1849, 1858, and 1861).

Fortunately, estimates of the Saffir-Simpson scale intensity for many historical New England hurricanes are available. Of the seven hurricane events identified in the LML record between 1630–1870, two of the storms (1635 and 1869 [September]) were estimated to have been of category 3 intensity, three (1727, 1770, and 1804) of category 2 intensity, and one (1850 [July]) of category 1 intensity (1706 was not considered) [Boose et al., 2001]. We thus interpret the varves with a graded bed as the result of both heavy rainfall amounts and landscape disturbance due to hurricanes of category 2–3 intensity. [18] By analogy, the 29 prehistoric extremes identified by the same criteria should serve as proxy evidence for similar category 2–3 intensity hurricanes that struck the Boston area, but during prehistoric times.
RomanM

Posted Jul 25, 2008 at 12:54 PM | Permalink

#25 Kenneth

I played some competitive bridge for a while over 30 years ago and saw other such examples of non-intuitive probabilities. One simple one had to do with the suit distrbution of a bridge hand: 4-3-3-3 vs. 4-4-3-2. It would seem that the first should be more likely, but in fact the second, less uniform hand occurs about twice as often.

Just for completeness, cross the i’s and dot the t’s, one last look at this data.

I took the list of years and tested the 34 middle values for possibly having a uniform distribution on the interval from the first year 1011 to the last year 1869. These values were selected to minimize possible end-point bias. The data was entered into R and a Kolmogorov-Smirnov one-sample two-sided test for uniformity was performed using the ks.test function. The probability value for the test was .4571. The interpretation one can make is that data from a Poisson process with uniform intensity will look as uniform or even less uniform than this data over 45% of the time in such a test. Some simple chi-square tests gave comparable results. Bottom line: nothing exceptional in the distribution of these hurricanes – the increased number from the 12th to the 16th century could be easily explained by simple natural random variation.
Steve McIntyre

Posted Jul 25, 2008 at 1:47 PM | Permalink

Roman, would you mind posting up your script? I find that very helpful both in understanding the point and learning something as well.
RomanM

Posted Jul 25, 2008 at 2:49 PM | Permalink

No problem, Steve.

years = c(1011, 1029, 1114, 1147, 1152, 1162, 1176, 1206,
1224, 1231, 1241, 1269, 1279, 1280, 1294, 1317,
1343, 1373, 1382, 1403, 1432, 1438, 1450, 1470,
1501, 1520, 1559, 1595, 1610, 1635, 1706, 1727,
1770, 1804, 1850, 1869)

ks.test( years[2:35], “punif”, years[1], years[36], alternative = “t”)

I actually transformed the data to a scale of [0, 1] first so that I would get a better sense of what the distribution of the values values looked like but the results are the same. The default parameters for the uniform are 0 and 1:

dat = ( years[2:35] – years[1] )/ (years[36] – years[1])

ks.test( dat, “punif”, alternative = “t”)
David Smith

Posted Jul 25, 2008 at 9:38 PM | Permalink

I checked the rain record for Blue Hill, which is not far from Mystic. Below is a table of the twenty rainiest days at Blue Hill since 1895.

It’s apparent that systems other than hurricanes cause heavy rain in Boston, so I understand why they didn’t attribute the abnormal sediment to rain alone.

I’m looking for strongest wind data for Boston but have not found such a list. It’ll be interesting to see if the two lists overlap on hurricanes.
Paul Linsay

Posted Jul 26, 2008 at 8:51 AM | Permalink

#30, David,

Hurricane Gloria hit Boston in 1985 and didn’t even make the list.

Without getting into an argument with RomanM about whether the distribution is Poisson, I did the calculation in a slightly simpler fashion and got the same result. The mean number of “hurricanes”/century = 4 from #10. The Poisson probability that there are 7 or fewer hurricanes per century is 0.949. The probability that all 9 centuries would have 7 or fewer hurricanes is 0.949^9 = 0.624. Hence the probability that there is at least one century with 8 or more hurricanes is 1 – 0.642 = 0.376, in agreement with the multinomial calculation.
RomanM

Posted Jul 26, 2008 at 9:54 AM | Permalink

No argument on Poissonness, Paul, but… 🙂

The precision of your (correctly done) calculation depends on the assumption that the actual underlying population intensity (not necessarily equal to your estimated value) for hurricanes over that period is known to be 4/century. The calculation I did is only conditional on the fact that 36 hurricanes occurred in that time period and does not depend at all on the actual (unknown) intensity parameter value for the population.
David Smith

Posted Jul 26, 2008 at 10:51 AM | Permalink

Re # 31 Another interesting storm, Paul, is the great New England hurricane of 1938. The Blue Hill rainfall for the wettest day of that event was only 1.47 inches, which doesn’t even make the top-500 wettest days at Blue Hill.
dearieme

Posted Jul 27, 2008 at 2:57 PM | Permalink

But, but, but: people here are assuming that “significant” was used in the sense that it would be used by a competent scientist. What if it was used in the sense that it is used by pompous politicians?
Kenneth Fritsch

Posted Jul 27, 2008 at 8:07 PM | Permalink

I have gone back and taken a more detailed look at the Besonen et al. (2008) paper and come up with a skeptical laymen’s view with the hopes of provoking further discussion of this paper as outlined below:

1. The evidence of couplets in the varve sediment in Lower Mystic Lake (LML) for defining annual/seasonal events seems reasonable to me. Once the annual resolution is demonstrated it is reasonable to go on to look for evidence of extreme events, such as hurricanes, on an annual basis.

2. The correlation of outlier status varve thicknesses with a characteristic graded bed with hurricane events in the Boston area seems rather well established even without a calculation of statistical significance. The authors point to these outlier conditions as being distinguishable from heavy rainfall and snow melt events by way of the varve having a graded bed property and have outlier thicknesses as determined by the program CLIM-X-DETECT – which differences the MA mean from the annual mean.

The explanation of how the hurricanes produce distinguishable varves by uprooting trees and disturbing other vegetation seems a bit unclear to this layperson and could be helped by some independent evidence for this phenomenon. One has to take the authors’ word that the heavy snowmelt and rainfall events do not correlate with outlier varve appearances. On the face of it the correlation of outlier varves to hurricanes in the period 1630-1870 appears excellent as noted in this paper excerpt:

We confined our analysis to the period prior to 1870 given the significant anthropogenic interference and altered sedimentation dynamics as discussed above. [11] No relationship was noted with the freshets or extratropical systems. However, we did note a very strong correspondence with known historical hurricanes—10 of the 11 prominent graded beds deposited between 1630 and 1870 fall during years in which hurricanes are known to have struck the Boston area [Ludlum, 1963].

It is a bit disconcerting that these 10 of 11 “prominent” graded beds do not correspond to the number of outlier varves for the period using any of the three limits of outlier. The lower outlier limit gives 14 , the middle gives 8 and the higher gives 7 extreme events.

The following excerpt from Besonen et al. (2008) explains the selection/calibration process in more detail. The 1707 and 1736 years are without hurricanes but have outlier varves and are not graded. Thusly the authors show further evidence of the graded character being connected to the hurricane produced outlier varves. At the same time the authors note a year (1649) without a hurricane but with a graded varve that was excluded because of the outlier thickness being insufficient to meet their criterion. They also comment that 3 hurricane years were eliminated with graded varves that did meet their outlier thickness criterion.

Also when one looks at the number hurricanes that the authors’ source (Early American Hurricanes 1492-1870 by David M Ludlum) one finds under the state of MA for the period used above 31 hurricanes from which to select. I’ll need to look at Ludlum’s details to determine how the authors selected the ones they did – or whether they merely fail to mention the hurricane years that did not meet their varve outlier status or graded condition. At this point one can really only conclude that the author’s have produced a criteria for extreme events that can be associated rather uniquely with hurricanes and should be consistent over time in selecting an extreme event with a limited range for perhaps category 2 and 3 hurricanes.

We examined the LML record using a range of TDVs from med +2.0 rstd up to med +5.0 rstd in increments of 0.5 rstd. Using the 1630–1870 portion of the historical records as a guideline, we determined the best compromise between over-/under-sensitivity was provided by a TDV of med +3.5 rstd. At this value, nine of the varves were identified as extreme events based on thickness alone. Two of the events (1707 and 1736) are varves which do not contain a graded bed, and are simply thicker than usual, for unknown reasons. However, the remaining seven events (1635, 1706, 1727, 1770, 1804, 1850, and 1869) are all varves that contain a graded bed, and correspond to a year in which a hurricane is known to have struck the Boston area [Ludlum, 1963]. We note that this choice of TDV is conservative as it was large enough to exclude the only varve with a prominent graded bed that does not correspond with a known hurricane year (1649). However, it did so at the expense of excluding three other varves that do include graded beds, and also correspond with hurricane years (1849, 1858, and 1861).

In summary, using guidance provided by the historical portion of the record, we recognize hurricane-related events in the LML varve thickness time series based on two conditions: 1.) the varve must reach or exceed a thickness TDV defined by med +3.5 rstd, and 2.) the varve must also contain a graded bed. Using these criteria, 36 hurricane related events (7 historic, 29 prehistoric) were recognized the LML record from 1011–1870.

3. The connection of extreme events to SST is pointed to in the paper in rather general and approximate terms. We can make the case that the distribution of extreme events as tabulated by the authors does not appear to have any statistical significance above what would be expected by chance for the 100 year periods per RomanM’s analyses posted earlier in this thread. That is, however, not to say that a correlation between SST in the NATL TC Main Development Region (MDR) and extreme events does not exist. The extreme events could be affected by the SST changes that are, in turn, occurring on a random basis.

To check for a correlation between the Besonen et al. (2008) extreme events and SST, I took the available SSTs graphed for the Sargasso Sea dO18 reconstruction at 40 to 70W and 25 to 35N here

http://www.climateaudit.org/?p=2202

as a proxy for the SSTs in the NATL TC MDR which is approximately designated at 20 to 80W and 10 to 20N and ranked them for the periods listed in the Besonen paper along with a ranking of the extreme events. I then calculated a Spearman rank correlation from these rankings. The correlation, r, was 0.18 and testing the hypothesis that the correlation was not greater than zero could not be rejected with a p = 0.62. The table below gives the Sargasso SSTs estimated from the graph for the various centennial periods along with the extreme events. The O18 Sargasso Sea graph is also shown below.

This correlation test amounts to a back of the envelop calculation and certainly does not account for any differences that might arise between the MDR and Sargasso Sea or the SSTs from the proxy being related to those months of the TC season in NATL or for that matter the accuracy of the O18 proxy. Since, however, the authors referenced this (and other) proxies I thought it might be fair game. Even more important in hurricanes making it as far north as MA might be the historical SSTs closer to the MA shoreline, but there was no mention of that in the Besonen paper.
Kenneth Fritsch

Posted Jul 27, 2008 at 8:11 PM | Permalink

In Post #35 above, I should have noted that the Sargasso Sea proxy starts at year 0 before present being in 1925. This is noted in Steve M introduction to that thread linked in the post above, but I should have made in clearer in my post.
UK John

Posted Jul 28, 2008 at 5:20 AM | Permalink

But I thought there wasn’t a Medieval warm period! Does this paper prove there was!
bernie

Posted Jul 28, 2008 at 5:53 AM | Permalink

John:
Perhaps but this graph starts in 1925. Read Kenneth’s #36 and the axis label!
bernie

Posted Jul 28, 2008 at 7:08 AM | Permalink

UK John, Re #40, see what you started! LOL
UK John

Posted Jul 28, 2008 at 3:52 PM | Permalink

My comment was meant to be an ironic joke! on the following point.

It could be argued that this paper found significant evidence of increased hurricane activity in the Medieaval Period, so it must have been very warm.
bernie

Posted Jul 28, 2008 at 5:11 PM | Permalink

#40 John
My apologies I was too hasty. I misinterpreted the chart or rather Kenneth’s note. I guess I was stunned at the implied precision of the graph.
Tony

Posted Jul 30, 2008 at 1:15 PM | Permalink

I’ll try to use the coffee-ground in my mug as an proxy for the climate in the southern hemisphere. Just in case this doesn’t yield the right results, I will try to use it as a proxy for the economic development of the southern hemisphere. Something must fit, I can smell it…
Kenneth Fritsch

Posted Aug 6, 2008 at 3:05 PM | Permalink

As a follow up to a back of the envelop analysis of the Besonen et al. (2008) extreme events as related to the Sargasso station S (StatS) temperature proxy, I compared the Sargasso Station S annual SST to the NATL TC Main Development Region (MDR) SST for the period 1854-2007.

I also did another estimate of the SSTs from the Keigwin 1996 paper graph shown above for century long periods and used that estimate to rank the SSTs and the Besonen extreme events for calculating a Spearman Rank correlation and the probability of r being zero.

The re-estimation of the SSTs lead to the same results as previously reported above: When testing the hypothesis that Spearman Rank r was not zero, it could not be rejected with a p = 0.40.

The correlations of the SST average for the months of August, September and October in the MDR (10 to 20 N and 20 to 80 W) to the annual SSTs for StatS (32 to 36 N and 62 to 66 W) were calculated for a 1, 10 and 20 year averages. I wanted to determine whether the correlation progressively improved by averaging over larger increments of time.

1 year R^2 = 0.39; 10 year R^2 = 0.77 and 20 year R^2 = 0.80.

Since I only had approximately 150 years of data I could not determine directly the correlation for century long averaged SSTs for MDR versus StatS.

I used StatS on an annual basis according to CA here http://www.climateaudit.org/?p=898 were we have:

“The reference to Deuser 1987 in Keigwin (Science 1996) was as follows”:

For isotope analysis, I chose the planktonic foraminfera G ruber (white variety 150 to 230 microns) The white variety of this species lives year-round in the upper 25 m of the northern Sargasso Sea and has a relatively constant annual mass flux and shell flux (18- W.G. Deuser, 1987. J Foraminiferal Res 17, 14.). Thus of all planktonic foraminifera in this location this species is most appropriate for reconstructing annual average SSTs (18).

The SSTs were taken from the link:

http://nomads.ncdc.noaa.gov/#climatencdc

using the Smith-Reynolds Extended Reconstructed SST’s.

The Keigwin core samples had variable time intervals but averaged approximately 60 years for temperature proxy readings from 76 years before present to 3100 years before present. The paper they were taken from was written in 1996.

The data listing core depth, calibrated YrBP and d18O G. rubber are in this link here

I am sure others here could do better renditions of back of the envelop analyses with these data, but in Besonen et al. (2008) it appeared that the authors were trying too hard to indicate an SST to extreme events correlation. I have no idea how much one could count on the validity of the Keigwin d18O as a proxy for temperature and for that matter what would increase the tendencies for the occurrence of extreme events (TCs migrating up to MA). Would theory predict that a century with a higher average SST in the MDR for the ASO months have a greater tendency to form extreme events or could a century with a lower average SST but a larger standard deviation, and perhaps a few seasons with higher SSTs, create more extreme events? I guess it would depend on how constant SST variability is over time periods of a century.

Anyway I was able to relocate the link for calculating SST time series by entering in the grid locations and time periods so something was accomplished.

One Trackback

By A baseless Mann UVa email – claims by Mann spliced and diced | Watts Up With That? on May 12, 2014 at 1:21 PM

[…] maybe a slight increasing trend. It would likely look a lot like this graph you plotted in the CA discussion of Besonen et al 2008 (which has other issues independent of this discussion, I’m only using it as an example of […]

Climate Audit