On Climate Change and Statistics December 9, 2009Posted by Matt in statistics.
Tags: Climate Change, linear regression, math, science, statistics
My job is to foretell the future.
No, I’m not like Miss Cleo. I don’t own a crystal ball or use chicken bones or practice astrology. I don’t receive prophetic visions of impending doom. No, my tools for seeing into the future are found in the realm of mathematics.
I provide much of the statistical background when it comes to forecasting for the multi-billion dollar medical device company for whom I am employed. I work heavily with regression analysis to do this form of numerical soothsaying and have developed a fairly good understanding of the behind-the-scenes processes needed to use these methods. My education background is firmly rooted in this area, with a bachelor’s degree in mathematics and graduate work in statistics, so my life and livelihood are based in the power of numbers.
Thus, I was a bit dismayed to read about the criticism that had been leveled at a few climate scientists over a series of emails that were hacked and stolen. Now, I do not understand all of the intricacies of the science, I don’t claim to know anything about their data collection techniques, nor have I sifted through the piles of data upon which they base their hypotheses, but my ears perked up when talk turned to their statistical methods, especially when one hears the vehemence with which critics based their accusations.
There are two points that immediately come to mind when I think of this case:
1. There is no perfect statistical method. Regardless of the model used or the amount of data collected, there is always room for error. Statisticians can account for this with a probable range of error, but that fact is often overlooked by critics in their zeal for proving an opponent wrong.
2. When forecasting the future, “massaging” the data to fit an obvious trend is often needed. Many times you will find unforecasted anomalies (whether too high or too low) in the data which could skew your statistical model and give an unrealistic forecast. In order to combat this, a statistician may either remove the data point completely or replace it with a more realistic value.
Climate science is a complicated field and one that I certainly do not know enough about to comment. The multiple variables involved in building a statistical model for this sort of event is far more complicated than the regression models I employ. I can imagine that it is an exceedingly difficult field to predict and one fraught with uncertainty, but certainly not one that is impossible.
And I think that if everyone had at least a basic understanding of statistics this debate would be far different and much more productive.