Reliable measurements or pliable estimates?

The last three posts were mostly about the adjustments of the ocean data done in the Karl 2015 paper. This because the adjustments in ocean data had the biggest impact on the result (that there wasn’t something like a “hiatus”). Kevin Marshall of the excellent blog reminded in a comment on previous post that surface datasets had issues as well.

I could agree with that one, I also had written a post in the first year of blogging: Things I took for granted: Global Mean Temperature,, that described how my perception of a global mean temperature changed from believer until skeptic and why I had a hard time to believe that the (surface) datasets were accurate enough to capture an 0.8 °C increase in temperature over 160 years.

Reading it back I was a bit surprised that I wrote this already in my first year of blogging. But, in line with the Karl et al paper, there were two things that I think were missing in this early piece.

First, that the data in the surface datasets are not measurements, but estimates derived from the temperature station measurements. In a way that could be concluded from the uneven spatial coverage, the convenience sampling and other measurement biases like Urban Heat Island, Time of Observation and who knows what more. This makes that the homogenized end result will just be an estimate of the actual mean temperature.

Okay, not really world-shaking, not really unexpected, but important to know. When I was a believer, I assumed that scientists knew exactly what the mean global temperature was until now and that they could compare the current temperature with those of the past. How many people from the public realize that the mean temperature is not a measurement, but an estimate? How many people know that the surface dataset changed so often? And that this adjustments change the conclusion? How many journalists in the mainstream media give some background information when they report headlines like “Warmest year ever”?

Second, and in line with the first point, that estimate is a moving target as shown clearly in the NOAA and NASA/Giss data. Those changed substantially over time, from a cycle in last century to almost a straight line now. And now the recent attempt at disappearing the “hiatus” in the NOAA data, by ignoring high quality data and adjust low quality data.

This indicates that the dataset is ever changing, although the measurements stay the same. This indicates that this data is not of a good quality in the first place, therefor in need of adjustments to correct for the problems with it. This raises some questions like for example: how do we know whether the current estimate is correct? Apparently the estimates in the past were not correct and had to be adjusted. If the estimates are that pliable and so easily (re)adjusted, then how reliable are the results?


4 thoughts on “Reliable measurements or pliable estimates?

  1. manicbeancounter

    Thanks for the recommendation for my little blog.
    You gave gone much further than most on recognizing the issues with temperature homogenization. However there is more to understand on the topic. I have been meaning to post on the issue more for some time now, but keep developing my ideas. Some thoughts.
    1. The standard definition of temp homogenisation is of a process that removes the measurement biases (eg UHI) to only leave only variations caused by real climate trends. This assumes that over the homogenisation area there are no real climatic trends. If this assumption is not valid due to real data trends at the “local” level (that could be an area bigger than Belgium) you could smooth out real trends.
    2. Homogenisation algorithms make no distinctions between measurement biases and real anomalies in the data.
    3. Temperature stations are unevenly dispersed. Some areas of the world are very under-represented, such as much of Asia, Africa and South America. In these areas to homogenization must be over a much greater area.
    4. Go back prior to 1950 and the data is much sparser than post 1960.
    5. Try
    This is the surface temperature maps, especially for the annual anomalies (avoiding 2015) at the 250km smoothing radius. This is post homogenized data, but clearly shows huge temperature variations across both land and ocean.

    It indicates to me that the assumption that “over the homogenisation area there are no real climatic trends” does not hold for most of the land surface area, particularly the further back in time you go.

    Your second point about continually adjusting the past may also be due to homogenization techniques. I need to study this further.

  2. poitsplace

    I think I mentioned it here before, but the 1997 report by NOAA about it being the hottest year was over 1C higher than the 2015 report about 2014 being the hottest year. I suspect there is far more variability in the datasets than this.

    The surface record is ever changing, and as we’ve seen in the adjustments, it’s highly likely that there is an accumulating error with the processes used. The process absolutely depends on the assumption of an even distribution of warming and cooling break points in the record. To put this into proper perspective, if you had 100 years of records with a station move every 10 years to get the stations out of UHI contaminated areas, the dataset would be full of slow warming trends with a cooling break point.

    The routines used by essentially all temperature datasets, would then stitch that broken record into an unbroken record, ramming the past down every time to align the temperatures, adding back the UHI that had been avoided by the station move right back into the temperature record multiple times. If there were an average of 10 moves per station in the dataset, each with an average of 1C of breakpoint caused by the avoidance of UHI, the routines would cool the past by 10C…even though there was no overall warming trend.

    Another thing we may be seeing is the twisting of the data by the spacial distribution processing. In the distant past (as the other commenter pointed out) there was sparse station coverage. If these were more UHI polluted (although there was far less back then) the routines would smear that heat around. The network (in the US at least) shows substantial increases in numbers of stations for a time, likely introducing more remote, low UHI stations which might overpower the UHI signal…and there is a cooling that shows up in the early adjustments. Finally there is a massive culling of surface stations in recent history…likely removing the more remote stations and exaggerating the warming trend caused by homogenization even more.

    There should probably be some tests with test datasets to see how much wiggle room there really is in the measurements…testing each routine to see what kind of errors might accumulate. And we should also see what happens if we just calculate the trend without the various routines. I’m personally concerned that the possible amount of error for the surface dataset might surpass the entire signal.

    1. trustyetverify Post author

      Indeed, the more one thinks about it, the more issues to find. Starting from incomplete, uneven convenience sampled data with measurements biases, it seems more like a miracle coming to the correct temperature average after processing such data. That makes me also rather concerned about the signal/noise ratio.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s