Support the Arctic Sea Ice Forum and Blog

Author Topic: Are 2007 and 2012 "statistical outliers"?  (Read 1720 times)

Jim Hunt

  • First-year ice
  • Posts: 6514
  • Don't Vote NatC or PopCon, Save Lives!
    • View Profile
    • The Arctic sea ice Great White Con
  • Liked: 1015
  • Likes Given: 92
Are 2007 and 2012 "statistical outliers"?
« on: September 18, 2024, 09:35:32 AM »
"Tamino" constructed a "3 segment" graph of annual average extent:

https://tamino.wordpress.com/2021/09/30/one-look-at-a-graph/

Then Kev Pluck constructed a graph of annual minima omitting two normally conspicuous "cherries":

https://x.com/kevpluck/status/1740326508844273815

Then Gerontocrat constructed a similar graph, also omitting 1996:

https://forum.arctic-sea-ice.net/index.php/topic,4145.msg410488.html#msg410488

Discuss!
« Last Edit: September 18, 2024, 09:42:18 AM by Jim Hunt »
"The most revolutionary thing one can do always is to proclaim loudly what is happening" - Rosa Luxemburg

https://bsky.app/profile/greatwhitecon.info

oren

  • Moderator
  • Multi-year ice
  • Posts: 10079
    • View Profile
  • Liked: 3797
  • Likes Given: 4359
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #1 on: September 18, 2024, 11:08:39 AM »
Thank you Jim.

SteveMDFP

  • Young ice
  • Posts: 2725
    • View Profile
  • Liked: 653
  • Likes Given: 72
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #2 on: September 18, 2024, 01:21:04 PM »
"Tamino" constructed a "3 segment" graph of annual average extent:

https://tamino.wordpress.com/2021/09/30/one-look-at-a-graph/

Then Kev Pluck constructed a graph of annual minima omitting two normally conspicuous "cherries":

https://x.com/kevpluck/status/1740326508844273815

Then Gerontocrat constructed a similar graph, also omitting 1996:

https://forum.arctic-sea-ice.net/index.php/topic,4145.msg410488.html#msg410488

Discuss!

Good point.  Excluding extreme outliers is sometimes done in legitimate statistical analyses.  As long as the criteria for exclusion are reasonable and even-handed (i.e., not exluding only the high or only the low outliers), this strikes me as defensible, and not "cherry-picking."  In some kinds of statistical analysis, this is accomplished by examining medians rather than means.  How to apply this kind of even-handed cleaning up of noisy data for arctic sea ice is beyond me, however.

Richard Rathbone

  • Nilas ice
  • Posts: 1992
    • View Profile
  • Liked: 437
  • Likes Given: 27
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #3 on: September 18, 2024, 01:26:19 PM »
"Tamino" constructed a "3 segment" graph of annual average extent:

https://tamino.wordpress.com/2021/09/30/one-look-at-a-graph/


This was a statistical test of what combination of straight lines best described the data. The reason its 3 rather than 1 or 2 or 25, is thats what the statistics said. The reason the breaks happen at the point they do is thats what the statistics said.

Its the single straight line thats the cherry pick.

cognitivebias2

  • Grease ice
  • Posts: 569
    • View Profile
  • Liked: 111
  • Likes Given: 129
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #4 on: September 18, 2024, 05:47:38 PM »
"Tamino" constructed a "3 segment" graph of annual average extent:

https://tamino.wordpress.com/2021/09/30/one-look-at-a-graph/


This was a statistical test of what combination of straight lines best described the data. The reason its 3 rather than 1 or 2 or 25, is thats what the statistics said. The reason the breaks happen at the point they do is thats what the statistics said.

Its the single straight line thats the cherry pick.

3 segments cannot 'fit' better than 25!  If there are 25 data points, 24 segments will fit perfectly.  The strongest statement would be that its the best fit for a 3 segment line.   Why not do best fit with polynomial curves of various degrees to see which one is more satisfying?  Apparently, the 1st degree polynomial (line) fit is unsatisfying to some.



« Last Edit: September 18, 2024, 06:10:47 PM by cognitivebias2 »

Richard Rathbone

  • Nilas ice
  • Posts: 1992
    • View Profile
  • Liked: 437
  • Likes Given: 27
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #5 on: September 18, 2024, 06:25:28 PM »
"Tamino" constructed a "3 segment" graph of annual average extent:

https://tamino.wordpress.com/2021/09/30/one-look-at-a-graph/


This was a statistical test of what combination of straight lines best described the data. The reason its 3 rather than 1 or 2 or 25, is thats what the statistics said. The reason the breaks happen at the point they do is thats what the statistics said.

Its the single straight line thats the cherry pick.

3 segments cannot 'fit' better than 25!  If there are 25 data points, 24 segments will fit perfectly.  The strongest statement would be that its the best fit for a 3 segment line.   Why not do best fit with polynomial curves of various degrees to see which one is more satisfying?  Apparently, the 1st degree polynomial (line) fit is unsatisfying to some.

The number of parameters is also part of the quality of the fit. Overfitting is a decrease not an increase in quality.

The Walrus

  • Young ice
  • Posts: 3319
    • View Profile
  • Liked: 201
  • Likes Given: 529
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #6 on: September 18, 2024, 06:37:26 PM »
Here is a polymeric fit that may be more satisfying.  The problem is that the sea ice does not follow any mathematical formula.  There are a multitude of physical parameters that influences the outcome, and they are changing simultaneously.  Hence, some years become "outliers" as several parameters converge at a given time. 

Perhaps it would be best just to show a moving average (5-year grey line), which would show the changes over time.  It looks rather similar to the 3-segment linear.  The data is telling us that changes occurred over time, which affected the trend of the sea ice extent.

oren

  • Moderator
  • Multi-year ice
  • Posts: 10079
    • View Profile
  • Liked: 3797
  • Likes Given: 4359
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #7 on: September 18, 2024, 10:17:58 PM »
Here is a polymeric fit that may be more satisfying.  The problem is that the sea ice does not follow any mathematical formula.  There are a multitude of physical parameters that influences the outcome, and they are changing simultaneously.  Hence, some years become "outliers" as several parameters converge at a given time. 

Perhaps it would be best just to show a moving average (5-year grey line), which would show the changes over time.  It looks rather similar to the 3-segment linear.  The data is telling us that changes occurred over time, which affected the trend of the sea ice extent.
The dots making up the segments in the last image appear almost cherry-picked. Starting the green line with the 1996 outlier peak gives it an extra slope.

The Walrus

  • Young ice
  • Posts: 3319
    • View Profile
  • Liked: 201
  • Likes Given: 529
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #8 on: September 18, 2024, 10:33:00 PM »
Here is a polymeric fit that may be more satisfying.  The problem is that the sea ice does not follow any mathematical formula.  There are a multitude of physical parameters that influences the outcome, and they are changing simultaneously.  Hence, some years become "outliers" as several parameters converge at a given time. 

Perhaps it would be best just to show a moving average (5-year grey line), which would show the changes over time.  It looks rather similar to the 3-segment linear.  The data is telling us that changes occurred over time, which affected the trend of the sea ice extent.
The dots making up the segments in the last image appear almost cherry-picked. Starting the green line with the 1996 outlier peak gives it an extra slope.

Yes it does.  But not removing one year at the beginning (1996) and replacing it with one at the end (2012), yields a somewhat similar slope.  The 3-segment linear trends still aligns with the moving average.
« Last Edit: September 18, 2024, 10:38:06 PM by The Walrus »

oren

  • Moderator
  • Multi-year ice
  • Posts: 10079
    • View Profile
  • Liked: 3797
  • Likes Given: 4359
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #9 on: September 18, 2024, 10:36:03 PM »
And how about removing 1996 from the green line, adding 1999 to the first line, and not adding 2012 to the green line?
So taking the original 3 segments but moving the first break point one year forward.

El Cid

  • Young ice
  • Posts: 2666
    • View Profile
  • Liked: 1023
  • Likes Given: 241
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #10 on: September 18, 2024, 11:01:25 PM »
Answering the title of the thread: No, 2007 and 2012 are NOT outliers. They are not measurement errors, they are not inexplicable extreme values, they only show an accelerating trend that suddenly stopped or at least seriously slowed down in 2012/13. They are perfectly within the realm of natural variablity.
   
You could argue that the long term downtrend is linear, you can argue that there is 3-segment system (which in all likelyhood will start a new downleg one day), but there is no reason to exclude them.

I don't even understand why anyone would even think of that. Excluding them is absolute cherry picking 

Phil.

  • Grease ice
  • Posts: 580
    • View Profile
  • Liked: 87
  • Likes Given: 13
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #11 on: September 19, 2024, 01:30:41 AM »
"Tamino" constructed a "3 segment" graph of annual average extent:

https://tamino.wordpress.com/2021/09/30/one-look-at-a-graph/

Then Kev Pluck constructed a graph of annual minima omitting two normally conspicuous "cherries":

https://x.com/kevpluck/status/1740326508844273815

Then Gerontocrat constructed a similar graph, also omitting 1996:

https://forum.arctic-sea-ice.net/index.php/topic,4145.msg410488.html#msg410488

Discuss!

Good point.  Excluding extreme outliers is sometimes done in legitimate statistical analyses.  As long as the criteria for exclusion are reasonable and even-handed (i.e., not exluding only the high or only the low outliers), this strikes me as defensible, and not "cherry-picking."  In some kinds of statistical analysis, this is accomplished by examining medians rather than means.  How to apply this kind of even-handed cleaning up of noisy data for arctic sea ice is beyond me, however.

A method I've used in my research is the Grubbs' test where if a data point has a Z value above a certain value (depending on number of total points) it can be considered an outlier.  If you did something like that based on the deviation from the fitted line you could identify statistical outliers.  For 50 points the critical Z value would be 3.13. (Based on a 5% probability)

HapHazard

  • Grease ice
  • Posts: 909
  • Chillin' on Cold Mountain.
    • View Profile
  • Liked: 312
  • Likes Given: 5632
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #12 on: September 19, 2024, 02:09:46 AM »
IDK about them being outliers but all the charts posted here are too short-term, in in that sense it's cherry picking. You're basically zooming in on one tree & telling me that's the entire forest. Not that we have great long-term data; it is what it is.

Besides, area & volume are more important anyway.

Y'all are arguing over a solution to the 3 body problem lmao
If I call you out but go no further, the reason is Brandolini's law.

El Cid

  • Young ice
  • Posts: 2666
    • View Profile
  • Liked: 1023
  • Likes Given: 241
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #13 on: September 19, 2024, 08:16:44 AM »
To demonstrate how much 2007 and 12 are NOT outliers, consider the chart below. I looked at annual average sea ice volume (piomas), took the 7 year centered moving average (meaning that the previous and the following 3 years from a given year are in the moving average) and looked at how much the year is from the trend. Results are similar if you look at only September.
« Last Edit: September 19, 2024, 08:21:55 AM by El Cid »

oren

  • Moderator
  • Multi-year ice
  • Posts: 10079
    • View Profile
  • Liked: 3797
  • Likes Given: 4359
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #14 on: September 19, 2024, 09:44:28 AM »
2012 was an especial outlier in Sept extent, so checking for other variables will not necessarily reveal it.

Obviously, it wasn't even a true outlier (as pointed above). But it had a confluence of factors and one of them was compaction which served to push extent much lower that what we saw in 2016 or 2024 even though area was not that far off.

El Cid

  • Young ice
  • Posts: 2666
    • View Profile
  • Liked: 1023
  • Likes Given: 241
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #15 on: September 19, 2024, 10:01:10 AM »
2012 was an especial outlier in Sept extent, so checking for other variables will not necessarily reveal it.

Obviously, it wasn't even a true outlier (as pointed above). But it had a confluence of factors and one of them was compaction which served to push extent much lower that what we saw in 2016 or 2024 even though area was not that far off.

I used volume as that is thought to be the best indicator (although not directly observed/measured).

You can see that September 2012 is as much an outlier as 2014 in the other direction. Also, 2012 was as much an outlier as 1995. The chart above suggests that no years are outliers, all are within natural variation. Of the 43 values 30 are witthin +/-1 , 11 are within +/-2, and there is one value slightly above 2 and below -2. Looks like a pretty well behaved distribution to me.




Jim Hunt

  • First-year ice
  • Posts: 6514
  • Don't Vote NatC or PopCon, Save Lives!
    • View Profile
    • The Arctic sea ice Great White Con
  • Liked: 1015
  • Likes Given: 92
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #16 on: September 19, 2024, 10:25:19 AM »
I used volume as that is thought to be the best indicator (although not directly observed/measured).

However see below, and they're working on doing it in summer too.
"The most revolutionary thing one can do always is to proclaim loudly what is happening" - Rosa Luxemburg

https://bsky.app/profile/greatwhitecon.info

Jim Hunt

  • First-year ice
  • Posts: 6514
  • Don't Vote NatC or PopCon, Save Lives!
    • View Profile
    • The Arctic sea ice Great White Con
  • Liked: 1015
  • Likes Given: 92
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #17 on: September 19, 2024, 10:33:20 AM »
"The most revolutionary thing one can do always is to proclaim loudly what is happening" - Rosa Luxemburg

https://bsky.app/profile/greatwhitecon.info

HapHazard

  • Grease ice
  • Posts: 909
  • Chillin' on Cold Mountain.
    • View Profile
  • Liked: 312
  • Likes Given: 5632
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #18 on: September 19, 2024, 09:46:48 PM »
That's what I'm talking about, Jim, thank-you.  ;)
If I call you out but go no further, the reason is Brandolini's law.

kassy

  • First-year ice
  • Posts: 9138
    • View Profile
  • Liked: 2213
  • Likes Given: 2044
Re: Are 2007 and 2012 "statistical outliers"?
« Reply #19 on: September 20, 2024, 09:12:39 PM »
Even if you assume both are they are also statistical datapoints. Looking at the fundamentals they will ultimately fit in fine.
If you just look back at the graphics you miss the newer inputs. Things like stronger atmospheric rivers over time or cloud changes:
https://www.sciencedaily.com/releases/2024/09/240919114912.htm.
Þetta minnismerki er til vitnis um að við vitum hvað er að gerast og hvað þarf að gera. Aðeins þú veist hvort við gerðum eitthvað.