Support the Arctic Sea Ice Forum and Blog

Author Topic: Mathematics of drawing distribution graphs of sea ice extent  (Read 2829 times)

cesium62

  • Frazil ice
  • Posts: 310
    • View Profile
  • Liked: 20
  • Likes Given: 1
Mathematics of drawing distribution graphs of sea ice extent
« on: March 21, 2015, 07:22:02 PM »
Over in the Melting thread, Siffy posts some graphs made by Wipneus of sea ice extent in various arctic seas.  The graphs display standard deviation bands assuming the distribution of data points for sea ice extent on a date are normally distributed.  However, the data is fairly clearly not normally distributed.

In hopes of getting at least a short detailed mathematical discussion out of this, I've created this new thread.

Quote
Siffy, your question has been mostly answered by others. The grey bands show the average cover with error bands of 1 and 2 standard deviations.
That is ignoring a geographical maximum cover in some region, but also the very skewed behavior of ice cover deviations: negative swings are much larger than positive ones even with no physical maximum's.
Note that just cutting the grey's off is not a sound solution. The possibilities that have been cutoff will have to appear somewhere else, that is in other regions. Before you know it you get into modelling so complicated that it cannot possibly be useful anymore. 

For the sake of simplicity and because such issues are commonly ignored in the field I did choose not to do anything about it. I am open to suggestions though.

The data distribution looks like a response time latency distribution.  An overly quick google search suggests that modeling this as an "Ex-Gaussian" distribution might work.  That still looks a bit complicated to implement.  It's not immediately obvious to me where you would draw the equivalent of the standard deviation bands.

The other approach is to simply draw in the bands given the data and avoid modeling a distribution for it.  (Well, we will still be modeling some sort of distribution, but...)  E.g. Sort the data points.  Find the median.  Find the points that are 1/3rd above and below the median (or choose whatever gradiant seems interesting to draw).  Find the points that are 3% from the top and 3% from the bottom or so.

Perhaps there are too few points (we have, what, 20 or 30?) for this exercise to be sufficiently meaningful, but it should be at least as meaningful as trying to force a normal distribution to the data.

[Edit: seaicesailor gives the same suggestion using simpler language some 8 hours before me, if I would bother to read ahead before responding...]
« Last Edit: March 21, 2015, 07:27:50 PM by cesium62 »

seaicesailor

  • Guest
Re: Mathematics of drawing distribution graphs of sea ice extent
« Reply #1 on: March 21, 2015, 07:32:41 PM »
Lol, thx.
But I realized after reading your post that indeed there are too few samples for the 2.5th and 97.5th percentiles . . .

seaicesailor

  • Guest
Re: Mathematics of drawing distribution graphs of sea ice extent
« Reply #2 on: March 21, 2015, 07:41:22 PM »
Which makes me think, is the series of years too short to draw conclusions beyond the level of confidence of +/- 1 standard deviation?

jdallen

  • Young ice
  • Posts: 3290
    • View Profile
  • Liked: 573
  • Likes Given: 214
Re: Mathematics of drawing distribution graphs of sea ice extent
« Reply #3 on: March 21, 2015, 08:25:17 PM »
Which makes me think, is the series of years too short to draw conclusions beyond the level of confidence of +/- 1 standard deviation?

No, I don't think so.  I agree, the 97th percentile is a bit of a stretch, but the 92nd is not, and that gives us our +/- 2SD. 

I also agree, the larger our samples, the more skillful our estimates.  Smaller samples are still useful, and have been reasonably predictive.  When we see outliers passing the existing 2SD limit, it is not necessarily indicating the estimation was wrong. 

You need to recall, that the statistical range for any evaluation of a system is dependent on there being consistent context surrounding the dimensions being sampled.  The +/- SD assumes the same underlying context from sample period to sample period.  It establishes history for the system.   With our arctic measurements, when we start seeing outliers passing the 2SD range, it more likely indicates the underlying context is shifting, not that the sample is too small.
This space for Rent.

ghoti

  • Grease ice
  • Posts: 767
    • View Profile
  • Liked: 12
  • Likes Given: 15
Re: Mathematics of drawing distribution graphs of sea ice extent
« Reply #4 on: March 21, 2015, 09:05:06 PM »
What distribution is assumed for the SD calculation? It is unlikely to be a normal distribution even though that's the usual assumption used when calculating SD.

seaicesailor

  • Guest
Re: Mathematics of drawing distribution graphs of sea ice extent
« Reply #5 on: March 21, 2015, 09:18:41 PM »
Distribution is not normal. But nevertheless, Jdallen you are right. I understand that if one gets last 15 years below 1std dev of previous 15, one can draw conclusion that the distribution is shifting, and if  the last 5 summer data are in the range -2 sigma of what used to be in the 20th century a conclusion of a completely changed Arctic can be drawn too. That is the idea behind those bands, to show how bad things are going.
I was observing the mathematical fact that there are too few data to say for instance 'the extent or area of september will within this or that range, with a 95% confidence'. reason why people need also May extent/area, snow cover, and number of melt ponds, etc. to predict minimum extent in September.
I think I am now stating the obvious and drifting off-topic.

Wipneus

  • Citizen scientist
  • Young ice
  • Posts: 4200
    • View Profile
    • Arctische Pinguin
  • Liked: 996
  • Likes Given: 0
Re: Mathematics of drawing distribution graphs of sea ice extent
« Reply #6 on: March 22, 2015, 09:39:34 AM »
Sketching percentiles belongs IMO to the simple methods an is worth a try.

It is a standard meteo 30 year period (1981-2010), so on year counts for 3.3%. Near the 2.3% and 97.7% 2-sigma bands. One sigma would be near the 5th and 25th ranking for each day.
Drawing lines through those points should not be too difficult.