Support the Arctic Sea Ice Forum and Blog

Author Topic: Interpretation of time codes in cryosphere data  (Read 5913 times)

plg

  • New ice
  • Posts: 76
    • View Profile
  • Liked: 1
  • Likes Given: 0
Interpretation of time codes in cryosphere data
« on: May 06, 2015, 02:40:46 PM »
The data from Cryosphere (http://arctic.atmos.uiuc.edu/cryosphere/timeseries.anom.1979-2008) uses time codes instead of dates. The time code is expressed as 10000ths of a year.

However, there is some difficulty in interpreting the codes as they do not map properly to leap years and there exists duplicate time codes with different values for area.

The purpose of this thread is to discuss the interpretation of these time codes.

[I created this thread in response to a request by RaenorShine in the "Home brew AMSR2 extent & area calculation" thread, to avoid cluttering up Wipneus calculations.]
If you are not paranoid you just do not have enough information yet.

plg

  • New ice
  • Posts: 76
    • View Profile
  • Liked: 1
  • Likes Given: 0
Re: Interpretation of time codes in cryosphere data
« Reply #1 on: May 06, 2015, 02:43:26 PM »
[This is a repost of my original entry in the "Home brew AMSR2 extent & area calculation" thread]

Corrections for timecodes in cryosphere data

I have been using the cryosphere data (http://arctic.atmos.uiuc.edu/cryosphere/timeseries.anom.1979-2008) for a
couple of years for fun and no profit (as well as IJIS, PIOMAS and NSIDC).

At the time I set up an automated import into a database (PostgreSQL), and noted that the timecodes were not unique. I did not bother to look further then but just turned of the indexing on the timecode.

I have recently had some time on my hands and decided to have a closer look and after a few hours work found several inconsistencies and probable typos.

There are 366 data points for all years divisible by 4 and 365 for all others (except 1979 only had 364 values, 1979.0000 is missing). Note that 2000 is therefore considered a leap year. I use the term Y4 instead of leap year as a reminder that it is not strictly leap years.

Focusing first on the non-Y4 years I found the following inconsistencies (timecodes that are not round(d*365/10000) where d is the day of the year), some which appear to be obvious typos:
  • 1987.9253 -> 1987.9233
  • 2005.1088 -> 2005.1096
  • 2005.8650 -> 2005.8658
  • 2010.0928 -> 2010.0932
  • 2009.4274 -> 2009.4247 (duplicate code, only correct the first occurrence)
In addition, the block of values 2007.9315 - 2007.9808 (19 values) have been shifted 1 step (~0.0027) since the code 2007.9288 is missing and 2007.9808 is a duplicate.

The following will fix this:
  • for codes 2007.9342 - 2007.9780, set the code to the code on the previous line
  • change 2007.9315 -> 2007.9288 (missing time code)
  • change 2007.9808 -> 2007.9780 (first occurrence, duplicate entry).
This will make all non-Y4 years compact (365 values) and monotonically increasing by 1 day (~0.00274).

For the Y4 years there is more to do, I believe that approximately 3/4 of the year is one day off but this requires some more explanation, addressed in following posts.
If you are not paranoid you just do not have enough information yet.

plg

  • New ice
  • Posts: 76
    • View Profile
  • Liked: 1
  • Likes Given: 0
Re: Interpretation of time codes in cryosphere data
« Reply #2 on: May 06, 2015, 02:49:41 PM »
Simple quick-and-dirty python code for correcting the data for non-Y4 as explained in previous post. The ouput adds a corrected time code column, the original time codes are untouched in the first columns (by some reason half the text is in italic,due to the forum software):

Code: [Select]
INFILE='timeseries.anom.1979-2008'
OUTFILE='timeseries.anom.1979-2008_mod'

def getlines(fn = INFILE):
    with open(fn) as f:
        lns = [[y for y in x.strip().split(' ') if y] for x in f]
    return lns

def processlines(lns):
    for i in range(len(lns)):
        if lns[i][0] == '1987.9253':
            lns[i].append('1987.9233')
        elif lns[i][0] == '2005.1088':
            lns[i].append('2005.1096')
        elif lns[i][0] == '2005.8650':
            lns[i].append('2005.8658')
        elif lns[i][0] == '2010.0928':
            lns[i].append('2010.0932')
        elif lns[i][0] == '2009.4274' and lns[i][0] != lns[i-1][0]:
            lns[i].append('2009.4247')
        elif lns[i][0] == '2007.9315':
            lns[i].append('2007.9288')
        elif lns[i][0] >= '2007.9342' and lns[i][0] <= '2007.9780':
            lns[i].append(lns[i-1][0])
        elif lns[i][0] == '2007.9808' and lns[i][0] != lns[i-1][0]:
            lns[i].append(lns[i-1][0])
        else:
            lns[i].append(lns[i][0])
    return lns

def putlines(lns, fn = OUTFILE):
    from os import linesep
    fmt = ' %s%12s%12s%12s%11s' + linesep
    with open(fn, 'w') as f:
        for ln in lns:
            f.write(fmt % tuple(ln))

if __name__ == '__main__':
    putlines(processlines(getlines()))

This codes works well on a real operating system, and I am confident it will work on Windows as well.
« Last Edit: May 06, 2015, 03:49:14 PM by plg »
If you are not paranoid you just do not have enough information yet.

RaenorShine

  • Frazil ice
  • Posts: 244
    • View Profile
  • Liked: 0
  • Likes Given: 0
Re: Interpretation of time codes in cryosphere data
« Reply #3 on: May 06, 2015, 03:00:21 PM »
Thanks for moving this across! It wont get lost across here, and I'm sure it will help others analysing the CT data.

plg

  • New ice
  • Posts: 76
    • View Profile
  • Liked: 1
  • Likes Given: 0
Re: Interpretation of time codes in cryosphere data
« Reply #4 on: May 06, 2015, 03:48:41 PM »
Time codes for leap years and 2000

Inspecting the data for Y4 years (leap years plus year 2000) shows uniformly 366 data points, compared to the 365 values for the other years. However, the codes themselves range from 0.0000 to 0.9973 for all years, and they are distributed on average at 0.00274, which is 365/10000. Apart from intervals of 0.0026 to 0.0029 (which is 0.00274 with some round off jitter), there are only 14 "odd" values and 9 values of zero (i.e. duplicates).

This shows that the codes are not "stretched" for leap years, but really do represent 10000/365 points per year - 0.0001 corresponds to 3153.6 seconds or 52 minutes and 33.6 seconds.

Removing the errors for non-Y4 years pointed out in the previous post, the only anomalies that remain are one extra data point between day 59 and 61 or between day 161 and 163 (only for 2008), see embedded image (can't figure out how to make pretty tables or how to gracefully embed images).

So, there is consistently one extra data point on day 60 (day 162 for 2008), i.e. a duplicate or close enough. It cannot be a coincidence that day 60 is March 1 on leap years.

So, to normalize the Y4 years:
  • Shift all values from xxxx.1671 forward by one day, so the last code will be x.9973+.0027 = x.9999 (take care with overflow!)
  • For the two values between x.1616 and (former) x.1671 assign the values x.1644 and x.1671.
  • For 2008 apply the same logic but use x.4411 and x.4466 instead.
  • For 2000 it is hopeless, as there will be an extra value no matter what.
An alternate solution is as follows:
  • Start with x.0000 and assign it day 1. Count days forward as long as the interval is ~0.0027, then stop.
  • Start with x.9973 and assign it day 366. Count days backward as long as the interval is ~0.0027, then stop.
  • If there is one value left, assign it to the missing date (applies to years with duplicate entries)
  • Done!
I have no clue what to do about 2000, nor why 2008 has a different "breakpoint".

Hope this helps.

Update: Modified the table below, the dates were off by one. No other change.
« Last Edit: May 09, 2015, 11:43:42 AM by plg »
If you are not paranoid you just do not have enough information yet.

plg

  • New ice
  • Posts: 76
    • View Profile
  • Liked: 1
  • Likes Given: 0
Re: Interpretation of time codes in cryosphere data
« Reply #5 on: May 06, 2015, 03:51:43 PM »
Thanks for moving this across! It wont get lost across here, and I'm sure it will help others analysing the CT data.
No problem, thanks for the pointer. Would certainly not want to disturb Wipneus calculations  :)
If you are not paranoid you just do not have enough information yet.