Excel - Handling Sea Ice Data.The source Excel workbook for this is available on Google Docs
here. It's best not to view it from Google Docs, select File>Download and save or open direct.
There are 6 sheets in the spreadsheet.
- RawData - the point of entry for new data.
- Area - Cryosphere Today area tabulated data.
- Volume - PIOMAS daily volume tabulated data.
- CalcThick - Calculated thickness tabulated data.
- ThickSeries - Calculated thickness time series.
- Monthly - Monthly average values of various indices of sea ice.
In this first post I'll explain the nuts and bolts of how this spreadsheet is constructed. In this series my posts will be headed with a bold title so that they stand out from any discussion.
The first place to start is on the sheet RawData, click on the bottom tab and it will open. Note that between lines 10 and 11 there is a break, scrolling below that line moves the data up and down but keeps the headers of the columns visible, they don't disappear off the top of the screen. This is done using Freeze Panes (View > FreezePanes).
You'll see that there are five columns:
- Area.
- DataTable.
- Volume.
- Extent.
The URLs from which to obtain the source data are given above each column, apart from DataTable, which will be explained. I could use data links to update these source data columns (Data > Get External Data), however I've found them to be awkward to keep working through regular updates, so my update method is to do it manually. I'll be keeping that Spreadsheet up to date when PIOMAS data comes out each month. The way to update is described in a following comment, for now I want to keep on track with how all this works.
The key to the RawData sheet, and to subsequent sheet interactions with RawData is the DataTable table (columns G to L).
The table itself is in blue and the key column is the DateNum column. This obtains the date number index from the Area columns (Columns B to E), specifically from the Date column (column B) which is from the original Cryosphere Today area data index as downloaded from the stated webpage. The Year (G) is worked out as =INT(Bxx), this cuts off the CT date index to leave only the integer part. The day number (day of the year) is a bit more complex, that uses:
=ROUND(365*(Bxx-INT(Bxx)),0)+1
ROUND(X,Y) rounds the number X, to the number of digits Y, for example ROUND(12.347386,2) produces 12.35 as a result. Bxx-INT(Bxx) simply removes the whole number part of a number leaving the part to the right of the decimal point, for example 12.347386 becomes .347386. This is then multiplied by 365 because it's a fraction of one year, and one is added because the first day of the year is '.000'.
The above formulae makes two columns, one of year and one of date. These are combined in column DateNum (I) to make a date index, which is essential to the operation of the whole spreadsheet. The combination is simply done by =Gxx&"\"&Hxx, which concatenates to make the string seen in column I, e.g. 1979/4.
Area is simply copied in cell by cell, to do this in a new table (Insert > Table) one need only put the formula into the first table entry, when the table is resized by clicking on a lower corner and dragging down, the formula copies down that column.
In Extent and Volume (K & L) we see the first use of the key formula behind the whole sheet; the VLOOKUP function. I'll ignore Extent as that's not maintained, and once you understand how volume works you'll be able to start maintaining Extent should you wish to. In a simlar way to the DataTable table, volume is built into a series in a volume table, which is seen in columns U and V. The values in this table are then referenced in column L, the volume column of our main data table.
VLOOKUP works as follows:
VLOOKUP(Reference, Source Data, Column Number, Match Type)
- Reference - the item VLOOKUP will search for in the source data.
- Source Data - the range of the spreadsheet that VLOOKUP will search in, actually it looks down the first column of that range.
- Column Number - the column number in the Source Data range that VLOOKUP returns a result from.
- Match - the type of match, we want to use an exact match so we'll use FALSE for this.
Because we'll be adding new data to the sheet the best was to declare our source data is with a named table. This is why the DataTable table has been built up. VLOOKUP will search down the DateNum column (I) until it finds a match for the Reference it's looking for, then it will use Column Number to tell it how many columns from the first column of the Source Data range it should go (to the right) to find the value to display.
So in L11 we have:
=VLOOKUP(I11,ProcessVol,2,FALSE)
This means - look for the value in I11 in the table process volume, and when you find a match go to the same row as the match, but get the value from column 2. Because FALSE is selected get an exact match, or return an error. In this way the DateNum index of the Volume table (U & V) is searched for the dates listed in the DateNum column of DataTable and the relevant values of volume are copied into the DataTable. Once again; as you resize the table downwards the VLOOKUP formula copies down, so every date gets the correct data.
The result is a table of dates, and for each date are the matched Area (Extent) and Volume, although I'm not using Extent further.
But a list of numbers is not really of use, what we need to do next is get these values into a useable form. I'll go over that in my next comment.