Support the Arctic Sea Ice Forum and Blog

Author Topic: AIdeas  (Read 15329 times)

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
AIdeas
« on: November 26, 2022, 11:14:04 PM »
It seems like uniquorn, one of our resident experts on ice images has some really cool ideas about which data to use. Coming at this from an AI aspect im going to lay out my thinking, such that other people might wish to google concepts or to chip in.

This is an image task. Specifically, it is using an AI to create a new image from a previous image. Actually, to be more specific, this is creating a segmentation mask from a raw image (the mask being ice edge and discretised ice concentration, with the raw image(s) being radar data).

Being based on an array in which adjacent pixels correlate (aka a satellite image), this is absolutely a task for a convolutional neural network. Specifically speaking, as it involves creating a mask, this is a task for a U-net.

It looks like the challenge setters have implemented a basic u-net. I cant run it as for a few days my pc is out of action. But its definitely a u-net.

So. Our questions are:

-can we beat their model using feature selection?
(Where we limit data given to the model so it only gets the best data). We can do feature selection by cleaning the input data further, by removing non-informative modalities of data or by creating new features from existing data

-can we beat their model by making a better model?
We can do this by tweaking the u-net (adding, altering or removing layers, including by changing the dimensions of hidden layers). Theres also the idea of using attention on certain parts of the image, this helps to remove spurious fitting which shouldnt be there. We can also do this by using a different model! Aka not doing a u-net.

Finally, can we beat their model by using a combination of feature selection AND model selection/hyper parameter optimization?

The challenge is designed to save the people who make the charts time.

So we could even be super cool and get the ice chart people to scribble (quickly draw the ice edge) and use their scribbles to perform something called "attention gating", where the model gets hints about where to look. A super interesting publication about this concept, but because it requires some form of user input, this technically would be breaking the rules


https://www.researchgate.net/publication/350505134_Learning_to_Segment_From_Scribbles_Using_Multi-Scale_Adversarial_Attention_Gates


If nothing else, that paper certainly gives us some other unet models to try

The introduction code is here
https://github.com/astokholm/AI4ArcticSeaIceChallenge/blob/main/introduction.ipynb

I had a small problem loading local modules using JupyterLab but other than that, with a default Anaconda install, it's been cut and paste so far. Here is the data from one file.

Quote
nersc_sar_primary; min: -380.123, mean: -14.503, max: 25.228, std: 5.624
nersc_sar_secondary; min: -463.143, mean: -24.702, max: 23.287, std: 4.742
sar_incidenceangle; min: 18.045, mean: 33.998, max: 47.405, std: 8.321
distance_map; min: 0.000, mean: 23.642, max: 41.000, std: 14.999
btemp_6_9h; min: 73.891, mean: 148.777, max: 267.375, std: 61.680
btemp_6_9v; min: 145.383, mean: 203.500, max: 276.789, std: 38.693
btemp_7_3h; min: 74.461, mean: 149.822, max: 261.031, std: 61.706
btemp_7_3v; min: 145.594, mean: 204.181, max: 277.305, std: 38.693
btemp_10_7h; min: 79.352, mean: 153.790, max: 264.844, std: 60.885
btemp_10_7v; min: 151.609, mean: 208.868, max: 275.430, std: 36.261
btemp_18_7h; min: 93.672, mean: 165.524, max: 273.211, std: 52.625
btemp_18_7v; min: 162.398, mean: 218.508, max: 280.367, std: 28.065
btemp_23_8h; min: 107.773, mean: 182.005, max: 279.156, std: 42.902
btemp_23_8v; min: 163.875, mean: 226.286, max: 283.484, std: 22.117
btemp_36_5h; min: 125.289, mean: 185.039, max: 279.562, std: 38.627
btemp_36_5v; min: 149.891, mean: 228.111, max: 285.008, std: 17.640
btemp_89_0h; min: 135.102, mean: 212.373, max: 285.742, std: 22.579
btemp_89_0v; min: 141.055, mean: 241.162, max: 289.000, std: 16.341
u10m_rotated; min: -18.484, mean: 0.675, max: 19.101, std: 4.740
v10m_rotated; min: -22.304, mean: 0.582, max: 19.573, std: 5.154
t2m; min: 230.684, mean: 268.634, max: 292.469, std: 9.492
skt; min: 221.912, mean: 268.912, max: 298.371, std: 10.362
tcwv; min: 0.272, mean: 7.707, max: 38.949, std: 5.205
tclw; min: 0.000, mean: 0.041, max: 1.184, std: 0.071


« Last Edit: November 26, 2022, 11:33:13 PM by uniquorn »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #1 on: November 28, 2022, 09:26:01 PM »
Some changes to the data visualisation.

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #2 on: November 28, 2022, 10:07:37 PM »
default notebook setup on the eox server.

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #3 on: November 28, 2022, 11:41:18 PM »
Possible misalignment or time difference on 89GHz AMSR2 data on  20190112T102000_cis_prep.nc
Just looking at random selection.

ani and static

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #4 on: November 29, 2022, 10:22:40 AM »
Did you manage to get their example model to spit out an ice edge mask? Thats cool its running on the server, how easy was it to set up?
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #5 on: November 29, 2022, 01:01:52 PM »
Haven't got that far, still getting used to using custom Conda Environments as Jupyter Kernels
https://eurodatacube.com/documentation/custom-jupyter-kernels.
Currently stuck at connecting to the data.

I'm just looking at the data locally for now to decide on a strategy and test with a few files.

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #6 on: November 29, 2022, 03:08:51 PM »
Curriculum learning sounds really cool; this is where you get the model to initially train on easy tasks, this helps it to adjust its weights, then over time you move to harder tasks. By the time it gets to the hard tasks, it already has a pretty good idea what to do.

In our example, easy tasks would be well defined ice edge, then over bins we would move to really diffuse and blurry edges.

Also, we should be thinking about adding noise, data augmentation (this involves flipping and mirroring data) and gaussian blurring.

Once we start getting further into this, I will probably set up a Slack workspace just so that we can better communicate.
« Last Edit: November 29, 2022, 03:19:21 PM by SimonF92 »
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #7 on: November 29, 2022, 10:14:48 PM »
Some of the ice edges are not so easy.

Was hoping to have a look at the raw data locally but a 12GB download never completes on my connection. That will have to wait till I can use the ESA server.

I'd start curriculum learning using amsr2 36.5h to identify mostly open water, perhaps an average with 36.5v. 89 shows more detail but needs a weather filter.

30 views from 2018. Will post the code later. Private or public? It is a competition after all ;)

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #8 on: November 30, 2022, 10:09:00 AM »
Timepoints where the ice:ocean:land boundaries are sharp and defined would serve as training data for model 1 (bin1).

Model 1 gets loaded in for training on the next 'good but less good' cohort of data, etc etc, and you bin downward until you get to the really rubbishy stuff.

Id expect the DICE score (how well its done) to decrease as we iterate over the cohorts, but the model should learn useful stuff each time.

People I have spoken to said this is a really great approach for their tasks.

Yes, to some extent I agree, that we should maybe take our discussion off public and we shouldnt share code here (we should be doing that via github anyway :) ). Ill send you a Slack invite tomorrow. I know getting my head around this is going to be quite tough, im off Friday so thats going to be my day where I sit down and get to it.

Any if anyone else wishes to join us, even as a curious observer, please feel free to send us a request and one of us will add you to the Slack (once its up).


https://towardsdatascience.com/applying-curriculum-learning-to-medical-images-184275c52350
« Last Edit: November 30, 2022, 10:19:51 AM by SimonF92 »
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

oren

  • First-year ice
  • Posts: 9993
    • View Profile
  • Liked: 3674
  • Likes Given: 4248
Re: AIdeas
« Reply #9 on: November 30, 2022, 11:11:59 AM »
I am highly interested but cannot follow the technical discussions (though in my past I could). I would appreciate if you post a general update on your progress from time to time, with no competitive details.

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #10 on: December 01, 2022, 10:36:48 AM »
Noted oren, will do, if we start to get results we can post them here too
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #11 on: December 02, 2022, 10:52:50 PM »
data shown on this thread is thanks to

Buus-Hinkler, Jørgen; Wulf, Tore; Stokholm, Andreas; Korosov, Anton; Saldo, Roberto; Pedersen, Leif Toudal; Arthurs, David; Solberg, Rune; Longépé, Nicolas; and Kreiner, Matilde Brandt; (2022):
AI4Arctic Sea Ice Challenge Dataset.
Danish Meteorological Institute.
https://doi.org/10.11583/DTU.c.6244065.

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #12 on: December 05, 2022, 11:47:01 AM »
Cool CNN visualisation my friend at work showed me, with explain-ability.

https://adamharley.com/nn_vis/cnn/3d.html


Its looking like uniquorn and I have our setups ready for AI, no small feat- to directly quote a friend:

"it's taken me weeks to do that before: no amount of conda can sort a gpu out"



Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #13 on: December 07, 2022, 12:32:31 AM »
took a while

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #14 on: December 07, 2022, 08:45:00 PM »
State of play:

Conceptually we are ready to start tweaking. Uniquorn trained one epoch before it failed, meaning the codebase seems to be in good shape, the data maybe not so much.

First blocking problem we have found is handling the data. Manually downloading individual files is extremely tedious, but downloading the zip tends to fail (the training zip is 50Gb). There is no obvious ftp option for a filezilla type transfer.

I am now thinking about writing a downloader script as part of the codebase. But this is bothering me. It feels like a hacky solution.

Anyone have any suggestions about how to download these files automatically (and with stability), without resorting to a for loop in python?

https://data.dtu.dk/articles/dataset/Ready-To-Train_AI4Arctic_Sea_Ice_Challenge_Dataset/21316608?backTo=/collections/AI4Arctic_Sea_Ice_Challenge_Dataset/6244065
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

oren

  • First-year ice
  • Posts: 9993
    • View Profile
  • Liked: 3674
  • Likes Given: 4248
Re: AIdeas
« Reply #15 on: December 07, 2022, 11:53:27 PM »
No suggestions, except to say that personally I would have done it with some kind of loop that runs over the days and gets the files and puts them in your desired location/database, I see no harm in that.

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #16 on: December 09, 2022, 12:06:12 AM »
First test running with 20 epochs overnight. Here are the epoch 4 results

Mean training loss: 2.222
10 val, memory allocated =  4.9443359375

Final batch loss: 2.167
Epoch 4 score:
SIC r2_metric: 65.224%
SOD f1_metric: 80.302%
FLOE f1_metric: 67.696%
Combined score: 71.75%

A-Team

  • Young ice
  • Posts: 2977
    • View Profile
  • Liked: 944
  • Likes Given: 35
Re: AIdeas
« Reply #17 on: December 09, 2022, 04:16:11 PM »
Quote
The challenge is designed to save the people who make the charts time.
(0) This is an excellent project that could bring a lot of credit to forum and competition participants. AI is rapidly becoming the future across all of the sciences; intuiting the Arctic future still has a place but is not the way to go.

I do hope along with Oren that ordinary prose exposition will keep up. Just reading about download sizes involved makes my computer seize. Three questions please:

(1) This looks to me more like another self-driving truck. With all the time saved, the chart people will be able to look for related unemployment elsewhere such as shoveling driveways or salting sidewalks (Rumba will be along shortly).

That said, is the intent to replicate the quality (or lack thereof) of current ice charts currently being made or make better ones? If the latter, it seems like the training set provided will prevent that.

They would need to do custom flyovers, diverted satellites, multi-spectral drones, glider studies, ship visits and ice instrumentation to really determine fiducial conditions on certain days as N-ICE2015 did and Polarstern intended.

In other words, should an AI offering be penalized because it differs from manual? It might very well be better. Arctic satellite data is severely under-utilized and indeed scarcely looked at, SAR imagery being the worst -- be a shame to discard a genuine scientific advance.

How much ship traffic is there anyway along Arctic shores; is the goal cheaper coal and minerals to encourage consumption? It's not clear that ice charts are actually used -- many posts here quote sea captains saying ice conditions change too fast.

If this is about privileged tourism and better charts reducing private insurance costs, I would say it can wait until climate change has been addressed if not a lot longer.

(2) SimonF92 rightly describes the product as a mask, what we've been calling a gimp or imageJ layer. In effect, the competition is about adding a fancy menu command or having an imageJ macro act on a stack.

Some of those tools are exceedingly complex already, such as picking out wispy strands of blond hair against a complex natural background with a click.

My question here is should routine/clever image enhancement steps be taken before turning the AI code loose? If not, it seems a lot of it just goes into reinventing early photoshop.

(3) In terms of code portability, will the AI product help with other Arctic ice issues? It seems not because each will need its own training set. Other than that, is it fair to say that AI itself is being automated here, perhaps putting yourselves out on the sidewalk as well?
« Last Edit: December 09, 2022, 04:22:36 PM by A-Team »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #18 on: December 09, 2022, 06:02:07 PM »
Quote
My question here is should routine/clever image enhancement steps be taken before turning the AI code loose?
My thoughts exactly, though SAR and AMSR2 data already go through a lot of processing before we even see them. We already changed some colours to help decide strategy. The files are huge, but how often do we get to see the complete range of amsr2 output that Kaleschke does his magic with?
Machine learning might well do better on the raw data but I can't even download those files. Perhaps we can use them directly from the ESA computers.

Agreed about shipping etc, for me it's a way to learn about ML and stretch my skills with ice data that I like looking at. Might even get a floe size algo out of it.
--------------

The overnight run stopped at epoch11 with no error but I passed the best result through to the test upload and got this back. Pretty basic. Better see what the SAR's look like.

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #19 on: December 12, 2022, 11:21:54 AM »
Oren, thank you for your input. I started to write a data downloader and got myself IP banned for not having a sleep timer inbuilt. Should be getting unbanned soon. Im really not impressed to be honest. I had another guy look at the data webpage and he (without my input) said that scraping was probably the easiest way. If they want people to be serious about this then they should be serious about sharing the data and not have people resort to scraping.

Uniquorn, looks like you are doing a great job, you might start wanting to tweak? Tweakable things include the the batch-size, learning rate and window size, some more advanced tweaking could include adding or removing layers and changing the regularisation. We dont want to lose sight of the larger modifications that could actually make noticeable/significant improvements such as curriculum learning or attention gating. But hyperparam tweaking is a good way for you to start feeling your way around. I am sorry I am still stuck, travelled for a funeral over the weekend so havent been much in the mood to write any code.

A-team, your profound advice is always meaningful to me. I see your point about modifications to input images. I would however stress that part of the usefulness of deep learning is that one allows the system to decide whats important. Over-manipulation of the inputs might remove/reduce the non-semantic usefulness of the images and negatively impact the scores. However, as always in life, I am happy to be shown to be wrong.

In terms of your final point, its quite sceptical to assume that the model generated here would be of null-value to other Arctic tasks. Even the model weights and biases could be used for pre-training in other image tasks. I would certainly not believe that nothing of value could be obtained outwith the challenge presented here. I do, however, enjoy the idea of AI making AI, I would swiftly be out of a job were that to happen :)
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #20 on: December 14, 2022, 09:41:20 AM »
This appears to work quite well if anyone wants to get hold of the data (not linking our github for code protection)



import requests
from time import sleep
from tqdm import tqdm
import logging
logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(logging.INFO)

#looks like the individual files are within this range (actually every 3, weirdly)
file_id_range= list(range(37832145, 37835670, 1))

#set to your filepath, including trailing '\\' (very important)
filepath = 'PATH TO YOUR DATA DIRECTORY'

#loops over inidividual file ids on their server
for file_id in tqdm(file_id_range):
    url = 'https://data.dtu.dk/ndownloader/files/{}'.format(file_id)
    r = requests.get(url, allow_redirects=True)

    #if its a usable file, save it to filepath
    if 'Content-Disposition' in r.headers:

        filename= r.headers['Content-Disposition'].split('=')[1]

        if filename[-3:] == '.nc':

            open(filepath + filename, 'wb').write(r.content)

            logging.info("Successfully accessed and saved {} to {}". format(filename, filepath))

    #sleep means you dont spam the server and potentially get banned
    sleep(1)
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #21 on: December 20, 2022, 11:11:56 PM »
First test running with 20 epochs overnight. Here are the epoch 4 results

Mean training loss: 2.222
10 val, memory allocated =  4.9443359375

Final batch loss: 2.167
Epoch 4 score:
SIC r2_metric: 65.224%
SOD f1_metric: 80.302%
FLOE f1_metric: 67.696%
Combined score: 71.75%

tweaked a few variables and improved the combined score by 6.56%, maybe will reach epoch20 overnight

Mean training loss: 2.051
Epoch 8 score:
SIC r2_metric: 83.552%
SOD f1_metric: 82.246%
FLOE f1_metric: 59.956%
Combined score: 78.31%

--------------------------------------

edit: some more tweaking got us over 80% with a much improved floe size, need to run that on the test data though.

Epoch 2 score:
SIC r2_metric: 79.693%
SOD f1_metric: 84.74%
FLOE f1_metric: 75.105%
Combined score: 80.794%
« Last Edit: December 21, 2022, 12:35:13 AM by uniquorn »

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #22 on: December 21, 2022, 04:52:24 PM »
First test running with 20 epochs overnight. Here are the epoch 4 results

Mean training loss: 2.222
10 val, memory allocated =  4.9443359375

Final batch loss: 2.167
Epoch 4 score:
SIC r2_metric: 65.224%
SOD f1_metric: 80.302%
FLOE f1_metric: 67.696%
Combined score: 71.75%

tweaked a few variables and improved the combined score by 6.56%, maybe will reach epoch20 overnight

Mean training loss: 2.051
Epoch 8 score:
SIC r2_metric: 83.552%
SOD f1_metric: 82.246%
FLOE f1_metric: 59.956%
Combined score: 78.31%

--------------------------------------

edit: some more tweaking got us over 80% with a much improved floe size, need to run that on the test data though.

Epoch 2 score:
SIC r2_metric: 79.693%
SOD f1_metric: 84.74%
FLOE f1_metric: 75.105%
Combined score: 80.794%

Great stuff.

Lets keep in mind also that is feasible to go too heavy on tweaking and that can result in something called 'overfitting'. One should be checking how good their tweaking is on some kind of 'hold out' set. ie, something the model doesnt train on, but even that is tricky because if one looks at the hold out set too much, they can actually overfit to that too!
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #23 on: December 21, 2022, 05:38:03 PM »
maybe geometry based uncertainty would help

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #24 on: December 21, 2022, 05:47:54 PM »
looks like the machine is learning, we have much better resolution on the ice charts.

but it scored less when submitted to the challenge
« Last Edit: December 21, 2022, 06:15:40 PM by uniquorn »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #25 on: December 21, 2022, 09:42:34 PM »
comparison of SAR, ice charts provided and AI generated ice charts (inf3) for the test data.

some obvious errors, spot the difference ;)

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #26 on: December 21, 2022, 09:43:08 PM »
Got the training loop output comparisons coming out. Model is doing.......... ok? My hyperparameters are not as good as uniquorns though as we need to sync up.

Next tasks:

1) custom data loader so we can remove some modalities on the fly
2) curriculum architecture


Two examples from model inference. FYI first image is "ground truth" aka correct answer, second is the model output.
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #27 on: December 21, 2022, 09:43:43 PM »
uniquorn, beat me by 30seconds  :o
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #28 on: December 22, 2022, 08:49:54 PM »
today's run with another tweak got us over 84%

Mean training loss: 1.796
Epoch 7 score:
SIC r2_metric: 84.589%
SOD f1_metric: 88.739%
FLOE f1_metric: 74.312%
Combined score: 84.194%

Have to wait till tomorrow to submit

sidd

  • First-year ice
  • Posts: 6823
    • View Profile
  • Liked: 1055
  • Likes Given: 0
Re: AIdeas
« Reply #29 on: December 22, 2022, 11:25:57 PM »
Here is a paper that may be of interest to this thread: doi:10.1109/TGRS.2022.3151623

https://www.researchgate.net/publication/361560275_PMDRnet_A_Progressive_Multiscale_Deformable_Residual_Network_for_Multi-Image_Super-Resolution_of_AMSR2_Arctic_Sea_Ice_Images/link/62b962b11010dc02cc606f83/download

I like the techniques. One of the authors, Rongxing Li, has an interesting history.

sidd

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #30 on: December 23, 2022, 05:34:01 PM »
Thanks sidd, that looks more interesting than the challenge, especially now

Quote
    New test dataset: Unfortunately, the ice charts were by mistake included in the Ready-to-train (RTT) test set. To avoid any misuse, a new version of the test set has now been made available, without the ice chart label data.

    Updated leaderboard. As the test set has been replaced, the leaderboard has been reset and previous submissions (a single submission) have been removed.

I thought it was odd that we could see the answers, didn't get around to training on them though.

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #31 on: December 23, 2022, 11:43:05 PM »
That's another 20GB down the drain trying to download the updated dataset zip file

gerontocrat

  • Multi-year ice
  • Posts: 22166
    • View Profile
  • Liked: 5435
  • Likes Given: 70
Re: AIdeas
« Reply #32 on: December 24, 2022, 12:31:49 AM »
I am totally unsure if this post has any relevance to the AIdeas project you lot are engaged in. But here goes anyway.

I've been muttering to myself about chaos theory for some time, triggered by the increasingly erratic behaviour of Antarctic sea ice over the last decade. Add to that major weather events in the last two years that lie far beyond the limits assumed in climate science weather and climate models, and I started to wonder if use of historical data observations might become of less and less relevance when looking at the future.

Today I had a look at one or two hairy papers (maths far beyond me) on chaos and climate modelling.However I did find this... I attach the first and last few paragraphs from https://history.aip.org/climate/chaos.htm -
Chaos in the Atmosphere
Quote
Before they could understand how climates change, scientists would have to understand the basic principles for how any complicated system can change. Early studies, using highly simplified models, could see nothing but simple and predictable behavior, either stable or cyclical. But in the 1950s, work with slightly more complex physical and computer models turned up hints that even quite simple systems could lurch in unexpected ways. During the 1960s, computer experts working on weather prediction realized that such surprises were common in systems with realistic feedbacks. The climate system in particular might wobble all on its own without any external push, in a "chaotic" fashion that by its very nature was unforeseeable. By the mid 1970s, many experts found it plausible that at some indeterminate point a small push could trigger severe climate change. While the largest effects could be predicted, important details might lie forever beyond calculation. In the following decades a consensus developed that the climate system was unlikely to jump into an altogether different state. The most likely future was one of gradual change, with low odds for an abrupt catastrophe—yet the odds were not zero, and critical details remained beyond calculation............

........On the global scale, however any decent computer model, run with any plausible initial conditions plus a rise of greenhouse gases, predicted warming. As the world's average temperature did in fact climb, it seemed less and less likely that the match with the models was mere accident. However, different models got different results for the future climate in any particular region. And a given model for a given region might come up with a surprising shift of the weather pattern in the middle of a run. Some of these regional fluctuations might be fundamentally chaotic. Occasionally a run of an entire global model would diverge widely for a time, for example if an unusual combination of factors perturbed the delicate balance of ocean circulation. But these divergences were within limits set by the overall long-term average global warming. In fact, it had become a test of a good model that it should show fluctuations and variations, just as the real climate did. For predicting future climates, it became common practice to run a supercomputer model a few times (usually three to five), with slight variations in the initial conditions. The details of the results would differ only modestly, and the modeler would confidently publish an average of the numbers.(36)   

To be sure, the models were built to be stable. When a new model was constructed it tended to run away into implausible climate states, until the modelers adjusted parameters to make it resemble the actual current climate. Meanwhile researchers kept turning up possible triggers for a change beyond anything known in recent centuries. Could freshwater from melting Arctic ice abruptly shut down the circulation of the North Atlantic? (Evidently just that had happened some ten thousand years ago.) Could the warming caused by emissions of methane gas make warming tundra or seabeds emit still more methane in a runaway feedback? (There were signs of something like that during a cataclysmic climate shift 55 million years back.) What about a runaway mechanism nobody had even imagined, as the planet warmed beyond anything seen in millions of years? An analysis of deep-sea records from warm periods in the distant past indicated that small perturbations had sometimes triggered processes, of an unknown nature, that brought extreme heating. Those events, however, had played out over tens of thousands of years..(37) The odds against a sudden catastrophe seemed long, but it was impossible to be certain that the planet was not approaching some fatal "tipping point."   

Until the future actually came, there would be no way to say how well the modelers understood all the essential forces. What was no longer in doubt was the most important insight produced by countless computer experiments. Under some circumstances a small change in conditions, even something so slight as an increase of trace gases that made up a tiny fraction of the atmosphere, could nudge the planet's climate into a seriously different state. The climate looked less like a simple predictable system than like a confused beast, which a dozen different forces were prodding in different directions. It responded sluggishly, but once it began to move it would be hard to stop.   
"Para a Causa do Povo a Luta Continua!"
"And that's all I'm going to say about that". Forrest Gump
"Damn, I wanted to see what happened next" (Epitaph)

A-Team

  • Young ice
  • Posts: 2977
    • View Profile
  • Liked: 944
  • Likes Given: 35
Re: AIdeas
« Reply #33 on: December 24, 2022, 03:47:04 PM »
Quote
I am totally unsure if this post has any relevance
It does not. The responsiveness of idealized non-linear differential equations to changes in initial conditions is utterly off-topic for a satellite image enhancement forum.

A glance at title and references shows it to be a 2008 update of a 2004 book by physics historian Spencer Weart, 81. It was not intended as Judith-Curryish climate denial FUD though it reads that way today.

Climate change prediction is mainly a boundary value problem so very different from meteorological turbulence. Weather initial conditions are known so sparsely and inaccurately that chaos issues hardly matter even there.

https://andthentheresphysics.wordpress.com/2018/05/29/initial-value-problem-vs-boundary-value-problem/

Meanwhile Sidd posted an excellent link to a Feb 2022 article that is essentially a competition-killer. That is, it will be very difficult to improve on the AMSR2 image sharpening AI pipeline that Rongxing Li and coworkers have already vetted. https://tinyurl.com/3za2xfpw

As follow-up to Uniquorn's discovery of ongoing lazy incompetence on the part of the competition hosts, we might wonder if they troubled themselves with a literature search. This one would be impossible to miss.

More to the point, is it feasible to integrate Li's sharpened AMSR2 imagery into the fantastic sea ice motion animations that seaice.de has been producing at Alfred Wegener institute (AWI)?

It's vastly more cost-effective to pour effort into image enhancement rather launch another satellite where considerations of budgets and physics often constrain possible hardware gains.

It's been going on since Sputnik, mostly for RGB and IR. There are still some opportunities on ice with passive radar because of the number and polarization of channels and the complex but informative ice physics behind emissions. Even a slight bump in resolution makes a big difference to AMSR2 end users.

Oren posts these on the melt/freeze season. That's important because while we have been making ice videos since time immemorial, the Arctic research community has not. (Trad journals don't support modern graphics.)

The dramatic AMSR2 sea ice lead videos cannot be ignored since they're coming from a respected researcher at the respectable AWI:

-1- How much longer will Arctic scientists persist in posting preposterous predictions of the Last Sea Ice Area? No question about it, the area north of Greenland exports completely on a rather short term basis. If the Non-Last ice to its north is gone, what replenishes it, an ever-weakening freeze season?

AMSR2, ASCAT and Uniq's buoy trajectories show better prospects for last ice west of Nares to Banks Island and up. Since we often see lift-offs along the whole coast, the question really is what atmospheric feature persistently pushes the ice back up against the Canadian coast.

CAA ice today is not attached to landfast so is free to move far away in summer -- maybe to kill zones -- but it hasn't yet. The pressure gradient pattern could perhaps change in a near-future climate if it is not effectively constrained by the asymmetric geometric configuration of Arctic land relative to the rotation pole and North Pacific.

-2- The research community has gotten itself deep into an Emperor Has No Clothes situation on the Beaufort Gyre. The ice moves around quite a bit trending west off the AK coast but no one in their right mind would characterize it as a circular gyre.

Lately they're kinda admitting this but calling it a decadal pattern of ye Olde Gyre. However the ever-worsening summer trend that eats the gyre in the Chukchi rules out a future periodic return.

So increasingly the conversation has shifted to eddies in the phony fresh water deep below the ice because it's largely unobservable. Another trick is limiting the very latest papers to the 2004-12 era, who cares?

What if someone put up a gigantic brightly lit billboard of 10 yrs of animated AMSR2, ASCAT and buoys facing the offices of Woods Hole Oceanographic Institute, would the facts finally be acknowledged? No.

Fervent defense of erroneous views is not that unusual in science. Max Planck was already complaining about it in 1908. We see it today in "three domains of life" and "not-even-wrong" string theory.

https://forum.arctic-sea-ice.net/index.php/topic,3863.msg353924.html#msg353924
« Last Edit: December 24, 2022, 06:55:40 PM by A-Team »

A-Team

  • Young ice
  • Posts: 2977
    • View Profile
  • Liked: 944
  • Likes Given: 35
Re: AIdeas
« Reply #34 on: December 26, 2022, 11:30:02 AM »
Quote
mundane literature search before initiating AI competition?
AMSR2 has been around quite a while. Needless to say(?), the antenna design stage already considered image output for each channel and polarization. And that does not mean raw analog signal but rather post-enhancement image. It has to be fit for purpose to get funded -- enhancement reduces the load on hardware.

However, just like Teslas getting software upgrades long after the hardware hits the road, post-launch opportunities for radar imagery enhancement continue to develop well into the future.

Jaxa posted the graphic below showing gray and black dots indicating what each frequency is good for in the Arctic ice & weather context. We're mostly familiar with the 89 GHz which is good for sea ice concentration and ground resolution but degraded by various forms of water vapor.

With so many channels and dual polarizations, there's a ton data but far less information. In RGB, the green and blue are highly correlated but differently scattered by the atmosphere; a NASA person explained to us how to tweak Landsat imagery of Jakobshavn Isbrae to get fully naturalistic color.
 
With AMSR2, dimensional reduction is also the top priority -- if channel a can be predicted from higher ground resolution channel b, we don’t need channel a. As R.Li mentions, some weighted linear combination of channels or covariance matrix is conventionally used. Here known and fixed bathymetry might also be included for near-shore.

There's a big issue in the challenge with pixel footprints split between land and seawater/ice. The challenge doesn't provide visible but much higher resolution visible coastal boundaries than the radar could be stubbed in.These boundaries are fixed over the time series regardless of snow or weather, meaning the affected radar pixels after re-gridding can be individually listed and characterized by a land line or segmented arc.

Another priority where lower frequency channels might be helpful is masking weather. This is local across an individual frame, variable even in consecutive frames and strongly correlated with seasonal and daily weather. The contest rules seem to imply use only the test imagery but in reality reanalysis weather would improve the AI.

After all this, the bottom line with AMSR2 probably just comes down to the 89 GHz. The final annotation call on ice chart pixel bins should come with a confidence level overlay as there are good and bad days of weather vapor, wind and current movement of ice and more or less problematic pixel locations. AI won't ever be standalone here but could provide a good tool that saves human mappers time.

Quote
'Super-Resolution of AMSR2 Sea Ice Images.' I like the techniques. the lead Rongxing Li, has an interesting history -- sidd
Yes. He was/is a rocket star in satellite image enhancement and coupled geolocation, being appointed full professor of an endowed chair at Ohio State at age 30 and bringing in a $35 million grant from NASA's Mars program to improve remote sensing data (which they really have to make the most of).

Despite being a long-time US citizen, he got swept up in the Deep State's declared 'tilt' away from Middle East wars towards simultaneous new hegomonic conflict with Russia and China.

Like with the daughter of the Huawei CEO or Dr. X Qiu being escorted out a virus lab by the Royal Canadian Mounted Police (but never accused of anything), Li's wife was majorly harassed at the airport by Homeland Security to humiliate China. The US doesn't conduct surveillance after the fact nor do spies hand-carry precious data across borders in the internet age (no backup!)

No charges or even accusations were ever made. It was right out of Nixon and the pumpkin microfilm, this time mumbling about a double-crossing commie monitoring weapons shipments and inexplicably hacking Raytheon for algorithms he had originated himself and already journal-published with a github open archive.

"The five rolls of 35 mm film known as the "pumpkin papers" had been characterized as highly classified and too sensitive to reveal and were thought until late 1974 to be locked in HUAC files. In 1975, an economist at the University of Michigan sued the U.S. Justice Department under the Freedom of Information Act... released copies of the "pumpkin papers" that had been used to implicate Hiss. One roll of film turned out to be totally blank due to overexposure,two others are faintly legible copies of non-classified Navy Department documents relating to such subjects as life rafts and fire extinguishers, and the remaining two are photographs of State Department trial documents."

Li went back to his alma mater at Tongji and continued his open source work without breaking stride as can be seen from his ResearchGate page and IEEE status.

Any lingering resentment from having lost a nice house and a plum position from extraneous political posturing?  OSU had to replace him with another foreign-educated researcher (Polish) because hardly anyone in the US pursues science anymore.

Our loss, their gain if there are military satellite applications, ditto virus repurposing and hundreds of other science topics. Making enemies out of friends seems short-sighted.
« Last Edit: December 27, 2022, 02:22:20 PM by A-Team »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #35 on: December 26, 2022, 07:47:55 PM »
A large selection of amsr2 swaths are available as .h5 or processed .nc files from https://gportal.jaxa.jp/gpr/search?tab=0

The default processing is not as good as seaice.de et al but might make an interesting AI project. Maybe train it on sic-leads v110

Can see the ice edge moving near Svalbard on the gif

challenge data download website has bad gateway again
« Last Edit: December 26, 2022, 07:55:30 PM by uniquorn »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #36 on: January 06, 2023, 01:06:13 PM »
despite the eternal bad gateway error when trying to download the data we haven't quite given up on the ai4eo challenge. I managed to download 106 of the 514 dataset files and 7 of the 21 test data files. 106 is enough to begin training but I don't think we can submit an entry without all the test data files.

Nevertheless the results so far look quite promising.

Quote
Mean training loss: 1.705
Final batch loss: 1.380
Epoch 14 score:
SIC r2_metric: 69.893%
SOD f1_metric: 92.854%
FLOE f1_metric: 78.826%
Combined score: 80.864%

It seems we have to balance the training variables and the input data.
Currently we appear to mostly training SIC

SimonF92

  • Grease ice
  • Posts: 610
    • View Profile
  • Liked: 221
  • Likes Given: 92
Re: AIdeas
« Reply #37 on: January 06, 2023, 05:28:34 PM »
Update from me.

Took a break over xmas but back into it. We have curriculum learning working within our training loop, but only using random splits- the data is not currently semantically batched. On my machine it takes a such a long time to train (and often runs out of memory) that im just hacking a few epochs and a limited validation set, but it seems to be happy.

Unlike uniquorn, im a bit fed up with how difficult they seem to have made it to get the data, so im no longer going to make much effort there, id say 30% of my time has been on trying to get the data, which is just ridiculous to be honest. So whatever is on my machine is what im going to use from now.



What we have done:

- explore the data
- forced to write a data scraper to actually get it
- some hyperparameter optimisation
- create a rudimentary way of reducing the channels (ie only selecting certain inputs)
- implement a basic curriculum learning architecture


What we are going to do:

- create and batch the samples into their respective "good", "moderate" and "bad" batches
- get this working within the curriculum learning design



What we havent done but should do:

- modify the layers of the U-net to see if theres a 'better design'
- consider the "time" element of the data, ie seasonality, perhaps some kind of recurrent network? https://en.wikipedia.org/wiki/Recurrent_neural_network
- consider Gaussian blurring or other method on input images
- implement augmentation to help with training and generalisability
Bunch of small python Arctic Apps:
https://github.com/SimonF92/Arctic

oren

  • First-year ice
  • Posts: 9993
    • View Profile
  • Liked: 3674
  • Likes Given: 4248
Re: AIdeas
« Reply #38 on: January 07, 2023, 07:47:07 AM »
It's very frustrating and sad that such a project should hit large hurdles on what should be the most trivial part, that of data download.
Do the files download manually? Is there a way for others here to download and send you the resulting files?

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #39 on: January 07, 2023, 06:58:34 PM »
thanks oren, when we stop getting this error message we'll know something has changed
Quote
This site can’t be reached
The webpage at https://data.dtu.dk/ndownloader/files/38619479 might be temporarily down or it may have moved permanently to a new web address.
ERR_INVALID_RESPONSE
but it's 139MB, there's no need to download it.

Meanwhile I checked through my downloads and it seems I did get the test data zip file on dec26, a mere snippet at 2.35GB. That file is now behind the bad gateway error, but we can take a look at the contents. note that these files don't appear to have the global scene_variables available

An error running the test_upload file prevents me from generating the ice charts so far.
« Last Edit: January 07, 2023, 07:16:37 PM by uniquorn »

A-Team

  • Young ice
  • Posts: 2977
    • View Profile
  • Liked: 944
  • Likes Given: 35
Re: AIdeas
« Reply #40 on: January 07, 2023, 10:01:26 PM »
So what happens when you write Jørgen Buus-Hinkler, the person responsible for posting the data? He is getting paid to do this under ESA Contract No. 4000129762/20/I-NB CCN1

He is with the Danmarks Meteorologiske Institut, 2100 Copenhagen, Denmark (e-mail: jbh  at dmi  dk).
 
https://data.dtu.dk/articles/dataset/Ready-To-Train_AI4Arctic_Sea_Ice_Challenge_Dataset/21316608


uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #41 on: January 07, 2023, 10:16:48 PM »
I've posted the problem on the ai4eo forum, along with others.

https://platform.ai4eo.eu/auto-ice/forum/T1PuYZOPHsWhQDhst8ih

and

https://platform.ai4eo.eu/auto-ice/forum/HbJ4odsNmSlwLVlDWygd

they are reaching out...

A-Team

  • Young ice
  • Posts: 2977
    • View Profile
  • Liked: 944
  • Likes Given: 35
Re: AIdeas
« Reply #42 on: January 08, 2023, 02:31:39 PM »
Quote
"never attribute to malice that which is adequately explained by stupidity."
Ok but I cannot make up my mind in this instance. Goethe raises other alternatives, misunderstandings and lethargy. The apologies seem insincere and possibly self-serving.

Hosting of files for download is routine (spare computer in broom closet) and scarcely requires an unresponsive outside contractor. It does not take two weeks to fix a bad url.

I boldly predict that the hosts (and judges) of the competition will win the competition, despite some last-minute threats from possibly more competent teams who scarcely had access to the source files.

We apologize for the inconvenience of the update [corrupt files, inclusion of results, incorrect thresholds] but we are certain that it will improve the overall quality of the competition. The AI4Arctic team wishes you all a pleasant holiday.
https://platform.ai4eo.eu/auto-ice/forum/T1PuYZOPHsWhQDhst8ih


The final score on the platform leaderboard will decide the winners of the challenge. However, if any concerns about a solution’s validity arise, the organizers reserve the right to contact the team to resolve such issues. If the issues cannot be resolved, the team will not be eligible to receive prizes.

Any of your intellectual property rights contained in the solution you will develop for this challenge will remain yours. While we ask you to submit the code at the end of this challenge to be eligible for the prizes, we will only use this to validate your approach [make sure we can run your code by ourselves]. Any Organizer or other stakeholder who views your code will be required to sign a [yet to be written] NDA.


Are we not all stakeholders in climate change? Yes indeed and NDAs can be quite permissive. For example, a host-initiated startup or aerospace company might take your code, compile it to binary, wrap it up in some service it sells back to the ESA at astronomic cost and that would not constitute disclosure.

To be eligible for the cash prize, at least one of the winning team must be a national of an ESA Member State, including Canada (as Cooperating State), Slovakia, Slovenia, Latvia, and Lithuania (as Associate Member States).

I'd say ask for the €9,000 upfront. The prizes sound like chores:

-- Six months Polar TEP Machine Learning Environment (Hopsworks) access valued at €9,000 [by whom?]

-- Two vouchers for self-paced online training courses (8 each) from the NVIDIA Deep Learning Institute
« Last Edit: January 08, 2023, 03:10:27 PM by A-Team »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #43 on: January 08, 2023, 03:49:37 PM »
Tend to agree with that assessment, though not the bold prediction. I enter the competition in the spirit of learning. The prizes, for me, are meaningless, but the data, stokholm's introductory code and working with SimonF92 would appear to be a very good introduction to machine learning, which you are very welcome to join, as are any other interested ASIF members. Your contributions have already improved our scores.
  Up to now we choose not to post code on their computer, though SimonF92 may benefit from its superior format. Probably some code and variables are included in the submissions, perhaps that is why there are so few, 4 from us which now don't count.

I'm very interested in the amsr2 idea above though, haven't discussed that with SimonF92 yet.

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #44 on: January 09, 2023, 02:21:48 PM »
step by step.

An error running the test_upload file prevents me from generating the ice charts so far.

stokholm posted an update that fixed the error generating the ice charts, now there is an error when submitting the model to the ai4eo website.
Quote
local variable 'score_values' referenced before assignment

Here are the SAR and ice chart images for run 05b.
Now to look at curriculum learning
« Last Edit: January 09, 2023, 02:27:36 PM by uniquorn »

A-Team

  • Young ice
  • Posts: 2977
    • View Profile
  • Liked: 944
  • Likes Given: 35
Re: AIdeas
« Reply #45 on: January 09, 2023, 05:57:08 PM »
I checked out the publications of the competition hosts. This is a good sized group trying this and that AI for years without getting anywhere, hence the scheme to get outside groups to do it for them. [[see Chap2 Adventures of Tom Sawyer]]

It may prove worthwhile to skim through the papers looking for approaches to avoid such as convolutional neural networks. Also they identify the commercial 'stakeholder' who gets to 'view' your code under the yet-to-be-described NDA.

It would be highly amusing if you can win the contest but refuse to give them the code. Or give them code that instead deletes the 80ºN dmi graphic. To make up for no prize, I can offer a link to a ImageJ tutorial valued at the same €9000.

Interpolation of AMSR2 data for improvement of ice charting
AA Nielsen, R Saldo, Jørgen Buus-Hinkler, MB Kreiner
https://backend.orbit.dtu.dk/ws/portalfiles/portal/197991141/imm7139.pdf
 
Today, ice charts in Greenland waters are produced manually by the Danish Meteorological Institute (DMI) for selected regions depending on season and shipping routes. The project “Automated Downstream Sea Ice Products for Greenland Waters” or shorter “Automated Sea Ice Products” (ASIP) attempts to automate this process by means of fusion of data from instruments with different resolutions and modalities.

As a part of this process data from the Advanced Microwave Scanning Radiometer (AMSR2) will be interpolated to the geometry of the SAR data acquired by Sentinel-1. In a preparatory leave-one-out cross-validation (LOOCV) study, different interpolation methods including ordinary kriging (OK) are compared.

Using bias and root-mean-squared error (RMSE) as measures of precision, OK using 20-30 nearest neighbors outperforms other often used methods such as inverse distance (ID) weighting. This comes at a cost: more work needs to be done by both the operator and the computer.

A Convolutional Neural Network Architecture for Sentinel-1 and AMSR2 Data Fusion
https://ieeexplore.ieee.org/abstract/document/9133205/   July 2020

With a growing number of different satellite sensors, data fusion offers great potential in many applications. In this work, a convolutional neural network (CNN) architecture is presented for fusing Sentinel-1 synthetic aperture radar (SAR) imagery and the Advanced Microwave Scanning Radiometer 2 (AMSR2) data. The CNN is applied to the prediction of Arctic sea ice for marine navigation and as input to sea ice forecast models. This generic model is specifically well suited for fusing data sources where the ground resolutions of the sensors differ with orders of magnitude, here 35 km × 62 km (for AMSR2, 6.9 GHz) compared with the 93 m × 87 m (for sentinel-1 IW mode).

In this work, two optimization approaches are compared using the categorical cross-entropy error function in the specific application of CNN training on sea ice charts. In the first approach, concentrations are thresholded to be encoded in a standard binary fashion, and in the second approach, concentrations are used as the target probability directly. The second method leads to a significant improvement in R 2 measured on the prediction of ice concentrations evaluated over the test set.

The performance improves both in terms of robustness to noise and alignment with mean concentrations from ice analysts in the validation data, and an R2 value of 0.89 is achieved over the independent test set. It can be concluded that CNNs are suitable for multisensor fusion even with sensors that differ in resolutions by large factors, such as in the case of Sentinel-1 SAR and AMSR2.


Fusion of satellite SAR and passive microwave radiometer data for automated sea ice mapping and the expected impact of CIMR observations
Same authors
https://orbit.dtu.dk/en/publications/fusion-of-satellite-sar-and-passive-microwave-radiometer-data-for   2021
https://orbit.dtu.dk/en/publications/automatic-satellite-based-ice-charting-using-ai  2019

Manual ice charting from multi-sensor satellite data analysis has for many years been the primary method at the National Ice Services for producing sea ice information for marine safety. Ice analysts primarily use satellite synthetic aperture radar (SAR) imagery due to the high spatial resolution and the capability to image the surface through clouds and in polar darkness, but also optical imagery in clear sky and daylight conditions, thermal-infrared and microwave radiometer data from e.g. AMSR2.

Ice analysts mention the spatial resolution of microwave radiometers as the primarily limitation to use the data. The traditional manual ice charting method is time-consuming and limited in spatial and temporal coverage. Further, it is challenged by an increasing amount of available satellite imagery, along with a growing number of users accessing wider parts of the Arctic due to the thinning of the Arctic sea ice. The automation of the time-consuming and labour-intensive sea ice charting process has potential to provide users with near-real time sea ice products of higher spatial resolution, larger spatial and temporal coverage, and increased consistency.

To automate the generation of sea ice information from satellite imagery we use a Convolutional Neural Network (CNN) designed for prediction of sea ice in Greenland waters. Automating the process on SAR data alone is challenging. SAR images show patterns related to ice formations, but backscatter intensities can be ambiguous, which complicates the discrimination between ice and open water, e.g. at high wind speeds. Our CNN model tackles the challenges by fusing Sentinel-1 active microwave (SAR) data with Microwave Radiometer (MWR) data from AMSR2 to exploit the advantages of both instruments.

While SAR data has ambiguities, it has a very high spatial resolution, whereas MWR data has good contrast between open water and ice. However, the coarse resolution of the AMSR2 MWR observations introduces a new set of obstacles, e.g. land spill-over, which can lead to erroneous sea ice predictions along the coastline adjacent to open water. The CNN model has been trained with a large dataset of 461 ice charts manually produced by the ice analysts in the DMI Greenland Ice Service based on Sentinel-1 imagery.

The dataset also contains the corresponding AMSR2 swath co-located with the ice charts and Sentinel-1 images. The sea ice training dataset  has been co-produced in the ASIP and AI4Arctic (ESA) projects. We will present the results of merging active and passive microwave data from Sentinel-1 and AMSR2 as input to a CNN and show how the input from the passive microwave data has a positive effect on the CNN performance. https://doi.org/10.11583/DTU.13011134.v2

In this work we explore data fusion and image segmentation techniques with Convolutional Neural Networks to produce per pixel predictions from Sentinel 1 (S1) SAR images and AMSR2 microwave radiometer measurements of Ice/water. The work is carried out under the Danish Automated Sea Ice Products (ASIP) project in a collaboration between the Danish Meteorological Institute and the Technical University of Denmark.

For the study a dataset of more than 900 ice charts and corresponding Sentinel­1 SAR imagery has been collected. The core of our algorithm consists of a Convolutional Neural Network that models image features at different scales by the use of dilated convolutional filters. The architecture of the algorithm further allows us to merge S1 images with AMSR2 measurements in a data fusion approach that exploits the best properties of each measurement. While the 40m pixel size in Sentinel­1 data enables extraction of ice information at an unprecedented high resolution, the AMSR2 measurements contributes with a high contrast between ice and water independent of wind conditions.

Future studies in the project will investigate the importance of additional meta data in the ice prediction, such as weather information, sensor viewing angles, geographic location, etc.

AI4SeaIce: Toward Solving Ambiguous SAR Textures in Convolutional Neural Networks for Automatic Sea Ice Concentration Charting
https://ieeexplore.ieee.org/abstract/document/9705586  Feb 2022

Automatically producing Arctic sea ice charts from Sentinel-1 synthetic aperture radar (SAR) images is challenging for convolutional neural networks (CNNs) due to ambiguous backscattering signatures. The number of pixels viewed by the CNN model in the input image used to generate an output pixel, or the receptive field, is important to detect large features or physical objects such as sea ice and correctly classify them. In addition, a noise phenomenon is present in the Sentinel-1 ESA Instrument Processing Facility (IPF) v2.9 SAR data, particularly in subswath transitions, visible as long vertical lines and grained particles resembling small sea ice floes.

To overcome these two challenges, we suggest adjusting the receptive field of the popular U-Net CNN architecture used for semantic segmentation. It is achieved by symmetrically adding additional blocks of convolutional, pooling and upsampling layers in the encoder and decoder of the U-Net, constituting an increase in the number of levels. This shows great improvements in the performance and in the homogeneity of predictions.

Second, training models on SAR data noise-corrected with an enhanced technique has demonstrated a significant increase in model performance and enabled better predictions in uncertain regions. An eight-level U-Net trained on the alternative noise-corrected SAR data is presented to be capable of correctly predicting many ambiguous SAR signatures and increased the performance by 8.44% points compared with the regular U-Net trained on the ordinary ESA IPF v2.9 noise-corrected SAR data. This is the first installment of this multi-series installment of articles related to AI applied to sea ice (in short AI4SeaIce).


High-Resolution Sea Ice Maps with Convolutional Neural Networks
http://www2.compute.dtu.dk/pubdb/pubs/7133-full.html 2019 conf

Automatically generated high resolution sea ice maps have the potential to increase the use of satellite imagery in arctic applications. Applications include marine navigation, offshore operations, validation of ice models, and climate research. Especially for arctic marine navigation, frequent ice maps in high resolution are requested by most users, as documented by an internal project stakeholder survey.

We present current results from our large-scale study of high resolution ice maps generation with Convolutional Neural Networks (CNNs). Our study is based on dual polarized (HH+HV) Extra Wide swath (EW) {SAR} data from the Copernicus Sentinel 1 satellite mission and we generate pixel-wise sea ice estimates in 40m x 40m resolution. The presentation will include a model validation against expert annotations of {SAR} images.

In the near future we will expand our study to include AMSR2 Microwave Radiometer (MWR) data as input. The addition of {MWR} data can potentially solve the ambiguities in {SAR} data over open water, due to {SAR} backscatter variation at different wind conditions. Some {CNN} estimates are observed to confuse very homogeneous ice surfaces with similar backscatter open water scenarios, but results show a clear potential for this methodology.

Our work is carried out under a Danish research project named Automated downstream Sea Ice Products (ASIP). The project goal is to automate generation of sea ice information from satellite images. {ASIP} is a collaboration between the Danish Meteorological Institute (DMI), the Technical University of Denmark and Harnvig Arctic and Maritime.

It sets out to automate, partially or fully, the extraction of arctic sea ice information from satellite imagery. Today, ice mapping is mainly done manually by ice-experts at national Ice Centers around the world. The project goal will enable analyzing larger quantities of satellite data, for better utilization of the available Sentinel-1 images and for providing ice maps to users more frequently. As a part of the {ASIP} project a thorough analysis of the need for ice information was carried out among users by Harnvig Arctic and Maritime. http://harnvig-ice.dk/

This resulted in ASIP Internal Stakeholder Survey Report which substantiates the specific needs. One of the conclusions from this report is that 90% of use cases need simple ice/no-ice information for marine route planning purposes in high resolution (< 250m pr. pixel). Meeting this resolution requirement is unfortunately not possible with current {MWR} data alone, though its properties are otherwise good for ice concentration estimations. Hence, {SAR} data is theonly source with regularly coverage as input data.

Iceberg Detection in Dual-Polarized C-Band SAR Imagery by Segmentation and Nonparametric CFAR (SnP-CFAR)
https://ieeexplore.ieee.org/abstract/document/9406184  April 2021

We propose an unsupervised method for iceberg detection over sea ice-free waters. The algorithm is based on the segmentation and nonparametric constant false alarm rate (SnP-CFAR) approach. Unlike in parametric CFAR detection, in our method, there is no need to define target, guard, and background areas explicitly. Instead, we apply the CFAR detection to the pixels within each detected segment and the background is formed of the nearby pixels not included in the target segment.

By using nonparametric background probability density function (PDF) estimates, we also eliminate the need of assuming a specific type of a background PDF. We compared the detection results with the operational Danish Meteorological Institute (DMI) Gamma-CFAR algorithm results. The results were evaluated against icebergs manually identified by the Finnish Meteorological Institute (FMI) Ice analysts.

Our method also exhibits a reduced number of false alarms. We present results of iceberg detection based on the SAR channel-cross-correlation (CCC). CCC was able to distinguish many of the true targets with a low number of false alarms. However, CCC seems to miss some of the true targets and its main use would be in confirming iceberg observations.

Field tracking (GPS) of ten icebergs in eastern Baffin Bay, offshore Upernavik, northwest Greenland
https://tinyurl.com/2zdjsmyf  2017

A field investigation of iceberg drift pattern and drift speed was conducted in September 2011 in Baffin Bay, northwest Greenland. Ten icebergs were equipped with GPS transponders during a field campaign. Above-waterline dimensions (length, width and height) of the icebergs were measured using a GPS/pressure altimeter and geometrically rectified digital photographs taken during the field campaign. Iceberg lengths, masses and drafts ranged from 95 to 450 m, 330 000 to 17 000 000 t and 70 to 260 m, respectively. The drift patterns and speeds were determined on the basis of GPS positions logged continuously at 1 hour intervals.

The drift patterns differed significantly from iceberg to iceberg. The GPS signal was lost on six of the icebergs within the first 23 days of logging. Three transponders were transferring data for more than 5 months until the battery ran out of power. One transponder was sending data until summer 2012. The measured maximum drift speed was 68 cm s−1 (2.4 km h−1), and the mean drift speed for all ten icebergs was 10 cm s−1 (0.4 km h−1). Relations between iceberg size and drift speed were investigated, showing that icebergs with large surface areas moved at the highest speeds, which occurred particularly during strong wind conditions.

Jørgen Buus-Hinkler received the Ph.D. degree from the University of Copenhagen, Copenhagen, Denmark, in 2005, focusing on snow-precipitation in Northeast Greenland and its relation to sea-ice distribution gathered from passive microwave imagery. He has been working as a Research Scientist at the Danish Meteorological Institute, Helsinki, Finland    the fields of remote sensing and geospatial analysis. Part of his present work is the development of operational iceberg products based on target detection in SAR imagery. This work is within the Copernicus Marine Environment Monitoring Service (CMEMS). 
« Last Edit: January 09, 2023, 06:45:57 PM by A-Team »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #46 on: January 10, 2023, 01:52:18 PM »
It would be highly amusing if you can win the contest but refuse to give them the code. Or give them code that instead deletes the 80ºN dmi graphic. To make up for no prize, I can offer a link to a ImageJ tutorial valued at the same €9000.

Thanks for the background reading. My understanding is that we would be removed from the leaderboard and never mentioned again. Did you ever look at FlowJ? I think it might work well with ascat with good parameters.
--------------------------------------

A quick look at an extract of the submission data:
Quote
Ai4eotst2/upload_package_tst1a.nc {
  dimensions:
    20180124T194759_dmi_SIC_dim0 = 5003;
    20180124T194759_dmi_SIC_dim1 = 5229;
    20180124T194759_dmi_SOD_dim0 = 5003;
    20180124T194759_dmi_SOD_dim1 = 5229;
    20180124T194759_dmi_FLOE_dim0 = 5003;
    20180124T194759_dmi_FLOE_dim1 = 5229;

  variables:
    ubyte 20180124T194759_dmi_SIC(20180124T194759_dmi_SIC_dim0=5003, 20180124T194759_dmi_SIC_dim1=5229);
      :_ChunkSizes = 2502U, 2615U; // uint

    ubyte 20180124T194759_dmi_SOD(20180124T194759_dmi_SOD_dim0=5003, 20180124T194759_dmi_SOD_dim1=5229);
      :_ChunkSizes = 2502U, 2615U; // uint

    ubyte 20180124T194759_dmi_FLOE(20180124T194759_dmi_FLOE_dim0=5003, 20180124T194759_dmi_FLOE_dim1=5229);
      :_ChunkSizes = 2502U, 2615U; // uint


The filesize dropped from ~21KB for v1 to ~17KB for v2. Both v2 test uploads were exactly the same size though so..

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #47 on: January 10, 2023, 07:57:57 PM »
... schedule a meeting for later this week...

A-Team

  • Young ice
  • Posts: 2977
    • View Profile
  • Liked: 944
  • Likes Given: 35
Re: AIdeas
« Reply #48 on: January 10, 2023, 10:36:05 PM »
Quote
The filesize dropped from ~21KB for v1 to ~17KB for v2. Both v2 test uploads were exactly the same size though so.
meaning?
Quote
Did you ever look at FlowJ? might work well with ascat.
Looked but now thinking buoy isochrons might be better. Buoys are very unusual as data sources in that hourly resolution and even density of buoys is sometimes excessive. Everywhere else we are scratching around for more resolution.

Seems that drift trajectories could be supplemented, in favorable cases of transitive buoy pairings, with lines joining positions on the same dates, maybe weekly or even monthly.



The idea is the isochrons give the sweep. You posted a quadrilateral bit of this earlier on the freeze forum. To help pick viable pairings, perhaps color by date the simplified drift trajectories rather than speed.

This is reminiscent of delauney but it stays quadrilateral like a fish net. Often buoys are launched in batches, either nearby or with some delay, so favorable separations may be fairly common.

The two sets of lines are vaguely orthogonal in the sense that the same software draws and colors both sets of line separately after a matrix transpose.

Gimp provides an adobe illustrator-like line tool that could allow point and click editing away of clutter. Another way to go is putting each buoy track into an otherwise transparent layers and add out from an especially favorable trajectory.

After verifying that the set of buoys and dates works, the sub-areas swept towards the Fram are then summed by pixel count on interior white. It is then verified that neither more buoys nor a higher resolution on the dates much affects (ie improves) the calculated areas. If the ice is not quite exported, call it the replacement ice nominal surface area sources.
« Last Edit: January 10, 2023, 11:58:21 PM by A-Team »

uniquorn

  • First-year ice
  • Posts: 5342
    • View Profile
  • Liked: 2279
  • Likes Given: 393
Re: AIdeas
« Reply #49 on: January 11, 2023, 10:38:27 AM »
Maybe. A shame to revert to pixel counting when we already have the lat/lons. Will keep thinking about it.

---------------------

Quote
meaning?
I don't think the error is related to file contents any more. It's probably due to the entry count being set to zero while we still have submissions in the list.