This project studied cloud segmentation, a major nuisance in optical remote sensing-based applications. The analysis was based on the 95-Cloud dataset, publicly available on Kaggle – a competition platform. This dataset encapsulates the medium resolution Landsat-8 L1C scenes, each with 4 channel data (red, green, blue and near-infrared) and a corresponding ground truth mask. This report explores the above problem by extracting semantic maps of clouds in medium resolution Landsat-8 satellite images. Devising it as a supervised learning problem, a comparative analysis of state of the art deep neural network architectures, namely, Vanilla Unet, Unet++, Efficient Net, and Unet with a ResNet backbone, that are commonly used for the the task of semantic segmentation were designed, implemented and experimentally evaluated. Furthermore, I investigated the reliability of transfer learning for such a problem. Both pretrained and untrained models were considered. VGG16 and ResNet34 that are typically used for classification tasks were used in this case as a feature extractor (encoder). A slight modification that involves removing the classifier (top layer) and designing a decoder was applied to these networks.
As opposed to what is commonly accepted within the computer vision research community, transfer learning was found to be inefficient for satellite imagery. It was computationally expensive and resulted in a marginal improvement compared to the untrained versions. Conversely, U-Net++ outperformed the rest and generated very accurate results, suggesting that classifying remote sensing images could highly benefit from ensemble learning. To the best of my knowledge, this fairly new network was not tested before on satellite images. Based on these results, I found it to be a very powerful and an accurate network and should be adopted for further investigations, and for other remote sensing based applications. The Dice binary cross entropy (BCE) loss function seems to be suitable to this case study as compared to the standard binary cross entropy (BCE), Dice loss, or even the negative log dice loss functions. As these were biased and easily drowned by the majority class.