Underwater Image Enhancement using Masked MSE Loss
Image Enhancement, Underwater Imaging, Transfer Learning, Masked Image Modelling, Convolutional Neural Networks.
Capturing images in an underwater environment has been one of the most daunting tasks in the computer vision field as it poses quite unique challenges. Undistorted images are hard to acquire the deeper we go. Constraints such as light penetration and the underwater environment hinder the image quality captured. Deteriorated images impact hugely on feature extraction as well as object recognition. Images underwater gets degraded due to color cast, mainly blue and green color cast because blue color and green color possess longer wavelengths compared to others and can travel deeper resulting in selective attenuation with greenish and bluish hues, due to wavelength-dependent attenuation and scattering, due to haze because of suspended particles, and the marine snow also affecting in the form of noise.



The goal of this task is for the model to assert its focus on the color chart added to the images taken in the underwater environment by introducing corresponding masked images. Masks, which indicate the position of the color chart, and the underwater images are both applied together to train the network. color chart serve as reference standards to estimate quality degradation under varying lighting conditions. The loss function has been modified so that the algorithm acts accordingly. This served as a more robust and agile way possible to enhance the images.

This project is based on a pre-built model called Underwater image enhancement via medium transmission guided multi-color space embedding (Ucolor) created by Li and Anwar (Li et al., 2021).


The aim is to enhance the underwater image by removing the color casts. Continuing the previous work performed (Li et al., 2021), changes have been made to the loss function in such a way that it focuses mainly towards the color chart introduced into the images. To ensure this, corresponding masked images are introduced. To perform the masking procedure, binary mask are added. These binary masks are nothing but an array of binary values of grayscale images holding values. The values that masked image contains are either 0 or 1. Value 1 here corresponds that at that pixel position there is a presence of color chart. Masked images are loaded first in a similar fashion as other images are loaded. Every set of images are then converted into an array of float values. For the training procedure, image patches are selected after randomly cropping them into a shape of 128x128. Patches selected here from the input images, depth images and the masked images are always picked from same x and y position. Once this is done, the image patches are finally sent to the loss function for evaluation. The loss function consists of combination of MSE loss (LMSE) and VGG loss (Lvgg). MSE loss here is mean squared error while VGG loss corresponds to the network’s loss.




For more details regarding the project, you can refer to my dissertation paper: UMaskNet-MSE
References
2021
- Underwater image enhancement via medium transmission-guided multi-color space embeddingIEEE Transactions on Image Processing, 2021