MM-Net: Multi-Modal Segmentation Networks Guided by Multi-view Imagery
Abstract:
The image segmentation architectures bounded by a single modality paradigm struggle to learn complementary features and lacks generalization ability because of background clutter, occlusion, and inter and intra-class similarities. Deep learning-based image segmentation approaches due to their self-learning ability emerged as an attractive solution for image segmentation in medical, remote sensing, object detection, video surveillance, and recognition tasks. But these approaches are quite sensitive to distributional variations especially due to changes in modality and view and, hence, suffer from low performance. Our main focus in this thesis is to design novel, efficient, robust, and generalized deep learning-based architectures using multiple modalities and multiple view imagery and to test it on some of the challenging applications such as medical and remote sensing image segmentation. More specifically we focused on cardiac magnetic resonance imaging (CMRI) and height estimation via satellite image because both of these areas has multi-modal and multi-view scenarios.
CMRI segmentation has been performed utilizing two publically available, 3D single-view CMRI cohorts, (1) Multi-Centre, Multi-Vendor & Multi-Disease Cardiac Image Segmentation Challenge (M&Ms-2020) dataset and (2) Automated Cardiac Diagnosis Challenge (ACDC) dataset. This was a pilot study to quantify the effect of for three different state-of-the-art architectures Isensee_2017, Calisto-2020, Chen2020. Dice Similarity Coefficient for Left-ventricle (LV) 90.03%, 84.93%, and 87.22%, right-ventricle 86.44%, 85.65%, and 85.97%, and for myocardium 83.74%, 79.40%, and 82.43% for Isensee_2017, Calisto-2020, Chen-2020 respectively. Secondly, a novel architecture, SA-LA model, is proposed for holistic segmentation of right ventricle (RV) on multi-view CMRI, demonstrated state-of-the-art performance for RV segmentation with dice of 91.00%, and 89.63% for short-axis view (SA) and longaxis view (LA) CMRI respectively.
The next goal would be to extend the idea towards multi-modality and application would be, estimating the building height, from remote sensing. The 3D building layout is paramount for urban development and management planning and a good measure for house energy consumption. Moreover, we will incorporate mathematical aspect, e.g. stereo rectification, structure from motion (SFM) form satellite imagery, along with deep learning tool in order to improve the performance of proposed building height estimation framework.
Proposal Defense Committee
1. Dr Hassan Mohy Ud Din (Advisor)
2. Dr Murtaza Taj (Co-Advisor)
3. Dr Zubair Khalid