Doktorego tesiaren defentsa: Advancing Pulmonary Disease Assessment in Medical Imaging: Deep Regression and Data Augmentation Techniques
Lehenengo argitaratze data: 2025/07/18
Egilea: Bouthaina Slika
Izenburua: Advancing Pulmonary Disease Assessment in Medical Imaging: Deep Regression and Data Augmentation Techniques
Zuzendariak: Fadi Dornaika / Karim Hammoudi
Eguna: 2025eko uztailaren 22an
Ordua: 10:30h
Lekua: Ada Lovelace aretoa (Informatikako fakultatea)
Abstract:
"Lung infections, including pneumonia and COVID-19, pose significant global health challenges due to their rapid progression, high transmissibility, and detrimental impact on respiratory function. Accurate and efficient assessment of infection severity using chest imaging, particularly chest X-rays (CXRs) and computed tomography (CT) scans, is vital for timely diagnosis, risk stratification, and treatment planning. Traditional approaches often on manual scoring systems by radiologists, which are time-consuming, subjective, and prone to inter-observer variability. Furthermore, the increasing patient load during pandemics highlights the urgent need for scalable, automated, and interpretable solutions. In response to these challenges, this thesis introduces a comprehensive set of deep learning frameworks designed to automatically predict the severity of pulmonary infections from CXR and CT data. These frameworks leverage recent advances in Vision Transformers (ViTs), multi-task learning, and attention-based modelling, combined with anatomically informed data augmentation techniques to enhance generalization, accuracy, and robustness.
The proposed models are capable of estimating multiple severity scores simultaneously and demonstrate strong performance across diverse public datasets, thus paving the way for real-world clinical integration and decision support in
respiratory disease management. Despite recent progress, existing deep learning methods for severity prediction from medical images face several limitations that hinder their clinical applicability and scalability. Many models depend heavily on large volumes of annotated data, are sensitive to data distribution shifts, or struggle to generalize across different imaging modalities and patient populations. Moreover, conventional data augmentation strategies often fail to capture the complex anatomical variations present in pulmonary infections, leading to suboptimal performance and overfitting.
The issues with previously proposed methods include:
(I) Existing methods typically require heavy architectures with a large number of parameters, increasing the risk of overfitting and reducing its deployment ability.
(II) Many techniques focus solely on classification, while regression-based severity quantification remains underexplored.
(III) Current approaches often fail to integrate multi-task learning to exploit the complex structure of pulmonary infections.
(IV) Few methods leverage cross-modality robustness or evaluate the generalizability across diverse imaging conditions.
(V) Most augmentation strategies are designed for classification and do not effectively enhance performance for regression tasks in severity scoring."
To address these challenges, this thesis introduces a set of innovative, lightweight, and region-aware deep learning models for robust severity quantification of lung infections.
The overarching goal is to improve predictive performance and generalization while reducing reliance on large-scale annotated data. Specifically, several transformer-based and attention-augmented architectures are proposed in conjunction with anatomically informed augmentation strategies. The main contributions of the thesis are outlined below.
(1) A novel Vision Transformer Regressor (ViTReg-IP) is proposed, leveraging a small-scale transformer with regression heads, trained using a newly derived score-aware data augmentation method that adapts classification-based fusion and mixing for regression.
(2) A multi-task variant of ViTReg-IP is developed to simultaneously predict two infection scores—lung opacity and geographic extent—through a dual-encoder transformer network and an online Combined Lung and Score Replacement augmentation method, enabling more comprehensive assessments.
(3) A parallel gated transformer model (PViTGAtt-IP) is introduced, where the input image is regionally partitioned and fed into multiple transformer encoders, each capturing localized severity patterns. A multi-gate attention fusion mechanism is applied to effectively integrate region-level features.
(4) A Mamba-based attention model is designed by combining gated and spatial attention in a parallel architecture to enhance feature selectivity. To further mitigate data scarcity, a Segmented Lung Replacement Augmentation method is introduced, using anatomical regions for self- and cross-image patch replacements, thereby preserving clinical plausibility and increasing data diversity.
(5) All models are evaluated across multiple public datasets (RALO, Brixia, Per-COVID…etc.) and imaging modalities (CXR and CT), achieving state-of-the-art performance in terms of Mean Absolute Error (MAE) and Pearson Correlation (PC). Extensive ablation studies highlight the contributions of each architectural and augmentation component, confirming models’ robustness and generalizability.
Keywords:
Deep Learning, Vision Transformers, Lung Severity Prediction, Chest Radiographs, CT scans, Multi-task Regression, Attention Mechanisms, Data Augmentation, Mamba