University calendar

Interpretable Recurrent Survival Network (IRSN) for Predicting Diabetes Risk from Longitudinal Nutritional Data

Monday, November 10, 2025 at 1:30pm to 3:00pm

College of Engineering
Computer & Information Science

Advisor: Dr. (Julia) Hua Fang, Computer & Information Science - University of Massachusetts Dartmouth

Committee members:

  • Dr. Yuchou Chang, Computer & Information Science - University of Massachusetts Dartmouth
  • Dr. Amir Akhavan Masoumi, Computer & Information Science - University of Massachusetts Dartmouth

Abstract:

Predicting the onset of chronic diseases from longitudinal data presents a significant methodological challenge due to high-dimensional features, missing values, and irregularly timed observations. This thesis addresses the gap between the predictive accuracy of deep learning models and clinical interpretability by developing and validating a novel method named the Interpretable Recurrent Survival Network (IRSN). The IRSN integrates three unified components: (1) a Long Short-Term Memory (LSTM) network to model patients’ longitudinal trajectories, (2) a self-attention mechanism for direct ante-hoc interpretability of visit-level importance, and (3) a Cox partial log-likelihood loss to handle right-censored survival data.
The IRSN's performance was evaluated against static (CPH, RSF, DeepSurv, DeepHit) and classical longitudinal (Cox-TVC) baselines on four public datasets, a simulated dataset informed by real-world parameters, and real-world data. On a simulated dataset designed to embed the predictive signal in temporal trends, the IRSN significantly outperformed all baselines, achieving a C-index of 0.9049 compared to the 0.60-0.64 range of static models that only had access to the last visit. This provided evidence for the necessity of a sequence aware model, and that the baselines failed to capture the non-linear temporal dependencies inherent in dietary trajectories. This was further reinforced on the National 1 data, where IRSN achieved a mean C-index of 0.7861. This was a significant improvement over the best-performing baseline, RSF (C-index = 0.6406), while other models performed poorly (C-index ≈ 0.47-0.51). This thesis makes three primary contributions: it adapts an interpretable, attention-based recurrent framework for survival analysis; empirically demonstrates the necessity of non-linear sequence learning in clinical prediction; and establishes an ante-hoc interpretability mechanism suited for clinical decision-making. The model's primary contribution of ante-hoc interpretability was also empirically validated. On the simulated data, the attention mechanism correctly identified the primary engineered signal (sugar intake) as the most influential feature. On the National 1 data, clinical case studies demonstrated the model's utility in generating plausible, patient-specific insights, highlighting factors such as whole grain intake and irregular visit schedules as key drivers of risk. This thesis therefore advances survival analysis by combining interpretability with predictive strength, offering a transparent and effective approach for modeling disease risk trajectories from complex nutritional data.

All CIS students are encouraged to attend and all interested parties invited. 

For further information, please contact Dr. Hua Fang.

virtual
Hua (Julia) Fang
508-910-6411
hfang2@umassd.edu
https://umassd.zoom.us/j/95683754446?pwd=WW1paGw5Q29mdXBMb0E3N3dkUTZ2Zz09

Back to top of screen