In recent years, Predictive Process Monitoring (PPM) has evolved at the intersection of process mining, machine learning, and data science, as organizations seek to anticipate the future course of ongoing processes. Traditional PPM mainly relies on structured event log data, but many real-world scenarios generate richer information, including text, images, audio, and video. Multimodal Predictive Process Monitoring (MM-PPM) addresses this rich data scenario by integrating complementary knowledge from heterogeneous modalities through modality-specific representations and information fusion techniques. The growing digitization of healthcare systems, combined with advances in Artificial Intelligence (AI), has accelerated AI-based PPM for analyzing sequences of clinical events, supporting decision-making, enabling personalized care, and improving clinical facility management. Given these characteristics, clinical pathways represent an ideal domain for experimenting with MM-PPM, as they may naturally involve diverse modalities such as structured records, free-text notes, or medical images. To handle multimodal information available with clinical pathways, we introduce MEDUSA , an MM-PPM approach for outcome prediction, which jointly processes medical image information coupled with the storytelling of structural records and text notes collected during the clinical pathway of a patient until the acquisition of the considered image. The evaluation of MEDUSA is done in a COVID-19 case study, to assess the performance of the proposed approach and explain how specific information within each modality influences the decisions of the predictive model.
Multimodal predictive process monitoring and its application to explainable clinical pathways
Maggi F. M.;Malerba D.
2026-01-01
Abstract
In recent years, Predictive Process Monitoring (PPM) has evolved at the intersection of process mining, machine learning, and data science, as organizations seek to anticipate the future course of ongoing processes. Traditional PPM mainly relies on structured event log data, but many real-world scenarios generate richer information, including text, images, audio, and video. Multimodal Predictive Process Monitoring (MM-PPM) addresses this rich data scenario by integrating complementary knowledge from heterogeneous modalities through modality-specific representations and information fusion techniques. The growing digitization of healthcare systems, combined with advances in Artificial Intelligence (AI), has accelerated AI-based PPM for analyzing sequences of clinical events, supporting decision-making, enabling personalized care, and improving clinical facility management. Given these characteristics, clinical pathways represent an ideal domain for experimenting with MM-PPM, as they may naturally involve diverse modalities such as structured records, free-text notes, or medical images. To handle multimodal information available with clinical pathways, we introduce MEDUSA , an MM-PPM approach for outcome prediction, which jointly processes medical image information coupled with the storytelling of structural records and text notes collected during the clinical pathway of a patient until the acquisition of the considered image. The evaluation of MEDUSA is done in a COVID-19 case study, to assess the performance of the proposed approach and explain how specific information within each modality influences the decisions of the predictive model.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


