Background Despite several studies having identified factors associated with successful treatment outcomes in rheumatoid arthritis (RA), there is a lack of accurate predictive models for sustained remission in patients on biologic agents. To the best of our knowledge, no machine learning (ML) approaches apart from logistic regression (LR) have ever been tried on this class of problems. Methods In this longitudinal study, patients with RA who started a biological disease-modifying antirheumatic drug (bDMARD) in a tertiary care center were analyzed. Demographic and clinical characteristics were collected at treatment baseline, 12-month, and 24-month follow-up. A wrapper feature selection algorithm was used to determine an attribute core set. Four different ML algorithms, namely, LR, random forest, K-nearest neighbors, and extreme gradient boosting, were then trained and validated with 10-fold cross-validation to predict 24-month sustained DAS28 (Disease Activity Score on 28 joints) remission. The performances of the algorithms were then compared assessing accuracy, precision, and recall. Results Our analysis included 367 patients (female 323/367, 88%) with mean age ± SD of 53.7 ± 12.5 years at bDMARD baseline. Sustained DAS28 remission was achieved by 175 (47.2%) of 367 patients. The attribute core set used to train algorithms included acute phase reactant levels, Clinical Disease Activity Index, Health Assessment Questionnaire-Disability Index, as well as several clinical characteristics. Extreme gradient boosting showed the best performance (accuracy, 72.7%; precision, 73.2%; recall, 68.1%), outperforming random forest (accuracy, 65.9%; precision, 65.6%; recall, 59.3%), LR (accuracy, 64.9%; precision, 62.6%; recall, 61.9%), and K-nearest neighbors (accuracy, 63%; precision, 61.5%; recall, 54.8%). Conclusions We showed that ML models can be used to predict sustained remission in RA patients on bDMARDs. Furthermore, our method only relies on a few easy-to-collect patient attributes. Our results are promising but need to be tested on longitudinal cohort studies.

A Machine Learning Approach for Predicting Sustained Remission in Rheumatoid Arthritis Patients on Biologic Agents

Venerito V.;Fornaro M.;Cacciapaglia F.;Lopalco G.;Iannone F.
2022-01-01

Abstract

Background Despite several studies having identified factors associated with successful treatment outcomes in rheumatoid arthritis (RA), there is a lack of accurate predictive models for sustained remission in patients on biologic agents. To the best of our knowledge, no machine learning (ML) approaches apart from logistic regression (LR) have ever been tried on this class of problems. Methods In this longitudinal study, patients with RA who started a biological disease-modifying antirheumatic drug (bDMARD) in a tertiary care center were analyzed. Demographic and clinical characteristics were collected at treatment baseline, 12-month, and 24-month follow-up. A wrapper feature selection algorithm was used to determine an attribute core set. Four different ML algorithms, namely, LR, random forest, K-nearest neighbors, and extreme gradient boosting, were then trained and validated with 10-fold cross-validation to predict 24-month sustained DAS28 (Disease Activity Score on 28 joints) remission. The performances of the algorithms were then compared assessing accuracy, precision, and recall. Results Our analysis included 367 patients (female 323/367, 88%) with mean age ± SD of 53.7 ± 12.5 years at bDMARD baseline. Sustained DAS28 remission was achieved by 175 (47.2%) of 367 patients. The attribute core set used to train algorithms included acute phase reactant levels, Clinical Disease Activity Index, Health Assessment Questionnaire-Disability Index, as well as several clinical characteristics. Extreme gradient boosting showed the best performance (accuracy, 72.7%; precision, 73.2%; recall, 68.1%), outperforming random forest (accuracy, 65.9%; precision, 65.6%; recall, 59.3%), LR (accuracy, 64.9%; precision, 62.6%; recall, 61.9%), and K-nearest neighbors (accuracy, 63%; precision, 61.5%; recall, 54.8%). Conclusions We showed that ML models can be used to predict sustained remission in RA patients on bDMARDs. Furthermore, our method only relies on a few easy-to-collect patient attributes. Our results are promising but need to be tested on longitudinal cohort studies.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/414474
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? 12
social impact