The current guidelines recommend the sentinel lymph node biopsy to evaluate the lymph node involvement for breast cancer patients with clinically negative lymph nodes on clinical or radiological examination. Machine learning (ML) models have significantly improved the prediction of lymph nodes status based on clinical features, thus avoiding expensive, time-consuming and invasive procedures. However, the classification of sentinel lymph node status represents a typical example of an unbalanced classification problem. In this work, we developed a ML framework to explore the effects of unbalanced populations on the performance and stability of feature ranking for sentinel lymph node status classification in breast cancer. Our results indicate state-of-the-art AUC (Area under the Receiver Operating Characteristic curve) values on a hold-out set ((Formula presented.)) while providing particularly stable features related to tumor size, histological subtype and estrogen receptor expression, which should therefore be considered as potential biomarkers.
Accurate Evaluation of Feature Contributions for Sentinel Lymph Node Status Classification in Breast Cancer
Lombardi A.;Amoroso N.;Bellantuono L.;Bove S.;Fanizzi A.;La Forgia D.;Monaco A.;Tangaro S.;Zito F. A.;Bellotti R.;Massafra R.
2022-01-01
Abstract
The current guidelines recommend the sentinel lymph node biopsy to evaluate the lymph node involvement for breast cancer patients with clinically negative lymph nodes on clinical or radiological examination. Machine learning (ML) models have significantly improved the prediction of lymph nodes status based on clinical features, thus avoiding expensive, time-consuming and invasive procedures. However, the classification of sentinel lymph node status represents a typical example of an unbalanced classification problem. In this work, we developed a ML framework to explore the effects of unbalanced populations on the performance and stability of feature ranking for sentinel lymph node status classification in breast cancer. Our results indicate state-of-the-art AUC (Area under the Receiver Operating Characteristic curve) values on a hold-out set ((Formula presented.)) while providing particularly stable features related to tumor size, histological subtype and estrogen receptor expression, which should therefore be considered as potential biomarkers.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.