One of the main problems in analyzing real data is often related to the presence of anomalies. Anomalous cases may, in fact, spoil the resulting analysis as well as contain valuable information at the same time. In both cases, the ability to detect these occurrences is very important. Particularly, in biomedical field, a proper identification of outliers allows to develop novel biological hypotheses not taken into consideration when experimental biological data are considered. In this paper, we address the problem of detecting outlier samples in gene expression data. We propose an ensemble approach for anomalies detection in gene expression matrices based on the use of hierarchical clustering and Robust Principal Component Analysis, that allows to derive a novel pseudo mathematical classification of anomalies.
Anomalies Detection in Gene Expression Matrices: Towards a New Approach
Nicoletta Buono;Flavia Esposito;Laura Selicato
;
2021-01-01
Abstract
One of the main problems in analyzing real data is often related to the presence of anomalies. Anomalous cases may, in fact, spoil the resulting analysis as well as contain valuable information at the same time. In both cases, the ability to detect these occurrences is very important. Particularly, in biomedical field, a proper identification of outliers allows to develop novel biological hypotheses not taken into consideration when experimental biological data are considered. In this paper, we address the problem of detecting outlier samples in gene expression data. We propose an ensemble approach for anomalies detection in gene expression matrices based on the use of hierarchical clustering and Robust Principal Component Analysis, that allows to derive a novel pseudo mathematical classification of anomalies.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.