The high number of Android devices that are active around the world makes these platforms appealing targets for malware attacks. A malware is shorthand for malicious applications developed by cyber attackers with the intention of gaining access or causing damage to a computer device or network, often while the victim remains oblivious to the fact there’s been a compromise. Android security requires machine learning approaches to quickly and accurately flag malicious applications. This paper describes a supervised learning approach for classifying Android applications as genuine or malicious. It uses reverse engineering to look for dangerous capabilities within the application code and structure before it is executed and applies to an intriguing combination of clustering and classification, in order to deal with the imbalanced data problem and avoid a detection system that skews towards modeling the genuine applications. We use benchmark Android applications to assess that the presented approach is able to correctly detect malware applications. The significance of the computed detection patterns is evaluated using established machine learning metrics.
Dealing with Class Imbalance in Android Malware Detection by Cascading Clustering and Classification
Andresini G.
;Appice A.;Malerba D.
2020-01-01
Abstract
The high number of Android devices that are active around the world makes these platforms appealing targets for malware attacks. A malware is shorthand for malicious applications developed by cyber attackers with the intention of gaining access or causing damage to a computer device or network, often while the victim remains oblivious to the fact there’s been a compromise. Android security requires machine learning approaches to quickly and accurately flag malicious applications. This paper describes a supervised learning approach for classifying Android applications as genuine or malicious. It uses reverse engineering to look for dangerous capabilities within the application code and structure before it is executed and applies to an intriguing combination of clustering and classification, in order to deal with the imbalanced data problem and avoid a detection system that skews towards modeling the genuine applications. We use benchmark Android applications to assess that the presented approach is able to correctly detect malware applications. The significance of the computed detection patterns is evaluated using established machine learning metrics.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.