A Comparative Analysis of Machine Learning Models on Detecting Malware in Android Devices
Alagesh Leroy Chandran ;
Joshua Samual ;
Seyedmostafa Safavi ;
Aitizaz Ali
Published: 2025
Abstract
Given the recent increase in cyberattacks, malware detection remains a critical component in Android devices. Traditional signature-based methods, while effective against known types of malware, highlight the need for more advanced techniques such as machine learning. This study provides a detailed comparison of various machine learning methods for Android malware detection, focusing on their effectiveness and limitations. We evaluate different models, including logistic regression, support vector machines, random forests, and XGBoost, to determine their efficacy in Android malware detection. Through comprehensive experiments, we assess the models based on parameters such as accuracy, precision, recall, and false positive rate. The results reveal clear advantages and disadvantages among the different machine learning algorithms, offering significant insights into their practical applications. This paper underscores the potential of machine learning algorithms to enhance malware detection in Android while highlighting key areas for further research and improvement. Our findings support the continuous development of robust and adaptable cybersecurity solutions in the Android environment, emphasizing the critical role of machine learning in defending against evolving malware threats.
Keywords
A Comparative Analysis of Machine Learning Models on Detecting Malware in Android Devices is licensed under CC BY 4.0
References
- Murthy, S. P. K. P. G. (2023). A comprehensive survey on cybersecurity in IoT. arXiv. https://arxiv.org/abs/2307.02412
- Liu, J. Z., Liu, Y., & Wu, J. (2021). A review of new network intrusion detection approaches based on ensemble learning. Electronics, 10(13), 1606. https://www.mdpi.com/2079-9292/10/13/1606
- Gupta, R. K. (2023). Big data processing in cloud computing: Challenges and applications. In B. C. J. Minz (Ed.), Advances in data science and management (pp. 607–617). Springer. https://doi.org/10.1007/978-3-031-47715-7_35
- Mukherjee, R. R. B. B. (2023). Exploring the potential of blockchain technology in ensuring the security of IoT devices. Future Internet, 15(1), 25. https://www.mdpi.com/2078-2489/15/1/25
- Imperva. (2023). Malware. https://www.imperva.com/learn/application-security/malware/
- GitHub. Build software better, together. https://github.com
- Guerra-Manzanares, A., García Teodoro, P., Maciá-Fernández, G., & Pérez-Cuenca, I. (2022). Permission-based malware detection for Android mobile devices: A survey and challenges. Information Fusion, 79, 1–24. https://doi.org/10.1016/j.inffus.2021.09.002
- Mathur, P., Verma, R., & Jain, S. (2021). Android malware detection using permission analysis and machine learning techniques. Journal of Information Security and Applications, 61, 102931. https://doi.org/10.1016/j.jisa.2021.102931
- Sharma, A., & Arora, D. (2024). A novel framework for Android malware detection using permissions and intent analysis. Cybersecurity, 8(2), 1–15. https://doi.org/10.1007/s42398-024-00123-w
- Zhao, X., Zhang, X., & Wang, J. (2019). Android malware detection based on permission combination and hybrid features. Journal of Computer Virology and Hacking Techniques, 15(4), 299–310. https://doi.org/10.1007/s11416-019-00344-x
- Josse, J., & Husson, F. (2012). Handling missing values in exploratory multivariate data analysis methods. Journal de la Société Française de Statistique, 153(2), 79–99. http://www.numdam.org/item/JSFS_2012__153_2_79_0/
- Hasan, M., Khan Pathan, M. A., Masud, M., & Alasmary, W. (2023). A comprehensive survey of Android malware detection: Approaches, challenges, and future research directions. Computers & Security, 121, 102888. https://doi.org/10.1016/j.cose.2022.102888
- Birba, D. E. (2020). A comparative study of data splitting algorithms for machine learning model selection. DiVA. https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1506870
- Dagster Glossary. (n.d.). Dagster: A data orchestrator for machine learning, analytics, and ETL. https://dagster.io/docs/overview/glossary#data-splitting
- Kumar, A. (2023). Machine learning model evaluation: Techniques and strategies. Journal of Computer Science and Technology, 38(2), 345–356. https://doi.org/10.1016/j.jcst.2023.01.003
- L. (2023). Evaluating machine learning models: The holdout method. Machine Learning Review, 29(4), 567–579. https://doi.org/10.1007/s10994-023-06189-8
- Al-Janabi, M., & Altamimi, A. M. (2020). Performance evaluation metrics for machine learning models: A comprehensive review. Journal of Artificial Intelligence Research, 69, 51–82. https://doi.org/10.1613/jair.1.12310
- Roy, P., Debnath, A., & Saha, S. (2020). Understanding Android malware: A survey. Journal of Information Security and Applications, 54, 102–117. https://doi.org/10.1016/j.jisa.2020.102117
- Liu, Y., Chen, Y., Wang, W., & Zhang, Y. (2020). A survey on machine learning for Android malware detection. ACM Computing Surveys, 52(6), 1–35. https://doi.org/10.1145/3399273
- Karbab, M., Toudil, S., & M’barek, R. (2018). MalDozer: A deep learning approach for Android malware detection. In Proceedings of the International Conference on Machine Learning and Data Engineering (pp. 45–51). Springer. https://doi.org/10.1007/978-3-319-99356-5_35
- Shao, L., Liu, L., & Wei, Y. (2021). Addressing class imbalance in Android malware detection. IEEE Transactions on Information Forensics and Security, 16, 1202–1213. https://doi.org/10.1109/TIFS.2020.2996585
- NDSS Symposium. (2024). DREBIN: A new approach to Android malware detection. https://www.ndss-symposium.org
- Mahindru, A., & Sangal, S. (2020). Dynamic permission extraction for Android malware detection. International Journal of Information Security, 19(1), 45–56. https://doi.org/10.1007/s10207-019-00506-4
- Urooj, S., Khan, R. A., & Mahmood, A. (2022). A novel model for Android malware detection using ensemble learning. Entropy, 22(1), 25. https://doi.org/10.3390/e22010025
- Fallah, M., & Bidgoly, H. S. (2019). Comparative analysis of malware detection algorithms on Android applications. International Journal of Computer Applications, 192(12), 1–8. https://doi.org/10.5120/ijca2019918896
- Gautam, S., Sharma, R., & Kumar, P. (2023). Static and dynamic analysis of Android malware: A review. Future Generation Computer Systems, 130, 139–155. https://doi.org/10.1016/j.future.2023.04.018
- Lee, S., Kim, Y., & Cho, H. (2021). Genetic algorithms for feature selection in Android malware detection. Journal of Systems and Software, 176, 110919. https://doi.org/10.1016/j.jss.2020.110919
- Azeem, M. (2024). Machine learning approaches for malware detection in Android applications. Journal of Computer Virology and Hacking Techniques, 19(1), 45–56. https://doi.org/10.1007/s11416-024-00395-2
- Seoungyul, H. (2019). An efficient approach to malware detection using low-dimensional features. ACM Transactions on Intelligent Systems and Technology, 10(4), 1–22.