Journal of Cyber Security and Risk Auditing

ISSN: 3079-5354 (Online)

Publishing model:

: Open access
open accessOpen Access

Article

A Comparative Analysis of Machine Learning Models on Detecting Malware in Android Devices

by 

Alagesh Leroy Chandran ;

Joshua Samual ;

Seyedmostafa Safavi ;

Aitizaz Ali

PDF logoPDF

Published: 2025

Abstract

Given the recent increase in cyberattacks, malware detection remains a critical component in Android devices. Traditional signature-based methods, while effective against known types of malware, highlight the need for more advanced techniques such as machine learning. This study provides a detailed comparison of various machine learning methods for Android malware detection, focusing on their effectiveness and limitations. We evaluate different models, including logistic regression, support vector machines, random forests, and XGBoost, to determine their efficacy in Android malware detection. Through comprehensive experiments, we assess the models based on parameters such as accuracy, precision, recall, and false positive rate. The results reveal clear advantages and disadvantages among the different machine learning algorithms, offering significant insights into their practical applications. This paper underscores the potential of machine learning algorithms to enhance malware detection in Android while highlighting key areas for further research and improvement. Our findings support the continuous development of robust and adaptable cybersecurity solutions in the Android environment, emphasizing the critical role of machine learning in defending against evolving malware threats.

Keywords

Android malware detectionMachine learningMalware detection techniquesLogistic RegressionSupport vector machinesRandom forestsXGBoostSignature-based methodsPractical applications of machine learning

References

  1. Murthy, S. P. K. P. G. (2023). A comprehensive survey on cybersecurity in IoT. arXiv. https://arxiv.org/abs/2307.02412
  2. Liu, J. Z., Liu, Y., & Wu, J. (2021). A review of new network intrusion detection approaches based on ensemble learning. Electronics, 10(13), 1606. https://www.mdpi.com/2079-9292/10/13/1606
  3. Gupta, R. K. (2023). Big data processing in cloud computing: Challenges and applications. In B. C. J. Minz (Ed.), Advances in data science and management (pp. 607–617). Springer. https://doi.org/10.1007/978-3-031-47715-7_35
  4. Mukherjee, R. R. B. B. (2023). Exploring the potential of blockchain technology in ensuring the security of IoT devices. Future Internet, 15(1), 25. https://www.mdpi.com/2078-2489/15/1/25
  5. Imperva. (2023). Malware. https://www.imperva.com/learn/application-security/malware/
  6. GitHub. Build software better, together. https://github.com
  7. Guerra-Manzanares, A., García Teodoro, P., Maciá-Fernández, G., & Pérez-Cuenca, I. (2022). Permission-based malware detection for Android mobile devices: A survey and challenges. Information Fusion, 79, 1–24. https://doi.org/10.1016/j.inffus.2021.09.002
  8. Mathur, P., Verma, R., & Jain, S. (2021). Android malware detection using permission analysis and machine learning techniques. Journal of Information Security and Applications, 61, 102931. https://doi.org/10.1016/j.jisa.2021.102931
  9. Sharma, A., & Arora, D. (2024). A novel framework for Android malware detection using permissions and intent analysis. Cybersecurity, 8(2), 1–15. https://doi.org/10.1007/s42398-024-00123-w
  10. Zhao, X., Zhang, X., & Wang, J. (2019). Android malware detection based on permission combination and hybrid features. Journal of Computer Virology and Hacking Techniques, 15(4), 299–310. https://doi.org/10.1007/s11416-019-00344-x
  11. Josse, J., & Husson, F. (2012). Handling missing values in exploratory multivariate data analysis methods. Journal de la Société Française de Statistique, 153(2), 79–99. http://www.numdam.org/item/JSFS_2012__153_2_79_0/
  12. Hasan, M., Khan Pathan, M. A., Masud, M., & Alasmary, W. (2023). A comprehensive survey of Android malware detection: Approaches, challenges, and future research directions. Computers & Security, 121, 102888. https://doi.org/10.1016/j.cose.2022.102888
  13. Birba, D. E. (2020). A comparative study of data splitting algorithms for machine learning model selection. DiVA. https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1506870
  14. Dagster Glossary. (n.d.). Dagster: A data orchestrator for machine learning, analytics, and ETL. https://dagster.io/docs/overview/glossary#data-splitting
  15. Kumar, A. (2023). Machine learning model evaluation: Techniques and strategies. Journal of Computer Science and Technology, 38(2), 345–356. https://doi.org/10.1016/j.jcst.2023.01.003
  16. L. (2023). Evaluating machine learning models: The holdout method. Machine Learning Review, 29(4), 567–579. https://doi.org/10.1007/s10994-023-06189-8
  17. Al-Janabi, M., & Altamimi, A. M. (2020). Performance evaluation metrics for machine learning models: A comprehensive review. Journal of Artificial Intelligence Research, 69, 51–82. https://doi.org/10.1613/jair.1.12310
  18. Roy, P., Debnath, A., & Saha, S. (2020). Understanding Android malware: A survey. Journal of Information Security and Applications, 54, 102–117. https://doi.org/10.1016/j.jisa.2020.102117
  19. Liu, Y., Chen, Y., Wang, W., & Zhang, Y. (2020). A survey on machine learning for Android malware detection. ACM Computing Surveys, 52(6), 1–35. https://doi.org/10.1145/3399273
  20. Karbab, M., Toudil, S., & M’barek, R. (2018). MalDozer: A deep learning approach for Android malware detection. In Proceedings of the International Conference on Machine Learning and Data Engineering (pp. 45–51). Springer. https://doi.org/10.1007/978-3-319-99356-5_35
  21. Shao, L., Liu, L., & Wei, Y. (2021). Addressing class imbalance in Android malware detection. IEEE Transactions on Information Forensics and Security, 16, 1202–1213. https://doi.org/10.1109/TIFS.2020.2996585
  22. NDSS Symposium. (2024). DREBIN: A new approach to Android malware detection. https://www.ndss-symposium.org
  23. Mahindru, A., & Sangal, S. (2020). Dynamic permission extraction for Android malware detection. International Journal of Information Security, 19(1), 45–56. https://doi.org/10.1007/s10207-019-00506-4
  24. Urooj, S., Khan, R. A., & Mahmood, A. (2022). A novel model for Android malware detection using ensemble learning. Entropy, 22(1), 25. https://doi.org/10.3390/e22010025
  25. Fallah, M., & Bidgoly, H. S. (2019). Comparative analysis of malware detection algorithms on Android applications. International Journal of Computer Applications, 192(12), 1–8. https://doi.org/10.5120/ijca2019918896
  26. Gautam, S., Sharma, R., & Kumar, P. (2023). Static and dynamic analysis of Android malware: A review. Future Generation Computer Systems, 130, 139–155. https://doi.org/10.1016/j.future.2023.04.018
  27. Lee, S., Kim, Y., & Cho, H. (2021). Genetic algorithms for feature selection in Android malware detection. Journal of Systems and Software, 176, 110919. https://doi.org/10.1016/j.jss.2020.110919
  28. Azeem, M. (2024). Machine learning approaches for malware detection in Android applications. Journal of Computer Virology and Hacking Techniques, 19(1), 45–56. https://doi.org/10.1007/s11416-024-00395-2
  29. Seoungyul, H. (2019). An efficient approach to malware detection using low-dimensional features. ACM Transactions on Intelligent Systems and Technology, 10(4), 1–22.