Optimization Accuracy of Software Defect Prediction using Random Forest Machine Learning Algorithm
Main Article Content
Abstract
Software defect prediction is a critical task in software engineering that aims to
identify faulty components early in the development lifecycle, thereby reducing cost and
improving software quality. This project explores the effectiveness of the Random Forest
machine learning algorithm for predicting software defects, with a particular focus on
optimizing model accuracy. The Random Forest algorithm, known for its robustness and ability
to handle high-dimensional data, is applied to benchmark software defect datasets. The study
involves extensive preprocessing, including feature selection and normalization, followed by
hyperparameter tuning to enhance prediction performance. Evaluation metrics such as
accuracy, precision, recall, F1-score, and AUC-ROC are used to assess the model's
effectiveness. Experimental results demonstrate that the optimized Random Forest model
achieves high predictive accuracy and outperforms several baseline models. This work
highlights the potential of ensemble learning methods, particularly Random Forest, as a reliable
approach for software defect prediction, aiding developers in creating more reliable and
maintainable software systems.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
Muhammad Azam, Muhammad Nouman, Ahsan Rehman Gill, “Comparative Analysis of Machine Learning techniques to Improve Software Defect Prediction”, Journal of Computing & Information Sciences (KJCIS) Volume 5, Issue 2, pp. 41-66, 2022.
Tiapride, Jaray’s, Chakri Klan Tantithamthavorn, Hao Khan Dam, and John Grundy, “An Empirical Study of Model-Agnostic Techniques for Defect Prediction Models”, IEEE Transactions on Software Engineering 48(1):166–85, 2022.
L.-Q. Chen, C. Wang, and S.-L. Song, ‘‘Software defect prediction based on nested-stacking and heterogeneous feature selection,’’ Complex Intell. Syst., vol. 8, no. 4, pp. 3333–3348, 2022
M. Pavana, L. Pushpa, and A. Parkavi, ‘‘Software fault prediction using machine learning algorithms,’’ in Proc. Int. Conf. Adv. Elect. Comput. Technol., pp. 185–197, 2022.
A. Al-Nusirat, F. Hanandeh, M. K. Kharabsheh, M. Al-Ayyoub, and N. Al-Dhfairi, ‘‘Dynamic detection of software defects using supervised learning techniques”, Int. J. Commun. Netw. Inf. Secur., vol. 11, no. 1, pp. 185–191, 2022.
Chen, Xiang, Yinzhou Mu, Key Liu, Zhan Qi Cui, and Chao Ni, “Revisiting Heterogeneous Defect Prediction Methods: How Far Are We?”, Information and Software Technology 130:106441, 2021.
Esteves, Granderson, Eduardo Figueredo, Adriano Veloso, Markos Vigias, and Nivea Zaviana, “Understanding Machine Learning Software Defect Predictions”, Automated Software Engineering 27(3–4):369–92, 2020.
Esteves, Granderson, Eduardo Figueredo, Adriano Veloso, Markos Vigias, and Nivea Zaviana, “Understanding Machine Learning Software Defect Predictions” Automated Software Engineering 27(3–4):369–92, 2020.
R. Bahaweres, F. Agustian, I. Hermadi, A. Suroso, and Y. Arkeman, ‘‘Software defect prediction using neural network basedSMOTE,’’ in Proc. 7th Int. Conf. Electr. Eng., Comput. Sci. Informat. (EECSI), pp. 71–76, 2020.
N. Li, M. Shepperd, and Y. Guo, ‘‘A systematic review of unsupervised learning techniques for software defect prediction,’’ Inf. Softw. Technol., vol. 122, Jun. Art. no. 106287, 2020.
Thota, Mahesh Kumar, Francis H. Sajin, and Gaultheria Rajesh, “Survey on Software Defect Prediction Techniques”, International Journal of Applied Science and Engineering 14, 2019.
Son, Le, Nakul Pritam, Manju Khari, Raghavendra Kumar, Pham Phuong, and Pham Thong, “Empirical Study of Software Defect Prediction: A Systematic Mapping”, Symmetry 11(2):212, 2019.
Pan, Cong, Minyan Lu, Biao Xu, and Hauling Gao, “An Improved CNN Model for Within-Project Software Defect Prediction”, Applied Sciences 9(10):2138, 2019.
Manjula, C., and Lilly Florence, “Deep Neural Network Based Hybrid Approach for Software Defect Prediction Using Software Metrics”, Cluster Computing 22:9847–63, 2019.
Al-Nusrat, Alaa, Fears Hamadeh, Mohammad Khorramshahr, Mahmoud AlAyoub, and Nahla Al-Dhahiri, “Dynamic Detection of Software Defects Using Supervised Learning Techniques”, International Journal of Communication Networks and Information Security 11(1):185–91, 2019.