SOFTWARE DEFECT PREDICTION SYSTEM BASED ON WELL-TUNED RANDOM FOREST TECHNIQUE
Rubrics: ARTICLES
Abstract and keywords
Abstract (English):
Software quality is the main criterion for increasing user demand for software. Therefore, software companies seek to ensure software quality by predicting software defects in the software testing phase. Having an intelligent system capable of predicting software defects helps greatly in reducing time and effort consumption. Despite the great trend to develop software defect prediction systems based on Machine Learning techniques in last few years, the accuracy of these systems is still a major challenge. Therefore, in this study, a software defect prediction system based on three stages is presented to improve the prediction accuracy. First stage, data pre-processing is performed, which includes (data cleaning, data balance, data normalization, and feature selection). Second stage the hyperparameters of ML are tuned using Grid Search technique. Finally, a well-tuned ML technique is implemented to predict software defects. Performance experiments were carried out on the JM1 dataset where the proposed system achieved promising results in predicting software defects. Among ML techniques used, a well-tuned RF technique outperformed the rest of the used ML techniques, in addition to the techniques mentioned in previous works, with an accuracy of 88,26 %. This study proves that the selection of important features and efficient hyperparameter tuning of ML techniques significantly improve the accuracy of software defect prediction.

Keywords:
machine learning, Random Forest, software defects, feature selection, prediction
Login or Create
* Forgot password?