Predicting bug-fixing time can help project managers to select the adequate resources in bug assignment activity. In this work, we tackle the problem of predicting the bug-fixing time by a multiple regression analysis using as predictor variables the textual information extracted from the bug reports. Our model selects all and only the features useful for prediction, also using statistical procedures, such as the Principal Component Analysis (PCA). To validate our model, we performed an empirical investigation using the bug reports of four well-known open source projects whose bugs are stored in Bugzilla installations, where Bugzilla is an online open-source Bug Tracking System (BTS). For each project, we built a regression model using the M5P model tree, Support Vector Machine (SVM) and Random Forests algorithms. Experimental results show the model is effective, in fact, they are slightly better than all the ones known in the literature. In the future, we will use and compare other different regression approaches to select the best one for a specific data set.
A Text-Based Regression Approach to Predict Bug-Fix Time
Ardimento P.;Boffoli N.;
2020-01-01
Abstract
Predicting bug-fixing time can help project managers to select the adequate resources in bug assignment activity. In this work, we tackle the problem of predicting the bug-fixing time by a multiple regression analysis using as predictor variables the textual information extracted from the bug reports. Our model selects all and only the features useful for prediction, also using statistical procedures, such as the Principal Component Analysis (PCA). To validate our model, we performed an empirical investigation using the bug reports of four well-known open source projects whose bugs are stored in Bugzilla installations, where Bugzilla is an online open-source Bug Tracking System (BTS). For each project, we built a regression model using the M5P model tree, Support Vector Machine (SVM) and Random Forests algorithms. Experimental results show the model is effective, in fact, they are slightly better than all the ones known in the literature. In the future, we will use and compare other different regression approaches to select the best one for a specific data set.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.