Defect Predictor

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 72 Experts worldwide ranked by ideXlab platform

Bingwu Fang - One of the best experts on this subject based on the ideXlab platform.

  • Evaluating Data Filter on Cross-Project Defect Prediction: Comparison and Improvements
    IEEE Access, 2017
    Co-Authors: Zhiqiu Huang, Yong Wang, Bingwu Fang
    Abstract:

    Cross-project Defect prediction (CPDP) is a field of study where a software project lacking enough local data can use data from other projects to build Defect Predictors. To support CPDP, the cross-project data must be carefully filtered before being applied locally. Researchers have devised and implemented a plethora of various data filters for the improvement of CPDP performance. However, it is still unclear what data filter strategy is most effective, both generally and specifically, in CPDP. The objective of this paper is to provide an extensive comparison of well-known data filters and a novel filter devised in this paper. We perform experiments on 44 releases of 14 open-source projects, and use Naive Bayes and a support vector machine as the underlying classifier. The results demonstrate that the data filter strategy improves the performance of cross-project Defect prediction significantly, and the hierarchical select-based filter proposed performs significantly better. Moreover, when using appropriate data filter strategy, the Defect Predictor built from cross-project data can outperform the Predictor learned by using within-project data.

Chengzu Bai - One of the best experts on this subject based on the ideXlab platform.

  • A novel Bayes Defect Predictor based on information diffusion function
    Knowledge-Based Systems, 2018
    Co-Authors: Song Huang, Changyou Zheng, Chengzu Bai
    Abstract:

    Abstract Software Defect prediction plays a significant part in identifying the most Defect-prone modules before software testing. Quite a number of researchers have made great efforts to improve prediction accuracy. However, the problem of insufficient historical data available for within- or cross- project still remains unresolved. Further, it is common practice to use the probability density function for a normal distribution in Naive Bayes (NB) classifier. Nevertheless, after performing a Kolmogorov–Smirnov test, we find that the 21 main software metrics are not normally distributed at the 5% significance level. Therefore, this paper proposes a new Bayes classifier, which evolves NB classifier with non-normal information diffusion function, to help solve the problem of lacking appropriate training data for new projects. We conduct three experiments on 34 data sets obtained from 10 open source projects, using only 10%, 6.67%, 5%, 3.33% and 2% of the total data for training, respectively. Four well-known classification algorithms are also included for comparison, namely Logistic Regression, Naive Bayes, Random Tree and Support Vector Machine. All experimental results demonstrate the efficiency and practicability of the new classifier.

Zhiqiu Huang - One of the best experts on this subject based on the ideXlab platform.

  • Evaluating Data Filter on Cross-Project Defect Prediction: Comparison and Improvements
    IEEE Access, 2017
    Co-Authors: Zhiqiu Huang, Yong Wang, Bingwu Fang
    Abstract:

    Cross-project Defect prediction (CPDP) is a field of study where a software project lacking enough local data can use data from other projects to build Defect Predictors. To support CPDP, the cross-project data must be carefully filtered before being applied locally. Researchers have devised and implemented a plethora of various data filters for the improvement of CPDP performance. However, it is still unclear what data filter strategy is most effective, both generally and specifically, in CPDP. The objective of this paper is to provide an extensive comparison of well-known data filters and a novel filter devised in this paper. We perform experiments on 44 releases of 14 open-source projects, and use Naive Bayes and a support vector machine as the underlying classifier. The results demonstrate that the data filter strategy improves the performance of cross-project Defect prediction significantly, and the hierarchical select-based filter proposed performs significantly better. Moreover, when using appropriate data filter strategy, the Defect Predictor built from cross-project data can outperform the Predictor learned by using within-project data.

John Yearwood - One of the best experts on this subject based on the ideXlab platform.

  • a parallel framework for software Defect detection and metric selection on cloud computing
    Cluster Computing, 2017
    Co-Authors: Mohsin Ali, Shamsul Huda, Jemal H Abawajy, Sultan Alyahya, Hmood Aldossari, John Yearwood
    Abstract:

    With the continued growth of Internet of Things (IoT) and its convergence with the cloud, numerous interoperable software are being developed for cloud. Therefore, there is a growing demand to maintain a better quality of software in the cloud for improved service. This is more crucial as the cloud environment is growing fast towards a hybrid model; a combination of public and private cloud model. Considering the high volume of the available software as a service (SaaS) in the cloud, identification of non-standard software and measuring their quality in the SaaS is an urgent issue. Manual testing and determination of the quality of the software is very expensive and impossible to accomplish it to some extent. An automated software Defect detection model that is capable to measure the relative quality of software and identify their faulty components can significantly reduce both the software development effort and can improve the cloud service. In this paper, we propose a software Defect detection model that can be used to identify faulty components in big software metric data. The novelty of our proposed approach is that it can identify significant metrics using a combination of different filters and wrapper techniques. One of the important contributions of the proposed approach is that we designed and evaluated a parallel framework of a hybrid software Defect Predictor in order to deal with big software metric data in a computationally efficient way for cloud environment. Two different hybrids have been developed using Fisher and Maximum Relevance (MR) filters with a Artificial Neural Network (ANN) based wrapper in the parallel framework. The evaluations are performed with real Defect-prone software datasets for all parallel versions. Experimental results show that the proposed parallel hybrid framework achieves a significant computational speedup on a computer cluster with a higher Defect prediction accuracy and smaller number of software metrics compared to the independent filter or wrapper approaches.

Song Huang - One of the best experts on this subject based on the ideXlab platform.

  • A novel Bayes Defect Predictor based on information diffusion function
    Knowledge-Based Systems, 2018
    Co-Authors: Song Huang, Changyou Zheng, Chengzu Bai
    Abstract:

    Abstract Software Defect prediction plays a significant part in identifying the most Defect-prone modules before software testing. Quite a number of researchers have made great efforts to improve prediction accuracy. However, the problem of insufficient historical data available for within- or cross- project still remains unresolved. Further, it is common practice to use the probability density function for a normal distribution in Naive Bayes (NB) classifier. Nevertheless, after performing a Kolmogorov–Smirnov test, we find that the 21 main software metrics are not normally distributed at the 5% significance level. Therefore, this paper proposes a new Bayes classifier, which evolves NB classifier with non-normal information diffusion function, to help solve the problem of lacking appropriate training data for new projects. We conduct three experiments on 34 data sets obtained from 10 open source projects, using only 10%, 6.67%, 5%, 3.33% and 2% of the total data for training, respectively. Four well-known classification algorithms are also included for comparison, namely Logistic Regression, Naive Bayes, Random Tree and Support Vector Machine. All experimental results demonstrate the efficiency and practicability of the new classifier.