Software Defect

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Zhihua Zhou - One of the best experts on this subject based on the ideXlab platform.

  • towards one reusable model for various Software Defect mining tasks
    Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2019
    Co-Authors: Zhihua Zhou
    Abstract:

    Software Defect mining is playing an important role in Software quality assurance. Many deep neural network based models have been proposed for Software Defect mining tasks, and have pushed forward the state-of-the-art mining performance. These deep models usually require a huge amount of task-specific source code for training to capture the code functionality to mine the Defects. But such requirement is often hard to be satisfied in practice. On the other hand, lots of free source code and corresponding textual explanations are publicly available in the open source Software repositories, which is potentially useful in modeling code functionality. However, no previous studies ever leverage these resources to help Defect mining tasks. In this paper, we propose a novel framework to learn one reusable deep model for code functional representation using the huge amount of publicly available task-free source code as well as their textual explanations. And then reuse it for various Software Defect mining tasks. Experimental results on three major Defect mining tasks with real world datasets indicate that by reusing this model in specific tasks, the mining performance outperforms its counterpart that learns deep models from scratch, especially when the training data is insufficient.

  • sample based Software Defect prediction with active and semi supervised learning
    Automated Software Engineering, 2012
    Co-Authors: Hongyu Zhang, Zhihua Zhou
    Abstract:

    Software Defect prediction can help us better understand and control Software quality. Current Defect prediction techniques are mainly based on a sufficient amount of historical project data. However, historical data is often not available for new projects and for many organizations. In this case, effective Defect prediction is difficult to achieve. To address this problem, we propose sample-based methods for Software Defect prediction. For a large Software system, we can select and test a small percentage of modules, and then build a Defect prediction model to predict Defect-proneness of the rest of the modules. In this paper, we describe three methods for selecting a sample: random sampling with conventional machine learners, random sampling with a semi-supervised learner and active sampling with active semi-supervised learner. To facilitate the active sampling, we propose a novel active semi-supervised learning method ACoForest which is able to sample the modules that are most helpful for learning a good prediction model. Our experiments on PROMISE datasets show that the proposed methods are effective and have potential to be applied to industrial practice.

  • Software Defect detection with rocus
    Journal of Computer Science and Technology, 2011
    Co-Authors: Yuan Jiang, Zhihua Zhou
    Abstract:

    Software Defect detection aims to automatically identify Defective Software modules for efficient Software test in order to improve the quality of a Software system. Although many machine learning methods have been successfully applied to the task, most of them fail to consider two practical yet important issues in Software Defect detection. First, it is rather difficult to collect a large amount of labeled training data for learning a well-performing model; second, in a Software system there are usually much fewer Defective modules than Defect-free modules, so learning would have to be conducted over an imbalanced data set. In this paper, we address these two practical issues simultaneously by prcposing a novel semi-supervised learning approach named ROCUS. This method exploits the abundant unlabeled examples to improve the detection accuracy, as well as employs under-sampling to tackle the class-imbalance problem in the learning process. Experimental results of real-world Software Defect detection tasks show that ROCUS is effective for Software Defect cetection. Its performance is better than a semi-supervised learning method that ignores the class-imbalance nature of the task and a class-imbalance learning method that does not make effective use of unlabeled data.

Shihai Wang - One of the best experts on this subject based on the ideXlab platform.

  • a novel Software Defect prediction based on atomic class association rule mining
    Expert Systems With Applications, 2018
    Co-Authors: Yuanxun Shao, Bin Liu, Shihai Wang
    Abstract:

    Abstract To ensure the rational allocation of Software testing resources and reduce costs, Software Defect prediction has drawn notable attention to many “white-box” and “black-box” classification algorithms. Although there have been lots of studies on using Software product metrics to identify Defect-prone modules, Defect prediction algorithms are still worth exploring. For instance, it is not easy to directly implement the Apriori algorithm to classify Defect-prone modules across a skewed dataset. Therefore, we propose a novel supervised approach for Software Defect prediction based on atomic class-association rule mining (ACAR). It holds the characteristics of only one feature of the antecedent and a unique class label of the consequent, which is a specific kind of association rules that explores the relationship between attributes and categories. It holds the characteristics of only one feature of the antecedent and a unique class label of the consequent, which is a specific kind of association rules that explores the relationship between attributes and categories. Such association patterns can provide meaningful knowledge that can be easily understood by Software engineers. A new Software Defect prediction model infrastructure based on association rules is employed to improve the prediction of Defect-prone modules, which is divided into data preprocessing, rule model building and performance evaluation. Moreover, ACAR can achieve a satisfactory classification performance compared with other seven benchmark learners (the extension of classification based on associations (CBA2), Support Vector Machine, Naive Bayesian, Decision Tree, OneR, K-nearest Neighbors and RIPPER) on NASA MDP and PROMISE datasets. In light of Software Defect associative prediction, a comparative experiment between ACAR and CBA2 is discussed in details. It is demonstrated that ACAR is better than CBA2 in terms of AUC, G-mean, Balance, and understandability. In addition, the average AUC of ACAR is increased by 2.9% compared with CBA2, which can reach 81.1%.

  • Software Defect prediction using stacked denoising autoencoders and two stage ensemble learning
    Information & Software Technology, 2017
    Co-Authors: Haonan Tong, Bin Liu, Shihai Wang
    Abstract:

    Abstract Context Software Defect prediction (SDP) plays an important role in allocating testing resources reasonably, reducing testing costs, and ensuring Software quality. However, Software metrics used for SDP are almost entirely traditional features compared with deep representations (DPs) from deep learning. Although stacked denoising autoencoders (SDAEs) are powerful for feature learning and have been successfully applied in other fields, to the best of our knowledge, it has not been investigated in the field of SDP. Meanwhile, class-imbalance is still a pressing problem needing to be addressed. Objective In this paper, we propose a novel SDP approach, SDAEsTSE, which takes advantages of SDAEs and ensemble learning, namely the proposed two-stage ensemble (TSE). Method Our method mainly includes two phases: the deep learning phase and two-stage ensemble (TSE) phase. We first use SDAEs to extract the DPs from the traditional Software metrics, and then a novel ensemble learning approach, TSE, is proposed to address the class-imbalance problem. Results Experiments are performed on 12 NASA datasets to demonstrate the effectiveness of DPs, the proposed TSE, and SDAEsTSE, respectively. The performance is evaluated in terms of F-measure, the area under the curve (AUC), and Matthews correlation coefficient (MCC). Generally, DPs, TSE, and SDAEsTSE contribute to significantly higher performance compared with corresponding traditional metrics, classic ensemble methods, and benchmark SDP models. Conclusions It can be concluded that (1) deep representations are promising for SDP compared with traditional Software metrics, (2) TSE is more effective for addressing the class-imbalance problem in SDP compared with classic ensemble learning methods, and (3) the proposed SDAEsTSE is significantly effective for SDP.

Jin Liu - One of the best experts on this subject based on the ideXlab platform.

  • ldfr learning deep feature representation for Software Defect prediction
    Journal of Systems and Software, 2019
    Co-Authors: Jacky Keung, Jin Liu, Xiapu Luo, Yifeng Zhang, Tao Zhang, Yutian Tang
    Abstract:

    Abstract Software Defect Prediction (SDP) aims to detect Defective modules to enable the reasonable allocation of testing resources, which is an economically critical activity in Software quality assurance. Learning effective feature representation and addressing class imbalance are two main challenges in SDP. Ideally, the more discriminative the features learned from the modules and the better the rescue performed on the imbalance issue, the more effective it should be in detecting Defective modules. In this study, to solve these two challenges, we propose a novel framework named LDFR by Learning Deep Feature Representation from the Defect data for SDP. Specifically, we use a deep neural network with a new hybrid loss function that consists of a triplet loss to learn a more discriminative feature representation of the Defect data and a weighted cross-entropy loss to remedy the imbalance issue. To evaluate the effectiveness of the proposed LDFR framework, we conduct extensive experiments on a benchmark dataset with 27 Defect data (each with three types of features), using three traditional and three effort-aware indicators. Overall, the experimental results demonstrate the superiority of our LDFR framework in detecting Defective modules when compared with 27 baseline methods, except in terms of the indicator of Precision.

  • Software Defect prediction based on kernel pca and weighted extreme learning machine
    Information & Software Technology, 2019
    Co-Authors: Jin Liu, Xiapu Luo, Zijiang Yang, Yifeng Zhang, Peipei Yuan, Yutian Tang, Tao Zhang
    Abstract:

    Abstract Context Software Defect prediction strives to detect Defect-prone Software modules by mining the historical data. Effective prediction enables reasonable testing resource allocation, which eventually leads to a more reliable Software. Objective The complex structures and the imbalanced class distribution in Software Defect data make it challenging to obtain suitable data features and learn an effective Defect prediction model. In this paper, we propose a method to address these two challenges. Method We propose a Defect prediction framework called KPWE that combines two techniques, i.e., Kernel Principal Component Analysis (KPCA) and Weighted Extreme Learning Machine (WELM). Our framework consists of two major stages. In the first stage, KPWE aims to extract representative data features. It leverages the KPCA technique to project the original data into a latent feature space by nonlinear mapping. In the second stage, KPWE aims to alleviate the class imbalance. It exploits the WELM technique to learn an effective Defect prediction model with a weighting-based scheme. Results We have conducted extensive experiments on 34 projects from the PROMISE dataset and 10 projects from the NASA dataset. The experimental results show that KPWE achieves promising performance compared with 41 baseline methods, including seven basic classifiers with KPCA, five variants of KPWE, eight representative feature selection methods with WELM, 21 imbalanced learning methods. Conclusion In this paper, we propose KPWE, a new Software Defect prediction framework that considers the feature extraction and class imbalance issues. The empirical study on 44 Software projects indicate that KPWE is superior to the baseline methods in most cases.

  • dictionary learning based Software Defect prediction
    International Conference on Software Engineering, 2014
    Co-Authors: Xiao-yuan Jing, Zhi-wu Zhang, Shi Ying, Jin Liu
    Abstract:

    In order to improve the quality of a Software system, Software Defect prediction aims to automatically identify Defective Software modules for efficient Software test. To predict Software Defect, those classification methods with static code attributes have attracted a great deal of attention. In recent years, machine learning techniques have been applied to Defect prediction. Due to the fact that there exists the similarity among different Software modules, one Software module can be approximately represented by a small proportion of other modules. And the representation coefficients over the pre-defined dictionary, which consists of historical Software module data, are generally sparse. In this paper, we propose to use the dictionary learning technique to predict Software Defect. By using the characteristics of the metrics mined from the open source Software, we learn multiple dictionaries (including Defective module and Defective-free module sub-dictionaries and the total dictionary) and sparse representation coefficients. Moreover, we take the misclassification cost issue into account because the misclassification of Defective modules generally incurs much higher risk cost than that of Defective-free ones. We thus propose a cost-sensitive discriminative dictionary learning (CDDL) approach for Software Defect classification and prediction. The widely used datasets from NASA projects are employed as test data to evaluate the performance of all compared methods. Experimental results show that CDDL outperforms several representative state-of-the-art Defect prediction methods.

  • A General Software Defect-Proneness Prediction Framework
    IEEE Transactions on Software Engineering, 2011
    Co-Authors: Qinbao Song, Martin Shepperd, Zihan Jia, Shi Ying, Jin Liu
    Abstract:

    BACKGROUND - Predicting Defect-prone Software components is an economically important activity and so has received a good deal of attention. However, making sense of the many, and sometimes seemingly inconsistent, results is difficult. OBJECTIVE - We propose and evaluate a general framework for Software Defect prediction that supports 1) unbiased and 2) comprehensive comparison between competing prediction systems. METHOD - The framework is comprised of 1) scheme evaluation and 2) Defect prediction components. The scheme evaluation analyzes the prediction performance of competing learning schemes for given historical data sets. The Defect predictor builds models according to the evaluated learning scheme and predicts Software Defects with new data according to the constructed model. In order to demonstrate the performance of the proposed framework, we use both simulation and publicly available Software Defect data sets. RESULTS - The results show that we should choose different learning schemes for different data sets (i.e., no scheme dominates), that small details in conducting how evaluations are conducted can completely reverse findings, and last, that our proposed framework is more effective and less prone to bias than previous approaches. CONCLUSIONS - Failure to properly or fully evaluate a learning scheme can be misleading; however, these problems may be overcome by our proposed framework.

Istvan Gergely Czibula - One of the best experts on this subject based on the ideXlab platform.

  • A novel approach for Software Defect prediction through hybridizing gradual relational association rules with artificial neural networks
    Information Sciences, 2018
    Co-Authors: Diana-lucia Miholca, Gabriela Czibula, Istvan Gergely Czibula
    Abstract:

    Abstract The growing complexity of Software projects requires increasing consideration of their analysis and testing. Identifying Defective Software entities is essential for Software quality assurance and it also improves activities related to Software testing. In this study, we developed a novel supervised classification method called HyGRAR for Software Defect prediction. HyGRAR is a non-linear hybrid model that combines gradual relational association rule mining and artificial neural networks to discriminate between Defective and non-Defective Software entities. Experiments performed based on 10 open-source data sets demonstrated the excellent performance of the HYGRAR classifier. HyGRAR performed better than most of the previously proposed approaches for Software Defect prediction in performance evaluations using the same data sets.

  • Software Defect prediction using relational association rule mining
    Information Sciences, 2014
    Co-Authors: Gabriela Czibula, Zsuzsanna Marian, Istvan Gergely Czibula
    Abstract:

    This paper focuses on the problem of Defect prediction, a problem of major importance during Software maintenance and evolution. It is essential for Software developers to identify Defective Software modules in order to continuously improve the quality of a Software system. As the conditions for a Software module to have Defects are hard to identify, machine learning based classification models are still developed to approach the problem of Defect prediction. We propose a novel classification model based on relational association rules mining. Relational association rules are an extension of ordinal association rules, which are a particular type of association rules that describe numerical orderings between attributes that commonly occur over a dataset. Our classifier is based on the discovery of relational association rules for predicting whether a Software module is or it is not Defective. An experimental evaluation of the proposed model on the open source NASA datasets, as well as a comparison to similar existing approaches is provided. The obtained results show that our classifier overperforms, for most of the considered evaluation measures, the existing machine learning based techniques for Defect prediction. This confirms the potential of our proposal.

John Yearwood - One of the best experts on this subject based on the ideXlab platform.

  • a framework for Software Defect prediction and metric selection
    IEEE Access, 2018
    Co-Authors: Shamsul Huda, Mohsin Ali, Jemal H Abawajy, Sultan Alyahya, Hmood Aldossari, Shafiq Ahmad, John Yearwood
    Abstract:

    Automated Software Defect prediction is an important and fundamental activity in the domain of Software development. However, modern Software systems are inherently large and complex with numerous correlated metrics that capture different aspects of the Software components. This large number of correlated metrics makes building a Software Defect prediction model very complex. Thus, identifying and selecting a subset of metrics that enhance the Software Defect prediction method’s performance are an important but challenging problem that has received little attention in the literature. The main objective of this paper is to identify significant Software metrics, to build and evaluate an automated Software Defect prediction model. We propose two novel hybrid Software Defect prediction models to identify the significant attributes (metrics) using a combination of wrapper and filter techniques. The novelty of our approach is that it embeds the metric selection and training processes of Software Defect prediction as a single process while reducing the measurement overhead significantly. Different wrapper approaches were combined, including SVM and ANN, with a maximum relevance filter approach to find the significant metrics. A filter score was injected into the wrapper selection process in the proposed approaches to direct the search process efficiently to identify significant metrics. Experimental results with real Defect-prone Software data sets show that the proposed hybrid approaches achieve significantly compact metrics (i.e., selecting the most significant metrics) with high prediction accuracy compared with conventional wrapper or filter approaches. The performance of the proposed framework has also been verified using a statistical multivariate quality control process using multivariate exponentially weighted moving average. The proposed framework demonstrates that the hybrid heuristic can guide the metric selection process in a computationally efficient way by integrating the intrinsic characteristics from the filters into the wrapper and using the advantages of both the filter and wrapper approaches.

  • a parallel framework for Software Defect detection and metric selection on cloud computing
    Cluster Computing, 2017
    Co-Authors: Mohsin Ali, Shamsul Huda, Jemal H Abawajy, Sultan Alyahya, Hmood Aldossari, John Yearwood
    Abstract:

    With the continued growth of Internet of Things (IoT) and its convergence with the cloud, numerous interoperable Software are being developed for cloud. Therefore, there is a growing demand to maintain a better quality of Software in the cloud for improved service. This is more crucial as the cloud environment is growing fast towards a hybrid model; a combination of public and private cloud model. Considering the high volume of the available Software as a service (SaaS) in the cloud, identification of non-standard Software and measuring their quality in the SaaS is an urgent issue. Manual testing and determination of the quality of the Software is very expensive and impossible to accomplish it to some extent. An automated Software Defect detection model that is capable to measure the relative quality of Software and identify their faulty components can significantly reduce both the Software development effort and can improve the cloud service. In this paper, we propose a Software Defect detection model that can be used to identify faulty components in big Software metric data. The novelty of our proposed approach is that it can identify significant metrics using a combination of different filters and wrapper techniques. One of the important contributions of the proposed approach is that we designed and evaluated a parallel framework of a hybrid Software Defect predictor in order to deal with big Software metric data in a computationally efficient way for cloud environment. Two different hybrids have been developed using Fisher and Maximum Relevance (MR) filters with a Artificial Neural Network (ANN) based wrapper in the parallel framework. The evaluations are performed with real Defect-prone Software datasets for all parallel versions. Experimental results show that the proposed parallel hybrid framework achieves a significant computational speedup on a computer cluster with a higher Defect prediction accuracy and smaller number of Software metrics compared to the independent filter or wrapper approaches.