Attribute Subset

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 4248 Experts worldwide ranked by ideXlab platform

Hong Zhao - One of the best experts on this subject based on the ideXlab platform.

  • test cost sensitive Attribute reduction on heterogeneous data for adaptive neighborhood model
    Soft Computing, 2016
    Co-Authors: Hong Zhao
    Abstract:

    Test-cost-sensitive Attribute reduction is an important component in data mining applications, and plays a key role in cost-sensitive learning. Some previous approaches in test-cost-sensitive Attribute reduction focus mainly on homogeneous datasets. When heterogeneous datasets must be taken into account, the previous approaches convert nominal Attribute to numerical Attribute directly. In this paper, we introduce an adaptive neighborhood model for heterogeneous Attribute and deal with test-cost-sensitive Attribute reduction problem. In the adaptive neighborhood model, the objects with numerical Attributes are dealt with classical covering neighborhood, and the objects with nominal Attributes are dealt with the overlap metric neighborhood. Compared with the previous approaches, the proposed model can avoid that objects with different values of nominal Attribute are classified into one neighborhood. The number of inconsistent objects of a neighborhood reflects the discriminating capability of an Attribute Subset. With the adaptive neighborhood model, an inconsistent objects-based heuristic reduction algorithm is constructed. The proposed algorithm is compared with the $$\lambda $$ź-weighted heuristic reduction algorithm which nominal Attribute is normalized. Experimental results demonstrate that the proposed algorithm is more effective and more practical significance than the $$\lambda $$ź-weighted heuristic reduction algorithm.

  • test cost sensitive Attribute reduction of data with normal distribution measurement errors
    Mathematical Problems in Engineering, 2013
    Co-Authors: Hong Zhao
    Abstract:

    The measurement error with normal distribution is universal in applications. Generally, smaller measurement error requires better instrument and higher test cost. In decision making, we will select an Attribute Subset with appropriate measurement error to minimize the total test cost. Recently, error-range-based covering rough set with uniform distribution error was proposed to investigate this issue. However, the measurement errors satisfy normal distribution instead of uniform distribution which is rather simple for most applications. In this paper, we introduce normal distribution measurement errors to covering-based rough set model and deal with test-cost-sensitive Attribute reduction problem in this new model. The major contributions of this paper are fourfold. First, we build a new data model based on normal distribution measurement errors. Second, the covering-based rough set model with measurement errors is constructed through the “3-sigma” rule of normal distribution. With this model, coverings are constructed from data rather than assigned by users. Third, the test-cost-sensitive Attribute reduction problem is redefined on this covering-based rough set. Fourth, a heuristic algorithm is proposed to deal with this problem. The experimental results show that the algorithm is more effective and efficient than the existing one. This study suggests new research trends concerning cost-sensitive learning.

  • test cost sensitive Attribute reduction of data with normal distribution measurement errors
    arXiv: Artificial Intelligence, 2012
    Co-Authors: Hong Zhao
    Abstract:

    The measurement error with normal distribution is universal in applications. Generally, smaller measurement error requires better instrument and higher test cost. In decision making based on Attribute values of objects, we shall select an Attribute Subset with appropriate measurement error to minimize the total test cost. Recently, error-range-based covering rough set with uniform distribution error was proposed to investigate this issue. However, the measurement errors satisfy normal distribution instead of uniform distribution which is rather simple for most applications. In this paper, we introduce normal distribution measurement errors to covering-based rough set model, and deal with test-cost-sensitive Attribute reduction problem in this new model. The major contributions of this paper are four-fold. First, we build a new data model based on normal distribution measurement errors. With the new data model, the error range is an ellipse in a two-dimension space. Second, the covering-based rough set with normal distribution measurement errors is constructed through the "3-sigma" rule. Third, the test-cost-sensitive Attribute reduction problem is redefined on this covering-based rough set. Fourth, a heuristic algorithm is proposed to deal with this problem. The algorithm is tested on ten UCI (University of California - Irvine) datasets. The experimental results show that the algorithm is more effective and efficient than the existing one. This study is a step toward realistic applications of cost-sensitive learning.

Yu Fang - One of the best experts on this subject based on the ideXlab platform.

  • cost sensitive approximate Attribute reduction with three way decisions
    International Journal of Approximate Reasoning, 2019
    Co-Authors: Yu Fang
    Abstract:

    Abstract In the research spectrum of rough set, the task of Attribute reduction is obtaining a minimal Attribute Subset that preserves certain properties of the original data. Cost-sensitive Attribute reduction aims at minimizing various types of costs. Approximate Attribute reduction allows decision makers to leverage the advantages of knowledge discovery and their own preferences. This paper proposes the cost-sensitive approximate Attribute reduction problem under both qualitative and quantitative criteria. The qualitative criterion refers to the indiscernibility, while the quantitative criterion refers to the approximate parameter e and the cost. We present a framework based on three-way decisions and discernibility matrix to handle this new problem. First, a quality function for Attribute Subsets is designed with the interpretation of a hierarchical granular structure. Second, a fitness function is designed for cost performance index by investigating Attribute significance. Third, three-way decision theory is applied to partition the Attributes into three groups based on the fitness function and a threshold pair ( α , β ). Finally, deletion-based and addition-based cost-sensitive approximate reduction algorithms are designed under this framework. Experimental results indicate that our algorithms outperform the state-of-the-art methods.

Jinjin Li - One of the best experts on this subject based on the ideXlab platform.

  • Intuitionistic Fuzzy Rough Set-Based Granular Structures and Attribute Subset Selection
    IEEE Transactions on Fuzzy Systems, 2019
    Co-Authors: Wei-zhi Wu, Jinkun Chen, Yuhua Qian, Jiye Liang, Jinjin Li
    Abstract:

    Attribute Subset selection is an important issue in data mining and information processing. However, most automatic methodologies consider only the relevance factor between samples while ignoring the diversity factor. This may not allow the utilization value of hidden information to be exploited. For this reason, we propose a hybrid model named intuitionistic fuzzy (IF) rough set to overcome this limitation. The model combines the technical advantages of rough set and IF set and can effectively consider the above-mentioned statistical factors. First, fuzzy information granules based on IF relations are defined and used to characterize the hierarchical structures of the lower and upper approximations of IF rough set within the framework of granular computing. Then, the computation of IF rough approximations and knowledge reduction in IF information systems are investigated. Third, based on the approximations of IF rough set, significance measures are developed to evaluate the approximation quality and classification ability of IF relations. Furthermore, a forward heuristic algorithm for finding one optimal reduct of IF information systems is developed using these measures. Finally, numerical experiments are conducted on public datasets to examine the effectiveness and efficiency of the proposed algorithm in terms of the number of selected Attributes, computational time, and classification accuracy.

Mashaan A Alshammari - One of the best experts on this subject based on the ideXlab platform.

  • towards scalable rough set based Attribute Subset selection for intrusion detection using parallel genetic algorithm in mapreduce
    Simulation Modelling Practice and Theory, 2016
    Co-Authors: Elsayed M Elalfy, Mashaan A Alshammari
    Abstract:

    Abstract Attribute Subset selection based on rough sets is a crucial preprocessing step in data mining and pattern recognition to reduce the modeling complexity. To cope with the new era of big data, new approaches need to be explored to address this problem effectively. In this paper, we review recent work related to Attribute Subset selection in decision-theoretic rough set models. We also introduce a scalable implementation of a parallel genetic algorithm in Hadoop MapReduce to approximate the minimum reduct which has the same discernibility power as the original Attribute set in the decision table. Then, we focus on intrusion detection in computer networks and apply the proposed approach on four datasets with varying characteristics. The results show that the proposed model can be a powerful tool to boost the performance of identifying Attributes in the minimum reduct in large-scale decision systems.

Francis Quek - One of the best experts on this subject based on the ideXlab platform.

  • Attribute bagging improving accuracy of classifier ensembles by using random feature Subsets
    Pattern Recognition, 2003
    Co-Authors: Robert Kamil Bryll, Ricardo Gutierrezosuna, Francis Quek
    Abstract:

    We present Attribute bagging (AB), a technique for improving the accuracy and stability of classifier ensembles induced using random Subsets of features. AB is a wrapper method that can be used with any learning algorithm. It establishes an appropriate Attribute Subset size and then randomly selects Subsets of features, creating projections of the training set on which the ensemble classifiers are built. The induced classifiers are then used for voting. This article compares the performance of our AB method with bagging and other algorithms on a hand-pose recognition dataset. It is shown that AB gives consistently better results than bagging, both in accuracy and stability. The performance of ensemble voting in bagging and the AB method as a function of the Attribute Subset size and the number of voters for both weighted and unweighted voting is tested and discussed. We also demonstrate that ranking the Attribute Subsets by their classification accuracy and voting using only the best Subsets further improves the resulting performance of the ensemble.