Supervised Discretization

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 780 Experts worldwide ranked by ideXlab platform

B. Duval - One of the best experts on this subject based on the ideXlab platform.

  • A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

  • ICDM - A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

Alexis Bondu - One of the best experts on this subject based on the ideXlab platform.

  • IJCNN - A Supervised approach for change detection in data streams
    The 2011 International Joint Conference on Neural Networks, 2011
    Co-Authors: Alexis Bondu, Marc Boullé
    Abstract:

    In recent years, the amount of data to process has increased in many application areas such as network monitoring, web click and sensor data analysis. Data stream mining answers to the challenge of massive data processing, this paradigm allows for treating pieces of data on the fly and overcomes exhaustive data storage. The detection of changes in a data stream distribution is an important issue which application area is wide. In this article, change detection problem is turned into a Supervised learning task. We chose to exploit the Supervised Discretization method “MODL” given its interesting properties. Our approach is favorably compared with an alternative method on artificial data streams, and is applied on real data streams.

  • A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

  • ICDM - A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

Marc Boullé - One of the best experts on this subject based on the ideXlab platform.

  • EGC (best of volume) - Simultaneous Partitioning of Input and Class Variables for Supervised Classification Problems with Many Classes
    Advances in Knowledge Discovery and Management, 2012
    Co-Authors: Marc Boullé
    Abstract:

    In the data preparation phase of data mining, Supervised Discretization and value grouping methods have numerous applications: interpretation, conditional density estimation, filter selection of input variables, variable recoding for classification methods. These methods usually assume a small number of classes, typically less than ten, and reach their limit in case of too many classes. In this paper, we extend Discretization and value grouping methods, based on the partitioning of both the input and class variables. The best joint partitioning is searched by maximizing a Bayesian model selection criterion. We show how to exploit this preprocessing method as a preparation for the naive Bayes classifier. Extensive experiments demonstrate the benefits of the approach in the case of hundreds of classes.

  • IJCNN - A Supervised approach for change detection in data streams
    The 2011 International Joint Conference on Neural Networks, 2011
    Co-Authors: Alexis Bondu, Marc Boullé
    Abstract:

    In recent years, the amount of data to process has increased in many application areas such as network monitoring, web click and sensor data analysis. Data stream mining answers to the challenge of massive data processing, this paradigm allows for treating pieces of data on the fly and overcomes exhaustive data storage. The detection of changes in a data stream distribution is an important issue which application area is wide. In this article, change detection problem is turned into a Supervised learning task. We chose to exploit the Supervised Discretization method “MODL” given its interesting properties. Our approach is favorably compared with an alternative method on artificial data streams, and is applied on real data streams.

  • A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

  • ICDM - A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

  • Optimal bin number for equal frequency Discretizations in supervized learning
    Intelligent Data Analysis, 2005
    Co-Authors: Marc Boullé
    Abstract:

    While real data often comes in mixed format, discrete and continuous, many Supervised induction algorithms require discrete data. Although efficient Supervised Discretization methods are available, the unSupervised Equal Frequency Discretization method is still widely used by the statistician both for data exploration and data preparation. In this paper, we propose an automatic method, based on a Bayesian approach, to optimize the number of bins for Equal Frequency Discretizations in the context of Supervised learning. We introduce a space of Equal Frequency Discretization models and a prior distribution defined on this model space. This results in the definition of a Bayes optimal evaluation criterion for Equal Frequency Discretizations. We then propose an optimal search algorithm whose run-time is super-linear in the sample size. Extensive comparative experiments demonstrate that the method works quite well in many cases.

V. Lemaire - One of the best experts on this subject based on the ideXlab platform.

  • A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

  • ICDM - A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

Sacha Loiseau - One of the best experts on this subject based on the ideXlab platform.

  • A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.

  • ICDM - A Non-parametric Semi-Supervised Discretization Method
    2008 Eighth IEEE International Conference on Data Mining, 2008
    Co-Authors: Alexis Bondu, Marc Boullé, V. Lemaire, Sacha Loiseau, B. Duval
    Abstract:

    Semi-Supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-Supervised Discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-Supervised method with the original Supervised MODL approach is presented. We demonstrate that the semi-Supervised approach is asymptotically equivalent to the Supervised approach, improved with a post-optimization of the intervals bounds location.