Chervonenkis

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 2946 Experts worldwide ranked by ideXlab platform

Manfred K. Warmuth - One of the best experts on this subject based on the ideXlab platform.

  • Sample compression, learnability, and the Vapnik-Chervonenkis dimension
    Machine Learning, 1995
    Co-Authors: Sally Floyd, Manfred K. Warmuth
    Abstract:

    Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class $$C \subseteq 2^X $$ consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a class C is sufficient to ensure that the class C is pac-learnable. Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class is finite. In the second half of this paper we explore the relationship between sample compression schemes and the VC dimension. We define maximum and maximal classes of VC dimension d . For every maximum class of VC dimension d , there is a sample compression scheme of size d , and for sufficiently-large maximum classes there is no sample compression scheme of size less than d . We discuss briefly classes of VC dimension d that are maximal but not maximum. It is an open question whether every class of VC dimension d has a sample compression scheme of size O( d ).

  • Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension
    Machine Learning, 1995
    Co-Authors: Sally Floyd, Manfred K. Warmuth
    Abstract:

    Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ⊆ 2X consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a class C is sufficient to ensure that the class C is pac-learnable. Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class is finite. In the second half of this paper we explore the relationship between sample compression schemes and the VC dimension. We define maximum and maximal classes of VC dimension d. For every maximum class of VC dimension d, there is a sample compression scheme of size d, and for sufficiently-large maximum classes there is no sample compression scheme of size less than d. We discuss briefly classes of VC dimension d that are maximal but not maximum. It is an open question whether every class of VC dimension d has a sample compression scheme of size O(d).

Jun Wang - One of the best experts on this subject based on the ideXlab platform.

Sally Floyd - One of the best experts on this subject based on the ideXlab platform.

  • Sample compression, learnability, and the Vapnik-Chervonenkis dimension
    Machine Learning, 1995
    Co-Authors: Sally Floyd, Manfred K. Warmuth
    Abstract:

    Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class $$C \subseteq 2^X $$ consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a class C is sufficient to ensure that the class C is pac-learnable. Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class is finite. In the second half of this paper we explore the relationship between sample compression schemes and the VC dimension. We define maximum and maximal classes of VC dimension d . For every maximum class of VC dimension d , there is a sample compression scheme of size d , and for sufficiently-large maximum classes there is no sample compression scheme of size less than d . We discuss briefly classes of VC dimension d that are maximal but not maximum. It is an open question whether every class of VC dimension d has a sample compression scheme of size O( d ).

  • Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension
    Machine Learning, 1995
    Co-Authors: Sally Floyd, Manfred K. Warmuth
    Abstract:

    Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ⊆ 2X consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a class C is sufficient to ensure that the class C is pac-learnable. Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class is finite. In the second half of this paper we explore the relationship between sample compression schemes and the VC dimension. We define maximum and maximal classes of VC dimension d. For every maximum class of VC dimension d, there is a sample compression scheme of size d, and for sufficiently-large maximum classes there is no sample compression scheme of size less than d. We discuss briefly classes of VC dimension d that are maximal but not maximum. It is an open question whether every class of VC dimension d has a sample compression scheme of size O(d).

Gábor Lugosi - One of the best experts on this subject based on the ideXlab platform.

  • Vapnik-Chervonenkis Theory
    A Probabilistic Theory of Pattern Recognition, 1996
    Co-Authors: Luc Devroye, László Györfi, Gábor Lugosi
    Abstract:

    In this chapter we select a decision rule from a class of rules with the help of training data. Working formally, let C be a class of functions O: R d → {0,1}. One wishes to select a function from C with small error probability. Assume that the training data D n = ((X 1, Y 1),..., (X n , Y n )) are given to pick one of the functions from C to be used as a classifier.

  • a probabilistic theory of pattern recognition
    1996
    Co-Authors: Luc Devroye, László Györfi, Gábor Lugosi
    Abstract:

    Preface * Introduction * The Bayes Error * Inequalities and alternatedistance measures * Linear discrimination * Nearest neighbor rules *Consistency * Slow rates of convergence Error estimation * The regularhistogram rule * Kernel rules Consistency of the k-nearest neighborrule * Vapnik-Chervonenkis theory * Combinatorial aspects of Vapnik-Chervonenkis theory * Lower bounds for empirical classifier selection* The maximum likelihood principle * Parametric classification *Generalized linear discrimination * Complexity regularization *Condensed and edited nearest neighbor rules * Tree classifiers * Data-dependent partitioning * Splitting the data * The resubstitutionestimate * Deleted estimates of the error probability * Automatickernel rules * Automatic nearest neighbor rules * Hypercubes anddiscrete spaces * Epsilon entropy and totally bounded sets * Uniformlaws of large numbers * Neural networks * Other error estimates *Feature extraction * Appendix * Notation * References * Index

  • Lower Bounds for Empirical Classifier Selection
    A Probabilistic Theory of Pattern Recognition, 1996
    Co-Authors: Luc Devroye, László Györfi, Gábor Lugosi
    Abstract:

    In Chapter 12 a classifier was selected by minimizing the empirical error over a class of classifiers C. With the help of the Vapnik-Chervonenkis theory we have been able to obtain distribution-free performance guarantees for the selected rule. For example, it was shown that the difference between the expected error probability of the selected rule and the best error probability in the class behaves at least as well as where V C is the Vapnik-Chervonenkis dimension of C, and n is the size of the training data D n . (This upper bound is obtained from Theorem 12.5. Corollary 12.5 may be used to replace the log n term with log V C .) Two questions arise immediately: Are these upper bounds (at least up to the order of magnitude) tight? Is there a much better way of selecting a classifier than minimizing the empirical error? This chapter attempts to answer these questions. As it turns out, the answer is essentially affirmative for the first question, and negative for the second.

  • Combinatorial Aspects of Vapnik-Chervonenkis Theory
    A Probabilistic Theory of Pattern Recognition, 1996
    Co-Authors: Luc Devroye, László Györfi, Gábor Lugosi
    Abstract:

    In this section we list a few interesting properties of shatter coefficient s(A, n) and of the vc dimension V A of a class of sets A. We begin with a property that makes things easier.

  • Improved upper bounds for probabilities of uniform deviations
    Statistics & Probability Letters, 1995
    Co-Authors: Gábor Lugosi
    Abstract:

    We obtain Vapnik-Chervonenkis type upper bounds for the uniform deviation of probabilities from their expectations. The bounds sharpen previously known probability inequalities.

Nicolas Vayatis - One of the best experts on this subject based on the ideXlab platform.

  • Exact rates in Vapnik–Chervonenkis bounds
    Annales de l'Institut Henri Poincare (B) Probability and Statistics, 2003
    Co-Authors: Nicolas Vayatis
    Abstract:

    Abstract Vapnik–Chervonenkis bounds on rates of uniform convergence of empirical means to their expectations have been continuously improved over the years since the precursory work in [26]. The result obtained by Talagrand in 1994 [21] seems to provide the final word as far as universal bounds are concerned. However, in the case where there are some additional assumptions on the underlying probability distribution, the exponential rate of convergence can be fairly improved. Alexander [1] and Massart [15] have found better exponential rates (similar to those in Bennett–Bernstein inequalities) under the assumption of a control on the variance of the empirical process. In this paper, the case of a particular distribution is considered for the empirical process indexed by a family of sets, and we provide the exact exponential rate based on large deviations theorems, as predicted by Azencott [2].

  • exact rates in vapnik Chervonenkis bounds
    Annales De L Institut Henri Poincare-probabilites Et Statistiques, 2003
    Co-Authors: Nicolas Vayatis
    Abstract:

    Abstract Vapnik–Chervonenkis bounds on rates of uniform convergence of empirical means to their expectations have been continuously improved over the years since the precursory work in [26]. The result obtained by Talagrand in 1994 [21] seems to provide the final word as far as universal bounds are concerned. However, in the case where there are some additional assumptions on the underlying probability distribution, the exponential rate of convergence can be fairly improved. Alexander [1] and Massart [15] have found better exponential rates (similar to those in Bennett–Bernstein inequalities) under the assumption of a control on the variance of the empirical process. In this paper, the case of a particular distribution is considered for the empirical process indexed by a family of sets, and we provide the exact exponential rate based on large deviations theorems, as predicted by Azencott [2].

  • refined exponential rates in vapnik Chervonenkis inequalities
    Comptes Rendus De L Academie Des Sciences Serie I-mathematique, 2001
    Co-Authors: Robert Azencott, Nicolas Vayatis
    Abstract:

    Abstract Vapnik–Chervonenkis bounds on speeds of convergence of empirical means to their expectations have been continuously improved over the years. The result obtained by M. Talagrand in 1994 [11] seems to provide the final word as far as universal bounds are concerned. However, for fixed families of underlying probability distributions, the exponential rate in the deviation term can be fairly improved by the more adequate Cramer transform, as shown by theorems of large deviations.

  • Refined exponential rates in Vapnik–Chervonenkis inequalities
    Comptes Rendus de l'Académie des Sciences - Series I - Mathematics, 2001
    Co-Authors: Robert Azencott, Nicolas Vayatis
    Abstract:

    Abstract Vapnik–Chervonenkis bounds on speeds of convergence of empirical means to their expectations have been continuously improved over the years. The result obtained by M. Talagrand in 1994 [11] seems to provide the final word as far as universal bounds are concerned. However, for fixed families of underlying probability distributions, the exponential rate in the deviation term can be fairly improved by the more adequate Cramer transform, as shown by theorems of large deviations.

  • EuroCOLT - Distribution-Dependent Vapnik-Chervonenkis Bounds
    Lecture Notes in Computer Science, 1999
    Co-Authors: Nicolas Vayatis, Robert Azencott
    Abstract:

    Vapnik-Chervonenkis (VC) bounds play an important role in statistical learning theory as they are the fundamental result which explains the generalization ability of learning machines. There have been consequent mathematical works on the improvement of VC rates of convergence of empirical means to their expectations over the years. The result obtained by Talagrand in 1994 seems to provide more or less the final word to this issue as far as universal bounds are concerned. Though for fixed distributions, this bound can be practically outperformed. We show indeed that it is possible to replace the 2∈2 under the exponential of the deviation term by the corresponding CramEr transform as shown by large deviations theorems. Then, we formulate rigorous distributionsensitive VC bounds and we also explain why these theoretical results on such bounds can lead to practical estimates of the effective VC dimension of learning structures.