Rademacher - Explore the Science & Experts

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Nishant A Mehta - One of the best experts on this subject based on the ideXlab platform.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

Algorithmic Learning Theory, 2019

Co-Authors: Peter Grunwald, Nishant A Mehta

Abstract:

We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, KL(posterior∥prior) complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to L2(P) entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with L∞. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.

15 days free trial to Access Article
a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

arXiv: Learning, 2017

Co-Authors: Peter Grunwald, Nishant A Mehta

Abstract:

We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, $\mathrm{KL}(\text{posterior} \operatorname{\|} \text{prior})$ complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to $L_2(P)$ entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with $L_\infty$. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.

15 days free trial to Access Article

Andreas Maurer - One of the best experts on this subject based on the ideXlab platform.

uniform concentration and symmetrization for weak interactions

Conference on Learning Theory, 2019

Co-Authors: Andreas Maurer, Massimiliano Pontil

Abstract:

The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic. Tight bounds are obtained for U-statistics, smoothened L-statistics and error functionals of l2-regularized algorithms.

15 days free trial to Access Article
a vector contraction inequality for Rademacher complexities

Algorithmic Learning Theory, 2016

Co-Authors: Andreas Maurer

Abstract:

The contraction inequality for Rademacher averages is extended to Lipschitz functions with vector-valued domains, and it is also shown that in the bounding expression the Rademacher variables can be replaced by arbitrary iid symmetric and sub-gaussian variables. Example applications are given for multi-category learning, K-means clustering and learning-to-learn.

15 days free trial to Access Article
the Rademacher complexity of linear transformation classes

Conference on Learning Theory, 2006

Co-Authors: Andreas Maurer

Abstract:

Bounds are given for the empirical and expected Rademacher complexity of classes of linear transformations from a Hilbert space H to a finite dimensional space. The results imply generalization guarantees for graph regularization and multi-task subspace learning.

15 days free trial to Access Article

Peter Grunwald - One of the best experts on this subject based on the ideXlab platform.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

Algorithmic Learning Theory, 2019

Co-Authors: Peter Grunwald, Nishant A Mehta

Abstract:

We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, KL(posterior∥prior) complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to L2(P) entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with L∞. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.

15 days free trial to Access Article
a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

arXiv: Learning, 2017

Co-Authors: Peter Grunwald, Nishant A Mehta

Abstract:

We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, $\mathrm{KL}(\text{posterior} \operatorname{\|} \text{prior})$ complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to $L_2(P)$ entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with $L_\infty$. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.

15 days free trial to Access Article

Mehryar Mohri - One of the best experts on this subject based on the ideXlab platform.

on the Rademacher complexity of linear hypothesis sets

arXiv: Learning, 2020

Co-Authors: Pranjal Awasthi, Natalie Frank, Mehryar Mohri

Abstract:

Linear predictors form a rich class of hypotheses used in a variety of learning algorithms. We present a tight analysis of the empirical Rademacher complexity of the family of linear hypothesis classes with weight vectors bounded in $\ell_p$-norm for any $p \geq 1$. This provides a tight analysis of generalization using these hypothesis sets and helps derive sharp data-dependent learning guarantees. We give both upper and lower bounds on the Rademacher complexity of these families and show that our bounds improve upon or match existing bounds, which are known only for $1 \leq p \leq 2$.

15 days free trial to Access Article
learning kernels using local Rademacher complexity

Neural Information Processing Systems, 2013

Co-Authors: Corinna Cortes, Marius Kloft, Mehryar Mohri

Abstract:

We use the notion of local Rademacher complexity to design new algorithms for learning kernels. Our algorithms thereby benefit from the sharper learning bounds based on that notion which, under certain general conditions, guarantee a faster convergence rate. We devise two new learning kernel algorithms: one based on a convex optimization problem for which we give an efficient solution using existing learning kernel techniques, and another one that can be formulated as a DC-programming problem for which we describe a solution in detail. We also report the results of experiments with both algorithms in both binary and multi-class classification tasks.

15 days free trial to Access Article
Rademacher complexity bounds for non i i d processes

Neural Information Processing Systems, 2008

Co-Authors: Mehryar Mohri, Afshin Rostamizadeh

Abstract:

This paper presents the first Rademacher complexity-based error bounds for non-i.i.d. settings, a generalization of similar existing bounds derived for the i.i.d. case. Our bounds hold in the scenario of dependent samples generated by a stationary β-mixing process, which is commonly adopted in many previous studies of non-i.i.d. settings. They benefit from the crucial advantages of Rademacher complexity over other measures of the complexity of hypothesis classes. In particular, they are data-dependent and measure the complexity of a class of hypotheses based on the training sample. The empirical Rademacher complexity can be estimated from such finite samples and lead to tighter generalization bounds. We also present the first margin bounds for kernel-based classification in this non-i.i.d. setting and briefly study their convergence.

15 days free trial to Access Article

Alexander Volberg - One of the best experts on this subject based on the ideXlab platform.

Rademacher type and enflo type coincide

Annals of Mathematics, 2020

Co-Authors: Paata Ivanisvili, Ramon Van Handel, Alexander Volberg

Abstract:

A nonlinear analogue of the Rademacher type of a Banach space was introduced in classical work of Enflo. The key feature of Enflo type is that its definition uses only the metric structure of the Banach space, while the definition of Rademacher type relies on its linear structure. We prove that Rademacher type and Enflo type coincide, settling a long-standing open problem in Banach space theory. The proof is based on a novel dimension-free analogue of Pisier's inequality on the discrete cube.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Rademacher with ideXlab!

Nishant A Mehta - One of the best experts on this subject based on the ideXlab platform.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

Andreas Maurer - One of the best experts on this subject based on the ideXlab platform.

uniform concentration and symmetrization for weak interactions

a vector contraction inequality for Rademacher complexities

the Rademacher complexity of linear transformation classes

Peter Grunwald - One of the best experts on this subject based on the ideXlab platform.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity

Mehryar Mohri - One of the best experts on this subject based on the ideXlab platform.

on the Rademacher complexity of linear hypothesis sets

learning kernels using local Rademacher complexity

Rademacher complexity bounds for non i i d processes

Alexander Volberg - One of the best experts on this subject based on the ideXlab platform.

Rademacher type and enflo type coincide