The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform
Nishant A Mehta  One of the best experts on this subject based on the ideXlab platform.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity
Algorithmic Learning Theory, 2019CoAuthors: Peter Grunwald, Nishant A MehtaAbstract:We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of dataindependent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the datadependent information complexity (also known as stochastic or PACBayesian, KL(posterior∥prior) complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax logloss individualsequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to L2(P) entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and CesaBianchi who did the logloss case with L∞. Together, these results recover optimal bounds for VC and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity
arXiv: Learning, 2017CoAuthors: Peter Grunwald, Nishant A MehtaAbstract:We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of dataindependent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the datadependent information complexity (also known as stochastic or PACBayesian, $\mathrm{KL}(\text{posterior} \operatorname{\} \text{prior})$ complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax logloss individualsequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to $L_2(P)$ entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and CesaBianchi who did the logloss case with $L_\infty$. Together, these results recover optimal bounds for VC and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.
Andreas Maurer  One of the best experts on this subject based on the ideXlab platform.

uniform concentration and symmetrization for weak interactions
Conference on Learning Theory, 2019CoAuthors: Andreas Maurer, Massimiliano PontilAbstract:The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic. Tight bounds are obtained for Ustatistics, smoothened Lstatistics and error functionals of l2regularized algorithms.

a vector contraction inequality for Rademacher complexities
Algorithmic Learning Theory, 2016CoAuthors: Andreas MaurerAbstract:The contraction inequality for Rademacher averages is extended to Lipschitz functions with vectorvalued domains, and it is also shown that in the bounding expression the Rademacher variables can be replaced by arbitrary iid symmetric and subgaussian variables. Example applications are given for multicategory learning, Kmeans clustering and learningtolearn.

the Rademacher complexity of linear transformation classes
Conference on Learning Theory, 2006CoAuthors: Andreas MaurerAbstract:Bounds are given for the empirical and expected Rademacher complexity of classes of linear transformations from a Hilbert space H to a finite dimensional space. The results imply generalization guarantees for graph regularization and multitask subspace learning.
Peter Grunwald  One of the best experts on this subject based on the ideXlab platform.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity
Algorithmic Learning Theory, 2019CoAuthors: Peter Grunwald, Nishant A MehtaAbstract:We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of dataindependent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the datadependent information complexity (also known as stochastic or PACBayesian, KL(posterior∥prior) complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax logloss individualsequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to L2(P) entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and CesaBianchi who did the logloss case with L∞. Together, these results recover optimal bounds for VC and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.

a tight excess risk bound via a unified pac bayesian Rademacher shtarkov mdl complexity
arXiv: Learning, 2017CoAuthors: Peter Grunwald, Nishant A MehtaAbstract:We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of dataindependent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the datadependent information complexity (also known as stochastic or PACBayesian, $\mathrm{KL}(\text{posterior} \operatorname{\} \text{prior})$ complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax logloss individualsequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to $L_2(P)$ entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and CesaBianchi who did the logloss case with $L_\infty$. Together, these results recover optimal bounds for VC and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.
Mehryar Mohri  One of the best experts on this subject based on the ideXlab platform.

on the Rademacher complexity of linear hypothesis sets
arXiv: Learning, 2020CoAuthors: Pranjal Awasthi, Natalie Frank, Mehryar MohriAbstract:Linear predictors form a rich class of hypotheses used in a variety of learning algorithms. We present a tight analysis of the empirical Rademacher complexity of the family of linear hypothesis classes with weight vectors bounded in $\ell_p$norm for any $p \geq 1$. This provides a tight analysis of generalization using these hypothesis sets and helps derive sharp datadependent learning guarantees. We give both upper and lower bounds on the Rademacher complexity of these families and show that our bounds improve upon or match existing bounds, which are known only for $1 \leq p \leq 2$.

learning kernels using local Rademacher complexity
Neural Information Processing Systems, 2013CoAuthors: Corinna Cortes, Marius Kloft, Mehryar MohriAbstract:We use the notion of local Rademacher complexity to design new algorithms for learning kernels. Our algorithms thereby benefit from the sharper learning bounds based on that notion which, under certain general conditions, guarantee a faster convergence rate. We devise two new learning kernel algorithms: one based on a convex optimization problem for which we give an efficient solution using existing learning kernel techniques, and another one that can be formulated as a DCprogramming problem for which we describe a solution in detail. We also report the results of experiments with both algorithms in both binary and multiclass classification tasks.

Rademacher complexity bounds for non i i d processes
Neural Information Processing Systems, 2008CoAuthors: Mehryar Mohri, Afshin RostamizadehAbstract:This paper presents the first Rademacher complexitybased error bounds for noni.i.d. settings, a generalization of similar existing bounds derived for the i.i.d. case. Our bounds hold in the scenario of dependent samples generated by a stationary βmixing process, which is commonly adopted in many previous studies of noni.i.d. settings. They benefit from the crucial advantages of Rademacher complexity over other measures of the complexity of hypothesis classes. In particular, they are datadependent and measure the complexity of a class of hypotheses based on the training sample. The empirical Rademacher complexity can be estimated from such finite samples and lead to tighter generalization bounds. We also present the first margin bounds for kernelbased classification in this noni.i.d. setting and briefly study their convergence.
Alexander Volberg  One of the best experts on this subject based on the ideXlab platform.

Rademacher type and enflo type coincide
Annals of Mathematics, 2020CoAuthors: Paata Ivanisvili, Ramon Van Handel, Alexander VolbergAbstract:A nonlinear analogue of the Rademacher type of a Banach space was introduced in classical work of Enflo. The key feature of Enflo type is that its definition uses only the metric structure of the Banach space, while the definition of Rademacher type relies on its linear structure. We prove that Rademacher type and Enflo type coincide, settling a longstanding open problem in Banach space theory. The proof is based on a novel dimensionfree analogue of Pisier's inequality on the discrete cube.