Variance Estimation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 95253 Experts worldwide ranked by ideXlab platform

Ning Hao - One of the best experts on this subject based on the ideXlab platform.

  • Variance Estimation using refitted cross validation in ultrahigh dimensional regression
    Journal of The Royal Statistical Society Series B-statistical Methodology, 2012
    Co-Authors: Jianqing Fan, Shaojun Guo, Ning Hao
    Abstract:

    Variance Estimation is a fundamental problem in statistical modelling. In ultrahigh dimensional linear regression where the dimensionality is much larger than the sample size, traditional Variance Estimation techniques are not applicable. Recent advances in variable selection in ultrahigh dimensional linear regression make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to serious underestimate of the level of noise. We propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation, to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function. The simulation studies lend further support to our theoretical claims. The naive two-stage estimator and the plug-in one-stage estimators using the lasso and smoothly clipped absolute deviation are also studied and compared. Their performances can be improved by the reffitted cross-validation method proposed.

  • Variance Estimation using refitted cross validation in ultrahigh dimensional regression
    arXiv: Methodology, 2010
    Co-Authors: Jianqing Fan, Shaojun Guo, Ning Hao
    Abstract:

    Variance Estimation is a fundamental problem in statistical modeling. In ultrahigh dimensional linear regressions where the dimensionality is much larger than sample size, traditional Variance Estimation techniques are not applicable. Recent advances on variable selection in ultrahigh dimensional linear regressions make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to serious underestimate of the noise level. In this paper, we propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation (RCV), to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function. The simulation studies lend further support to our theoretical claims. The naive two-stage estimator which fits the selected variables in the first stage and the plug-in one stage estimators using LASSO and SCAD are also studied and compared. Their performances can be improved by the proposed RCV method.

Amaury Lendasse - One of the best experts on this subject based on the ideXlab platform.

  • residual Variance Estimation in machine learning
    Neurocomputing, 2009
    Co-Authors: Elia Liitiainen, Michel Verleysen, Francesco Corona, Amaury Lendasse
    Abstract:

    The problem of residual Variance Estimation consists of estimating the best possible generalization error obtainable by any model based on a finite sample of data. Even though it is a natural generalization of linear correlation, residual Variance Estimation in its general form has attracted relatively little attention in machine learning. In this paper, we examine four different residual Variance estimators and analyze their properties both theoretically and experimentally to understand better their applicability in machine learning problems. The theoretical treatment differs from previous work by being based on a general formulation of the problem covering also heteroscedastic noise in contrary to previous work, which concentrates on homoscedastic and additive noise. In the second part of the paper, we demonstrate practical applications in input and model structure selection. The experimental results show that using residual Variance estimators in these tasks gives good results often with a reduced computational complexity, while the nearest neighbor estimators are simple and easy to implement.

  • Residual Variance Estimation in machine learning
    Neurocomputing, 2009
    Co-Authors: Elia Liitiainen, Michel Verleysen, Francesco Corona, Amaury Lendasse
    Abstract:

    The problem of residual Variance Estimation consists of estimating the best possible generalization error obtainable by any model based on a finite sample of data. Even though it is a natural generalization of linear correlation, residual Variance Estimation in its general form has attracted relatively little attention in machine learning. In this paper, we examine four different residual Variance estimators and analyze their properties both theoretically and experimentally to understand better their applicability in machine learning problems. The theoretical treatment differs from previous work by being based on a general formulation of the problem covering also heteroscedastic noise in contrary to previous work, which concentrates on homoscedastic and additive noise. In the second part of the paper, we demonstrate practical applications in input and model structure selection. The experimental results show that using residual Variance estimators in these tasks gives good results often with a reduced computational complexity, while the nearest neighbor estimators are simple and easy to implement. (C) 2009 Elsevier B.V. All rights reserved

  • on nonparametric residual Variance Estimation
    Neural Processing Letters, 2008
    Co-Authors: Elia Liitiainen, Francesco Corona, Amaury Lendasse
    Abstract:

    In this paper, the problem of residual Variance Estimation is examined. The problem is analyzed in a general setting which covers non-additive heteroscedastic noise under non-iid sampling. To address the Estimation problem, we suggest a method based on nearest neighbor graphs and we discuss its convergence properties under the assumption of a Holder continuous regression function. The universality of the estimator makes it an ideal tool in problems with only little prior knowledge available.

  • non parametric residual Variance Estimation in supervised learning
    International Work-Conference on Artificial and Natural Neural Networks, 2007
    Co-Authors: Elia Liitiainen, Amaury Lendasse, Francesco Corona
    Abstract:

    The residual Variance Estimation problem is well-known in statistics and machine learning with many applications for example in the field of nonlinear modelling. In this paper, we show that the problem can be formulated in a general supervised learning context. Emphasis is on two widely used non-parametric techniques known as the Delta test and the Gamma test. Under some regularity assumptions, a novel proof of convergence of the two estimators is formulated and subsequently verified and compared on two meaningful study cases.

  • IWANN - Non-parametric residual Variance Estimation in supervised learning
    Computational and Ambient Intelligence, 2007
    Co-Authors: Elia Liitiainen, Amaury Lendasse, Francesco Corona
    Abstract:

    The residual Variance Estimation problem is well-known in statistics and machine learning with many applications for example in the field of nonlinear modelling. In this paper, we show that the problem can be formulated in a general supervised learning context. Emphasis is on two widely used non-parametric techniques known as the Delta test and the Gamma test. Under some regularity assumptions, a novel proof of convergence of the two estimators is formulated and subsequently verified and compared on two meaningful study cases.

Francesco Corona - One of the best experts on this subject based on the ideXlab platform.

  • residual Variance Estimation in machine learning
    Neurocomputing, 2009
    Co-Authors: Elia Liitiainen, Michel Verleysen, Francesco Corona, Amaury Lendasse
    Abstract:

    The problem of residual Variance Estimation consists of estimating the best possible generalization error obtainable by any model based on a finite sample of data. Even though it is a natural generalization of linear correlation, residual Variance Estimation in its general form has attracted relatively little attention in machine learning. In this paper, we examine four different residual Variance estimators and analyze their properties both theoretically and experimentally to understand better their applicability in machine learning problems. The theoretical treatment differs from previous work by being based on a general formulation of the problem covering also heteroscedastic noise in contrary to previous work, which concentrates on homoscedastic and additive noise. In the second part of the paper, we demonstrate practical applications in input and model structure selection. The experimental results show that using residual Variance estimators in these tasks gives good results often with a reduced computational complexity, while the nearest neighbor estimators are simple and easy to implement.

  • Residual Variance Estimation in machine learning
    Neurocomputing, 2009
    Co-Authors: Elia Liitiainen, Michel Verleysen, Francesco Corona, Amaury Lendasse
    Abstract:

    The problem of residual Variance Estimation consists of estimating the best possible generalization error obtainable by any model based on a finite sample of data. Even though it is a natural generalization of linear correlation, residual Variance Estimation in its general form has attracted relatively little attention in machine learning. In this paper, we examine four different residual Variance estimators and analyze their properties both theoretically and experimentally to understand better their applicability in machine learning problems. The theoretical treatment differs from previous work by being based on a general formulation of the problem covering also heteroscedastic noise in contrary to previous work, which concentrates on homoscedastic and additive noise. In the second part of the paper, we demonstrate practical applications in input and model structure selection. The experimental results show that using residual Variance estimators in these tasks gives good results often with a reduced computational complexity, while the nearest neighbor estimators are simple and easy to implement. (C) 2009 Elsevier B.V. All rights reserved

  • on nonparametric residual Variance Estimation
    Neural Processing Letters, 2008
    Co-Authors: Elia Liitiainen, Francesco Corona, Amaury Lendasse
    Abstract:

    In this paper, the problem of residual Variance Estimation is examined. The problem is analyzed in a general setting which covers non-additive heteroscedastic noise under non-iid sampling. To address the Estimation problem, we suggest a method based on nearest neighbor graphs and we discuss its convergence properties under the assumption of a Holder continuous regression function. The universality of the estimator makes it an ideal tool in problems with only little prior knowledge available.

  • non parametric residual Variance Estimation in supervised learning
    International Work-Conference on Artificial and Natural Neural Networks, 2007
    Co-Authors: Elia Liitiainen, Amaury Lendasse, Francesco Corona
    Abstract:

    The residual Variance Estimation problem is well-known in statistics and machine learning with many applications for example in the field of nonlinear modelling. In this paper, we show that the problem can be formulated in a general supervised learning context. Emphasis is on two widely used non-parametric techniques known as the Delta test and the Gamma test. Under some regularity assumptions, a novel proof of convergence of the two estimators is formulated and subsequently verified and compared on two meaningful study cases.

  • IWANN - Non-parametric residual Variance Estimation in supervised learning
    Computational and Ambient Intelligence, 2007
    Co-Authors: Elia Liitiainen, Amaury Lendasse, Francesco Corona
    Abstract:

    The residual Variance Estimation problem is well-known in statistics and machine learning with many applications for example in the field of nonlinear modelling. In this paper, we show that the problem can be formulated in a general supervised learning context. Emphasis is on two widely used non-parametric techniques known as the Delta test and the Gamma test. Under some regularity assumptions, a novel proof of convergence of the two estimators is formulated and subsequently verified and compared on two meaningful study cases.

Elia Liitiainen - One of the best experts on this subject based on the ideXlab platform.

  • residual Variance Estimation in machine learning
    Neurocomputing, 2009
    Co-Authors: Elia Liitiainen, Michel Verleysen, Francesco Corona, Amaury Lendasse
    Abstract:

    The problem of residual Variance Estimation consists of estimating the best possible generalization error obtainable by any model based on a finite sample of data. Even though it is a natural generalization of linear correlation, residual Variance Estimation in its general form has attracted relatively little attention in machine learning. In this paper, we examine four different residual Variance estimators and analyze their properties both theoretically and experimentally to understand better their applicability in machine learning problems. The theoretical treatment differs from previous work by being based on a general formulation of the problem covering also heteroscedastic noise in contrary to previous work, which concentrates on homoscedastic and additive noise. In the second part of the paper, we demonstrate practical applications in input and model structure selection. The experimental results show that using residual Variance estimators in these tasks gives good results often with a reduced computational complexity, while the nearest neighbor estimators are simple and easy to implement.

  • Residual Variance Estimation in machine learning
    Neurocomputing, 2009
    Co-Authors: Elia Liitiainen, Michel Verleysen, Francesco Corona, Amaury Lendasse
    Abstract:

    The problem of residual Variance Estimation consists of estimating the best possible generalization error obtainable by any model based on a finite sample of data. Even though it is a natural generalization of linear correlation, residual Variance Estimation in its general form has attracted relatively little attention in machine learning. In this paper, we examine four different residual Variance estimators and analyze their properties both theoretically and experimentally to understand better their applicability in machine learning problems. The theoretical treatment differs from previous work by being based on a general formulation of the problem covering also heteroscedastic noise in contrary to previous work, which concentrates on homoscedastic and additive noise. In the second part of the paper, we demonstrate practical applications in input and model structure selection. The experimental results show that using residual Variance estimators in these tasks gives good results often with a reduced computational complexity, while the nearest neighbor estimators are simple and easy to implement. (C) 2009 Elsevier B.V. All rights reserved

  • on nonparametric residual Variance Estimation
    Neural Processing Letters, 2008
    Co-Authors: Elia Liitiainen, Francesco Corona, Amaury Lendasse
    Abstract:

    In this paper, the problem of residual Variance Estimation is examined. The problem is analyzed in a general setting which covers non-additive heteroscedastic noise under non-iid sampling. To address the Estimation problem, we suggest a method based on nearest neighbor graphs and we discuss its convergence properties under the assumption of a Holder continuous regression function. The universality of the estimator makes it an ideal tool in problems with only little prior knowledge available.

  • non parametric residual Variance Estimation in supervised learning
    International Work-Conference on Artificial and Natural Neural Networks, 2007
    Co-Authors: Elia Liitiainen, Amaury Lendasse, Francesco Corona
    Abstract:

    The residual Variance Estimation problem is well-known in statistics and machine learning with many applications for example in the field of nonlinear modelling. In this paper, we show that the problem can be formulated in a general supervised learning context. Emphasis is on two widely used non-parametric techniques known as the Delta test and the Gamma test. Under some regularity assumptions, a novel proof of convergence of the two estimators is formulated and subsequently verified and compared on two meaningful study cases.

  • IWANN - Non-parametric residual Variance Estimation in supervised learning
    Computational and Ambient Intelligence, 2007
    Co-Authors: Elia Liitiainen, Amaury Lendasse, Francesco Corona
    Abstract:

    The residual Variance Estimation problem is well-known in statistics and machine learning with many applications for example in the field of nonlinear modelling. In this paper, we show that the problem can be formulated in a general supervised learning context. Emphasis is on two widely used non-parametric techniques known as the Delta test and the Gamma test. Under some regularity assumptions, a novel proof of convergence of the two estimators is formulated and subsequently verified and compared on two meaningful study cases.

Jianqing Fan - One of the best experts on this subject based on the ideXlab platform.

  • Error Variance Estimation in Ultrahigh-Dimensional Additive Models
    Journal of the American Statistical Association, 2017
    Co-Authors: Zhao Chen, Jianqing Fan
    Abstract:

    ABSTRACTError Variance Estimation plays an important role in statistical inference for high-dimensional regression models. This article concerns with error Variance Estimation in high-dimensional sparse additive model. We study the asymptotic behavior of the traditional mean squared errors, the naive estimate of error Variance, and show that it may significantly underestimate the error Variance due to spurious correlations that are even higher in nonparametric models than linear models. We further propose an accurate estimate for error Variance in ultrahigh-dimensional sparse additive model by effectively integrating sure independence screening and refitted cross-validation techniques. The root n consistency and the asymptotic normality of the resulting estimate are established. We conduct Monte Carlo simulation study to examine the finite sample performance of the newly proposed estimate. A real data example is used to illustrate the proposed methodology. Supplementary materials for this article are avai...

  • Variance Estimation using refitted cross validation in ultrahigh dimensional regression
    Journal of The Royal Statistical Society Series B-statistical Methodology, 2012
    Co-Authors: Jianqing Fan, Shaojun Guo, Ning Hao
    Abstract:

    Variance Estimation is a fundamental problem in statistical modelling. In ultrahigh dimensional linear regression where the dimensionality is much larger than the sample size, traditional Variance Estimation techniques are not applicable. Recent advances in variable selection in ultrahigh dimensional linear regression make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to serious underestimate of the level of noise. We propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation, to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function. The simulation studies lend further support to our theoretical claims. The naive two-stage estimator and the plug-in one-stage estimators using the lasso and smoothly clipped absolute deviation are also studied and compared. Their performances can be improved by the reffitted cross-validation method proposed.

  • Variance Estimation using refitted cross validation in ultrahigh dimensional regression
    arXiv: Methodology, 2010
    Co-Authors: Jianqing Fan, Shaojun Guo, Ning Hao
    Abstract:

    Variance Estimation is a fundamental problem in statistical modeling. In ultrahigh dimensional linear regressions where the dimensionality is much larger than sample size, traditional Variance Estimation techniques are not applicable. Recent advances on variable selection in ultrahigh dimensional linear regressions make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to serious underestimate of the noise level. In this paper, we propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation (RCV), to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function. The simulation studies lend further support to our theoretical claims. The naive two-stage estimator which fits the selected variables in the first stage and the plug-in one stage estimators using LASSO and SCAD are also studied and compared. Their performances can be improved by the proposed RCV method.