Winsorization

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 237 Experts worldwide ranked by ideXlab platform

Rand R Wilcox - One of the best experts on this subject based on the ideXlab platform.

  • trimming and Winsorization
    Encyclopedia of Biostatistics, 2005
    Co-Authors: Rand R Wilcox
    Abstract:

    Distributions with very long tails or isolated outlying values can disturb the effectiveness of methods based on the normal distribution. Trimming involves the discarding of a proportion of sample observations at the two extremes before calculation of a mean. Winsorization involves the replacement of the discarded values by the most extreme retained values. Keywords: robustness; order statistics; simulation; mean

  • Encyclopedia of Biostatistics - Trimming and Winsorization
    Encyclopedia of Biostatistics, 2005
    Co-Authors: Rand R Wilcox
    Abstract:

    Distributions with very long tails or isolated outlying values can disturb the effectiveness of methods based on the normal distribution. Trimming involves the discarding of a proportion of sample observations at the two extremes before calculation of a mean. Winsorization involves the replacement of the discarded values by the most extreme retained values. Keywords: robustness; order statistics; simulation; mean

  • Estimating Winsorized correlations in a univariate or bivariate random effects model
    British Journal of Mathematical and Statistical Psychology, 1994
    Co-Authors: Rand R Wilcox
    Abstract:

    A well-known result is that slight departures from normality can have a large effect on the usual correlation coefficient rendering the magnitude of the correlation difficult to interpret and potentially misleading. In the context of a random effects model, which is the focus of attention in this paper, this means that effect size, as measured by the intraclass correlation, might be small due to outliers or heavy-tailed distributions rather than a lack of differences among the groups being compared. Similarly, a large intraclass correlation might be due to trivial shifts away from normality which would become small if an adjustment for non-normality were made. Moreover, this problem has to do with the effects of non-normality on population parameters, not just statistics, so problems can arise even with large sample sizes. This follows almost immediately from results in Tukey (1960), and it is briefly illustrated here. One approach to this problem is to use a Winsorized analogue of the intraclass correlation. This paper suggests three ways the Winsorized intraclass correlation might be estimated and compares them via simulations. A bivariate generalization of the random effects model is also considered, and two methods of estimating the group-level correlation are described and compared. Alternatives to Winsorization are also discussed.

Donald H. Mclaughlin - One of the best experts on this subject based on the ideXlab platform.

  • LESS VULNERABLE CONFIDENCE AND SIGNIFICANCE PROCEDURES FOR LOCATION BASED ON A SINGLE SAMPLE : TRIMMING/Winsorization 1
    2016
    Co-Authors: John W. Tukey, Donald H. Mclaughlin
    Abstract:

    SUMMARY. The vulnerability of Student's t, insofar as efficiency and power are concerned, leads to consideration of substitutes. Among the most promising are ratios of trimmed means to square roots of suitable quadratic forms involving the same order statistics. Matching, across underlying distri butions, of ratios of average of denominator to variance of numerator leads to selection of the Winsorized sum of squared deviations as the basis for a denominator. The resulting trimmed t should prove more useful when the amount of trimming is made to depend on the individual sample in a suitably prescribed manner. Exact critical values for the resulting tailored t seem to require Monte Carlo computation, but use of asimple modified denominator for trimmed t allows us to use the conventional t tables as a reasonable approximation.

  • less vulnerable confidence and significance procedures for location based on a single sample trimming Winsorization 1
    2016
    Co-Authors: John W. Tukey, Donald H. Mclaughlin
    Abstract:

    SUMMARY. The vulnerability of Student's t, insofar as efficiency and power are concerned, leads to consideration of substitutes. Among the most promising are ratios of trimmed means to square roots of suitable quadratic forms involving the same order statistics. Matching, across underlying distri butions, of ratios of average of denominator to variance of numerator leads to selection of the Winsorized sum of squared deviations as the basis for a denominator. The resulting trimmed t should prove more useful when the amount of trimming is made to depend on the individual sample in a suitably prescribed manner. Exact critical values for the resulting tailored t seem to require Monte Carlo computation, but use of asimple modified denominator for trimmed t allows us to use the conventional t tables as a reasonable approximation.

John W. Tukey - One of the best experts on this subject based on the ideXlab platform.

  • LESS VULNERABLE CONFIDENCE AND SIGNIFICANCE PROCEDURES FOR LOCATION BASED ON A SINGLE SAMPLE : TRIMMING/Winsorization 1
    2016
    Co-Authors: John W. Tukey, Donald H. Mclaughlin
    Abstract:

    SUMMARY. The vulnerability of Student's t, insofar as efficiency and power are concerned, leads to consideration of substitutes. Among the most promising are ratios of trimmed means to square roots of suitable quadratic forms involving the same order statistics. Matching, across underlying distri butions, of ratios of average of denominator to variance of numerator leads to selection of the Winsorized sum of squared deviations as the basis for a denominator. The resulting trimmed t should prove more useful when the amount of trimming is made to depend on the individual sample in a suitably prescribed manner. Exact critical values for the resulting tailored t seem to require Monte Carlo computation, but use of asimple modified denominator for trimmed t allows us to use the conventional t tables as a reasonable approximation.

  • less vulnerable confidence and significance procedures for location based on a single sample trimming Winsorization 1
    2016
    Co-Authors: John W. Tukey, Donald H. Mclaughlin
    Abstract:

    SUMMARY. The vulnerability of Student's t, insofar as efficiency and power are concerned, leads to consideration of substitutes. Among the most promising are ratios of trimmed means to square roots of suitable quadratic forms involving the same order statistics. Matching, across underlying distri butions, of ratios of average of denominator to variance of numerator leads to selection of the Winsorized sum of squared deviations as the basis for a denominator. The resulting trimmed t should prove more useful when the amount of trimming is made to depend on the individual sample in a suitably prescribed manner. Exact critical values for the resulting tailored t seem to require Monte Carlo computation, but use of asimple modified denominator for trimmed t allows us to use the conventional t tables as a reasonable approximation.

Karen Kafadar - One of the best experts on this subject based on the ideXlab platform.

  • trimming and Winsorization
    Encyclopedia of Environmetrics, 2006
    Co-Authors: Paul S Horn, Karen Kafadar
    Abstract:

    Frequently, a sample may contain observations that differ substantially from the other data values. There are various ways to deal with the possibility of such values so that the resulting analysis will not be adversely affected if they are present. All of these methods assign weights to observations according to some well-conceived algorithm. In this article we concentrate on two forms of assigning weights to observations. The first is trimming, which corresponds to assigning zero weight to a prespecified fraction of the extreme observations and equal weights to the rest. The second is Winsorizing, which also assign zero weights to the extreme observations, as in trimming, but assigns extra weight to the most extreme observations in the retained sample. In this article we will define these concepts for a single sample of observations, and briefly discuss the applications to more complex problems like regression.

  • Encyclopedia of Environmetrics - Trimming and Winsorization
    Encyclopedia of Environmetrics, 2006
    Co-Authors: Paul S Horn, Karen Kafadar
    Abstract:

    Frequently, a sample may contain observations that differ substantially from the other data values. There are various ways to deal with the possibility of such values so that the resulting analysis will not be adversely affected if they are present. All of these methods assign weights to observations according to some well-conceived algorithm. In this article we concentrate on two forms of assigning weights to observations. The first is trimming, which corresponds to assigning zero weight to a prespecified fraction of the extreme observations and equal weights to the rest. The second is Winsorizing, which also assign zero weights to the extreme observations, as in trimming, but assigns extra weight to the most extreme observations in the retained sample. In this article we will define these concepts for a single sample of observations, and briefly discuss the applications to more complex problems like regression.

Christopher I. Amos - One of the best experts on this subject based on the ideXlab platform.

  • Effect of Winsorization on Power and Type 1 Error of Variance Components and Related Methods of QTL Detection
    Behavior genetics, 2004
    Co-Authors: Sanjay Shete, T. Mark Beasley, Carol J. Etzel, Jose R. Fernandez, Jianfang Chen, David B. Allison, Christopher I. Amos
    Abstract:

    Variance components analysis provides an efficient method for performing linkage analysis for quantitative traits. However, power and type 1 error of variance components–based likelihood ratio testing may be affected when phenotypic data are nonnormally distributed (especially with high values of kurtosis) and there is moderate to high correlation among the siblings. Winsorization can reduce the effect of outliers on statistical analyses. Here, we considered the effect of Winsorization on variance components–based tests. We considered the likelihood ratio test (LRT), the Wald test, and some robust variance components tests. We compared these tests with Haseman-Elston least squares–based tests. We found that power to detect linkage is significantly increased after Winsorization of the nonnormal phenotypes. Winsorization does not greatly diminish the type 1 error for the variance components–based tests for markedly nonnormal data. A robust version of the LRT that adjusts for sample kurtosis showed the best power for nonnormal data. Finally, phenotype Winsorization of nonnormal data reduces the bias in estimation of the major gene variance component.

  • improving the power of sib pair quantitative trait loci detection by phenotype Winsorization
    Human Heredity, 2002
    Co-Authors: Jose R. Fernandez, Sanjay Shete, Carol J. Etzel, Christopher I. Amos, Mark T Beasley, David B. Allison
    Abstract:

    Objectives: In sib pair studies, quantitative trait loci (QTL) identification may be adversely affected by non-normality in the phenotypic distribution, particularly when subjects f