Underlying Population Distribution

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 20652 Experts worldwide ranked by ideXlab platform

Jonathan Rose - One of the best experts on this subject based on the ideXlab platform.

  • risky business factor analysis of survey data assessing the probability of incorrect dimensionalisation
    PLOS ONE, 2015
    Co-Authors: Cees Van Der Eijk, Jonathan Rose
    Abstract:

    This paper undertakes a systematic assessment of the extent to which factor analysis the correct number of latent dimensions (factors) when applied to ordered-categorical survey items (so-called Likert items). We simulate 2400 data sets of uni-dimensional Likert items that vary systematically over a range of conditions such as the Underlying Population Distribution, the number of items, the level of random error, and characteristics of items and item-sets. Each of these datasets is factor analysed in a variety of ways that are frequently used in the extant literature, or that are recommended in current methodological texts. These include exploratory factor retention heuristics such as Kaiser’s criterion, Parallel Analysis and a non-graphical scree test, and (for exploratory and confirmatory analyses) evaluations of model fit. These analyses are conducted on the basis of Pearson and polychoric correlations. We find that, irrespective of the particular mode of analysis, factor analysis applied to ordered-categorical survey data very often leads to over-dimensionalisation. The magnitude of this risk depends on the specific way in which factor analysis is conducted, the number of items, the properties of the set of items, and the Underlying Population Distribution. The paper concludes with a discussion of the consequences of over-dimensionalisation, and a brief mention of alternative modes of analysis that are much less prone to such problems.

  • Risky Business: Factor Analysis of Survey Data – Assessing the Probability of Incorrect Dimensionalisation
    PloS one, 2015
    Co-Authors: Cees Van Der Eijk, Jonathan Rose
    Abstract:

    This paper undertakes a systematic assessment of the extent to which factor analysis the correct number of latent dimensions (factors) when applied to ordered-categorical survey items (so-called Likert items). We simulate 2400 data sets of uni-dimensional Likert items that vary systematically over a range of conditions such as the Underlying Population Distribution, the number of items, the level of random error, and characteristics of items and item-sets. Each of these datasets is factor analysed in a variety of ways that are frequently used in the extant literature, or that are recommended in current methodological texts. These include exploratory factor retention heuristics such as Kaiser’s criterion, Parallel Analysis and a non-graphical scree test, and (for exploratory and confirmatory analyses) evaluations of model fit. These analyses are conducted on the basis of Pearson and polychoric correlations. We find that, irrespective of the particular mode of analysis, factor analysis applied to ordered-categorical survey data very often leads to over-dimensionalisation. The magnitude of this risk depends on the specific way in which factor analysis is conducted, the number of items, the properties of the set of items, and the Underlying Population Distribution. The paper concludes with a discussion of the consequences of over-dimensionalisation, and a brief mention of alternative modes of analysis that are much less prone to such problems.

Prathiba Natesan - One of the best experts on this subject based on the ideXlab platform.

  • Evaluation of threshold selection methods for adaptive kernel density estimation in disease mapping.
    International journal of health geographics, 2018
    Co-Authors: Warangkana Ruckthongsook, Chetan Tiwari, Joseph R. Oppong, Prathiba Natesan
    Abstract:

    Background Maps of disease rates produced without careful consideration of the Underlying Population Distribution may be unreliable due to the well-known small numbers problem. Smoothing methods such as Kernel Density Estimation (KDE) are employed to control the Population basis of spatial support used to calculate each disease rate. The degree of smoothing is controlled by a user-defined parameter (bandwidth or threshold) which influences the resolution of the disease map and the reliability of the computed rates. Methods for automatically selecting a smoothing parameter such as normal scale, plug-in, and smoothed cross validation bandwidth selectors have been proposed for use with non-spatial data, but their relative utilities remain unknown. This study assesses the relative performance of these methods in terms of resolution and reliability for disease mapping.

  • Evaluation of threshold selection methods for adaptive kernel density estimation in disease mapping
    International Journal of Health Geographics, 2018
    Co-Authors: Warangkana Ruckthongsook, Chetan Tiwari, Joseph R. Oppong, Prathiba Natesan
    Abstract:

    Background Maps of disease rates produced without careful consideration of the Underlying Population Distribution may be unreliable due to the well-known small numbers problem. Smoothing methods such as Kernel Density Estimation (KDE) are employed to control the Population basis of spatial support used to calculate each disease rate. The degree of smoothing is controlled by a user-defined parameter (bandwidth or threshold) which influences the resolution of the disease map and the reliability of the computed rates. Methods for automatically selecting a smoothing parameter such as normal scale, plug-in, and smoothed cross validation bandwidth selectors have been proposed for use with non-spatial data, but their relative utilities remain unknown. This study assesses the relative performance of these methods in terms of resolution and reliability for disease mapping. Results Using a simulated dataset of heart disease mortality among males aged 35 years and older in Texas, we assess methods for automatically selecting a smoothing parameter. Our results show that while all parameter choices accurately estimate the overall state rates, they vary in terms of the degree of spatial resolution. Further, parameter choices resulting in desirable characteristics for one sub group of the Population (e.g., a specific age-group) may not necessarily be appropriate for other groups. Conclusion We show that the appropriate threshold value depends on the characteristics of the data, and that bandwidth selector algorithms can be used to guide such decisions about mapping parameters. An unguided choice may produce maps that distort the balance of resolution and statistical reliability.

Ke-hai Yuan - One of the best experts on this subject based on the ideXlab platform.

  • Fit indices for mean structures with growth curve models.
    Psychological methods, 2018
    Co-Authors: Ke-hai Yuan, Zhiyong Zhang, Lifang Deng
    Abstract:

    Motivated by the need to effectively evaluate the quality of the mean structure in growth curve modeling (GCM), this article proposes to separately evaluate the goodness of fit of the mean structure from that of the covariance structure. Several fit indices are defined, and rationales are discussed. Particular considerations are given for polynomial and piecewise polynomial models because fit indices for them are valid regardless of the Underlying Population Distribution of the data. Examples indicate that the newly defined fit indices remove the confounding issues with indices jointly evaluating mean and covariance structure models and provide much more reliable evaluation of the mean structure in GCM. Examples also show that pseudo R-squares and concordance correlations are unable to reflect the goodness of mean structures in GCM. Proper use of the fit indices for the purpose of model diagnostics is discussed. A window-based program, WebSEM, is also introduced for easily computing these fit indices by applied researchers. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

  • Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances
    Psychometrika, 2018
    Co-Authors: Ke-hai Yuan, Mortaza Jamshidian, Yutaka Kano
    Abstract:

    Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the Population counterparts of the sample means and covariances of a given pattern of the observed data depend on the Underlying structure that generates the data, and the normal-Distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the Underlying Population Distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the Population Distribution is multivariate normal.

  • Missing Data Mechanisms and Homogeneity of Means and Variances-Covariances.
    Psychometrika, 2018
    Co-Authors: Ke-hai Yuan, Mortaza Jamshidian, Yutaka Kano
    Abstract:

    Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the Population counterparts of the sample means and covariances of a given pattern of the observed data depend on the Underlying structure that generates the data, and the normal-Distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the Underlying Population Distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the Population Distribution is multivariate normal.

Cees Van Der Eijk - One of the best experts on this subject based on the ideXlab platform.

  • Risky Business: Factor Analysis of Survey Data – Assessing the Probability of Incorrect Dimensionalisation
    PloS one, 2015
    Co-Authors: Cees Van Der Eijk, Jonathan Rose
    Abstract:

    This paper undertakes a systematic assessment of the extent to which factor analysis the correct number of latent dimensions (factors) when applied to ordered-categorical survey items (so-called Likert items). We simulate 2400 data sets of uni-dimensional Likert items that vary systematically over a range of conditions such as the Underlying Population Distribution, the number of items, the level of random error, and characteristics of items and item-sets. Each of these datasets is factor analysed in a variety of ways that are frequently used in the extant literature, or that are recommended in current methodological texts. These include exploratory factor retention heuristics such as Kaiser’s criterion, Parallel Analysis and a non-graphical scree test, and (for exploratory and confirmatory analyses) evaluations of model fit. These analyses are conducted on the basis of Pearson and polychoric correlations. We find that, irrespective of the particular mode of analysis, factor analysis applied to ordered-categorical survey data very often leads to over-dimensionalisation. The magnitude of this risk depends on the specific way in which factor analysis is conducted, the number of items, the properties of the set of items, and the Underlying Population Distribution. The paper concludes with a discussion of the consequences of over-dimensionalisation, and a brief mention of alternative modes of analysis that are much less prone to such problems.

Cees Van Der Eijk - One of the best experts on this subject based on the ideXlab platform.

  • risky business factor analysis of survey data assessing the probability of incorrect dimensionalisation
    PLOS ONE, 2015
    Co-Authors: Cees Van Der Eijk, Jonathan Rose
    Abstract:

    This paper undertakes a systematic assessment of the extent to which factor analysis the correct number of latent dimensions (factors) when applied to ordered-categorical survey items (so-called Likert items). We simulate 2400 data sets of uni-dimensional Likert items that vary systematically over a range of conditions such as the Underlying Population Distribution, the number of items, the level of random error, and characteristics of items and item-sets. Each of these datasets is factor analysed in a variety of ways that are frequently used in the extant literature, or that are recommended in current methodological texts. These include exploratory factor retention heuristics such as Kaiser’s criterion, Parallel Analysis and a non-graphical scree test, and (for exploratory and confirmatory analyses) evaluations of model fit. These analyses are conducted on the basis of Pearson and polychoric correlations. We find that, irrespective of the particular mode of analysis, factor analysis applied to ordered-categorical survey data very often leads to over-dimensionalisation. The magnitude of this risk depends on the specific way in which factor analysis is conducted, the number of items, the properties of the set of items, and the Underlying Population Distribution. The paper concludes with a discussion of the consequences of over-dimensionalisation, and a brief mention of alternative modes of analysis that are much less prone to such problems.