Fit Statistic

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 12216 Experts worldwide ranked by ideXlab platform

Clement A. Stone - One of the best experts on this subject based on the ideXlab platform.

  • Monte Carlo Based Null Distribution for an Alternative Goodness-of-Fit Test Statistic
    2016
    Co-Authors: Clement A. Stone
    Abstract:

    Assessing the correspondence between model predictions and observed data is a recommended procedure for justifying the application of an IRT model. However, with shorter tests, current goodness-of-Fit procedures that assume precise point estimates of ability, are inappropriate. The present paper describes a goodness-ofFit Statistic that considers the imprecision with which ability is estimated and involves constructing item Fit tables based on each examinee's posterior distribution of ability, given the likelihood of their response pattern and an assumed marginal ability distribution. However, the posterior expectations that are computed are dependent and the distribution of the goodness-of-Fit Statistic is unknown. The present paper also describes a Monte Carlo resampling procedure that can be used to assess the significance of the Fit Statistic and compares this method with a previously used method. The results indicate that the method described herein is an effective and reasonably simple procedure for assessing the validity of applying IRT models when ability estimates are imprecise.

  • assessing goodness of Fit of item response theory models a comparison of traditional and alternative procedures
    Journal of Educational Measurement, 2003
    Co-Authors: Clement A. Stone, Bo Zhang
    Abstract:

    Testing the goodness of Fit of item response theory (IRT) models is relevant to validating IRT models, and new procedures have been proposed. These alternatives compare observed and expected response frequencies conditional on observed total scores, and use posterior probabilities for responses across 0 levels rather than cross-classifying examinees using point estimates of 0 and score responses. This research compared these alternatives with regard to their methods, properties (Type I error rates and empirical power), available research, and practical issues (computational demands, treatment of missing data, effects of sample size and sparse data, and available computer programs). Different advantages and disadvantages related to these characteristics are discussed. A simulation study provided additional information about empirical power and Type I error rates. Item response theory (IRT) involves a class of mathematical models that are used to predict examinee performance using item and person characteristics. The properties of these models offer many well-known advantages in testing applications. However, the extent to which these properties are attained is dependent on the degree to which the IRT model itself is appropriate. To validate the use of an IRT model, goodness of Fit for a model, or correspondence between model predictions and observed data, is often examined (see Hambleton & Swaminathan, 1985, for a discussion of other useful validation studies). When a model is not appropriate or does not Fit the data, use of estimated parameters may be compromised. Several alternative approaches to assessing IRT model-data-Fit have emerged in response to using traditional methods in different testing applications. Orlando and Thissen (2000) described a Fit Statistic that does not use ability estimates but provides a comparison of observed and expected frequencies for score responses across total score levels. Stone, Mislevy, and Mazzeo (1994) argued that uncertainty in ability estimation was responsible for deviations in the approximation of the goodness-of-Fit Statistics to the null distribution, and discussed a Fit Statistic based on posterior expectations that addressed the issue. A scaling correction for the chi-squared Fit Statistic based on resampling methods was subsequently described (Stone, 2000a). Donoghue and Hombo (1999, 2001a) also used the Fit Statistic based on posterior probabilities to evaluate goodness of Fit for National Assessment of Educational Progress (NAEP) items, but derived a distribution for the Fit Statistic (labeled QDH) that could be used for hypothesis testing.

  • Empirical Power and Type I Error Rates for an IRT Fit Statistic that Considers the Precision of Ability Estimates
    Educational and Psychological Measurement, 2003
    Co-Authors: Clement A. Stone
    Abstract:

    Model-data-Fit of item response theory (IRT) models is generally assessed by comparing observed performance by examinees on individual items with performance that is predicted under the chosen IRT model. However, use of traditional chi-square methods to evaluate goodness-of-Fit of IRT models is not appropriate when the underlying trait/ability is estimated imprecisely (e.g., shorter assessments). This article describes a goodness-of-Fit Statistic that considers directly the uncertainty with which ability is estimated as well as a resampling-based hypothesis testing procedure. A simulation study was conducted to evaluate the empirical power and Type I error rates for the proposed procedure. Results of the study indicated that the procedure should be useful for evaluating goodness-of-Fit of IRT models for most testing applications where uncertainty in ability estimation is an issue.

  • monte carlo based null distribution for an alternative goodness of Fit test Statistic in irt models
    Journal of Educational Measurement, 2000
    Co-Authors: Clement A. Stone
    Abstract:

    Assessing the correspondence between model predictions and observed data is a recommended procedure for justifying the application of an IRT model. However, with shorter tests, current goodness-of-Fit procedures that assume precise point estimates of ability, are inappropriate. The present paper describes a goodness-of-Fit Statistic that considers the imprecision with which ability is estimated and involves constructing item Fit tables based on each examinee's posterior distribution of ability, given the likelihood of their response pattern and an assumed marginal ability distribution. However, the posterior expectations that are computed are dependent and the distribution of the goodness-of-Fit Statistic is unknown. The present paper also describes a Monte Carlo resampling procedure that can be used to assess the significance of the Fit Statistic and compares this method with a previously used method. The results indicate that the method described herein is an effective and reasonably simple procedure for assessing the validity of applying IRT models when ability estimates are imprecise.

  • Monte Carlo Based Null Distribution for an Alternative Goodness‐of‐Fit Test Statistic in IRT Models
    Journal of Educational Measurement, 2000
    Co-Authors: Clement A. Stone
    Abstract:

    Assessing the correspondence between model predictions and observed data is a recommended procedure for justifying the application of an IRT model. However, with shorter tests, current goodness-of-Fit procedures that assume precise point estimates of ability, are inappropriate. The present paper describes a goodness-of-Fit Statistic that considers the imprecision with which ability is estimated and involves constructing item Fit tables based on each examinee's posterior distribution of ability, given the likelihood of their response pattern and an assumed marginal ability distribution. However, the posterior expectations that are computed are dependent and the distribution of the goodness-of-Fit Statistic is unknown. The present paper also describes a Monte Carlo resampling procedure that can be used to assess the significance of the Fit Statistic and compares this method with a previously used method. The results indicate that the method described herein is an effective and reasonably simple procedure for assessing the validity of applying IRT models when ability estimates are imprecise.

Aurea Grané - One of the best experts on this subject based on the ideXlab platform.

Philseok Lee - One of the best experts on this subject based on the ideXlab platform.

Josep Fortiana - One of the best experts on this subject based on the ideXlab platform.

David W Hosmer - One of the best experts on this subject based on the ideXlab platform.

  • a smoothed residual based goodness of Fit Statistic for logistic hierarchical regression models
    Computational Statistics & Data Analysis, 2007
    Co-Authors: Rodney X. Sturdivant, David W Hosmer
    Abstract:

    We develop a goodness-of-Fit measure with desirable properties for use in the hierarchical logistic regression setting. The Statistic is an unweighted sum of squares (USS) of the kernel smoothed model residuals. We develop expressions for the moments of this Statistic and create a standardized Statistic with hypothesized asymptotic standard normal distribution under the null hypothesis that the model is correctly specified. Extensive simulation studies demonstrate satisfactory adherence to Type I error rates of the Kernel smoothed USS Statistic in a variety of likely data settings. Finally, we discuss issues of bandwidth selection for using our proposed Statistic in practice and illustrate its use in an example.