Small Sample Size

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 169563 Experts worldwide ranked by ideXlab platform

Dominique Lord - One of the best experts on this subject based on the ideXlab platform.

  • effects of low Sample mean values and Small Sample Size on the estimation of the fixed dispersion parameter of poisson gamma models for modeling motor vehicle crashes a bayesian perspective
    Safety Science, 2008
    Co-Authors: Dominique Lord, Luis F Mirandamoreno
    Abstract:

    Abstract There has been considerable research conducted on the development of statistical models for predicting motor vehicle crashes on highway facilities. Over the last few years, there has been a significant increase in the application hierarchical Bayes methods for modeling motor vehicle crash data. Whether the inferences are estimated using classical or Bayesian methods, the most common probabilistic structure used for modeling this type of data remains the traditional Poisson-gamma (or Negative Binomial) model. Crash data collected for highway safety studies often have the unusual attributes of being characterized by low Sample mean values and, due to the prohibitive costs of collecting data, Small Sample Sizes. Previous studies have shown that the dispersion parameter of Poisson-gamma models can be seriously mis-estimated when the models are estimated using the maximum likelihood estimation (MLE) method for these extreme conditions. Despite important work done on this topic for the MLE, nobody has so far examined how low Sample mean values and Small Sample Sizes affect the posterior mean of the dispersion parameter of Poisson-gamma models estimated using the hierarchical Bayes method. The inverse dispersion parameter plays an important role in various types of highway safety studies. It is therefore vital to determine the conditions in which the inverse dispersion parameter may be mis-estimated for this category of models. To accomplish the objectives of this study, a simulation framework is developed to generate data from the Poisson-gamma distributions using different values describing the mean, the dispersion parameter, the Sample Size, and the prior specification. Vague and non-vague prior specifications are tested for determining the magnitude of the biases introduced by low Sample mean values and Small Sample Sizes. A series of datasets are also simulated from the Poisson-lognormal distributions, in the light of recent work done by statisticians on this mixed distribution. The study shows that a dataset characterized by a low Sample mean combined with a Small Sample Size can seriously affect the estimation of the posterior mean of the dispersion parameter when a vague prior specification is used to characterize the gamma hyper-parameter. The risk of a mis-estimated posterior mean can be greatly minimized when an appropriate non-vague prior distribution is used. Finally, the study shows that Poisson-lognormal models are recommended over Poisson-gamma models when assuming vague priors and whenever crash data characterized by low Sample mean values are used for developing crash prediction models.

  • modeling motor vehicle crashes using poisson gamma models examining the effects of low Sample mean values and Small Sample Size on the estimation of the fixed dispersion parameter
    Accident Analysis & Prevention, 2006
    Co-Authors: Dominique Lord
    Abstract:

    There has been considerable research conducted on the development of statistical models for predicting crashes on highway facilities. Despite numerous advancements made for improving the estimation tools of statistical models, the most common probabilistic structure used for modeling motor vehicle crashes remains the traditional Poisson and Poisson-gamma (or Negative Binomial) distribution; when crash data exhibit over-dispersion, the Poisson-gamma model is usually the model of choice most favored by transportation safety modelers. Crash data collected for safety studies often have the unusual attributes of being characterized by low Sample mean values. Studies have shown that the goodness-of-fit of statistical models produced from such datasets can be significantly affected. This issue has been defined as the "low mean problem" (LMP). Despite recent developments on methods to circumvent the LMP and test the goodness-of-fit of models developed using such datasets, no work has so far examined how the LMP affects the fixed dispersion parameter of Poisson-gamma models used for modeling motor vehicle crashes. The dispersion parameter plays an important role in many types of safety studies and should, therefore, be reliably estimated. The primary objective of this research project was to verify whether the LMP affects the estimation of the dispersion parameter and, if it is, to determine the magnitude of the problem. The secondary objective consisted of determining the effects of an unreliably estimated dispersion parameter on common analyses performed in highway safety studies. To accomplish the objectives of the study, a series of Poisson-gamma distributions were simulated using different values describing the mean, the dispersion parameter, and the Sample Size. Three estimators commonly used by transportation safety modelers for estimating the dispersion parameter of Poisson-gamma models were evaluated: the method of moments, the weighted regression, and the maximum likelihood method. In an attempt to complement the outcome of the simulation study, Poisson-gamma models were fitted to crash data collected in Toronto, Ont. characterized by a low Sample mean and Small Sample Size. The study shows that a low Sample mean combined with a Small Sample Size can seriously affect the estimation of the dispersion parameter, no matter which estimator is used within the estimation process. The probability the dispersion parameter becomes unreliably estimated increases significantly as the Sample mean and Sample Size decrease. Consequently, the results show that an unreliably estimated dispersion parameter can significantly undermine empirical Bayes (EB) estimates as well as the estimation of confidence intervals for the gamma mean and predicted response. The paper ends with recommendations about minimizing the likelihood of producing Poisson-gamma models with an unreliable dispersion parameter for modeling motor vehicle crashes.

  • modeling motor vehicle crashes using poisson gamma models examining the effects of low Sample mean values and Small Sample Size on the estimation of the fixed dispersion parameter
    Accident Analysis & Prevention, 2006
    Co-Authors: Dominique Lord
    Abstract:

    There has been considerable research conducted on the development of statistical models for predicting crashes on highway facilities. Despite numerous advancements made for improving the estimation tools of statistical models, the most common probabilistic structure used for modeling motor vehicle crashes remains the traditional Poisson and Poisson-gamma (or Negative Binomial) distribution; when crash data exhibit over-dispersion, the Poisson-gamma model is usually the model of choice most favored by transportation safety modelers. Crash data collected for safety studies often have the unusual attributes of being characterized by low Sample mean values. Studies have shown that the goodness-of-fit of statistical models produced from such datasets can be significantly affected. This issue has been defined as the "low mean problem" (LMP). Despite recent developments on methods to circumvent the LMP and test the goodness-of-fit of models developed using such datasets, no work has so far examined how the LMP affects the fixed dispersion parameter of Poisson-gamma models used for modeling motor vehicle crashes. The dispersion parameter plays an important role in many types of safety studies and should, therefore, be reliably estimated. The primary objective of this research project was to verify whether the LMP affects the estimation of the dispersion parameter and, if it is, to determine the magnitude of the problem. The secondary objective consisted of determining the effects of an unreliably estimated dispersion parameter on common analyses performed in highway safety studies. To accomplish the objectives of the study, a series of Poisson-gamma distributions were simulated using different values describing the mean, the dispersion parameter, and the Sample Size. Three estimators commonly used by transportation safety modelers for estimating the dispersion parameter of Poisson-gamma models were evaluated: the method of moments, the weighted regression, and the maximum likelihood method. In an attempt to complement the outcome of the simulation study, Poisson-gamma models were fitted to crash data collected in Toronto, Ont. characterized by a low Sample mean and Small Sample Size. The study shows that a low Sample mean combined with a Small Sample Size can seriously affect the estimation of the dispersion parameter, no matter which estimator is used within the estimation process. The probability the dispersion parameter becomes unreliably estimated increases significantly as the Sample mean and Sample Size decrease. Consequently, the results show that an unreliably estimated dispersion parameter can significantly undermine empirical Bayes (EB) estimates as well as the estimation of confidence intervals for the gamma mean and predicted response. The paper ends with recommendations about minimizing the likelihood of producing Poisson-gamma models with an unreliable dispersion parameter for modeling motor vehicle crashes.

A N Venetsanopoulos - One of the best experts on this subject based on the ideXlab platform.

  • kernel quadratic discriminant analysis for Small Sample Size problem
    Pattern Recognition, 2008
    Co-Authors: Jie Wang, Konstantinos N Plataniotis, Juwei Lu, A N Venetsanopoulos
    Abstract:

    It is generally believed that quadratic discriminant analysis (QDA) can better fit the data in practical pattern recognition applications compared to linear discriminant analysis (LDA) method. This is due to the fact that QDA relaxes the assumption made by LDA-based methods that the covariance matrix for each class is identical. However, it still assumes that the class conditional distribution is Gaussian which is usually not the case in many real-world applications. In this paper, a novel kernel-based QDA method is proposed to further relax the Gaussian assumption by using the kernel machine technique. The proposed method solves the complex pattern recognition problem by combining the QDA solution and the kernel machine technique, and at the same time, tackles the so-called Small Sample Size problem through a regularized estimation of the covariance matrix. Extensive experimental results indicate that the proposed method is a more sophisticated solution outperforming many traditional kernel-based learning algorithms.

  • regularization studies of linear discriminant analysis in Small Sample Size scenarios with application to face recognition
    Pattern Recognition Letters, 2005
    Co-Authors: Juwei Lu, Konstantinos N Plataniotis, A N Venetsanopoulos
    Abstract:

    It is well-known that the applicability of linear discriminant analysis (LDA) to high-dimensional pattern classification tasks such as face recognition often suffers from the so-called ''Small Sample Size'' (SSS) problem arising from the Small number of available training Samples compared to the dimensionality of the Sample space. In this paper, we propose a new LDA method that attempts to address the SSS problem using a regularized Fisher's separability criterion. In addition, a scheme of expanding the representational capacity of face database is introduced to overcome the limitation that the LDA-based algorithms require at least two Samples per class available for learning. Extensive experiments performed on the FERET database indicate that the proposed methodology outperforms traditional methods such as Eigenfaces and some recently introduced LDA variants in a number of SSS scenarios.

  • regularized discriminant analysis for the Small Sample Size problem in face recognition
    Pattern Recognition Letters, 2003
    Co-Authors: Konstantinos N Plataniotis, A N Venetsanopoulos
    Abstract:

    It is well-known that the applicability of both linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) to high-dimensional pattern classification tasks such as face recognition (FR) often suffers from the so-called "Small Sample Size" (SSS) problem arising from the Small number of available trainings Samples compared to the dimensionality of the Sample space. In this paper, we propose a new QDA like method that effectively addresses the SSS problem using a regularization technique. Extensive experimentation performed on the FERET database Indicates that the proposed methodology outperforms traditional methods such as Eigenfaces, direct QDA and direct LDA in a number of SSS setting scenarios.

Woochul Lim - One of the best experts on this subject based on the ideXlab platform.

  • Reliability analysis using bootstrap information criterion for Small Sample Size response functions
    Structural and Multidisciplinary Optimization, 2020
    Co-Authors: Eshan Amalnerkar, Tae Hee Lee, Woochul Lim
    Abstract:

    Statistical model selection and evaluation methods like Akaike information criteria (AIC) and Monte Carlo simulation (MCS) have often established efficient output for reliability analysis with large Sample Size. Information criterion can provide better model selection and evaluation in Small Sample Sizes setup by considering the well-known measure of bootstrap resampling. Our purpose is to utilize the capabilities of bootstrap resampling in information criterion to check for uncertainty arising from model selection as well as statistics of interest for Small Sample Size using reliability analysis. In this study, therefore, a unique and efficient simulation scheme is proposed which contemplates the best model selection devised from efficient bootstrap simulation or variance reduced bootstrap information criterion to be combined with reliability analysis. It is beneficial to compute the spread of reliability values as against solitary fixed values with desirable statistics of interest for uncertainty analysis. The proposed simulation scheme is verified using a number of Sample Size focused response functions under repetitions-centred approach with AIC-based reliability analysis for comparison and MCS for accuracy. The results show that the proposed simulation scheme aids the statistics of interest by reducing the spread and hence the uncertainty in Sample Size-based reliability analysis when compared with conventional methods.

  • bootstrap guided information criterion for reliability analysis using Small Sample Size information
    World Congress of Structural and Multidisciplinary Optimisation, 2017
    Co-Authors: Eshan Amalnerkar, Tae Hee Lee, Woochul Lim
    Abstract:

    Several methods for reliability analysis have been established and applied to engineering fields bearing in mind uncertainties as a major contributing factor. Small Sample Size based reliability analysis can be very beneficial when rising uncertainty from statistics of interest such as mean and standard deviation are considered. Model selection and evaluation methods like Akaike Information Criteria (AIC) have demonstrated efficient output for reliability analysis. However, information criterion based on maximum likelihood can provide better model selection and evaluation in Small Sample Size scenario by considering the well-known measure of bootstrapping for curtailing uncertainty with resampling. Our purpose is to utilize the capabilities of bootstrap resampling in information criterion based reliability analysis to check for uncertainty arising from statistics of interest for Small Sample Size problems. In this study, therefore, a unique and efficient simulation scheme is proposed which contemplates the best model selection frequency devised from information criterion to be combined with reliability analysis. It is also beneficial to compute the spread of reliability values as against solitary fixed values with desirable statistics of interest under replication based approach. The proposed simulation scheme is verified using a number of Small and moderate Sample Size focused mathematical example with AIC based reliability analysis for comparison and Monte Carlo simulation (MCS) for accuracy. The results show that the proposed simulation scheme favors the statistics of interest by reducing the spread and hence the uncertainty in Small Sample Size based reliability analysis when compared with conventional methods whereas moderate Sample Size based reliability analysis did not show any considerable favor.

Lifen Chen - One of the best experts on this subject based on the ideXlab platform.

  • a new lda based face recognition system which can solve the Small Sample Size problem
    Pattern Recognition, 2000
    Co-Authors: Lifen Chen, Hongyuan Mark Liao, Mingtat Ko, Gwojong Yu
    Abstract:

    A new LDA-based face recognition system is presented in this paper. Linear discriminant analysis (LDA) is one of the most popular linear projection techniques for feature extraction. The major drawback of applying LDA is that it may encounter the Small Sample Size problem. In this paper, we propose a new LDA-based technique which can solve the Small Sample Size problem. We also prove that the most expressive vectors derived in the null space of the within-class scatter matrix using principal component analysis (PCA) are equal to the optimal discriminant vectors derived in the original space using LDA. The experimental results show that the new LDA process improves the performance of a face recognition system signi"cantly. ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.

  • a new lda based face recognition system which can solve the Small Sample Size problem
    Pattern Recognition, 2000
    Co-Authors: Lifen Chen, Hongyuan Mark Liao, Jachen Lin
    Abstract:

    Abstract A new LDA-based face recognition system is presented in this paper. Linear discriminant analysis (LDA) is one of the most popular linear projection techniques for feature extraction. The major drawback of applying LDA is that it may encounter the Small Sample Size problem . In this paper, we propose a new LDA-based technique which can solve the Small Sample Size problem. We also prove that the most expressive vectors derived in the null space of the within-class scatter matrix using principal component analysis (PCA) are equal to the optimal discriminant vectors derived in the original space using LDA. The experimental results show that the new LDA process improves the performance of a face recognition system significantly.

Konstantinos N Plataniotis - One of the best experts on this subject based on the ideXlab platform.

  • kernel quadratic discriminant analysis for Small Sample Size problem
    Pattern Recognition, 2008
    Co-Authors: Jie Wang, Konstantinos N Plataniotis, Juwei Lu, A N Venetsanopoulos
    Abstract:

    It is generally believed that quadratic discriminant analysis (QDA) can better fit the data in practical pattern recognition applications compared to linear discriminant analysis (LDA) method. This is due to the fact that QDA relaxes the assumption made by LDA-based methods that the covariance matrix for each class is identical. However, it still assumes that the class conditional distribution is Gaussian which is usually not the case in many real-world applications. In this paper, a novel kernel-based QDA method is proposed to further relax the Gaussian assumption by using the kernel machine technique. The proposed method solves the complex pattern recognition problem by combining the QDA solution and the kernel machine technique, and at the same time, tackles the so-called Small Sample Size problem through a regularized estimation of the covariance matrix. Extensive experimental results indicate that the proposed method is a more sophisticated solution outperforming many traditional kernel-based learning algorithms.

  • regularization studies of linear discriminant analysis in Small Sample Size scenarios with application to face recognition
    Pattern Recognition Letters, 2005
    Co-Authors: Juwei Lu, Konstantinos N Plataniotis, A N Venetsanopoulos
    Abstract:

    It is well-known that the applicability of linear discriminant analysis (LDA) to high-dimensional pattern classification tasks such as face recognition often suffers from the so-called ''Small Sample Size'' (SSS) problem arising from the Small number of available training Samples compared to the dimensionality of the Sample space. In this paper, we propose a new LDA method that attempts to address the SSS problem using a regularized Fisher's separability criterion. In addition, a scheme of expanding the representational capacity of face database is introduced to overcome the limitation that the LDA-based algorithms require at least two Samples per class available for learning. Extensive experiments performed on the FERET database indicate that the proposed methodology outperforms traditional methods such as Eigenfaces and some recently introduced LDA variants in a number of SSS scenarios.

  • regularized discriminant analysis for the Small Sample Size problem in face recognition
    Pattern Recognition Letters, 2003
    Co-Authors: Konstantinos N Plataniotis, A N Venetsanopoulos
    Abstract:

    It is well-known that the applicability of both linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA) to high-dimensional pattern classification tasks such as face recognition (FR) often suffers from the so-called "Small Sample Size" (SSS) problem arising from the Small number of available trainings Samples compared to the dimensionality of the Sample space. In this paper, we propose a new QDA like method that effectively addresses the SSS problem using a regularization technique. Extensive experimentation performed on the FERET database Indicates that the proposed methodology outperforms traditional methods such as Eigenfaces, direct QDA and direct LDA in a number of SSS setting scenarios.