Selection Operator

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 27846 Experts worldwide ranked by ideXlab platform

Sijian Wang - One of the best experts on this subject based on the ideXlab platform.

  • variable Selection for multiply imputed data with application to dioxin exposure study
    Statistics in Medicine, 2013
    Co-Authors: Qixuan Chen, Sijian Wang
    Abstract:

    Multiple imputation (MI) is a commonly used technique for handling missing data in large-scale medical and public health studies. However, variable Selection on multiply-imputed data remains an important and longstanding statistical problem. If a variable Selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation-least absolute shrinkage and Selection Operator (MI-LASSO) variable Selection method as an extension of the least absolute shrinkage and Selection Operator (LASSO) method to multiply-imputed data. The MI-LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable Selection across multiple-imputed datasets. We use a simulation study to demonstrate the advantage of the MI-LASSO method compared with the alternatives. We also apply the MI-LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan.

  • variable Selection for multiply imputed data with application to dioxin exposure study
    Statistics in Medicine, 2013
    Co-Authors: Qixuan Chen, Sijian Wang
    Abstract:

    Multiple imputation (MI) is a commonly used technique for handling missing data in large-scale medical and public health studies. However, variable Selection on multiply-imputed data remains an important and longstanding statistical problem. If a variable Selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation-least absolute shrinkage and Selection Operator (MI-LASSO) variable Selection method as an extension of the least absolute shrinkage and Selection Operator (LASSO) method to multiply-imputed data. The MI-LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable Selection across multiple-imputed datasets. We use a simulation study to demonstrate the advantage of the MI-LASSO method compared with the alternatives. We also apply the MI-LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan. Copyright © 2013 John Wiley & Sons, Ltd.

Jenhong Lan - One of the best experts on this subject based on the ideXlab platform.

  • using multivariate regression model with least absolute shrinkage and Selection Operator lasso to predict the incidence of xerostomia after intensity modulated radiotherapy for head and neck cancer
    PLOS ONE, 2014
    Co-Authors: Tsairfwu Lee, Peiju Chao, Huimin Ting, Liyun Chang, Yujie Huang, Hungyu Wang, Mongfong Horng, Chunming Chang, Jenhong Lan
    Abstract:

    Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and Selection Operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the Selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT.

Qixuan Chen - One of the best experts on this subject based on the ideXlab platform.

  • variable Selection for multiply imputed data with application to dioxin exposure study
    Statistics in Medicine, 2013
    Co-Authors: Qixuan Chen, Sijian Wang
    Abstract:

    Multiple imputation (MI) is a commonly used technique for handling missing data in large-scale medical and public health studies. However, variable Selection on multiply-imputed data remains an important and longstanding statistical problem. If a variable Selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation-least absolute shrinkage and Selection Operator (MI-LASSO) variable Selection method as an extension of the least absolute shrinkage and Selection Operator (LASSO) method to multiply-imputed data. The MI-LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable Selection across multiple-imputed datasets. We use a simulation study to demonstrate the advantage of the MI-LASSO method compared with the alternatives. We also apply the MI-LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan.

  • variable Selection for multiply imputed data with application to dioxin exposure study
    Statistics in Medicine, 2013
    Co-Authors: Qixuan Chen, Sijian Wang
    Abstract:

    Multiple imputation (MI) is a commonly used technique for handling missing data in large-scale medical and public health studies. However, variable Selection on multiply-imputed data remains an important and longstanding statistical problem. If a variable Selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation-least absolute shrinkage and Selection Operator (MI-LASSO) variable Selection method as an extension of the least absolute shrinkage and Selection Operator (LASSO) method to multiply-imputed data. The MI-LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable Selection across multiple-imputed datasets. We use a simulation study to demonstrate the advantage of the MI-LASSO method compared with the alternatives. We also apply the MI-LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan. Copyright © 2013 John Wiley & Sons, Ltd.

Huimin Ting - One of the best experts on this subject based on the ideXlab platform.

  • using multivariate regression model with least absolute shrinkage and Selection Operator lasso to predict the incidence of xerostomia after intensity modulated radiotherapy for head and neck cancer
    PLOS ONE, 2014
    Co-Authors: Peiju Chao, Huimin Ting, Liyun Chang, Yujie Huang, Jiaming Wu, Hungyu Wang, Mongfong Horng, Chunming Chang, Yayu Huang
    Abstract:

    Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and Selection Operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the Selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT.

  • using multivariate regression model with least absolute shrinkage and Selection Operator lasso to predict the incidence of xerostomia after intensity modulated radiotherapy for head and neck cancer
    PLOS ONE, 2014
    Co-Authors: Tsairfwu Lee, Peiju Chao, Huimin Ting, Liyun Chang, Yujie Huang, Hungyu Wang, Mongfong Horng, Chunming Chang, Jenhong Lan
    Abstract:

    Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and Selection Operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the Selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions Multivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT.

Heming Yao - One of the best experts on this subject based on the ideXlab platform.

  • a variable informative criterion based on weighted voting strategy combined with lasso for variable Selection in multivariate calibration
    Chemometrics and Intelligent Laboratory Systems, 2019
    Co-Authors: Ruoqiu Zhang, Feiyu Zhang, Wanchao Chen, Qin Xiong, Zengkai Chen, Heming Yao
    Abstract:

    Abstract High-throughput spectra data with large number of variables (wavelength) will make the prediction of multivariate calibration model unreliable, in which case, the sparse statistical methods such as least absolute shrinkage and Selection Operator (LASSO) are gradually being valued by researchers. In this study, a novel variable informative criterion based on weighted voting strategy combined with least absolute shrinkage and Selection Operator (WV-LASSO) has been proposed. Monte Carlo Sampling (MCS) is used for generating a large number of sub-models. In each Monte Carlo circulation, the regression coefficients and variable Selection information of LASSO model will be recorded. In the present work, weighted voting strategy based on regression coefficients information combined with selected variable frequency of all sub-models is used for evaluating the importance of variable. Different from specific methods, variable informative (importance) criterion can be more extensive and flexible for algorithm design. Then an approach called exponentially decreasing function (EDF) is applied to create a variable Selection method with WV-LASSO. The performance of this method was evaluated by three near-infrared (NIR) datasets. Compared with some efficient variable Selection methods based on different informative criterions including variable importance projection (VIP), Monte Carlo uninformative variable elimination (MC-UVE), randomization test (RT), competitive adaptive reweighted sampling (CARS), stability competitive adaptive reweighted sampling (SCARS), variable iterative space shrinkage approach (VISSA), interval variable iterative space shrinkage approach (iVISSA), LASSO coupled with sampling error profile analysis (SEPA-LASSO) and variable combination population analysis (VCPA), and so forth, the variable Selection method proposed in this paper shows better prediction and interpretation ability and has potential for constructing various variable Selection methods by combining other Selection strategies.

  • a new strategy of least absolute shrinkage and Selection Operator coupled with sampling error profile analysis for wavelength Selection
    Chemometrics and Intelligent Laboratory Systems, 2018
    Co-Authors: Ruoqiu Zhang, Feiyu Zhang, Wanchao Chen, Heming Yao
    Abstract:

    Abstract A new strategy based on sampling error profile analysis (SEPA) combined with least absolute shrinkage and Selection Operator (SEPA-LASSO) was proposed. LASSO has been proven to be effective for multivariate calibration with automatic variable Selection for high-dimensional data. However, in the previous research, the critical process of multivariate calibration by LASSO was an optimization of 1-norm turning parameter for a fixed sample set without considering the behaviors of variable Selection by different subsets of samples. In the present work, Monte Carlo Sampling (MCS), the core of SEPA framework, is used to investigate various sub-models. Least angle regression (LAR) is used to solve LASSO, and various LAR iteration including certain number of variables could be obtained instead of choosing the numerical values of 1-norm turning parameters. SEPA-LASSO algorithm consists of plenty of loops. Under the SEPA framework and LAR algorithm, a number of LASSO sub-models with the same dimensions are built by MCS in each loop, the vote rule is used to determine the importance of variables and select them to build variable subsets. After running the loops, several subsets of variables are obtained and their error profile is used to choose the optimal subset of variables. The performance of SEPA-LASSO was evaluated by three near-infrared (NIR) datasets. The results show that the model built by SEPA-LASSO has excellent predictability and interpretability, compared with some commonly used multivariate calibration methods, such as principal component regression (PCR) and partial least squares (PLS), as well as some wavelength Selection methods including LASSO, moving window partial least squares regression (MWPLSR), Monte Carlo uninformative variable elimination (MC-UVE), ordered homogeneity pursuit lasso (OHPL) and stability competitive adaptive reweighted sampling (SCARS).