Recursive Partitioning

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

John W Davies - One of the best experts on this subject based on the ideXlab platform.

  • enrichment of high throughput screening data with increasing levels of noise using support vector machines Recursive Partitioning and laplacian modified naive bayesian classifiers
    Journal of Chemical Information and Modeling, 2006
    Co-Authors: Meir Glick, Jeremy L Jenkins, James H Nettles, Hamilton Hitchings, John W Davies
    Abstract:

    High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches:  Laplacian-modified naive Bayes, Recursive Partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and...

  • enrichment of high throughput screening data with increasing levels of noise using support vector machines Recursive Partitioning and laplacian modified naive bayesian classifiers
    Journal of Chemical Information and Modeling, 2006
    Co-Authors: Meir Glick, Jeremy L Jenkins, James H Nettles, Hamilton Hitchings, John W Davies
    Abstract:

    High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches: Laplacian-modified naive Bayes, Recursive Partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and scaffolds in the top 1%. A Murcko scaffold analysis could explain the differences in enrichments among the four data sets. This study demonstrates that data mining methods can add a true value to the screen even when the data is contaminated with a high level of stochastic noise.

Armin Curt - One of the best experts on this subject based on the ideXlab platform.

  • identifying homogeneous subgroups in neurological disorders unbiased Recursive Partitioning in cervical complete spinal cord injury
    Neurorehabilitation and Neural Repair, 2014
    Co-Authors: Lorenzo G Tanadini, Torsten Hothorn, John D Steeves, Rainer Abel, Doris Maier, Martin Schubert, Norbert Weidner, Rudiger Rupp, Armin Curt
    Abstract:

    Background The reliable stratification of homogeneous subgroups and the prediction of future clinical outcomes within heterogeneous neurological disorders is a particularly challenging task. Nonetheless, it is essential for the implementation of targeted care and effective therapeutic interventions. Objective This study was designed to assess the value of a recently developed regression tool from the family of unbiased Recursive Partitioning methods in comparison to established statistical approaches (eg, linear and logistic regression) for predicting clinical endpoints and for prospective patients' stratification for clinical trials. Methods A retrospective, longitudinal analysis of prospectively collected neurological data from the European Multicenter study about Spinal Cord Injury (EMSCI) network was undertaken on C4-C6 cervical sensorimotor complete subjects. Predictors were based on a broad set of early (<2 weeks) clinical assessments. Endpoints were based on later clinical examinations of upper extremity motor scores and recovery of motor levels, at 6 and 12 months, respectively. Prediction accuracy for each statistical analysis was quantified by resampling techniques. Results For all settings, overlapping confidence intervals indicated similar prediction accuracy of unbiased Recursive Partitioning to established statistical approaches. In addition, unbiased Recursive Partitioning provided a direct way of identification of more homogeneous subgroups. The Partitioning is carried out in a data-driven manner, independently from a priori decisions or predefined thresholds. Conclusion Unbiased Recursive Partitioning techniques may improve prediction of future clinical endpoints and the planning of future SCI clinical trials by providing easily implementable, data-driven rationales for early patient stratification based on simple decision rules and clinical read-outs.

  • identifying homogeneous subgroups in neurological disorders unbiased Recursive Partitioning in cervical complete spinal cord injury
    Neurorehabilitation and Neural Repair, 2014
    Co-Authors: Lorenzo G Tanadini, Torsten Hothorn, John D Steeves, Rainer Abel, Doris Maier, Martin Schubert, Norbert Weidner, Rudiger Rupp, Armin Curt
    Abstract:

    Background. The reliable stratification of homogeneous subgroups and the prediction of future clinical outcomes within heterogeneous neurological disorders is a particularly challenging task. Nonetheless, it is essential for the implementation of targeted care and effective therapeutic interventions. Objective. This study was designed to assess the value of a recently developed regression tool from the family of unbiased Recursive Partitioning methods in comparison to established statistical approaches (eg, linear and logistic regression) for predicting clinical endpoints and for prospective patients’ stratification for clinical trials. Methods. A retrospective, longitudinal analysis of prospectively collected neurological data from the European Multicenter study about Spinal Cord Injury (EMSCI) network was undertaken on C4-C6 cervical sensorimotor complete subjects. Predictors were based on a broad set of early (<2 weeks) clinical assessments. Endpoints were based on later clinical examinations of upper ...

Meir Glick - One of the best experts on this subject based on the ideXlab platform.

  • enrichment of high throughput screening data with increasing levels of noise using support vector machines Recursive Partitioning and laplacian modified naive bayesian classifiers
    Journal of Chemical Information and Modeling, 2006
    Co-Authors: Meir Glick, Jeremy L Jenkins, James H Nettles, Hamilton Hitchings, John W Davies
    Abstract:

    High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches:  Laplacian-modified naive Bayes, Recursive Partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and...

  • enrichment of high throughput screening data with increasing levels of noise using support vector machines Recursive Partitioning and laplacian modified naive bayesian classifiers
    Journal of Chemical Information and Modeling, 2006
    Co-Authors: Meir Glick, Jeremy L Jenkins, James H Nettles, Hamilton Hitchings, John W Davies
    Abstract:

    High-throughput screening (HTS) plays a pivotal role in lead discovery for the pharmaceutical industry. In tandem, cheminformatics approaches are employed to increase the probability of the identification of novel biologically active compounds by mining the HTS data. HTS data is notoriously noisy, and therefore, the selection of the optimal data mining method is important for the success of such an analysis. Here, we describe a retrospective analysis of four HTS data sets using three mining approaches: Laplacian-modified naive Bayes, Recursive Partitioning, and support vector machine (SVM) classifiers with increasing stochastic noise in the form of false positives and false negatives. All three of the data mining methods at hand tolerated increasing levels of false positives even when the ratio of misclassified compounds to true active compounds was 5:1 in the training set. False negatives in the ratio of 1:1 were tolerated as well. SVM outperformed the other two methods in capturing active compounds and scaffolds in the top 1%. A Murcko scaffold analysis could explain the differences in enrichments among the four data sets. This study demonstrates that data mining methods can add a true value to the screen even when the data is contaminated with a high level of stochastic noise.

Torsten Hothorn - One of the best experts on this subject based on the ideXlab platform.

  • model based Recursive Partitioning for subgroup analyses
    arXiv: Methodology, 2016
    Co-Authors: Heidi Seibold, Achim Zeileis, Torsten Hothorn
    Abstract:

    The identification of patient subgroups with differential treatment effects is the first step towards individualised treatments. A current draft guideline by the EMA discusses potentials and problems in subgroup analyses and formulated challenges to the development of appropriate statistical procedures for the data-driven identification of patient subgroups. We introduce model-based Recursive Partitioning as a procedure for the automated detection of patient subgroups that are identifiable by predictive factors. The method starts with a model for the overall treatment effect as defined for the primary analysis in the study protocol and uses measures for detecting parameter instabilities in this treatment effect. The procedure produces a segmented model with differential treatment parameters corresponding to each patient subgroup. The subgroups are linked to predictive factors by means of a decision tree. The method is applied to the search for subgroups of patients suffering from amyotrophic lateral sclerosis that differ with respect to their Riluzole treatment effect, the only currently approved drug for this disease.

  • identifying homogeneous subgroups in neurological disorders unbiased Recursive Partitioning in cervical complete spinal cord injury
    Neurorehabilitation and Neural Repair, 2014
    Co-Authors: Lorenzo G Tanadini, Torsten Hothorn, John D Steeves, Rainer Abel, Doris Maier, Martin Schubert, Norbert Weidner, Rudiger Rupp, Armin Curt
    Abstract:

    Background The reliable stratification of homogeneous subgroups and the prediction of future clinical outcomes within heterogeneous neurological disorders is a particularly challenging task. Nonetheless, it is essential for the implementation of targeted care and effective therapeutic interventions. Objective This study was designed to assess the value of a recently developed regression tool from the family of unbiased Recursive Partitioning methods in comparison to established statistical approaches (eg, linear and logistic regression) for predicting clinical endpoints and for prospective patients' stratification for clinical trials. Methods A retrospective, longitudinal analysis of prospectively collected neurological data from the European Multicenter study about Spinal Cord Injury (EMSCI) network was undertaken on C4-C6 cervical sensorimotor complete subjects. Predictors were based on a broad set of early (<2 weeks) clinical assessments. Endpoints were based on later clinical examinations of upper extremity motor scores and recovery of motor levels, at 6 and 12 months, respectively. Prediction accuracy for each statistical analysis was quantified by resampling techniques. Results For all settings, overlapping confidence intervals indicated similar prediction accuracy of unbiased Recursive Partitioning to established statistical approaches. In addition, unbiased Recursive Partitioning provided a direct way of identification of more homogeneous subgroups. The Partitioning is carried out in a data-driven manner, independently from a priori decisions or predefined thresholds. Conclusion Unbiased Recursive Partitioning techniques may improve prediction of future clinical endpoints and the planning of future SCI clinical trials by providing easily implementable, data-driven rationales for early patient stratification based on simple decision rules and clinical read-outs.

  • identifying homogeneous subgroups in neurological disorders unbiased Recursive Partitioning in cervical complete spinal cord injury
    Neurorehabilitation and Neural Repair, 2014
    Co-Authors: Lorenzo G Tanadini, Torsten Hothorn, John D Steeves, Rainer Abel, Doris Maier, Martin Schubert, Norbert Weidner, Rudiger Rupp, Armin Curt
    Abstract:

    Background. The reliable stratification of homogeneous subgroups and the prediction of future clinical outcomes within heterogeneous neurological disorders is a particularly challenging task. Nonetheless, it is essential for the implementation of targeted care and effective therapeutic interventions. Objective. This study was designed to assess the value of a recently developed regression tool from the family of unbiased Recursive Partitioning methods in comparison to established statistical approaches (eg, linear and logistic regression) for predicting clinical endpoints and for prospective patients’ stratification for clinical trials. Methods. A retrospective, longitudinal analysis of prospectively collected neurological data from the European Multicenter study about Spinal Cord Injury (EMSCI) network was undertaken on C4-C6 cervical sensorimotor complete subjects. Predictors were based on a broad set of early (<2 weeks) clinical assessments. Endpoints were based on later clinical examinations of upper ...

  • Recursive Partitioning on incomplete data using surrogate decisions and multiple imputation
    Computational Statistics & Data Analysis, 2012
    Co-Authors: Alexander Hapfelmeier, Torsten Hothorn
    Abstract:

    The occurrence of missing data is a major problem in statistical data analysis. All scientific fields and data of all kinds and size are touched by this problem. There is a number of ad-hoc solutions which unfortunately lead to a loss of power, biased inference, underestimation of variability and distorted relationships between variables. A more promising approach of rising popularity is multiple imputation by chained equations (MICE) also known as imputation by full conditional specification (FCS). Alternatives to imputation are given by methods with built-in procedures. These include Recursive Partitioning by classification and regression trees as well as corresponding Random Forests. However there is only few literature comparing the two approaches. Existing evaluations often lack generalizability due to restrictions on data structure and simulation schemes. The application of both methods to several kinds of data and different simulation settings is meant to improve and extend the comparative analyses. Classification and regression studies are examined. Recursive Partitioning is executed by two popular tree and one Random Forest implementation. Findings show that multiple imputation produces ambiguous performance results for both, simulated and real life data. Using surrogates instead is a fast and simple way to achieve performances which are only negligible worse and in many cases even superior.

  • model based Recursive Partitioning
    Journal of Computational and Graphical Statistics, 2008
    Co-Authors: Achim Zeileis, Torsten Hothorn, Kurt Hornik
    Abstract:

    Recursive Partitioning is embedded into the general and well-established class of parametric models that can be fitted using M-type estimators (including maximum likelihood). An algorithm for model-based Recursive Partitioning is suggested for which the basic steps are: (1) fit a parametric model to a dataset; (2) test for parameter instability over a set of Partitioning variables; (3) if there is some overall parameter instability, split the model with respect to the variable associated with the highest instability; (4) repeat the procedure in each of the daughter nodes. The algorithm yields a partitioned (or segmented) parametric model that can be effectively visualized and that subject-matter scientists are used to analyzing and interpreting.

Lorenzo G Tanadini - One of the best experts on this subject based on the ideXlab platform.

  • identifying homogeneous subgroups in neurological disorders unbiased Recursive Partitioning in cervical complete spinal cord injury
    Neurorehabilitation and Neural Repair, 2014
    Co-Authors: Lorenzo G Tanadini, Torsten Hothorn, John D Steeves, Rainer Abel, Doris Maier, Martin Schubert, Norbert Weidner, Rudiger Rupp, Armin Curt
    Abstract:

    Background The reliable stratification of homogeneous subgroups and the prediction of future clinical outcomes within heterogeneous neurological disorders is a particularly challenging task. Nonetheless, it is essential for the implementation of targeted care and effective therapeutic interventions. Objective This study was designed to assess the value of a recently developed regression tool from the family of unbiased Recursive Partitioning methods in comparison to established statistical approaches (eg, linear and logistic regression) for predicting clinical endpoints and for prospective patients' stratification for clinical trials. Methods A retrospective, longitudinal analysis of prospectively collected neurological data from the European Multicenter study about Spinal Cord Injury (EMSCI) network was undertaken on C4-C6 cervical sensorimotor complete subjects. Predictors were based on a broad set of early (<2 weeks) clinical assessments. Endpoints were based on later clinical examinations of upper extremity motor scores and recovery of motor levels, at 6 and 12 months, respectively. Prediction accuracy for each statistical analysis was quantified by resampling techniques. Results For all settings, overlapping confidence intervals indicated similar prediction accuracy of unbiased Recursive Partitioning to established statistical approaches. In addition, unbiased Recursive Partitioning provided a direct way of identification of more homogeneous subgroups. The Partitioning is carried out in a data-driven manner, independently from a priori decisions or predefined thresholds. Conclusion Unbiased Recursive Partitioning techniques may improve prediction of future clinical endpoints and the planning of future SCI clinical trials by providing easily implementable, data-driven rationales for early patient stratification based on simple decision rules and clinical read-outs.

  • identifying homogeneous subgroups in neurological disorders unbiased Recursive Partitioning in cervical complete spinal cord injury
    Neurorehabilitation and Neural Repair, 2014
    Co-Authors: Lorenzo G Tanadini, Torsten Hothorn, John D Steeves, Rainer Abel, Doris Maier, Martin Schubert, Norbert Weidner, Rudiger Rupp, Armin Curt
    Abstract:

    Background. The reliable stratification of homogeneous subgroups and the prediction of future clinical outcomes within heterogeneous neurological disorders is a particularly challenging task. Nonetheless, it is essential for the implementation of targeted care and effective therapeutic interventions. Objective. This study was designed to assess the value of a recently developed regression tool from the family of unbiased Recursive Partitioning methods in comparison to established statistical approaches (eg, linear and logistic regression) for predicting clinical endpoints and for prospective patients’ stratification for clinical trials. Methods. A retrospective, longitudinal analysis of prospectively collected neurological data from the European Multicenter study about Spinal Cord Injury (EMSCI) network was undertaken on C4-C6 cervical sensorimotor complete subjects. Predictors were based on a broad set of early (<2 weeks) clinical assessments. Endpoints were based on later clinical examinations of upper ...