Kernel Machine

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 37917 Experts worldwide ranked by ideXlab platform

Yair Goldberg - One of the best experts on this subject based on the ideXlab platform.

  • Kernel Machines for current status data
    Machine Learning, 2020
    Co-Authors: Yael Travis-lumer, Yair Goldberg
    Abstract:

    In survival analysis, estimating the failure time distribution is an important and difficult task, since usually the data is subject to censoring. Specifically, in this paper we consider current status data, a type of data where the failure time cannot be directly observed. The format of the data is such that the failure time is restricted to knowledge of whether or not the failure time exceeds a random monitoring time. We propose a flexible Kernel Machine approach for estimation of the failure time expectation as a function of the covariates, with current status data. In order to obtain the Kernel Machine decision function, we minimize a regularized version of the empirical risk with respect to a new loss function. Using finite sample bounds and novel oracle inequalities, we prove that the obtained estimator converges to the true conditional expectation for a large family of probability measures. Finally, we present a simulation study and an analysis of real-world data that compares the performance of the proposed approach to existing methods. We show empirically that our approach is comparable to current state of the art, and in some cases is even better.

  • Kernel Machines With Missing Responses
    arXiv: Machine Learning, 2018
    Co-Authors: Tiantian Liu, Yair Goldberg
    Abstract:

    Missing responses is a missing data format in which outcomes are not always observed. In this work we develop Kernel Machines that can handle missing responses. First, we propose a Kernel Machine family that uses mainly the complete cases. For the quadratic loss, we then propose a family of doubly-robust Kernel Machines. The proposed Kernel-Machine estimators can be applied to both regression and classification problems. We prove oracle inequalities for the finite-sample differences between the Kernel Machine risk and Bayes risk. We use these oracle inequalities to prove consistency and to calculate convergence rates. We demonstrate the performance of the two proposed Kernel Machine families using both a simulation study and a real-world data analysis.

  • Kernel Machines for Current Status Data.
    arXiv: Statistics Theory, 2015
    Co-Authors: Yael Travis-lumer, Yair Goldberg
    Abstract:

    In survival analysis, estimating the failure time distribution is an important and difficult task, since usually the data is subject to censoring. Specifically, in this paper we consider current status data, a type of data where all of the observations are censored. The format of the data is such that the failure time is restricted to knowledge of whether or not the failure time exceeds a random monitoring time. We propose a flexible Kernel Machine approach for estimation of the failure time expectation as a function of the covariates, with current status data. In order to obtain the Kernel Machine decision function, we minimize a regularized version of the empirical risk with respect to a new loss function. Using finite sample bounds and novel oracle inequalities, we prove that the obtained estimator converges to the true conditional expectation for a large family of probability measures. Finally, we present a simulation study and an analysis of real-world data that compares the performance of the proposed approach to existing methods. We show empirically that our approach is comparable to current state of the art, and in some cases is even better.

Raghu G. Raj - One of the best experts on this subject based on the ideXlab platform.

  • An Online Stochastic Kernel Machine for Robust Signal Classification
    arXiv: Machine Learning, 2019
    Co-Authors: Raghu G. Raj
    Abstract:

    We present a novel variation of online Kernel Machines in which we exploit a consensus based optimization mechanism to guide the evolution of decision functions drawn from a reproducing Kernel Hilbert space, which efficiently models the observed stationary process.

  • ACSSC - An Online Stochastic Kernel Machine for Robust Signal Classification
    2019 53rd Asilomar Conference on Signals Systems and Computers, 2019
    Co-Authors: Raghu G. Raj
    Abstract:

    We present a novel variation of online Kernel Machines in which we exploit a consensus based optimization mechanism to guide the evolution of decision functions drawn from a reproducing Kernel Hilbert space (RKHS) such that the entire stationary process observed can be efficiently modeled. We derive an efficient classification algorithm based on these principles such that our algorithm reduces to traditional online Kernel Machines for the special case in which the consensus based optimization mechanism is switched off. We illustrate the inherent label and input noise resistance of our algorithm for the case of online classification; and derive relevant mistake bounds. The resulting algorithm can find numerous applications such as, for example, in Automatic Target Recognition (ATR) by remote sensing platforms wherein the target being classified tends to typically be persistent over the observation interval.

Songcan Chen - One of the best experts on this subject based on the ideXlab platform.

  • Discriminality-driven regularization framework for indefinite Kernel Machine
    Neurocomputing, 2014
    Co-Authors: Hui Xue, Songcan Chen
    Abstract:

    Indefinite Kernel Machines have attracted more and more interests in Machine learning due to their better empirical classification performance than the common positive definite Kernel Machines in many applications. A key to implement effective Kernel Machine is how to use prior knowledge as sufficiently as possible to guide the appropriate construction of the Kernels. However, most of existing indefinite Kernel Machines actually utilize the knowledge involved in data such as discriminative and structural information insufficiently and thus construct the indefinite Kernels empirically. Discriminatively regularized least-squares classification (DRLSC) is a recently-proposed supervised classification method which provides a new discriminality-driven regularizer to encourage the discriminality of the classifier rather than the common smoothness. In this paper, we rigorously validate that the discriminative regularizer actually coincides with the definition on the inner product in Reproducing Kernel [email protected]?n Space (RKKS) naturally. As a result, we further present a new discriminality-driven regularization framework for indefinite Kernel Machine based on the discriminative regularizer. According to the framework, we firstly reintroduce the original DRLSC from the viewpoint of the proper indefinite Kernelization rather than the empirical Kernel mapping. Then a novel semi-supervised algorithm is proposed in terms of different definition on the regularizer. The experiments on both toy and real-world datasets demonstrate the superiority of the two algorithms.

  • Multi-view Kernel Machine on single-view data
    Neurocomputing, 2008
    Co-Authors: Zhe Wang, Songcan Chen
    Abstract:

    Existing multi-view learning focuses on the problem of how to learn from data represented by multiple independent sets of attributes (termed as multi-view data), and has been proved to bring an excellent performance. However, in general, we have only a single set of attributes (termed as single-view data) available. The goal of this paper is to employ the multi-view viewpoint to develop a multi-view Kernel Machine for such a single-view data. The key of doing so is to associate each learning Machine with one Kernel, take it as one view and thus form a set of learning Machines from their corresponding Kernels, as a result, a multi-view Kernel Machine can be developed by synthesizing them into a single learning framework. Further, in the two-view (two-Kernel) case, we explore the relationship between the generalization ability of the proposed Kernel Machine and its associated Kernels, in which with the Kernel alignment (KA) as a correlation measure between Kernels, it is found that superior performance of the proposed Machine results from a weaker correlation between the constitutive Kernels. To the best of our knowledge, both the multi-view learning on single-view data and the KA measure used here have not appeared in any literature. In practice, we take the Kernel modified Ho-Kashyap with squared (KMHKS) approximation of the misclassification errors as a learning Machine to develop a multi-view KMHKS (MultiV-KMHKS) on single-view data.

Tianxi Cai - One of the best experts on this subject based on the ideXlab platform.

  • Kernel Machine testing for risk prediction with stratified case cohort studies
    Biometrics, 2016
    Co-Authors: Rebecca Payne, Matey Neykov, Majken K Jensen, Tianxi Cai
    Abstract:

    Large assembled cohorts with banked biospecimens offer valuable opportunities to identify novel markers for risk prediction. When the outcome of interest is rare, an effective strategy to conserve limited biological resources while maintaining reasonable statistical power is the case cohort (CCH) sampling design, in which expensive markers are measured on a subset of cases and controls. However, the CCH design introduces significant analytical complexity due to outcome-dependent, finite-population sampling. Current methods for analyzing CCH studies focus primarily on the estimation of simple survival models with linear effects; testing and estimation procedures that can efficiently capture complex non-linear marker effects for CCH data remain elusive. In this article, we propose inverse probability weighted (IPW) variance component type tests for identifying important marker sets through a Cox proportional hazards Kernel Machine (CoxKM) regression framework previously considered for full cohort studies (Cai et al., 2011). The optimal choice of Kernel, while vitally important to attain high power, is typically unknown for a given dataset. Thus, we also develop robust testing procedures that adaptively combine information from multiple Kernels. The proposed IPW test statistics have complex null distributions that cannot easily be approximated explicitly. Furthermore, due to the correlation induced by CCH sampling, standard resampling methods such as the bootstrap fail to approximate the distribution correctly. We, therefore, propose a novel perturbation resampling scheme that can effectively recover the induced correlation structure. Results from extensive simulation studies suggest that the proposed IPW CoxKM testing procedures work well in finite samples. The proposed methods are further illustrated by application to a Danish CCH study of Apolipoprotein C-III markers on the risk of coronary heart disease.

  • risk classification with an adaptive naive bayes Kernel Machine model
    Journal of the American Statistical Association, 2015
    Co-Authors: Jessica Minnier, Ming Yuan, Jun S Liu, Tianxi Cai
    Abstract:

    Genetic studies of complex traits have uncovered only a small number of risk markers explaining a small fraction of heritability and adding little improvement to disease risk prediction. Standard single marker methods may lack power in selecting informative markers or estimating effects. Most existing methods also typically do not account for nonlinearity. Identifying markers with weak signals and estimating their joint effects among many noninformative markers remains challenging. One potential approach is to group markers based on biological knowledge such as gene structure. If markers in a group tend to have similar effects, proper usage of the group structure could improve power and efficiency in estimation. We propose a two-stage method relating markers to disease risk by taking advantage of known gene-set structures. Imposing a naive Bayes Kernel Machine (KM) model, we estimate gene-set specific risk models that relate each gene-set to the outcome in stage I. The KM framework efficiently models pote...

  • sparse Kernel Machine regression for ordinal outcomes
    Biometrics, 2015
    Co-Authors: Yuanyuan Shen, Katherine P Liao, Tianxi Cai
    Abstract:

    Ordinal outcomes arise frequently in clinical studies when each subject is assigned to a category and the categories have a natural order. Classification rules for ordinal outcomes may be developed with commonly used regression models such as the full continuation ratio (CR) model (fCR), which allows the covariate effects to differ across all continuation ratios, and the CR model with a proportional odds structure (pCR), which assumes the covariate effects to be constant across all continuation ratios. For settings where the covariate effects differ between some continuation ratios but not all, fitting either fCR or pCR may lead to suboptimal prediction performance. In addition, these standard models do not allow for non-linear covariate effects. In this paper, we propose a sparse CR Kernel Machine (KM) regression method for ordinal outcomes where we use the KM framework to incorporate non-linearity and impose sparsity on the overall differences between the covariate effects of continuation ratios to control for overfitting. In addition, we provide data driven rule to select an optimal Kernel to maximize the prediction accuracy. Simulation results show that our proposed procedures perform well under both linear and non-linear settings, especially when the true underlying model is in-between fCR and pCR models. We apply our procedures to develop a prediction model for levels of anti-CCP among rheumatoid arthritis patients and demonstrate the advantage of our method over other commonly used methods.

  • Kernel Machine snp set analysis for censored survival outcomes in genome wide association studies
    Genetic Epidemiology, 2011
    Co-Authors: Xinyi Lin, David C. Christiani, Tianxi Cai, Qian M Zhou, Geoffrey Liu, Xihong Lin
    Abstract:

    In this article, we develop a powerful test for identifying single nucleotide polymorphism (SNP)-sets that are predictive of survival with data from genome-wide association studies. We first group typed SNPs into SNP-sets based on genomic features and then apply a score test to assess the overall effect of each SNP-set on the survival outcome through a Kernel Machine Cox regression framework. This approach uses genetic information from all SNPs in the SNP-set simultaneously and accounts for linkage disequilibrium (LD), leading to a powerful test with reduced degrees of freedom when the typed SNPs are in LD with each other. This type of test also has the advantage of capturing the potentially nonlinear effects of the SNPs, SNP-SNP interactions (epistasis), and the joint effects of multiple causal variants. By simulating SNP data based on the LD structure of real genes from the HapMap project, we demonstrate that our proposed test is more powerful than the standard single SNP minimum P-value-based test for association studies with censored survival outcomes. We illustrate the proposed test with a real data application.

Matey Neykov - One of the best experts on this subject based on the ideXlab platform.

  • Kernel Machine score test for pathway analysis in the presence of semi competing risks
    Statistical Methods in Medical Research, 2018
    Co-Authors: Matey Neykov, Boris P. Hejblum, Jennifer A Sinnott
    Abstract:

    In cancer studies, patients often experience two different types of events: a non-terminal event such as recurrence or metastasis, and a terminal event such as cancer-specific death. Identifying pathways and networks of genes associated with one or both of these events is an important step in understanding disease development and targeting new biological processes for potential intervention. These correlated outcomes are commonly dealt with by modeling progression-free survival, where the event time is the minimum between the times of recurrence and death. However, identifying pathways only associated with progression-free survival may miss out on pathways that affect time to recurrence but not death, or vice versa. We propose a combined testing procedure for a pathway’s association with both the cause-specific hazard of recurrence and the marginal hazard of death. The dependency between the two outcomes is accounted for through perturbation resampling to approximate the test’s null distribution, without ...

  • Kernel Machine score test for pathway analysis in the presence of semi-competing risks
    Statistical Methods in Medical Research, 2018
    Co-Authors: Matey Neykov, Boris P. Hejblum, Jennifer Sinnott
    Abstract:

    In cancer studies, patients often experience two different types of events: a non-terminal event such as recurrence or metastasis, and a terminal event such as cancer-specific death. Identifying pathways and networks of genes associated with one or both of these events is an important step in understanding disease development and targeting new biological processes for potential intervention. These correlated outcomes are commonly dealt with by modeling progression-free survival, where the event time is the minimum between the times of recurrence and death. However, identifying pathways only associated with progression-free survival may miss out on pathways that affect time to recurrence but not death, or vice versa. We propose a combined testing procedure for a pathway’s association with both the cause-specific hazard of recurrence and the marginal hazard of death. The dependency between the two outcomes is accounted for through perturbation resampling to approximate the test’s null distribution, without any further assumption on the nature of the dependency. Even complex non-linear relationships between pathways and disease progression or death can be uncovered thanks to a flexible Kernel Machine framework. The superior statistical power of our approach is demonstrated in numerical studies and in a gene expression study of breast cancer.

  • Kernel Machine testing for risk prediction with stratified case cohort studies
    Biometrics, 2016
    Co-Authors: Rebecca Payne, Matey Neykov, Majken K Jensen, Tianxi Cai
    Abstract:

    Large assembled cohorts with banked biospecimens offer valuable opportunities to identify novel markers for risk prediction. When the outcome of interest is rare, an effective strategy to conserve limited biological resources while maintaining reasonable statistical power is the case cohort (CCH) sampling design, in which expensive markers are measured on a subset of cases and controls. However, the CCH design introduces significant analytical complexity due to outcome-dependent, finite-population sampling. Current methods for analyzing CCH studies focus primarily on the estimation of simple survival models with linear effects; testing and estimation procedures that can efficiently capture complex non-linear marker effects for CCH data remain elusive. In this article, we propose inverse probability weighted (IPW) variance component type tests for identifying important marker sets through a Cox proportional hazards Kernel Machine (CoxKM) regression framework previously considered for full cohort studies (Cai et al., 2011). The optimal choice of Kernel, while vitally important to attain high power, is typically unknown for a given dataset. Thus, we also develop robust testing procedures that adaptively combine information from multiple Kernels. The proposed IPW test statistics have complex null distributions that cannot easily be approximated explicitly. Furthermore, due to the correlation induced by CCH sampling, standard resampling methods such as the bootstrap fail to approximate the distribution correctly. We, therefore, propose a novel perturbation resampling scheme that can effectively recover the induced correlation structure. Results from extensive simulation studies suggest that the proposed IPW CoxKM testing procedures work well in finite samples. The proposed methods are further illustrated by application to a Danish CCH study of Apolipoprotein C-III markers on the risk of coronary heart disease.