Dimensional Feature

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 117714 Experts worldwide ranked by ideXlab platform

Masashi Sugiyama - One of the best experts on this subject based on the ideXlab platform.

  • high Dimensional Feature selection by Feature wise kernelized lasso
    Neural Computation, 2014
    Co-Authors: Makoto Yamada, Wittawat Jitkrittum, Leonid Sigal, Eric P. Xing, Masashi Sugiyama
    Abstract:

    The goal of supervised Feature selection is to find a subset of input Features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient Feature selection based on linear dependency between input Features and output values. In this letter, we consider a Feature-wise kernelized Lasso for capturing nonlinear input-output dependency. We first show that with particular choices of kernel functions, nonredundant Features with strong statistical dependence on output values can be found in terms of kernel-based independence measures such as the Hilbert-Schmidt independence criterion. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-Dimensional problems. The effectiveness of the proposed method is demonstrated through Feature selection experiments for classification and regression with thousands of Features.

  • High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso
    Neural computation, 2013
    Co-Authors: Makoto Yamada, Wittawat Jitkrittum, Leonid Sigal, Eric P. Xing, Masashi Sugiyama
    Abstract:

    The goal of supervised Feature selection is to find a subset of input Features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient Feature selection based on linear dependency between input Features and output values. In this paper, we consider a Feature-wise kernelized Lasso for capturing non-linear input-output dependency. We first show that, with particular choices of kernel functions, non-redundant Features with strong statistical dependence on output values can be found in terms of kernel-based independence measures. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-Dimensional problems. The effectiveness of the proposed method is demonstrated through Feature selection experiments with thousands of Features.

Li Wang - One of the best experts on this subject based on the ideXlab platform.

  • towards ultrahigh Dimensional Feature selection for big data
    Journal of Machine Learning Research, 2014
    Co-Authors: Mingkui Tan, Ivor W Tsang, Li Wang
    Abstract:

    In this paper, we present a new adaptive Feature scaling scheme for ultrahigh-Dimensional Feature selection on Big Data, and then reformulate it as a convex semi-infinite programming (SIP) problem. To address the SIP, we propose an efficient Feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input Features, the proposed paradigm iteratively activates a group of Features, and solves a sequence of multiple kernel learning (MKL) subproblems. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such optimization scheme, some efficient cache techniques are also developed. The Feature generating paradigm is guaranteed to converge globally under mild conditions, and can achieve lower Feature selection bias. Moreover, the proposed method can tackle two challenging tasks in Feature selection: 1) group-based Feature selection with complex structures, and 2) nonlinear Feature selection with explicit Feature mappings. Comprehensive experiments on a wide range of synthetic and real-world data sets of tens of million data points with O(1014) Features demonstrate the competitive performance of the proposed method over state-of-the-art Feature selection methods in terms of generalization performance and training effciency.

  • towards ultrahigh Dimensional Feature selection for big data
    arXiv: Learning, 2012
    Co-Authors: Mingkui Tan, Ivor W Tsang, Li Wang
    Abstract:

    In this paper, we present a new adaptive Feature scaling scheme for ultrahigh-Dimensional Feature selection on Big Data. To solve this problem effectively, we first reformulate it as a convex semi-infinite programming (SIP) problem and then propose an efficient \emph{Feature generating paradigm}. In contrast with traditional gradient-based approaches that conduct optimization on all input Features, the proposed method iteratively activates a group of Features and solves a sequence of multiple kernel learning (MKL) subproblems of much reduced scale. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such an optimization scheme, some efficient cache techniques are also developed. The Feature generating paradigm can guarantee that the solution converges globally under mild conditions and achieve lower Feature selection bias. Moreover, the proposed method can tackle two challenging tasks in Feature selection: 1) group-based Feature selection with complex structures and 2) nonlinear Feature selection with explicit Feature mappings. Comprehensive experiments on a wide range of synthetic and real-world datasets containing tens of million data points with $O(10^{14})$ Features demonstrate the competitive performance of the proposed method over state-of-the-art Feature selection methods in terms of generalization performance and training efficiency.

Baoping Tang - One of the best experts on this subject based on the ideXlab platform.

  • multi fault diagnosis for rotating machinery based on orthogonal supervised linear local tangent space alignment and least square support vector machine
    Neurocomputing, 2015
    Co-Authors: Zuqiang Su, Baoping Tang
    Abstract:

    Abstract In order to improve the accuracy of fault diagnosis, this article proposes a multi-fault diagnosis method for rotating machinery based on orthogonal supervised linear local tangent space alignment (OSLLTSA) and least square support vector machine (LS-SVM). First, the collected vibration signals are decomposed by empirical model decomposition (EMD), and a high-Dimensional Feature set is constructed by extracting statistical Features, autoregressive (AR) coefficients and instantaneous amplitude Shannon entropy from those intrinsic model functions (IMFs) that contain most fault information. Then, the high-Dimensional Feature set is inputted into OSLLTSA for dimension reduction to yield more sensitive low-Dimensional fault Features, which not only achieves the combination of intrinsic structure information and class label information of dataset but also improves the discrimination of the low-Dimensional fault Features. Further, the low-Dimensional fault Features are inputted to LS-SVM to recognize machinery faults according to the LS-SVM parameters selected by enhanced particle swarm optimization (EPSO). Finally, the performance of the proposed method is verified by a fault diagnosis case in a rolling element bearing, and the results confirm the improved accuracy of fault diagnosis.

  • fault diagnosis for a wind turbine transmission system based on manifold learning and shannon wavelet support vector machine
    Renewable Energy, 2014
    Co-Authors: Baoping Tang, Tao Song, Feng Li, Lei Deng
    Abstract:

    Fault diagnosis for wind turbine transmission systems is an important task for reducing their maintenance cost. However, the non-stationary dynamic operating conditions of wind turbines pose a challenge to fault diagnosis for wind turbine transmission systems. In this paper, a novel fault diagnosis method based on manifold learning and Shannon wavelet support vector machine is proposed for wind turbine transmission systems. Firstly, mixed-domain Features are extracted to construct a high-Dimensional Feature set characterizing the properties of non-stationary vibration signals from wind turbine transmission systems. Moreover, an effective manifold learning algorithm with non-linear Dimensionality reduction capability, orthogonal neighborhood preserving embedding (ONPE), is applied to compress the high-Dimensional Feature set into low-Dimensional eigenvectors. Finally, the low-Dimensional eigenvectors are inputted into a Shannon wavelet support vector machine (SWSVM) to recognize faults. The performance of the proposed method was proved by successful fault diagnosis application in a wind turbine's gearbox. The application results indicated that the proposed method improved the accuracy of fault diagnosis.

Jianqing Fan - One of the best experts on this subject based on the ideXlab platform.

  • a selective overview of variable selection in high Dimensional Feature space
    Statistica Sinica, 2010
    Co-Authors: Jianqing Fan
    Abstract:

    High Dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of pe- nalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high Dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high Dimensional statistical inference. In this article, we present a brief ac- count of the recent developments of theory, methods, and implementations for high Dimensional variable selection. What limits of the Dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high Dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high Dimensional variable selection, with emphasis on independence screening and two-scale methods.

  • a selective overview of variable selection in high Dimensional Feature space invited review article
    arXiv: Statistics Theory, 2009
    Co-Authors: Jianqing Fan
    Abstract:

    High Dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high Dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high Dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high Dimensional variable selection. What limits of the Dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high Dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high Dimensional variable selection, with emphasis on independence screening and two-scale methods.

  • sure independence screening for ultrahigh Dimensional Feature space
    Journal of The Royal Statistical Society Series B-statistical Methodology, 2008
    Co-Authors: Jianqing Fan
    Abstract:

    Summary.  Variable selection plays an important role in high Dimensional statistical modelling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or Dimensionality p, accuracy of estimation and computational cost are two top concerns. Recently, Candes and Tao have proposed the Dantzig selector using L1-regularization and showed that it achieves the ideal risk up to a logarithmic factor  log (p). Their innovative procedure and remarkable result are challenged when the Dimensionality is ultrahigh as the factor  log (p) can be large and their uniform uncertainty principle can fail. Motivated by these concerns, we introduce the concept of sure screening and propose a sure screening method that is based on correlation learning, called sure independence screening, to reduce Dimensionality from high to a moderate scale that is below the sample size. In a fairly general asymptotic framework, correlation learning is shown to have the sure screening property for even exponentially growing Dimensionality. As a methodological extension, iterative sure independence screening is also proposed to enhance its finite sample performance. With dimension reduced accurately from high to below sample size, variable selection can be improved on both speed and accuracy, and can then be accomplished by a well-developed method such as smoothly clipped absolute deviation, the Dantzig selector, lasso or adaptive lasso. The connections between these penalized least squares methods are also elucidated.

  • variable screening in high Dimensional Feature space
    2007
    Co-Authors: Jianqing Fan
    Abstract:

    Variable selection in high-Dimensional space characterizes many contemporary problems in scientific discovery and decision making. Fan and Lv [8] introduced the concept of sure screening to reduce the Dimensionality. This article first reviews the part of their ideas and results and then extends them to the likelihood based models. The techniques are then applied to disease classifications in computational biology and portfolio selection in finance.

Weifang Zhang - One of the best experts on this subject based on the ideXlab platform.

  • bearing fault Feature selection method based on weighted multiDimensional Feature fusion
    IEEE Access, 2020
    Co-Authors: Wei Dai, Weifang Zhang
    Abstract:

    Rolling bearing is one of the most critical components in rotating machinery, so in order to efficiently select Features, reduce Feature dimensions and improve the correctness of fault diagnosis, a Feature selection and fusion method based on weighted multi-Dimensional Feature fusion is proposed. Firstly, Features are extracted from different domains to constitute the original high-Dimensional Feature set. Considering the large number of invalid and redundant Features contained in such original Feature set, a Feature selection process that combines with support vector machine (SVM) single Feature evaluation, correlation analysis and principal component analysis-weighted load evaluation (PCA-WLE) is put forward in this paper for selecting sensitive Features. The selected Features are weighted and fused according to their sensitivity so as to further weaken the interference of low important Features. Finally, this process is applied to the data provided by the Case Western Reserve University Bearing Data Center and Xi'an Jiaotong University School of Mechanical Engineering, respectively, and the fault is diagnosed by using the particle swarm optimization-support vector machine (PSO-SVM). The results show that this method can accurately identify different fault categories and degrees of bearing, which is superior and practical than single-domain fault diagnosis with higher recognition ability.