Distance Metric

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 69546 Experts worldwide ranked by ideXlab platform

Bernard De Baets - One of the best experts on this subject based on the ideXlab platform.

  • Kernel-Based Distance Metric Learning for Supervised $k$ -Means Clustering
    IEEE Transactions on Neural Networks and Learning Systems, 2019
    Co-Authors: Bac Nguyen, Bernard De Baets
    Abstract:

    Finding an appropriate Distance Metric that accurately reflects the (dis)similarity between examples is a key to the success of k-means clustering. While it is not always an easy task to specify a good Distance Metric, we can try to learn one based on prior knowledge from some available clustered data sets, an approach that is referred to as supervised clustering. In this paper, a kernel-based Distance Metric learning method is developed to improve the practical use of k-means clustering. Given the corresponding optimization problem, we derive a meaningful Lagrange dual formulation and introduce an efficient algorithm in order to reduce the training complexity. Our formulation is simple to implement, allowing a large-scale Distance Metric learning problem to be solved in a computationally tractable way. Experimental results show that the proposed method yields more robust and better performances on synthetic as well as real-world data sets compared to other state-of-the-art Distance Metric learning methods.

  • Supervised Distance Metric learning through maximization of the Jeffrey divergence
    Pattern Recognition, 2017
    Co-Authors: Bac Nguyen, Carlos Morell, Bernard De Baets
    Abstract:

    Over the past decades, Distance Metric learning has attracted a lot of interest in machine learning and related fields. In this work, we propose an optimization framework for Distance Metric learning via linear transformations by maximizing the Jeffrey divergence between two multivariate Gaussian distributions derived from local pairwise constraints. In our method, the Distance Metric is trained on positive and negative difference spaces, which are built from the neighborhood of each training instance, so that the local discriminative information is preserved. We show how to solve this problem with a closed-form solution rather than using tedious optimization procedures. The solution is easy to implement, and tractable for large-scale problems. Experimental results are presented for both a linear and a kernelized version of the proposed method for k-nearest neighbors classification. We obtain classification accuracies superior to the state-of-the-art Distance Metric learning methods in several cases while being competitive in others. HighlightsWe propose a novel Distance Metric learning method (DMLMJ) for classification.DMLMJ is simple to implement and it can be solved analytically.We extend DMLMJ into a kernelized version to tackle non-linear problems.Experiments on several data sets show the effectiveness of the proposed method.

  • large scale Distance Metric learning for k nearest neighbors regression
    Neurocomputing, 2016
    Co-Authors: Bac Nguyen, Carlos Morell, Bernard De Baets
    Abstract:

    This paper presents a Distance Metric learning method for k-nearest neighbors regression. We define the constraints based on triplets, which are built from the neighborhood of each training instance, to learn the Distance Metric. The resulting optimization problem can be formulated as a convex quadratic program. Quadratic programming has a disadvantage that it does not scale well in large-scale settings. To reduce the time complexity of training, we propose a novel dual coordinate descent method for this type of problem. Experimental results on several regression data sets show that our method obtains a competitive performance when compared with the state-of-the-art Distance Metric learning methods, while being an order of magnitude faster.

Bac Nguyen - One of the best experts on this subject based on the ideXlab platform.

  • Kernel-Based Distance Metric Learning for Supervised $k$ -Means Clustering
    IEEE Transactions on Neural Networks and Learning Systems, 2019
    Co-Authors: Bac Nguyen, Bernard De Baets
    Abstract:

    Finding an appropriate Distance Metric that accurately reflects the (dis)similarity between examples is a key to the success of k-means clustering. While it is not always an easy task to specify a good Distance Metric, we can try to learn one based on prior knowledge from some available clustered data sets, an approach that is referred to as supervised clustering. In this paper, a kernel-based Distance Metric learning method is developed to improve the practical use of k-means clustering. Given the corresponding optimization problem, we derive a meaningful Lagrange dual formulation and introduce an efficient algorithm in order to reduce the training complexity. Our formulation is simple to implement, allowing a large-scale Distance Metric learning problem to be solved in a computationally tractable way. Experimental results show that the proposed method yields more robust and better performances on synthetic as well as real-world data sets compared to other state-of-the-art Distance Metric learning methods.

  • Supervised Distance Metric learning through maximization of the Jeffrey divergence
    Pattern Recognition, 2017
    Co-Authors: Bac Nguyen, Carlos Morell, Bernard De Baets
    Abstract:

    Over the past decades, Distance Metric learning has attracted a lot of interest in machine learning and related fields. In this work, we propose an optimization framework for Distance Metric learning via linear transformations by maximizing the Jeffrey divergence between two multivariate Gaussian distributions derived from local pairwise constraints. In our method, the Distance Metric is trained on positive and negative difference spaces, which are built from the neighborhood of each training instance, so that the local discriminative information is preserved. We show how to solve this problem with a closed-form solution rather than using tedious optimization procedures. The solution is easy to implement, and tractable for large-scale problems. Experimental results are presented for both a linear and a kernelized version of the proposed method for k-nearest neighbors classification. We obtain classification accuracies superior to the state-of-the-art Distance Metric learning methods in several cases while being competitive in others. HighlightsWe propose a novel Distance Metric learning method (DMLMJ) for classification.DMLMJ is simple to implement and it can be solved analytically.We extend DMLMJ into a kernelized version to tackle non-linear problems.Experiments on several data sets show the effectiveness of the proposed method.

  • large scale Distance Metric learning for k nearest neighbors regression
    Neurocomputing, 2016
    Co-Authors: Bac Nguyen, Carlos Morell, Bernard De Baets
    Abstract:

    This paper presents a Distance Metric learning method for k-nearest neighbors regression. We define the constraints based on triplets, which are built from the neighborhood of each training instance, to learn the Distance Metric. The resulting optimization problem can be formulated as a convex quadratic program. Quadratic programming has a disadvantage that it does not scale well in large-scale settings. To reduce the time complexity of training, we propose a novel dual coordinate descent method for this type of problem. Experimental results on several regression data sets show that our method obtains a competitive performance when compared with the state-of-the-art Distance Metric learning methods, while being an order of magnitude faster.

Xuan Zeng - One of the best experts on this subject based on the ideXlab platform.

  • Improved Tangent Space-Based Distance Metric for Lithographic Hotspot Classification
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2017
    Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan Zeng, Dian Zhou
    Abstract:

    A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space (ITS)-based Distance Metric for hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. The ITS-based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. We also propose a hierarchical density-based clustering method for hotspot clustering. It is more suitable for arbitrary shaped clusters.

  • Improved tangent space based Distance Metric for accurate lithographic hotspot classification
    DAC Design Automation Conference 2012, 2012
    Co-Authors: Fan Yang, Charles Chiang, Subarna Sinha, Xuan Zeng
    Abstract:

    A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.

  • DAC - Improved tangent space based Distance Metric for accurate lithographic hotspot classification
    Proceedings of the 49th Annual Design Automation Conference on - DAC '12, 2012
    Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan Zeng
    Abstract:

    A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.

Eric P Xing - One of the best experts on this subject based on the ideXlab platform.

  • multi modal Distance Metric learning
    International Joint Conference on Artificial Intelligence, 2013
    Co-Authors: Pengtao Xie, Eric P Xing
    Abstract:

    Multi-modal data is dramatically increasing with the fast growth of social media. Learning a good Distance measure for data with multiple modalities is of vital importance for many applications, including retrieval, clustering, classification and recommendation. In this paper, we propose an effective and scalable multi-modal Distance Metric learning framework. Based on the multi-wing harmonium model, our method provides a principled way to embed data of arbitrary modalities into a single latent space, of which an optimal Distance Metric can be learned under proper supervision, i.e., by minimizing the Distance between similar pairs whereas maximizing the Distance between dissimilar pairs. The parameters are learned by jointly optimizing the data likelihood under the latent space model and the loss induced by Distance supervision, thereby our method seeks a balance between explaining the data and providing an effective Distance Metric, which naturally avoids overfitting. We apply our general framework to text/image data and present empirical results on retrieval and classification to demonstrate the effectiveness and scalability.

  • IJCAI - Multi-modal Distance Metric learning
    2013
    Co-Authors: Pengtao Xie, Eric P Xing
    Abstract:

    Multi-modal data is dramatically increasing with the fast growth of social media. Learning a good Distance measure for data with multiple modalities is of vital importance for many applications, including retrieval, clustering, classification and recommendation. In this paper, we propose an effective and scalable multi-modal Distance Metric learning framework. Based on the multi-wing harmonium model, our method provides a principled way to embed data of arbitrary modalities into a single latent space, of which an optimal Distance Metric can be learned under proper supervision, i.e., by minimizing the Distance between similar pairs whereas maximizing the Distance between dissimilar pairs. The parameters are learned by jointly optimizing the data likelihood under the latent space model and the loss induced by Distance supervision, thereby our method seeks a balance between explaining the data and providing an effective Distance Metric, which naturally avoids overfitting. We apply our general framework to text/image data and present empirical results on retrieval and classification to demonstrate the effectiveness and scalability.

Fan Yang - One of the best experts on this subject based on the ideXlab platform.

  • Improved Tangent Space-Based Distance Metric for Lithographic Hotspot Classification
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2017
    Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan Zeng, Dian Zhou
    Abstract:

    A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space (ITS)-based Distance Metric for hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. The ITS-based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. We also propose a hierarchical density-based clustering method for hotspot clustering. It is more suitable for arbitrary shaped clusters.

  • Distance Metric Learning-Based Conformal Predictor
    2012
    Co-Authors: Fan Yang, Zhigang Chen, Guifang Shao, Huazhen Wang
    Abstract:

    In order to improve the computational efficiency of conformal predictor, Distance Metric learning methods were used in the algorithm. The process of learning was divided into two stages: offline learning and online learning. Firstly, part of the training data was used in Distance Metric learning to get a space transformation matrix in the offline learning stage; Secondly, standard CP-KNN was conducted on the remaining training data with a nonconformity measure function defined by K nearest neighbors classifier in the transformed space. Experimental results on three UCI datasets demonstrate the efficiency of the new algorithm.

  • Improved tangent space based Distance Metric for accurate lithographic hotspot classification
    DAC Design Automation Conference 2012, 2012
    Co-Authors: Fan Yang, Charles Chiang, Subarna Sinha, Xuan Zeng
    Abstract:

    A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.

  • DAC - Improved tangent space based Distance Metric for accurate lithographic hotspot classification
    Proceedings of the 49th Annual Design Automation Conference on - DAC '12, 2012
    Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan Zeng
    Abstract:

    A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.