The Experts below are selected from a list of 69546 Experts worldwide ranked by ideXlab platform
Bernard De Baets - One of the best experts on this subject based on the ideXlab platform.
-
Kernel-Based Distance Metric Learning for Supervised $k$ -Means Clustering
IEEE Transactions on Neural Networks and Learning Systems, 2019Co-Authors: Bac Nguyen, Bernard De BaetsAbstract:Finding an appropriate Distance Metric that accurately reflects the (dis)similarity between examples is a key to the success of k-means clustering. While it is not always an easy task to specify a good Distance Metric, we can try to learn one based on prior knowledge from some available clustered data sets, an approach that is referred to as supervised clustering. In this paper, a kernel-based Distance Metric learning method is developed to improve the practical use of k-means clustering. Given the corresponding optimization problem, we derive a meaningful Lagrange dual formulation and introduce an efficient algorithm in order to reduce the training complexity. Our formulation is simple to implement, allowing a large-scale Distance Metric learning problem to be solved in a computationally tractable way. Experimental results show that the proposed method yields more robust and better performances on synthetic as well as real-world data sets compared to other state-of-the-art Distance Metric learning methods.
-
Supervised Distance Metric learning through maximization of the Jeffrey divergence
Pattern Recognition, 2017Co-Authors: Bac Nguyen, Carlos Morell, Bernard De BaetsAbstract:Over the past decades, Distance Metric learning has attracted a lot of interest in machine learning and related fields. In this work, we propose an optimization framework for Distance Metric learning via linear transformations by maximizing the Jeffrey divergence between two multivariate Gaussian distributions derived from local pairwise constraints. In our method, the Distance Metric is trained on positive and negative difference spaces, which are built from the neighborhood of each training instance, so that the local discriminative information is preserved. We show how to solve this problem with a closed-form solution rather than using tedious optimization procedures. The solution is easy to implement, and tractable for large-scale problems. Experimental results are presented for both a linear and a kernelized version of the proposed method for k-nearest neighbors classification. We obtain classification accuracies superior to the state-of-the-art Distance Metric learning methods in several cases while being competitive in others. HighlightsWe propose a novel Distance Metric learning method (DMLMJ) for classification.DMLMJ is simple to implement and it can be solved analytically.We extend DMLMJ into a kernelized version to tackle non-linear problems.Experiments on several data sets show the effectiveness of the proposed method.
-
large scale Distance Metric learning for k nearest neighbors regression
Neurocomputing, 2016Co-Authors: Bac Nguyen, Carlos Morell, Bernard De BaetsAbstract:This paper presents a Distance Metric learning method for k-nearest neighbors regression. We define the constraints based on triplets, which are built from the neighborhood of each training instance, to learn the Distance Metric. The resulting optimization problem can be formulated as a convex quadratic program. Quadratic programming has a disadvantage that it does not scale well in large-scale settings. To reduce the time complexity of training, we propose a novel dual coordinate descent method for this type of problem. Experimental results on several regression data sets show that our method obtains a competitive performance when compared with the state-of-the-art Distance Metric learning methods, while being an order of magnitude faster.
Bac Nguyen - One of the best experts on this subject based on the ideXlab platform.
-
Kernel-Based Distance Metric Learning for Supervised $k$ -Means Clustering
IEEE Transactions on Neural Networks and Learning Systems, 2019Co-Authors: Bac Nguyen, Bernard De BaetsAbstract:Finding an appropriate Distance Metric that accurately reflects the (dis)similarity between examples is a key to the success of k-means clustering. While it is not always an easy task to specify a good Distance Metric, we can try to learn one based on prior knowledge from some available clustered data sets, an approach that is referred to as supervised clustering. In this paper, a kernel-based Distance Metric learning method is developed to improve the practical use of k-means clustering. Given the corresponding optimization problem, we derive a meaningful Lagrange dual formulation and introduce an efficient algorithm in order to reduce the training complexity. Our formulation is simple to implement, allowing a large-scale Distance Metric learning problem to be solved in a computationally tractable way. Experimental results show that the proposed method yields more robust and better performances on synthetic as well as real-world data sets compared to other state-of-the-art Distance Metric learning methods.
-
Supervised Distance Metric learning through maximization of the Jeffrey divergence
Pattern Recognition, 2017Co-Authors: Bac Nguyen, Carlos Morell, Bernard De BaetsAbstract:Over the past decades, Distance Metric learning has attracted a lot of interest in machine learning and related fields. In this work, we propose an optimization framework for Distance Metric learning via linear transformations by maximizing the Jeffrey divergence between two multivariate Gaussian distributions derived from local pairwise constraints. In our method, the Distance Metric is trained on positive and negative difference spaces, which are built from the neighborhood of each training instance, so that the local discriminative information is preserved. We show how to solve this problem with a closed-form solution rather than using tedious optimization procedures. The solution is easy to implement, and tractable for large-scale problems. Experimental results are presented for both a linear and a kernelized version of the proposed method for k-nearest neighbors classification. We obtain classification accuracies superior to the state-of-the-art Distance Metric learning methods in several cases while being competitive in others. HighlightsWe propose a novel Distance Metric learning method (DMLMJ) for classification.DMLMJ is simple to implement and it can be solved analytically.We extend DMLMJ into a kernelized version to tackle non-linear problems.Experiments on several data sets show the effectiveness of the proposed method.
-
large scale Distance Metric learning for k nearest neighbors regression
Neurocomputing, 2016Co-Authors: Bac Nguyen, Carlos Morell, Bernard De BaetsAbstract:This paper presents a Distance Metric learning method for k-nearest neighbors regression. We define the constraints based on triplets, which are built from the neighborhood of each training instance, to learn the Distance Metric. The resulting optimization problem can be formulated as a convex quadratic program. Quadratic programming has a disadvantage that it does not scale well in large-scale settings. To reduce the time complexity of training, we propose a novel dual coordinate descent method for this type of problem. Experimental results on several regression data sets show that our method obtains a competitive performance when compared with the state-of-the-art Distance Metric learning methods, while being an order of magnitude faster.
Xuan Zeng - One of the best experts on this subject based on the ideXlab platform.
-
Improved Tangent Space-Based Distance Metric for Lithographic Hotspot Classification
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2017Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan Zeng, Dian ZhouAbstract:A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space (ITS)-based Distance Metric for hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. The ITS-based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. We also propose a hierarchical density-based clustering method for hotspot clustering. It is more suitable for arbitrary shaped clusters.
-
Improved tangent space based Distance Metric for accurate lithographic hotspot classification
DAC Design Automation Conference 2012, 2012Co-Authors: Fan Yang, Charles Chiang, Subarna Sinha, Xuan ZengAbstract:A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.
-
DAC - Improved tangent space based Distance Metric for accurate lithographic hotspot classification
Proceedings of the 49th Annual Design Automation Conference on - DAC '12, 2012Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan ZengAbstract:A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.
Eric P Xing - One of the best experts on this subject based on the ideXlab platform.
-
multi modal Distance Metric learning
International Joint Conference on Artificial Intelligence, 2013Co-Authors: Pengtao Xie, Eric P XingAbstract:Multi-modal data is dramatically increasing with the fast growth of social media. Learning a good Distance measure for data with multiple modalities is of vital importance for many applications, including retrieval, clustering, classification and recommendation. In this paper, we propose an effective and scalable multi-modal Distance Metric learning framework. Based on the multi-wing harmonium model, our method provides a principled way to embed data of arbitrary modalities into a single latent space, of which an optimal Distance Metric can be learned under proper supervision, i.e., by minimizing the Distance between similar pairs whereas maximizing the Distance between dissimilar pairs. The parameters are learned by jointly optimizing the data likelihood under the latent space model and the loss induced by Distance supervision, thereby our method seeks a balance between explaining the data and providing an effective Distance Metric, which naturally avoids overfitting. We apply our general framework to text/image data and present empirical results on retrieval and classification to demonstrate the effectiveness and scalability.
-
IJCAI - Multi-modal Distance Metric learning
2013Co-Authors: Pengtao Xie, Eric P XingAbstract:Multi-modal data is dramatically increasing with the fast growth of social media. Learning a good Distance measure for data with multiple modalities is of vital importance for many applications, including retrieval, clustering, classification and recommendation. In this paper, we propose an effective and scalable multi-modal Distance Metric learning framework. Based on the multi-wing harmonium model, our method provides a principled way to embed data of arbitrary modalities into a single latent space, of which an optimal Distance Metric can be learned under proper supervision, i.e., by minimizing the Distance between similar pairs whereas maximizing the Distance between dissimilar pairs. The parameters are learned by jointly optimizing the data likelihood under the latent space model and the loss induced by Distance supervision, thereby our method seeks a balance between explaining the data and providing an effective Distance Metric, which naturally avoids overfitting. We apply our general framework to text/image data and present empirical results on retrieval and classification to demonstrate the effectiveness and scalability.
Fan Yang - One of the best experts on this subject based on the ideXlab platform.
-
Improved Tangent Space-Based Distance Metric for Lithographic Hotspot Classification
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2017Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan Zeng, Dian ZhouAbstract:A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space (ITS)-based Distance Metric for hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. The ITS-based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. We also propose a hierarchical density-based clustering method for hotspot clustering. It is more suitable for arbitrary shaped clusters.
-
Distance Metric Learning-Based Conformal Predictor
2012Co-Authors: Fan Yang, Zhigang Chen, Guifang Shao, Huazhen WangAbstract:In order to improve the computational efficiency of conformal predictor, Distance Metric learning methods were used in the algorithm. The process of learning was divided into two stages: offline learning and online learning. Firstly, part of the training data was used in Distance Metric learning to get a space transformation matrix in the offline learning stage; Secondly, standard CP-KNN was conducted on the remaining training data with a nonconformity measure function defined by K nearest neighbors classifier in the transformed space. Experimental results on three UCI datasets demonstrate the efficiency of the new algorithm.
-
Improved tangent space based Distance Metric for accurate lithographic hotspot classification
DAC Design Automation Conference 2012, 2012Co-Authors: Fan Yang, Charles Chiang, Subarna Sinha, Xuan ZengAbstract:A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.
-
DAC - Improved tangent space based Distance Metric for accurate lithographic hotspot classification
Proceedings of the 49th Annual Design Automation Conference on - DAC '12, 2012Co-Authors: Fan Yang, Subarna Sinha, Charles C Chiang, Xuan ZengAbstract:A Distance Metric of patterns is crucial to hotspot cluster analysis and classification. In this paper, we propose an improved tangent space based Metric for pattern matching based hotspot cluster analysis and classification. The proposed Distance Metric is an important extension of the well-developed tangent space method in computer vision. It can handle patterns containing multiple polygons, while the traditional tangent space method can only deal with patterns with a single polygon. It inherits most of the advantages of the traditional tangent space method, e.g., it is easy to compute and is tolerant with small variations or shifts of the shapes. Compared with the existing Distance Metric based on XOR of hotspot patterns, the improved tangent space based Distance Metric can achieve up to 37.5% accuracy improvement with at most 4.3× computational cost in the context of cluster analysis. The improved tangent space based Distance Metric is a more reliable and accurate Metric for hotspot cluster analysis and classification. It is more suitable for industry applications.