spectral clustering

The Experts below are selected from a list of 26778 Experts worldwide ranked by ideXlab platform

Maurizio Filippone - One of the best experts on this subject based on the ideXlab platform.

IJCNN - Mini-batch spectral clustering

2017 International Joint Conference on Neural Networks (IJCNN), 2017

Co-Authors: Yufei Han, Maurizio Filippone

Abstract:

The cost of computing the spectrum of Laplacian matrices hinders the application of spectral clustering to large data sets. While approximations recover computational tractability, they can potentially affect clustering performance. This paper proposes a practical approach to learn spectral clustering, where the spectrum of the Laplacian is recovered following a constrained optimization problem that we solve using adaptive mini-batch-based stochastic gradient optimization on Stiefel manifolds. Crucially, the proposed approach is formulated so that the memory footprint of the algorithm is low, the cost of each iteration is linear in the number of samples, and convergence to critical points of the objective function is guaranteed. Extensive experimental validation on data sets with up to half a million samples demonstrate its scalability and its ability to outperform state-of-the-art approximate methods to learn spectral clustering for a given computational budget.

15 days free trial to Access Article
Mini-Batch spectral clustering

arXiv: Machine Learning, 2016

Co-Authors: Yufei Han, Maurizio Filippone

Abstract:

The cost of computing the spectrum of Laplacian matrices hinders the application of spectral clustering to large data sets. While approximations recover computational tractability, they can potentially affect clustering performance. This paper proposes a practical approach to learn spectral clustering based on adaptive stochastic gradient optimization. Crucially, the proposed approach recovers the exact spectrum of Laplacian matrices in the limit of the iterations, and the cost of each iteration is linear in the number of samples. Extensive experimental validation on data sets with up to half a million samples demonstrate its scalability and its ability to outperform state-of-the-art approximate methods to learn spectral clustering for a given computational budget.

15 days free trial to Access Article

Guocan Feng - One of the best experts on this subject based on the ideXlab platform.

spectral clustering: A semi-supervised approach

Neurocomputing, 2012

Co-Authors: Weifu Chen, Guocan Feng

Abstract:

Recently, graph-based spectral clustering algorithms have been developing rapidly, which are proposed as discrete combinatorial optimization problems and approximately solved by relaxing them into tractable eigenvalue decomposition problems. In this paper, we first review the current existing spectral clustering algorithms in a unified-framework way and give a straightforward explanation about spectral clustering. We also present a novel model for generalizing the unsupervised spectral clustering to semi-supervised spectral clustering. Under this model, prior information given by some instance-level constraints can be generalized to space-level constraints. We find that (undirected) graph built on the enlarged prior information is more meaningful, hence the boundaries of the clusters are more correct. Experimental results based on toy data, real-world data and image segmentation demonstrate the advantages of the proposed model.

15 days free trial to Access Article
spectral clustering with discriminant cuts

Knowledge-Based Systems, 2012

Co-Authors: Weifu Chen, Guocan Feng

Abstract:

Recently, many k-way spectral clustering algorithms have been proposed, satisfying one or both of the following requirements: between-cluster similarities are minimized and within-cluster similarities are maximized. In this paper, a novel graph-based spectral clustering algorithm called discriminant cut (Dcut) is proposed, which first builds the affinity matrix of a weighted graph and normalizes it with the corresponding regularized Laplacian matrix, then partitions the vertices into k parts. Dcut has several advantages. First, it is derived from graph partition and has a straightforward geometrical explanation. Second, it emphasizes the above requirements simultaneously. Besides, it is computationally feasible because the NP-hard intractable graph cut problem can be relaxed into a mild eigenvalue decomposition problem. Toy-data and real-data experimental results show that Dcut is pronounced comparing with other spectral clustering methods.

15 days free trial to Access Article

Michael I. Jordan - One of the best experts on this subject based on the ideXlab platform.

AISTATS - Dimensionality Reduction for spectral clustering

2011

Co-Authors: Donglin Niu, Michael I. Jordan

Abstract:

spectral clustering is a flexible clustering methodology that is applicable to a variety of data types and has the particular virtue that it makes few assumptions on cluster shapes. It has become popular in a variety of application areas, particularly in computational vision and bioinformatics. The approach appears, however, to be particularly sensitive to irrelevant and noisy dimensions in the data. We thus introduce an approach that automatically learns the relevant dimensions and spectral clustering simultaneously. We pursue an augmented form of spectral clustering in which an explicit projection operator is incorporated in the relaxed optimization functional. We optimize this functional over both the projection and the spectral embedding. Experiments on simulated and real data show that this approach yields significant improvements in the performance of spectral clustering.

15 days free trial to Access Article
fast approximate spectral clustering

Knowledge Discovery and Data Mining, 2009

Co-Authors: Donghui Yan, Ling Huang, Michael I. Jordan

Abstract:

spectral clustering refers to a flexible class of clustering procedures that can produce high-quality clusterings on small data sets but which has limited applicability to large-scale problems due to its computational complexity of O(n3) in general, with n the number of data points. We extend the range of spectral clustering by developing a general framework for fast approximate spectral clustering in which a distortion-minimizing local transformation is first applied to the data. This framework is based on a theoretical analysis that provides a statistical characterization of the effect of local distortion on the mis-clustering rate. We develop two concrete instances of our general framework, one based on local k-means clustering (KASP) and one based on random projection trees (RASP). Extensive experiments show that these algorithms can achieve significant speedups with little degradation in clustering accuracy. Specifically, our algorithms outperform k-means by a large margin in terms of accuracy, and run several times faster than approximate spectral clustering based on the Nystrom method, with comparable accuracy and significantly smaller memory footprint. Remarkably, our algorithms make it possible for a single machine to spectral cluster data sets with a million observations within several minutes.

15 days free trial to Access Article
KDD - Fast approximate spectral clustering

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09, 2009

Co-Authors: Donghui Yan, Ling Huang, Michael I. Jordan

Abstract:

spectral clustering refers to a flexible class of clustering procedures that can produce high-quality clusterings on small data sets but which has limited applicability to large-scale problems due to its computational complexity of O(n3) in general, with n the number of data points. We extend the range of spectral clustering by developing a general framework for fast approximate spectral clustering in which a distortion-minimizing local transformation is first applied to the data. This framework is based on a theoretical analysis that provides a statistical characterization of the effect of local distortion on the mis-clustering rate. We develop two concrete instances of our general framework, one based on local k-means clustering (KASP) and one based on random projection trees (RASP). Extensive experiments show that these algorithms can achieve significant speedups with little degradation in clustering accuracy. Specifically, our algorithms outperform k-means by a large margin in terms of accuracy, and run several times faster than approximate spectral clustering based on the Nystrom method, with comparable accuracy and significantly smaller memory footprint. Remarkably, our algorithms make it possible for a single machine to spectral cluster data sets with a million observations within several minutes.

15 days free trial to Access Article
spectral clustering with perturbed data

Neural Information Processing Systems, 2008

Co-Authors: Ling Huang, Donghui Yan, Nina Taft, Michael I. Jordan

Abstract:

spectral clustering is useful for a wide-ranging set of applications in areas such as biological data analysis, image processing and data mining. However, the computational and/or communication resources required by the method in processing large-scale data are often prohibitively high, and practitioners are often required to perturb the original data in various ways (quantization, downsampling, etc) before invoking a spectral algorithm. In this paper, we use stochastic perturbation theory to study the effects of data perturbation on the performance of spectral clustering. We show that the error under perturbation of spectral clustering is closely related to the perturbation of the eigenvectors of the Laplacian matrix. From this result we derive approximate upper bounds on the clustering error. We show that this bound is tight empirically across a wide range of problems, suggesting that it can be used in practical settings to determine the amount of data reduction allowed in order to meet a specification of permitted loss in clustering performance.

15 days free trial to Access Article
Multiway spectral clustering: A Margin-based Perspective

Statistical Science, 2008

Co-Authors: Zhihua Zhang, Michael I. Jordan

Abstract:

spectral clustering is a broad class of clustering procedures in which an intractable combinatorial optimization formulation of clustering is “relaxed” into a tractable eigenvector problem, and in which the relaxed solution is subsequently “rounded” into an approximate discrete solution to the original problem. In this paper we present a novel margin-based perspective on multiway spectral clustering. We show that the margin-based perspective illuminates both the relaxation and rounding aspects of spectral clustering, providing a unified analysis of existing algorithms and guiding the design of new algorithms. We also present connections between spectral clustering and several other topics in statistics, specifically minimum-variance clustering, Procrustes analysis and Gaussian intrinsic autoregression.

15 days free trial to Access Article

Yufei Han - One of the best experts on this subject based on the ideXlab platform.

IJCNN - Mini-batch spectral clustering

2017 International Joint Conference on Neural Networks (IJCNN), 2017

Co-Authors: Yufei Han, Maurizio Filippone

Abstract:

The cost of computing the spectrum of Laplacian matrices hinders the application of spectral clustering to large data sets. While approximations recover computational tractability, they can potentially affect clustering performance. This paper proposes a practical approach to learn spectral clustering, where the spectrum of the Laplacian is recovered following a constrained optimization problem that we solve using adaptive mini-batch-based stochastic gradient optimization on Stiefel manifolds. Crucially, the proposed approach is formulated so that the memory footprint of the algorithm is low, the cost of each iteration is linear in the number of samples, and convergence to critical points of the objective function is guaranteed. Extensive experimental validation on data sets with up to half a million samples demonstrate its scalability and its ability to outperform state-of-the-art approximate methods to learn spectral clustering for a given computational budget.

15 days free trial to Access Article
Mini-Batch spectral clustering

arXiv: Machine Learning, 2016

Co-Authors: Yufei Han, Maurizio Filippone

Abstract:

The cost of computing the spectrum of Laplacian matrices hinders the application of spectral clustering to large data sets. While approximations recover computational tractability, they can potentially affect clustering performance. This paper proposes a practical approach to learn spectral clustering based on adaptive stochastic gradient optimization. Crucially, the proposed approach recovers the exact spectrum of Laplacian matrices in the limit of the iterations, and the cost of each iteration is linear in the number of samples. Extensive experimental validation on data sets with up to half a million samples demonstrate its scalability and its ability to outperform state-of-the-art approximate methods to learn spectral clustering for a given computational budget.

15 days free trial to Access Article

Frbmi Jordan - One of the best experts on this subject based on the ideXlab platform.

Learning spectral clustering

Advances in Neural Information Processing Systems 16 (NIPS), 2004

Co-Authors: Frbmi Jordan

Abstract:

spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive a new cost function for spectral clustering based on a measure of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing this cost function with respect to the partition leads to a new spectral clustering algorithm. Minimizing with respect to the similarity matrix leads to an algorithm for learning the similarity matrix. We develop a tractable approximation of our cost function that is based on the power method of computing eigenvectors.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Maurizio Filippone - One of the best experts on this subject based on the ideXlab platform.

IJCNN - Mini-batch spectral clustering

Mini-Batch spectral clustering

Guocan Feng - One of the best experts on this subject based on the ideXlab platform.

spectral clustering: A semi-supervised approach

spectral clustering with discriminant cuts

Michael I. Jordan - One of the best experts on this subject based on the ideXlab platform.

AISTATS - Dimensionality Reduction for spectral clustering

fast approximate spectral clustering

KDD - Fast approximate spectral clustering

spectral clustering with perturbed data

Multiway spectral clustering: A Margin-based Perspective

Yufei Han - One of the best experts on this subject based on the ideXlab platform.

IJCNN - Mini-batch spectral clustering

Mini-Batch spectral clustering

Frbmi Jordan - One of the best experts on this subject based on the ideXlab platform.

Learning spectral clustering

spectral clustering

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Maurizio Filippone - One of the best experts on this subject based on the ideXlab platform.

Guocan Feng - One of the best experts on this subject based on the ideXlab platform.

Michael I. Jordan - One of the best experts on this subject based on the ideXlab platform.

Yufei Han - One of the best experts on this subject based on the ideXlab platform.

Frbmi Jordan - One of the best experts on this subject based on the ideXlab platform.

Related terms