Ith Cluster - Explore the Science & Experts

The Experts below are selected from a list of 6 Experts worldwide ranked by ideXlab platform

Francisco Azuaje - One of the best experts on this subject based on the ideXlab platform.

a Cluster validity framework for genome expression data

Bioinformatics, 2002

Co-Authors: Francisco Azuaje

Abstract:

Summary: This paper presents a method for the assessment of expression Cluster validity. Availability: Executable programs are available on request from the author. Contact: Francisco.Azuaje@cs.tcd.ie Supplementary information: http://www.cs.tcd.ie/ Francisco.Azuaje/Cval.html Clustering is a useful approach to analyzing genome expression data. It aims to partition samples or genes into groups characterized by similar expression patterns. A number of Clustering algorIthms have been proposed (such as hierarchical Clustering and neural networks), but fewer solutions to systematically evaluate the quality of the Clusters obtained have been presented. Once a Clustering process is performed researchers may deal wIth some of the following questions: Is this a relevant partition? Should we analyze these Clusters? Is there a better partition? The framework presented here aims to help researchers address these questions. It has been shown that determining the ‘right’ number of Clusters in experimental data is a complex and timeconsuming process. An effective strategy may be to first decide a good estimate of the correct number of Clusters. Our system predicts the optimal number of expression Clusters, which may represent the best results to consider for interpretation purposes. This system implements the Dunn’s validity index, which has been suggested as an effective estimator for different types of Clustering applications (Bezdek and Pal, 1998). This index is based on the idea of identifying sets of Clusters that are compact and well separated. For any partition U ↔ X : X1 ∪··· Xi ∪··· Xc, where Xi represents the Ith Cluster of such partition, the Dunn’s validation index, V , is defined as:

15 days free trial to Access Article

Wenzhuo Yang - One of the best experts on this subject based on the ideXlab platform.

A Divide and Conquer Framework for Distributed Graph Clustering

2016

Co-Authors: Wenzhuo Yang

Abstract:

Recall that the graph we analyzed contains n nodes and r Clusters, and is generated according to the generalized stochastic blockmodel. We let Ki be the size of the Ith Cluster, K be the minimum Cluster size, i.e., K = miniKi, and K be the size of the smallest Cluster that contains at least one ordinary node. Therefore, edge (i; j) is present in the graph wIth probability pij p for every pair of nodes i; j that belong to the same Cluster, and edge (i; j) is present in the graph wIth probability qij q for every pair of nodes i; j that are in different Clusters. Note that the outliers in the graph do not belong to any Cluster. Let UU ⊤ be the singular value decomposition of Y and PT (M) = UU⊤M +MUU ⊤ UU⊤MUU ⊤ be the projection of M onto the row and column spaces of Y, and let PT?(M) = M PT (M). Let R be the support of Y, i.e., R = f(i; j) : Y ij = 1g, C be the set of the edges connecting to the high confidence nodes, i.e., C = f(i; j): i or j is a high confidence nodeg and A be the support of A, i.e., A = f(i; j) : Aij = 1g. For a set of matrix indices Ω, we let PΩ(M) be the matrix whose (i; j)th entry equals Mij if (i; j) 2 Ω or 0 otherwise. We let E be the matrix whose entries are all ones. 2. Proof of Theorem 1 For clarity, we let c0√ maxfn s;Kg log n; cA = 1

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Ith Cluster with ideXlab!

Francisco Azuaje - One of the best experts on this subject based on the ideXlab platform.

a Cluster validity framework for genome expression data

Wenzhuo Yang - One of the best experts on this subject based on the ideXlab platform.

A Divide and Conquer Framework for Distributed Graph Clustering