Agglomerative Clustering

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 10617 Experts worldwide ranked by ideXlab platform

Roland Wittler - One of the best experts on this subject based on the ideXlab platform.

  • Correction: Unraveling overlapping deletions by Agglomerative Clustering
    BMC Genomics, 2013
    Co-Authors: Roland Wittler
    Abstract:

    Wittler R. Correction: Unraveling overlapping deletions by Agglomerative Clustering. BMC Genomics. 2013;14(Suppl 1): S16

  • Unraveling overlapping deletions by Agglomerative Clustering
    BMC Genomics, 2013
    Co-Authors: Roland Wittler
    Abstract:

    Wittler R. Unraveling overlapping deletions by Agglomerative Clustering. BMC Genomics. 2013;14(Suppl 1): S12.Background Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from several samples in order to, e.g., distinguish tumor from patient specific variations. However, it has been shown that, especially in this setting, there is a need to explicitly take overlapping deletions into consideration. Existing tools have only minor capabilities to call overlapping deletions, unable to unravel complex signals to obtain consistent predictions. Result We present a first approach specifically designed to cluster short-read paired-end data into possibly overlapping deletion predictions. The method does not make any assumptions on the composition of the data, such as the number of samples, heterogeneity, polyploidy, etc. Taking paired ends mapped to a reference genome as input, it iteratively merges mappings to clusters based on a similarity score that takes both the putative location and size of a deletion into account. Conclusion We demonstrate that Agglomerative Clustering is suitable to predict deletions. Analyzing real data from three samples of a cancer patient, we found putatively overlapping deletions and observed that, as a side-effect, erroneous mappings are mostly identified as singleton clusters. An evaluation on simulated data shows, compared to other methods which can output overlapping clusters, high accuracy in separating overlapping from single deletions

Shenmin Song - One of the best experts on this subject based on the ideXlab platform.

  • moea d with the online Agglomerative Clustering based self adaptive mating restriction strategy
    Neurocomputing, 2019
    Co-Authors: Hu Zhang, Shenmin Song
    Abstract:

    Abstract In MOEA/D-DE, the appropriate value of the mating restriction probability varies with the evolutionary process. Furthermore, different subproblems have been solved in different degree during the evolution, so different subproblems have distinct requirements for exploitation and exploration. Additionally, MOEA/D-DE defines the neighborhood according to the distance between the weight vectors. However, the individuals corresponding to the neighbor subproblems may distribute far away in the decision space, which will affect the performance of exploitation. Accordingly, this paper proposes a MOEA/D with the online Agglomerative Clustering based self-adaptive mating restriction strategy (MOEA/D-OMR). MOEA/D-OMR utilizes the online Agglomerative Clustering algorithm to extract the neighborhood information in the decision space. The mating pool is then constructed by the neighbor population or the whole population based on the mating restriction probability. What is more, a separate mating restriction probability is assigned to each subproblem. The mating restriction probability is updated at each generation by the survival length, which is the number of generations that the solution has survived over the last certain period of time. Experimental results show that MOEA/D-OMR has a better performance than the comparison algorithms.

  • MOEA/D with the online Agglomerative Clustering based self-adaptive mating restriction strategy
    Neurocomputing, 2019
    Co-Authors: Hu Zhang, Shenmin Song
    Abstract:

    Abstract In MOEA/D-DE, the appropriate value of the mating restriction probability varies with the evolutionary process. Furthermore, different subproblems have been solved in different degree during the evolution, so different subproblems have distinct requirements for exploitation and exploration. Additionally, MOEA/D-DE defines the neighborhood according to the distance between the weight vectors. However, the individuals corresponding to the neighbor subproblems may distribute far away in the decision space, which will affect the performance of exploitation. Accordingly, this paper proposes a MOEA/D with the online Agglomerative Clustering based self-adaptive mating restriction strategy (MOEA/D-OMR). MOEA/D-OMR utilizes the online Agglomerative Clustering algorithm to extract the neighborhood information in the decision space. The mating pool is then constructed by the neighbor population or the whole population based on the mating restriction probability. What is more, a separate mating restriction probability is assigned to each subproblem. The mating restriction probability is updated at each generation by the survival length, which is the number of generations that the solution has survived over the last certain period of time. Experimental results show that MOEA/D-OMR has a better performance than the comparison algorithms.

Hu Zhang - One of the best experts on this subject based on the ideXlab platform.

  • moea d with the online Agglomerative Clustering based self adaptive mating restriction strategy
    Neurocomputing, 2019
    Co-Authors: Hu Zhang, Shenmin Song
    Abstract:

    Abstract In MOEA/D-DE, the appropriate value of the mating restriction probability varies with the evolutionary process. Furthermore, different subproblems have been solved in different degree during the evolution, so different subproblems have distinct requirements for exploitation and exploration. Additionally, MOEA/D-DE defines the neighborhood according to the distance between the weight vectors. However, the individuals corresponding to the neighbor subproblems may distribute far away in the decision space, which will affect the performance of exploitation. Accordingly, this paper proposes a MOEA/D with the online Agglomerative Clustering based self-adaptive mating restriction strategy (MOEA/D-OMR). MOEA/D-OMR utilizes the online Agglomerative Clustering algorithm to extract the neighborhood information in the decision space. The mating pool is then constructed by the neighbor population or the whole population based on the mating restriction probability. What is more, a separate mating restriction probability is assigned to each subproblem. The mating restriction probability is updated at each generation by the survival length, which is the number of generations that the solution has survived over the last certain period of time. Experimental results show that MOEA/D-OMR has a better performance than the comparison algorithms.

  • MOEA/D with the online Agglomerative Clustering based self-adaptive mating restriction strategy
    Neurocomputing, 2019
    Co-Authors: Hu Zhang, Shenmin Song
    Abstract:

    Abstract In MOEA/D-DE, the appropriate value of the mating restriction probability varies with the evolutionary process. Furthermore, different subproblems have been solved in different degree during the evolution, so different subproblems have distinct requirements for exploitation and exploration. Additionally, MOEA/D-DE defines the neighborhood according to the distance between the weight vectors. However, the individuals corresponding to the neighbor subproblems may distribute far away in the decision space, which will affect the performance of exploitation. Accordingly, this paper proposes a MOEA/D with the online Agglomerative Clustering based self-adaptive mating restriction strategy (MOEA/D-OMR). MOEA/D-OMR utilizes the online Agglomerative Clustering algorithm to extract the neighborhood information in the decision space. The mating pool is then constructed by the neighbor population or the whole population based on the mating restriction probability. What is more, a separate mating restriction probability is assigned to each subproblem. The mating restriction probability is updated at each generation by the survival length, which is the number of generations that the solution has survived over the last certain period of time. Experimental results show that MOEA/D-OMR has a better performance than the comparison algorithms.

Xiaodan Zhang - One of the best experts on this subject based on the ideXlab platform.

  • semantic smoothing of document models for Agglomerative Clustering
    International Joint Conference on Artificial Intelligence, 2007
    Co-Authors: Xiaohua Zhou, Xiaodan Zhang
    Abstract:

    In this paper, we argue that the Agglomerative Clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belong to different classes in many cases since any pair of documents shares lots of "general" words. Second, the sparsity of class-specific "core" words leads to grouping documents with the same class labels into different clusters. Both problems can be resolved by suitable smoothing of document model and using Kullback-Leibler divergence of two smoothed models as pairwise document distances. Inspired by the recent work in information retrieval, we propose a novel context-sensitive semantic smoothing method that can automatically identifies multiword phrases in a document and then statistically map phrases to individual document terms. We evaluate the new model-based similarity measure on three datasets using complete linkage criterion for Agglomerative Clustering and find out it significantly improves the Clustering quality over the traditional vector cosine measure.

  • IJCAI - Semantic smoothing of document models for Agglomerative Clustering
    2007
    Co-Authors: Xiaohua Zhou, Xiaodan Zhang
    Abstract:

    In this paper, we argue that the Agglomerative Clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belong to different classes in many cases since any pair of documents shares lots of "general" words. Second, the sparsity of class-specific "core" words leads to grouping documents with the same class labels into different clusters. Both problems can be resolved by suitable smoothing of document model and using Kullback-Leibler divergence of two smoothed models as pairwise document distances. Inspired by the recent work in information retrieval, we propose a novel context-sensitive semantic smoothing method that can automatically identifies multiword phrases in a document and then statistically map phrases to individual document terms. We evaluate the new model-based similarity measure on three datasets using complete linkage criterion for Agglomerative Clustering and find out it significantly improves the Clustering quality over the traditional vector cosine measure.

Joel B. Harley - One of the best experts on this subject based on the ideXlab platform.

  • segmentation of hidden delaminations with pitch catch ultrasonic testing and Agglomerative Clustering
    Journal of Nondestructive Evaluation, 2020
    Co-Authors: Alexander C. S. Douglass, Daniel Sparkman, Joel B. Harley
    Abstract:

    This paper studies the detection of hidden polymer matrix composite delaminations with a pitch–catch ultrasonic testing system and an Agglomerative Clustering algorithm. Existing ultrasonic testing methods characterize damage through normal-incidence pulse-echo measurements. Yet, these pulse-echo methods are ineffective at detecting delaminations underneath other delaminations. The ultrasonic waves only reflect off the top delamination. As a result, no information about the lower delamination is transmitted to the receiver. To address this problem, we investigate an oblique-angle pitch–catch ultrasonic testing method to transmit ultrasonic waves underneath the delaminations. The ultrasonic waves interact with the lower delaminations and carry that information to the receiver. We describe and discuss our experiments, which use two polytetrafluoro-ethylene (PTFE) inserts to simulate delaminations. We show that applying Agglomerative Clustering to the experimental data can successfully distinguish three regions: regions with two PTFE inserts, regions with only an upper PTFE insert, and regions with only the lower PTFE insert. Normal-incidence measurements only observe two regions.

  • Segmentation of Hidden Delaminations with Pitch–Catch Ultrasonic Testing and Agglomerative Clustering
    Journal of Nondestructive Evaluation, 2020
    Co-Authors: Alexander C. S. Douglass, Daniel Sparkman, Joel B. Harley
    Abstract:

    This paper studies the detection of hidden polymer matrix composite delaminations with a pitch–catch ultrasonic testing system and an Agglomerative Clustering algorithm. Existing ultrasonic testing methods characterize damage through normal-incidence pulse-echo measurements. Yet, these pulse-echo methods are ineffective at detecting delaminations underneath other delaminations. The ultrasonic waves only reflect off the top delamination. As a result, no information about the lower delamination is transmitted to the receiver. To address this problem, we investigate an oblique-angle pitch–catch ultrasonic testing method to transmit ultrasonic waves underneath the delaminations. The ultrasonic waves interact with the lower delaminations and carry that information to the receiver. We describe and discuss our experiments, which use two polytetrafluoro-ethylene (PTFE) inserts to simulate delaminations. We show that applying Agglomerative Clustering to the experimental data can successfully distinguish three regions: regions with two PTFE inserts, regions with only an upper PTFE insert, and regions with only the lower PTFE insert. Normal-incidence measurements only observe two regions.