Evolutionary Distance

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 44277 Experts worldwide ranked by ideXlab platform

Kenichi Fukui - One of the best experts on this subject based on the ideXlab platform.

  • Reinforcement learning based metric filtering for Evolutionary Distance metric learning
    Intelligent Data Analysis, 2020
    Co-Authors: Bassel Ali, Wasin Kalintha, Koichi Moriyama, Masayuki Numao, Kenichi Fukui
    Abstract:

    Data collection plays an important role in business agility; data can prove valuable and provide insights for important features. However, conventional data collection methods can be costly and time-consuming. This paper proposes a hybrid system R-EDML that combines a sequential feature selection performed by Reinforcement Learning (RL) with the Evolutionary feature prioritization of Evolutionary Distance Metric Learning (EDML) in a clustering process. The goal is to reduce the features while maintaining or increasing the accuracy leading to less time complexity and future data collection time and cost reduction. In this method, features represented by the diagonal elements of EDML matrices are prioritized using a differential evolution algorithm. Further, a selection control strategy using RL is learned by sequentially inserting and evaluating the prioritized elements. The outcome offers the best accuracy R-EDML matrix with the least number of elements. Diagonal R-EDML focusing on the diagonal elements is compared with EDML and conventional feature selection. Full Matrix R-EDML focusing on the diagonal and non-diagonal elements is tested and compared with Information-Theoretic Metric Learning. Moreover, R-EDML policy is tested for each EDML generation and across all generations. Results show a significant decrease in the number of features while maintaining or increasing accuracy.

  • reinforcement learning for Evolutionary Distance metric learning systems improvement
    Genetic and Evolutionary Computation Conference, 2018
    Co-Authors: Bassel Ali, Wasin Kalintha, Koichi Moriyama, Masayuki Numao, Kenichi Fukui
    Abstract:

    This paper introduces a hybrid system called R-EDML, combining the sequential decision making of Reinforcement Learning (RL) with the Evolutionary feature prioritizing process of Evolutionary Distance Metric Learning (EDML) in clustering aiming to optimize the input space by reducing the number of selected features while maintaining the clustering performance. In the proposed method, features represented by the elements of EDML Distance transformation matrices are prioritized. Then a selection control strategy using Reinforcement Learning is learned. R-EDML was compared to normal EDML and conventional feature selection. Results show a decrease in the number of features, while maintaining a similar accuracy level.

  • GECCO (Companion) - Reinforcement learning for Evolutionary Distance metric learning systems improvement
    Proceedings of the Genetic and Evolutionary Computation Conference Companion, 2018
    Co-Authors: Bassel Ali, Wasin Kalintha, Koichi Moriyama, Masayuki Numao, Kenichi Fukui
    Abstract:

    This paper introduces a hybrid system called R-EDML, combining the sequential decision making of Reinforcement Learning (RL) with the Evolutionary feature prioritizing process of Evolutionary Distance Metric Learning (EDML) in clustering aiming to optimize the input space by reducing the number of selected features while maintaining the clustering performance. In the proposed method, features represented by the elements of EDML Distance transformation matrices are prioritized. Then a selection control strategy using Reinforcement Learning is learned. R-EDML was compared to normal EDML and conventional feature selection. Results show a decrease in the number of features, while maintaining a similar accuracy level.

  • Integrating Class Information and Features in Cluster Analysis Based on Evolutionary Distance Metric Learning
    Proceedings in Adaptation Learning and Optimization, 2016
    Co-Authors: Wasin Kalintha, Masayuki Numao, Satoshi Ono, Kenichi Fukui
    Abstract:

    Most current applications of clustering only focus on a technological domain, e.g., numerical similarity, while overlooking human domain yield unnatural and incomprehensible results in a human point of view. Unsupervised clustering constructs based on the similarities of numerical features. This study decreases the gap between multiple disciplines that are concerned both computational artifact and the human understanding in order to construct a more understandable cluster structure by considering available class information as well as data features in the clustering. Hence, we applied Evolutionary Distance Metric Learning (EDML) in cluster analysis in order to simultaneously analyze both class label and features. This method is applied to the real-world problem of facial images and food recipes data. The analysis provided promising insights about the relation between class information and features of the data, overall cluster structure distribution, neighbor cluster relations, and the viewpoint of the cluster analysis. Finally, cluster analysis using EDML method can obtain a more intelligible cluster structure with neighbor relations, discover interesting insights, and particular cluster structure can be obtained according to the purpose of analysis. Precisely, these results cannot be achieved by unsupervised clustering.

  • cluster analysis of face images and literature data by Evolutionary Distance metric learning
    International Conference on Innovative Techniques and Applications of Artificial Intelligence, 2015
    Co-Authors: Wasin Kalintha, Kenichi Fukui, Taishi Megano, Satoshi Ono, Masayuki Numao
    Abstract:

    Evolutionary Distance metric learning (EDML) is an efficient technique for solving clustering problems with some background knowledge. However, EDML has never been applied to real world applications. Thus, we demonstrate EDML for cluster analysis and visualization of two applications, i.e., a face recognition image dataset and a literature dataset. In the facial image clustering, we demonstrate improvement of the cluster validity index and also analyze the distributions of classes (ages) visualized by a self-organizing map and a K-means clustering with K-nearest neighbor centroids graph. For the literature dataset, we have analyzed the topics (i.e., a cluster of articles) that are the most likely to win the best paper award. Application of EDML to these datasets yielded qualitatively promising visualization results that demonstrate the practicability and effectiveness of EDML.

Salvatore Gaglio - One of the best experts on this subject based on the ideXlab platform.

  • Normalised compression Distance and Evolutionary Distance of genomic sequences: comparison of clustering results
    International Journal of Knowledge Engineering and Soft Data Paradigms, 2009
    Co-Authors: Massimo La Rosa, Salvatore Gaglio, Riccardo Rizzo, Alfonso Urso
    Abstract:

    Genomic sequences are usually compared using Evolutionary Distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a time consuming procedure and the obtained dissimilarity results is not a metric. Recently, the normalised compression Distance was introduced as a method to calculate the Distance between two generic digital objects and it seems a suitable way to compare genomic strings. In this paper, the clustering and the non-linear mapping obtained using the Evolutionary Distance and the compression Distance are compared, in order to understand if the two Distances sets are similar.

  • comparison of genomic sequences clustering using normalized compression Distance and Evolutionary Distance
    International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, 2008
    Co-Authors: Massimo La Rosa, Riccardo Rizzo, Alfonso Urso, Salvatore Gaglio
    Abstract:

    Genomic sequences are usually compared using Evolutionary Distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a long procedure and the obtained dissimilarity results is not a metric. Recently the normalized compression Distance was introduced as a method to calculate the Distance between two generic digital objects, and it seems a suitable way to compare genomic strings. In this paper the clustering and the mapping, obtained using a SOM, with the traditional Evolutionary Distance and the compression Distance are compared in order to understand if the two Distances sets are similar. The first results indicate that the two Distances catch different aspects of the genomic sequences and further investigations are needed to obtain a definitive result.

  • KES (3) - Comparison of Genomic Sequences Clustering Using Normalized Compression Distance and Evolutionary Distance
    Lecture Notes in Computer Science, 2008
    Co-Authors: Massimo La Rosa, Riccardo Rizzo, Alfonso Urso, Salvatore Gaglio
    Abstract:

    Genomic sequences are usually compared using Evolutionary Distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a long procedure and the obtained dissimilarity results is not a metric. Recently the normalized compression Distance was introduced as a method to calculate the Distance between two generic digital objects, and it seems a suitable way to compare genomic strings. In this paper the clustering and the mapping, obtained using a SOM, with the traditional Evolutionary Distance and the compression Distance are compared in order to understand if the two Distances sets are similar. The first results indicate that the two Distances catch different aspects of the genomic sequences and further investigations are needed to obtain a definitive result.

Fumio Tajima - One of the best experts on this subject based on the ideXlab platform.

  • Estimation of Evolutionary Distance for reconstructing molecular phylogenetic trees.
    Molecular biology and evolution, 1994
    Co-Authors: Fumio Tajima, Naoko Takezaki
    Abstract:

    The most commonly used measure of Evolutionary Distance in molecular phylogenetics is the number of nucleotide substitutions per site. However, this number is not necessarily most efficient for reconstructing a phylogenetic tree. In order to evaluate the accuracy of Evolutionary Distance, D(t), for obtaining the correct tree topology, an accuracy index, A(t), was proposed. This index is defined as D'(t)/square root of[D(t)], where D'(t) is the first derivative of D(t) with respect to Evolutionary time and V[D(t)] is the sampling variance of Evolutionary Distance. Using A(t), namely, finding the condition under which A(t) gives the maximum value, we can obtain an Evolutionary Distance which is efficient for obtaining the correct topology. Under the assumption that the transversional changes do not occur as frequently as the transitional changes, we obtained the Evolutionary Distances which are expected to give the correct topology more often than are the other Distances.

  • Unbiased estimation of Evolutionary Distance between nucleotide sequences.
    Molecular biology and evolution, 1993
    Co-Authors: Fumio Tajima
    Abstract:

    A new algorithm for estimating the number of nucleotide substitutions per site (i.e., the Evolutionary Distance) between two nucleotide sequences is presented. This algorithm can be applied to many estimation methods, such as Jukes and Cantor's method, Kimura's transition/transversion method, and Tajima and Nei's method. Unlike ordinary methods, this algorithm is always applicable. Numerical computations and computer simulations indicate that this algorithm gives an almost unbiased estimate of the Evolutionary Distance, unless the Evolutionary Distance is very large. This algorithm should be useful especially when we analyze short nucleotide sequences. It can also be applied to amino acid sequences, for estimating the number of amino acid replacements.

Sudhir Kumar - One of the best experts on this subject based on the ideXlab platform.

  • MEGA5: Molecular Evolutionary genetics analysis using maximum likelihood, Evolutionary Distance, and maximum parsimony methods
    Molecular Biology and Evolution, 2011
    Co-Authors: Koichiro Tamura, Masatoshi Nei, Nicholas Peterson, G Stecher, Daniel Peterson, Sudhir Kumar
    Abstract:

    Comparative analysis of molecular sequence data is essential for reconstructing the Evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of Evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring Evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating Evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.

  • Evolutionary Distance estimation under heterogeneous substitution pattern among lineages.
    Molecular biology and evolution, 2002
    Co-Authors: Koichiro Tamura, Sudhir Kumar
    Abstract:

    Most of the sophisticated methods to estimate Evolutionary divergence between DNA sequences assume that the two sequences have evolved with the same pattern of nucleotide substitution after their divergence from their most recent common ancestor (homogeneity assumption). If this assumption is violated, the Evolutionary Distance estimated will be biased, which may result in biased estimates of divergence times and substitution rates, and may lead to erroneous branching patterns in the inferred phylogenies. Here we present a simple modification for existing Distance estimation methods to relax the assumption of the substitution pattern homogeneity among lineages when analyzing DNA and protein sequences. Results from computer simulations and empirical data analyses for human and mouse genes are presented to demonstrate that the proposed modification reduces the estimation bias considerably and that the modified method performs much better than the LogDet methods, which do not require the homogeneity assumption in estimating the number of substitutions per site. We also discuss the relationship of the substitution and mutation rate estimates when the substitution pattern is not the same in the lineages leading to the two sequences compared.

Massimo La Rosa - One of the best experts on this subject based on the ideXlab platform.

  • Normalised compression Distance and Evolutionary Distance of genomic sequences: comparison of clustering results
    International Journal of Knowledge Engineering and Soft Data Paradigms, 2009
    Co-Authors: Massimo La Rosa, Salvatore Gaglio, Riccardo Rizzo, Alfonso Urso
    Abstract:

    Genomic sequences are usually compared using Evolutionary Distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a time consuming procedure and the obtained dissimilarity results is not a metric. Recently, the normalised compression Distance was introduced as a method to calculate the Distance between two generic digital objects and it seems a suitable way to compare genomic strings. In this paper, the clustering and the non-linear mapping obtained using the Evolutionary Distance and the compression Distance are compared, in order to understand if the two Distances sets are similar.

  • comparison of genomic sequences clustering using normalized compression Distance and Evolutionary Distance
    International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, 2008
    Co-Authors: Massimo La Rosa, Riccardo Rizzo, Alfonso Urso, Salvatore Gaglio
    Abstract:

    Genomic sequences are usually compared using Evolutionary Distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a long procedure and the obtained dissimilarity results is not a metric. Recently the normalized compression Distance was introduced as a method to calculate the Distance between two generic digital objects, and it seems a suitable way to compare genomic strings. In this paper the clustering and the mapping, obtained using a SOM, with the traditional Evolutionary Distance and the compression Distance are compared in order to understand if the two Distances sets are similar. The first results indicate that the two Distances catch different aspects of the genomic sequences and further investigations are needed to obtain a definitive result.

  • KES (3) - Comparison of Genomic Sequences Clustering Using Normalized Compression Distance and Evolutionary Distance
    Lecture Notes in Computer Science, 2008
    Co-Authors: Massimo La Rosa, Riccardo Rizzo, Alfonso Urso, Salvatore Gaglio
    Abstract:

    Genomic sequences are usually compared using Evolutionary Distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a long procedure and the obtained dissimilarity results is not a metric. Recently the normalized compression Distance was introduced as a method to calculate the Distance between two generic digital objects, and it seems a suitable way to compare genomic strings. In this paper the clustering and the mapping, obtained using a SOM, with the traditional Evolutionary Distance and the compression Distance are compared in order to understand if the two Distances sets are similar. The first results indicate that the two Distances catch different aspects of the genomic sequences and further investigations are needed to obtain a definitive result.