Similarity Search

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 61104 Experts worldwide ranked by ideXlab platform

Luciano Garciabanuelos - One of the best experts on this subject based on the ideXlab platform.

  • graph matching algorithms for business process model Similarity Search
    Business Process Management, 2009
    Co-Authors: Remco M Dijkman, Marlon Dumas, Luciano Garciabanuelos
    Abstract:

    We investigate the problem of ranking all process models in a repository according to their Similarity with respect to a given process model. We focus specifically on the application of graph matching algorithms to this Similarity Search problem. Since the corresponding graph matching problem is NP-complete, we seek to find a compromise between computational complexity and quality of the computed ranking. Using a repository of 100 process models, we evaluate four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one. The results show that the mean average precision obtained by a fast greedy algorithm is close to that obtained with the most exhaustive algorithm.

  • Similarity Search of business process models
    IEEE Data(base) Engineering Bulletin, 2009
    Co-Authors: Marlon Dumas, Luciano Garciabanuelos, Remco M Dijkman
    Abstract:

    Similarity Search is a general class of problems in which a given object, called a query object, is compared against a collection of objects in order to retrieve those that most closely resemble the query object. This paper reviews recent work on an instance of this class of problems, where the objects in question are business process models. The goal is to identify process models in a repository that most closely resemble a given process model or a fragment thereof.

  • graph matching algorithms for business process model Similarity Search
    Lecture Notes in Computer Science, 2009
    Co-Authors: Remco M Dijkman, Marlon Dumas, Luciano Garciabanuelos
    Abstract:

    We investigate the problem of ranking all process models in a repository according to their Similarity with respect to a given process model. We focus specaifically application of graph matching algorithms to this Similarity Search problem. Since the corresponding graph matching problem is NP-complete, we seek to find a compromise between computational complexity and quality of the computed ranking. Using a repository of 100 process models, we evaluate four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one. The results show that the mean average precision obtained by a fast greedy algorithm is close to that obtained with the most exhaustive algorithm.

Remco M Dijkman - One of the best experts on this subject based on the ideXlab platform.

  • graph matching algorithms for business process model Similarity Search
    Business Process Management, 2009
    Co-Authors: Remco M Dijkman, Marlon Dumas, Luciano Garciabanuelos
    Abstract:

    We investigate the problem of ranking all process models in a repository according to their Similarity with respect to a given process model. We focus specifically on the application of graph matching algorithms to this Similarity Search problem. Since the corresponding graph matching problem is NP-complete, we seek to find a compromise between computational complexity and quality of the computed ranking. Using a repository of 100 process models, we evaluate four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one. The results show that the mean average precision obtained by a fast greedy algorithm is close to that obtained with the most exhaustive algorithm.

  • Similarity Search of business process models
    IEEE Data(base) Engineering Bulletin, 2009
    Co-Authors: Marlon Dumas, Luciano Garciabanuelos, Remco M Dijkman
    Abstract:

    Similarity Search is a general class of problems in which a given object, called a query object, is compared against a collection of objects in order to retrieve those that most closely resemble the query object. This paper reviews recent work on an instance of this class of problems, where the objects in question are business process models. The goal is to identify process models in a repository that most closely resemble a given process model or a fragment thereof.

  • graph matching algorithms for business process model Similarity Search
    Lecture Notes in Computer Science, 2009
    Co-Authors: Remco M Dijkman, Marlon Dumas, Luciano Garciabanuelos
    Abstract:

    We investigate the problem of ranking all process models in a repository according to their Similarity with respect to a given process model. We focus specaifically application of graph matching algorithms to this Similarity Search problem. Since the corresponding graph matching problem is NP-complete, we seek to find a compromise between computational complexity and quality of the computed ranking. Using a repository of 100 process models, we evaluate four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one. The results show that the mean average precision obtained by a fast greedy algorithm is close to that obtained with the most exhaustive algorithm.

Chunxiao Xing - One of the best experts on this subject based on the ideXlab platform.

  • a transformation based framework for knn set Similarity Search extended abstract
    International Conference on Data Engineering, 2020
    Co-Authors: Yong Zhang, Jin Wang, Chunxiao Xing
    Abstract:

    Set Similarity Search is a fundamental operation in a variety of applications [3] , [5] , [2] . There is a long stream of reSearch on the problem of set Similarity Search. Given a collection of set records, a query and a Similarity function, the algorithm will return all the set records that are Similarity with the query. There are many metrics to measure the Similarity between two sets, such as Overlap, Jaccard, Cosine and Dice. In this paper we use the widely applied Jaccard to quantify the Similarity between two sets, but our proposed techniques can be easily extended to other set-based Similarity functions. Previous approaches require users to specify a threshold of Similarity. However, in many scenarios it is rather difficult to specify such a threshold. For example, when users types some keywords in the Search engine, they will pay more attention for the results which rank in the front, say the top five ones. In this case, if we use threshold-based Search instead of KNN Similarity Search, it is difficult to find the results that are more attractive for users.

  • a transformation based framework for knn set Similarity Search
    IEEE Transactions on Knowledge and Data Engineering, 2020
    Co-Authors: Yong Zhang, Jin Wang, Chunxiao Xing
    Abstract:

    Set Similarity Search is a fundamental operation in a variety of applications. While many previous studies focus on threshold based set Similarity Search and join, few efforts have been paid for KNN set Similarity Search. In this paper, we propose a transformation based framework to solve the problem of KNN set Similarity Search, which given a collection of set records and a query set, returns $k$ k results with the largest Similarity to the query. We devise an effective transformation mechanism to transform sets with various lengths to fixed length vectors which can map similar sets closer to each other. Then, we index such vectors with a tiny tree structure. Next, we propose efficient Search algorithms and pruning strategies to perform exact KNN set Similarity Search. We also design an estimation technique by leveraging the data distribution to support approximate KNN Search, which can speed up the Search while retaining high recall. Experimental results on real world datasets show that our framework significantly outperforms state-of-the-art methods in both memory and disk based settings.

Dongyan Zhao - One of the best experts on this subject based on the ideXlab platform.

  • Efficient Graph Similarity Search over Large Graph Databases
    IEEE Transactions on Knowledge and Data Engineering, 2015
    Co-Authors: Weiguo Zheng, Xiang Lian, Lei Zou, Dong Wang, Dongyan Zhao
    Abstract:

    Since many graph data are often noisy and incomplete in real applications, it has become increasingly important to retrieve graphs g in the graph database D that approximately match the query graph q, rather than exact graph matching. In this paper, we study the problem of graph Similarity Search, which retrieves graphs that are similar to a given query graph under the constraint of graph edit distance. We propose a systematic method for edit-distance based Similarity Search problem. Specifically, we derive two lower bounds, i.e., partition-based and branch-based bounds, from different perspectives. More importantly, a hybrid lower bound incorporating both ideas of the two lower bounds is proposed, which is theoretically proved to have higher (at least not lower) pruning power than using the two lower bounds together. We also present a uniform index structure, namely u-tree, to facilitate effective pruning and efficient query processing. Extensive experiments confirm that our proposed approach outperforms the existing approaches significantly, in terms of both the pruning power and query response time.

  • Graph Similarity Search with edit distance constraint in large graph databases
    Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13, 2013
    Co-Authors: Weiguo Zheng, Xiang Lian, Lei Zou, Dong Wang, Dongyan Zhao
    Abstract:

    Due to many real applications of graph databases, it has become increasingly important to retrieve graphs g (in graph database D) that approximately match with query graph q, rather than exact subgraph matches. In this paper, we study the problem of graph Similarity Search, which retrieves graphs that are similar to a given query graph under the constraint of the minimum edit distance. Specifically, we derive a lower bound, branch-based bound, which can greatly reduce the Search space of the graph Similarity Search. We also propose a tree index structure, namely b-tree, to facilitate effective pruning and efficient query processing. Extensive experiments confirm that our proposed approach outperforms the existing approaches by orders of magnitude, in terms of both pruning power and query response time.

Marlon Dumas - One of the best experts on this subject based on the ideXlab platform.

  • graph matching algorithms for business process model Similarity Search
    Business Process Management, 2009
    Co-Authors: Remco M Dijkman, Marlon Dumas, Luciano Garciabanuelos
    Abstract:

    We investigate the problem of ranking all process models in a repository according to their Similarity with respect to a given process model. We focus specifically on the application of graph matching algorithms to this Similarity Search problem. Since the corresponding graph matching problem is NP-complete, we seek to find a compromise between computational complexity and quality of the computed ranking. Using a repository of 100 process models, we evaluate four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one. The results show that the mean average precision obtained by a fast greedy algorithm is close to that obtained with the most exhaustive algorithm.

  • Similarity Search of business process models
    IEEE Data(base) Engineering Bulletin, 2009
    Co-Authors: Marlon Dumas, Luciano Garciabanuelos, Remco M Dijkman
    Abstract:

    Similarity Search is a general class of problems in which a given object, called a query object, is compared against a collection of objects in order to retrieve those that most closely resemble the query object. This paper reviews recent work on an instance of this class of problems, where the objects in question are business process models. The goal is to identify process models in a repository that most closely resemble a given process model or a fragment thereof.

  • graph matching algorithms for business process model Similarity Search
    Lecture Notes in Computer Science, 2009
    Co-Authors: Remco M Dijkman, Marlon Dumas, Luciano Garciabanuelos
    Abstract:

    We investigate the problem of ranking all process models in a repository according to their Similarity with respect to a given process model. We focus specaifically application of graph matching algorithms to this Similarity Search problem. Since the corresponding graph matching problem is NP-complete, we seek to find a compromise between computational complexity and quality of the computed ranking. Using a repository of 100 process models, we evaluate four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one. The results show that the mean average precision obtained by a fast greedy algorithm is close to that obtained with the most exhaustive algorithm.