Pastern

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 324 Experts worldwide ranked by ideXlab platform

Jian Pei - One of the best experts on this subject based on the ideXlab platform.

  • Constraint-based sequential pattern mining: The pattern-growth methods
    Journal of Intelligent Information Systems, 2007
    Co-Authors: Jian Pei, Jia Wei Han, Wei Wang
    Abstract:

    Abstract Constraints are essential for many sequential pattern mining applications. However, there is no systematic study on constraint-based sequential pattern mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-pattern mining\ndoes not fit our mission well. An extended framework is developed based on a sequential pattern growth methodology. Our study\nshows that constraints can be effectively and efficiently pushed deep into the sequential pattern mining under this new framework.\nMoreover, this framework can be extended to constraint-based structured pattern mining as well.

  • Sequential Pattern Mining by Pattern-Growth: Principles and Extensions
    Foundations and Advances in Data Mining, 2005
    Co-Authors: Jia Wei Han, Jian Pei, X. Yan
    Abstract:

    Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) a candidate generation-and-test approach, represented by (i)GSP [30], a horizontal format-based sequential pattern mining method, and (ii) SPADE [36], a vertical format-based method; and (2) a sequential pattern growth method, represented by PrefixSpan [26] and its further extensions, such as CloSpan for mining closed sequential patterns [35].

  • From sequential pattern mining to structured pattern mining: A pattern-growth approach
    Journal of Computer Science and Technology, 2004
    Co-Authors: Jia Wei Han, Jian Pei, Xi Feng Yan
    Abstract:

    Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP, a horizontal format-based sequential pattern mining method, and (ii) SPADE, a vertical format-based method; and (2) a pattern-growth method, represented by Pre xSpan and its further extensions, such as gSpan for mining structured patterns. In this study, we perform a systematic introduction and presentation of the pattern-growth methodology and study its principles and extensions. We rst introduce two interesting pattern-growth algorithms, FreeSpan and Pre xSpan, for eÆcient sequential pattern mining. Then we introduce gSpan for mining structured patterns using the same methodology. Their relative performance in large databases is presented and analyzed. Several extensions of these methods are also discussed in the paper, including mining multi-level, multi-dimensional patterns and mining constraint-based patterns.

  • Pattern-growth Methods for Frequent Pattern Mining
    School of Computing Science, 2002
    Co-Authors: Jian Pei
    Abstract:

    Mining frequent patterns from large databases plays an essential role in many data mining tasks and has broad applications. Most of the previously proposed methods adopt apriori- like candidate-generation-and-test approaches. However, those methods may encounter se- rious challenges when mining datasets with prolific patterns and/or long patterns. In this work, we develop a class of novel and efficient pattern-growth methods for mining various frequent patterns from large databases. Pattern-growth methods adopt a divide- and-conquer approach to decompose both the mining tasks and the databases. Then, they use a pattern fragment growth method to avoid the costly candidate-generation-and-test processing completely. Moreover, effective data structures are proposed to compress crucial information about frequent patterns and avoid expensive, repeated database scans. A com- prehensive performance study shows that pattern-growth methods, FP-growth and H-mine, are efficient and scalable. They are faster than some recently reported new frequent pattern mining methods. Interestingly, pattern growth methods are not only efficient, but also effective. With pattern growth methods, many interesting patterns can also be mined efficiently, such as patterns with some tough non-anti-monotonic constraints and sequential patterns. These techniques have strong implications to many other data mining tasks.

  • FreeSpan: frequent pattern-projected sequential pattern mining
    KDD '00 Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000
    Co-Authors: Jia Wei Han, B Mortazavi-asl, Jian Pei, Qiming Chen
    Abstract:

    Sequential pattern mining is an important data mining problem with broad applications. It is also a difficult problem since one may need to examine a combinatorially explosive number of possible subsequence patterns. Most of the previously developed sequential pattern mining methods follow the methodology of Apriori since the Apriori-based method may substantially reduce the number of combinations to be examined. However, Apriori still encounters problems when a sequence database is large and/or when sequential patterns to be mined are numerous and/or long. In this paper, we re-examine the sequential pattern mining problem and propose a novel, efficient sequential pattern mining method, called FreeSpan (i.e., Frequent pattern-projected Sequential pattern mining). The general idea of the method is to integrate the mining of frequent sequences with that of frequent patterns and use projected sequence databases to confine the search and the growth of subsequence fragments. FreeSpan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Our performance study shows that FreeSpan examines a substantially smaller number of combinations of subsequences and runs considerably faster than the Apriori based GSP algorithm.

Jia Wei Han - One of the best experts on this subject based on the ideXlab platform.

  • Constraint-based sequential pattern mining: The pattern-growth methods
    Journal of Intelligent Information Systems, 2007
    Co-Authors: Jian Pei, Jia Wei Han, Wei Wang
    Abstract:

    Abstract Constraints are essential for many sequential pattern mining applications. However, there is no systematic study on constraint-based sequential pattern mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-pattern mining\ndoes not fit our mission well. An extended framework is developed based on a sequential pattern growth methodology. Our study\nshows that constraints can be effectively and efficiently pushed deep into the sequential pattern mining under this new framework.\nMoreover, this framework can be extended to constraint-based structured pattern mining as well.

  • Sequential Pattern Mining by Pattern-Growth: Principles and Extensions
    Foundations and Advances in Data Mining, 2005
    Co-Authors: Jia Wei Han, Jian Pei, X. Yan
    Abstract:

    Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) a candidate generation-and-test approach, represented by (i)GSP [30], a horizontal format-based sequential pattern mining method, and (ii) SPADE [36], a vertical format-based method; and (2) a sequential pattern growth method, represented by PrefixSpan [26] and its further extensions, such as CloSpan for mining closed sequential patterns [35].

  • From sequential pattern mining to structured pattern mining: A pattern-growth approach
    Journal of Computer Science and Technology, 2004
    Co-Authors: Jia Wei Han, Jian Pei, Xi Feng Yan
    Abstract:

    Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP, a horizontal format-based sequential pattern mining method, and (ii) SPADE, a vertical format-based method; and (2) a pattern-growth method, represented by Pre xSpan and its further extensions, such as gSpan for mining structured patterns. In this study, we perform a systematic introduction and presentation of the pattern-growth methodology and study its principles and extensions. We rst introduce two interesting pattern-growth algorithms, FreeSpan and Pre xSpan, for eÆcient sequential pattern mining. Then we introduce gSpan for mining structured patterns using the same methodology. Their relative performance in large databases is presented and analyzed. Several extensions of these methods are also discussed in the paper, including mining multi-level, multi-dimensional patterns and mining constraint-based patterns.

  • FreeSpan: frequent pattern-projected sequential pattern mining
    KDD '00 Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000
    Co-Authors: Jia Wei Han, B Mortazavi-asl, Jian Pei, Qiming Chen
    Abstract:

    Sequential pattern mining is an important data mining problem with broad applications. It is also a difficult problem since one may need to examine a combinatorially explosive number of possible subsequence patterns. Most of the previously developed sequential pattern mining methods follow the methodology of Apriori since the Apriori-based method may substantially reduce the number of combinations to be examined. However, Apriori still encounters problems when a sequence database is large and/or when sequential patterns to be mined are numerous and/or long. In this paper, we re-examine the sequential pattern mining problem and propose a novel, efficient sequential pattern mining method, called FreeSpan (i.e., Frequent pattern-projected Sequential pattern mining). The general idea of the method is to integrate the mining of frequent sequences with that of frequent patterns and use projected sequence databases to confine the search and the growth of subsequence fragments. FreeSpan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Our performance study shows that FreeSpan examines a substantially smaller number of combinations of subsequences and runs considerably faster than the Apriori based GSP algorithm.

Wei Wang - One of the best experts on this subject based on the ideXlab platform.

Qiming Chen - One of the best experts on this subject based on the ideXlab platform.

  • FreeSpan: frequent pattern-projected sequential pattern mining
    KDD '00 Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000
    Co-Authors: Jia Wei Han, B Mortazavi-asl, Jian Pei, Qiming Chen
    Abstract:

    Sequential pattern mining is an important data mining problem with broad applications. It is also a difficult problem since one may need to examine a combinatorially explosive number of possible subsequence patterns. Most of the previously developed sequential pattern mining methods follow the methodology of Apriori since the Apriori-based method may substantially reduce the number of combinations to be examined. However, Apriori still encounters problems when a sequence database is large and/or when sequential patterns to be mined are numerous and/or long. In this paper, we re-examine the sequential pattern mining problem and propose a novel, efficient sequential pattern mining method, called FreeSpan (i.e., Frequent pattern-projected Sequential pattern mining). The general idea of the method is to integrate the mining of frequent sequences with that of frequent patterns and use projected sequence databases to confine the search and the growth of subsequence fragments. FreeSpan mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation. Our performance study shows that FreeSpan examines a substantially smaller number of combinations of subsequences and runs considerably faster than the Apriori based GSP algorithm.

Jan Dul - One of the best experts on this subject based on the ideXlab platform.

  • Pattern Matching
    Erasmus Research Institute of Management, 2009
    Co-Authors: Tony Hak, Jan Dul
    Abstract:

    Pattern matching is comparing two patterns in order to determine whether they match (i.e., that they are the same) or do not match (i.e., that they differ). Pattern matching is the core procedure of theory-testing with cases. Testing consists of matching an “observed pattern” (a pattern of measured values) with an “expected pattern” (a hypothesis), and deciding whether these patterns match (resulting in a confirmation of the hypothesis) or do not match (resulting in a disconfirmation). Essential to pattern matching (as opposed to pattern recognition, which is a procedure by which theory is built) is that the expected pattern is precisely specified before the matching takes place.