Pattern Mining - Explore the Science & Experts

The Experts below are selected from a list of 42081 Experts worldwide ranked by ideXlab platform

Jia Wei Han - One of the best experts on this subject based on the ideXlab platform.

Advanced Pattern Mining

Data Mining, 2012

Co-Authors: Jia Wei Han, Micheline Kamber, Jian Pei

Abstract:

This chapter discusses the advanced methods of frequent Pattern Mining, which mines more complex forms of frequent Patterns and considers user preferences or constraints to speed up the Mining process. Frequent Pattern Mining has reached far beyond the basics due to substantial research, numerous extensions of the problem scope, and broad application studies. An in-depth coverage of methods for Mining many kinds of Patterns is included elaborating on: multilevel Patterns, multidimensional Patterns, Patterns in continuous data, rare Patterns, negative Patterns, constrained frequent Patterns, frequent Patterns in high-dimensional data, colossal Patterns, and compressed and approximate Patterns. Other Pattern Mining themes, including Mining sequential and structured Patterns and Mining Patterns from spatiotemporal, multimedia, and stream data, are considered more advanced. Pattern Mining is a more general term than frequent Pattern Mining since the former covers rare and negative Patterns as well. However, when there is no ambiguity, the two terms are used interchangeably. In addition to Mining for basic frequent itemsets and associations, advanced forms of Patterns can be mined such as multilevel associations and multidimensional associations, quantitative association rules, rare Patterns, and negative Patterns. Users can also mine high-dimensional Patterns and compressed or approximate Patterns. Frequent Pattern Mining has many diverse applications, ranging from Pattern-based data cleaning to Pattern-based classification, clustering, and outlier or exception analysis.

15 days free trial to Access Article
Frequent Pattern Mining: Current status and future directions

Data Mining and Knowledge Discovery, 2007

Co-Authors: Jia Wei Han, Dong Xin, Hong Cheng, Xifeng Yan

Abstract:

Frequent PatternMining has been a focused theme in dataMining re- search foroveradecade.Abundantliteraturehasbeendedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset Mining in transaction databases to numerous research frontiers, such as sequential Pattern Mining, structured Pattern Mining, correlation Mining, associative classification, and frequent Pattern-based clus- tering, as well as their broad applications. In this article, we provide a brief over- view of the current status of frequent Pattern Mining and discuss a few promising research directions.We believe that frequent Pattern Mining research has sub- stantiallybroadenedthe scopeof data analysisandwillhavedeepimpactondata Mining methodologies and applications in the long run. However, there are still some challenging research issues that need to be solved before frequent Pattern Mining can claim a cornerstone approach in data Mining applications.

15 days free trial to Access Article
Constraint-based sequential Pattern Mining: The Pattern-growth methods

Journal of Intelligent Information Systems, 2007

Co-Authors: Jian Pei, Jia Wei Han, Wei Wang

Abstract:

Abstract Constraints are essential for many sequential Pattern Mining applications. However, there is no systematic study on constraint-based sequential Pattern Mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-Pattern Mining\ndoes not fit our mission well. An extended framework is developed based on a sequential Pattern growth methodology. Our study\nshows that constraints can be effectively and efficiently pushed deep into the sequential Pattern Mining under this new framework.\nMoreover, this framework can be extended to constraint-based structured Pattern Mining as well.

15 days free trial to Access Article
From sequential Pattern Mining to structured Pattern Mining: A Pattern-growth approach

Journal of Computer Science and Technology, 2004

Co-Authors: Jia Wei Han, Jian Pei, Xifeng Yan

Abstract:

Sequential Pattern Mining is an important data Mining problem with broad applications. However, it is also a challenging problem since the Mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential Pattern Mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP, a horizontal format-based sequential Pattern Mining method, and (ii) SPADE, a vertical format-based method; and (2) a Pattern-growth method, represented by Pre xSpan and its further extensions, such as gSpan for Mining structured Patterns. In this study, we perform a systematic introduction and presentation of the Pattern-growth methodology and study its principles and extensions. We rst introduce two interesting Pattern-growth algorithms, FreeSpan and Pre xSpan, for eÆcient sequential Pattern Mining. Then we introduce gSpan for Mining structured Patterns using the same methodology. Their relative performance in large databases is presented and analyzed. Several extensions of these methods are also discussed in the paper, including Mining multi-level, multi-dimensional Patterns and Mining constraint-based Patterns.

15 days free trial to Access Article
gspan graph based substructure Pattern Mining

International Conference on Data Mining, 2002

Co-Authors: Xifeng Yan, Jia Wei Han

Abstract:

We investigate new approaches for frequent graph-based Pattern Mining in graph datasets and propose a novel algorithm called gSpan (graph-based substructure Pattern Mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic order gSpan adopts the depth-first search strategy to mine frequent connected subgraphs efficiently. Our performance study shows that gSpan substantially outperforms previous algorithms, sometimes by an order of magnitude.

15 days free trial to Access Article

Charu C. Aggarwal - One of the best experts on this subject based on the ideXlab platform.

Association Pattern Mining

Data Mining, 2015

Co-Authors: Charu C. Aggarwal

Abstract:

The classical problem of association Pattern Mining is defined in the context of supermarket data containing sets of items bought by customers, which are referred to as transactions. The goal is to determine associations between groups of items bought by customers, which can intuitively be viewed as k-way correlations between items. The most popular model for association Pattern Mining uses the frequencies of sets of items as the quantification of the level of association.

15 days free trial to Access Article
Frequent Pattern Mining - Frequent Pattern Mining Algorithms: A Survey

Frequent Pattern Mining, 2014

Co-Authors: Charu C. Aggarwal, Mansurul Bhuiyan, Mohammad Al Hasan

Abstract:

This chapter will provide a detailed survey of frequent Pattern Mining algorithms. A wide variety of algorithms will be covered starting from Apriori. Many algorithms such as Eclat, TreeProjection, and FP-growth will be discussed. In addition a discussion of several maximal and closed frequent Pattern Mining algorithms will be provided. Thus, this chapter will provide one of most detailed surveys of frequent Pattern Mining algorithms available in the literature.

15 days free trial to Access Article
Frequent Pattern Mining - Applications of Frequent Pattern Mining

Frequent Pattern Mining, 2014

Co-Authors: Charu C. Aggarwal

Abstract:

Frequent Pattern Mining has broad applications which encompass clustering, classification, software bug detection, recommendations, and a wide variety of other problems. In fact, the greatest utility of frequent Pattern Mining (unlike other major data Mining problems such as outlier analysis and classification), is as an intermediate tool to provide Pattern-centered insights for a variety of problems. In this chapter, we will study a wide variety of applications of frequent Pattern Mining. The purpose of this chapter is not to provide a detailed description of every possible application, but to provide the reader an overview of what is possible with the use of methods such as frequent Pattern Mining.

15 days free trial to Access Article
Frequent Pattern Mining - An Introduction to Frequent Pattern Mining

Frequent Pattern Mining, 2014

Co-Authors: Charu C. Aggarwal

Abstract:

The problem of frequent Pattern Mining has been widely studied in the literature because of its numerous applications to a variety of data Mining problems such as clustering and classification. In addition, frequent Pattern Mining also has numerous applications in diverse domains such as spatiotemporal data, software bug detection, and biological data. The algorithmic aspects of frequent Pattern Mining have been explored very widely. This chapter provides an overview of these methods, as it relates to the organization of this book.

15 days free trial to Access Article
On dense Pattern Mining in graph streams

Proceedings of the VLDB Endowment, 2010

Co-Authors: Charu C. Aggarwal, Philip S Yu, Yao Li, Ruoming Jin

Abstract:

Many massive web and communication network applications create data which can be represented as a massive sequential stream of edges. For example, conversations in a telecommunication network or messages in a social network can be represented as a massive stream of edges. Such streams are typically very large, because of the large amount of underlying activity in such networks. An important application in these domains is to determine frequently occurring dense structures in the underlying graph stream. In general, we would like to determine frequent and dense Patterns in the underlying interactions. We introduce a model for dense Pattern Mining and propose probabilistic algorithms for deterMining such structural Patterns effectively and efficiently. The purpose of the probabilistic approach is to create a summarization of the graph stream, which can be used for further Pattern Mining. We show that this summarization approach leads to effective and efficient results for stream Pattern Mining over a number of real and synthetic data sets.

15 days free trial to Access Article

Jian Pei - One of the best experts on this subject based on the ideXlab platform.

Advanced Pattern Mining

Data Mining, 2012

Co-Authors: Jia Wei Han, Micheline Kamber, Jian Pei

Abstract:

This chapter discusses the advanced methods of frequent Pattern Mining, which mines more complex forms of frequent Patterns and considers user preferences or constraints to speed up the Mining process. Frequent Pattern Mining has reached far beyond the basics due to substantial research, numerous extensions of the problem scope, and broad application studies. An in-depth coverage of methods for Mining many kinds of Patterns is included elaborating on: multilevel Patterns, multidimensional Patterns, Patterns in continuous data, rare Patterns, negative Patterns, constrained frequent Patterns, frequent Patterns in high-dimensional data, colossal Patterns, and compressed and approximate Patterns. Other Pattern Mining themes, including Mining sequential and structured Patterns and Mining Patterns from spatiotemporal, multimedia, and stream data, are considered more advanced. Pattern Mining is a more general term than frequent Pattern Mining since the former covers rare and negative Patterns as well. However, when there is no ambiguity, the two terms are used interchangeably. In addition to Mining for basic frequent itemsets and associations, advanced forms of Patterns can be mined such as multilevel associations and multidimensional associations, quantitative association rules, rare Patterns, and negative Patterns. Users can also mine high-dimensional Patterns and compressed or approximate Patterns. Frequent Pattern Mining has many diverse applications, ranging from Pattern-based data cleaning to Pattern-based classification, clustering, and outlier or exception analysis.

15 days free trial to Access Article
Constraint-based sequential Pattern Mining: The Pattern-growth methods

Journal of Intelligent Information Systems, 2007

Co-Authors: Jian Pei, Jia Wei Han, Wei Wang

Abstract:

Abstract Constraints are essential for many sequential Pattern Mining applications. However, there is no systematic study on constraint-based sequential Pattern Mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-Pattern Mining\ndoes not fit our mission well. An extended framework is developed based on a sequential Pattern growth methodology. Our study\nshows that constraints can be effectively and efficiently pushed deep into the sequential Pattern Mining under this new framework.\nMoreover, this framework can be extended to constraint-based structured Pattern Mining as well.

15 days free trial to Access Article
Preference-Based Frequent Pattern Mining

International Journal of Data Warehousing and Mining, 2005

Co-Authors: Moonjung Cho, Jian Pei, Haixun Wang, Wei Wang

Abstract:

Frequent Pattern Mining is an important data-Mining problem with broad applications. Although there are many in-depth studies on efficient frequent Pattern Mining algorithms and constraint pushing techniques, the effectiveness of frequent Pattern Mining remains a serious concern: It is non-trivial and often tricky to specify appropriate support thresholds and proper constraints. In this paper, we propose a novel theme of preference-based frequent Pattern Mining. A user simply can specify a preference instead of setting detailed parameters in constraints. We identify the problem of preference-based frequent Pattern Mining and formulate the preferences for Mining. We develop an efficient framework to mine frequent Patterns with preferences. Interestingly, many preferences can be pushed deep into the Mining by properly employing the existing efficient frequent Pattern Mining techniques. We conduct an extensive performance study to examine our method. The results indicate that preference-based frequent Pattern Mining is effective and efficient. Furthermore, we extend our discussion from Pattern-based frequent Pattern Mining to preference-based data Mining in principle and draw a general framework.

15 days free trial to Access Article
From sequential Pattern Mining to structured Pattern Mining: A Pattern-growth approach

Journal of Computer Science and Technology, 2004

Co-Authors: Jia Wei Han, Jian Pei, Xifeng Yan

Abstract:

Sequential Pattern Mining is an important data Mining problem with broad applications. However, it is also a challenging problem since the Mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential Pattern Mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP, a horizontal format-based sequential Pattern Mining method, and (ii) SPADE, a vertical format-based method; and (2) a Pattern-growth method, represented by Pre xSpan and its further extensions, such as gSpan for Mining structured Patterns. In this study, we perform a systematic introduction and presentation of the Pattern-growth methodology and study its principles and extensions. We rst introduce two interesting Pattern-growth algorithms, FreeSpan and Pre xSpan, for eÆcient sequential Pattern Mining. Then we introduce gSpan for Mining structured Patterns using the same methodology. Their relative performance in large databases is presented and analyzed. Several extensions of these methods are also discussed in the paper, including Mining multi-level, multi-dimensional Patterns and Mining constraint-based Patterns.

15 days free trial to Access Article
Constrained frequent Pattern Mining: a Pattern-growth view

ACM SIGKDD Explorations Newsletter, 2002

Co-Authors: Jian Pei, Jia Wei Han

Abstract:

It has been well recognized that frequent Pattern Mining plays an essential role in many important data Mining tasks. However, frequent Pattern Mining often generates a very large number of Patterns and rules, which reduces not only the efficiency but also the effectiveness of Mining. Recent work has highlighted the importance of the constraint-based Mining paradigm in the context of Mining frequent itemsets, associations, correlations, sequential Patterns, and many other interesting Patterns in large databases.Recently, we developed efficient Pattern-growth methods for frequent Pattern Mining. Interestingly, Pattern-growth methods are not only efficient but also effective in Mining with various constraints. Many tough constraints which cannot be handled by previous methods can be pushed deep into the Pattern-growth Mining process. In this paper, we overview the principles of Pattern-growth methods for constrained frequent Pattern Mining and sequential Pattern Mining. Moreover, we explore the power of Pattern-growth methods towards Mining with tough constraints and highlight some interesting open problems.

15 days free trial to Access Article

Xifeng Yan - One of the best experts on this subject based on the ideXlab platform.

Frequent Pattern Mining: Current status and future directions

Data Mining and Knowledge Discovery, 2007

Co-Authors: Jia Wei Han, Dong Xin, Hong Cheng, Xifeng Yan

Abstract:

Frequent PatternMining has been a focused theme in dataMining re- search foroveradecade.Abundantliteraturehasbeendedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset Mining in transaction databases to numerous research frontiers, such as sequential Pattern Mining, structured Pattern Mining, correlation Mining, associative classification, and frequent Pattern-based clus- tering, as well as their broad applications. In this article, we provide a brief over- view of the current status of frequent Pattern Mining and discuss a few promising research directions.We believe that frequent Pattern Mining research has sub- stantiallybroadenedthe scopeof data analysisandwillhavedeepimpactondata Mining methodologies and applications in the long run. However, there are still some challenging research issues that need to be solved before frequent Pattern Mining can claim a cornerstone approach in data Mining applications.

15 days free trial to Access Article
From sequential Pattern Mining to structured Pattern Mining: A Pattern-growth approach

Journal of Computer Science and Technology, 2004

Co-Authors: Jia Wei Han, Jian Pei, Xifeng Yan

Abstract:

Sequential Pattern Mining is an important data Mining problem with broad applications. However, it is also a challenging problem since the Mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential Pattern Mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP, a horizontal format-based sequential Pattern Mining method, and (ii) SPADE, a vertical format-based method; and (2) a Pattern-growth method, represented by Pre xSpan and its further extensions, such as gSpan for Mining structured Patterns. In this study, we perform a systematic introduction and presentation of the Pattern-growth methodology and study its principles and extensions. We rst introduce two interesting Pattern-growth algorithms, FreeSpan and Pre xSpan, for eÆcient sequential Pattern Mining. Then we introduce gSpan for Mining structured Patterns using the same methodology. Their relative performance in large databases is presented and analyzed. Several extensions of these methods are also discussed in the paper, including Mining multi-level, multi-dimensional Patterns and Mining constraint-based Patterns.

15 days free trial to Access Article
gspan graph based substructure Pattern Mining

International Conference on Data Mining, 2002

Co-Authors: Xifeng Yan, Jia Wei Han

Abstract:

We investigate new approaches for frequent graph-based Pattern Mining in graph datasets and propose a novel algorithm called gSpan (graph-based substructure Pattern Mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic order gSpan adopts the depth-first search strategy to mine frequent connected subgraphs efficiently. Our performance study shows that gSpan substantially outperforms previous algorithms, sometimes by an order of magnitude.

15 days free trial to Access Article

G Raju - One of the best experts on this subject based on the ideXlab platform.

Web access Pattern Mining – a new method

International Journal of Web Science, 2014

Co-Authors: Achuthan Nair Rajimol, G Raju

Abstract:

An efficient web access Pattern Mining algorithm, FOL-mine is presented in this paper. The FOL-mine algorithm is based on the projected database of each frequent event and eliminates the need for construction of Pattern tree. A data structure, first occurrence list (FOL), is introduced in the proposed algorithm for efficient handling of suffix building. Rebuilding of projection databases is completely eliminated in the new method. Experimental analysis of the algorithms reveals significant performance gain over other web access Pattern Mining algorithms.

15 days free trial to Access Article
A Novel Weighted Support Method for Access Pattern Mining

2014

Co-Authors: G Raju, Achuthan Nair Rajimol

Abstract:

Sequential Pattern Mining is an important data Mining technique that finds out all frequent sequential Patterns in a sequence database. Applications in wide range of important domains make Sequential Pattern Mining an interesting area of research. Conventional approach for sequential Pattern Mining treats each and every item in the sequence with equal importance and thus fails to reflect the individual significance of items. Weighted Sequential Pattern Mining is an approach that treats different items in the sequences with different weights so as to reflect the importance of each item. Thus, weighted method models real life sequence database in a better manner and more efficient than the conventional sequential Pattern Mining. Weighted sequential Pattern Mining can be used to mine web access Patterns more efficiently from web log data. This paper proposes a new weighted access Pattern Mining algorithm to mine weighted access Patterns in a web log database. The proposed method uses frequency of user visit to give weights to web pages during the Mining process. Through extensive experimental evaluation the algorithm is proved to be promising.

15 days free trial to Access Article
ICDEM - Web access Pattern Mining --- a survey

Lecture Notes in Computer Science, 2010

Co-Authors: Achuthan Nair Rajimol, G Raju

Abstract:

This article provides a survey of different Web Access Pattern Tree (WAP-tree) based methods for Web Access Pattern Mining. Web Access Pattern Mining mines complete set of Patterns that satisfy the given support threshold from a given Web Access Sequence Database. A brief discussion of basic theory and terminologies related to web access Pattern Mining are Presented. A comparison of the different methods is also given.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Pattern Mining with ideXlab!

Jia Wei Han - One of the best experts on this subject based on the ideXlab platform.

Advanced Pattern Mining

Frequent Pattern Mining: Current status and future directions

Constraint-based sequential Pattern Mining: The Pattern-growth methods

From sequential Pattern Mining to structured Pattern Mining: A Pattern-growth approach

gspan graph based substructure Pattern Mining

Charu C. Aggarwal - One of the best experts on this subject based on the ideXlab platform.

Association Pattern Mining

Frequent Pattern Mining - Frequent Pattern Mining Algorithms: A Survey

Frequent Pattern Mining - Applications of Frequent Pattern Mining

Frequent Pattern Mining - An Introduction to Frequent Pattern Mining

On dense Pattern Mining in graph streams

Jian Pei - One of the best experts on this subject based on the ideXlab platform.

Advanced Pattern Mining

Constraint-based sequential Pattern Mining: The Pattern-growth methods

Preference-Based Frequent Pattern Mining

From sequential Pattern Mining to structured Pattern Mining: A Pattern-growth approach

Constrained frequent Pattern Mining: a Pattern-growth view

Xifeng Yan - One of the best experts on this subject based on the ideXlab platform.

Frequent Pattern Mining: Current status and future directions

From sequential Pattern Mining to structured Pattern Mining: A Pattern-growth approach

gspan graph based substructure Pattern Mining

G Raju - One of the best experts on this subject based on the ideXlab platform.

Web access Pattern Mining – a new method

A Novel Weighted Support Method for Access Pattern Mining

ICDEM - Web access Pattern Mining --- a survey