Data Mining Query

The Experts below are selected from a list of 11274 Experts worldwide ranked by ideXlab platform

Maciej Zakrzewicz - One of the best experts on this subject based on the ideXlab platform.

on multiple Query optimization in Data Mining

Knowledge Discovery and Data Mining, 2005

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

Traditional multiple Query optimization methods focus on identifying common subexpressions in sets of relational queries and on constructing their global execution plans. In this paper we consider the problem of optimizing sets of Data Mining queries submitted to a Knowledge Discovery Management System. We describe the problem of Data Mining Query scheduling and we introduce a new algorithm called CCAgglomerative to schedule Data Mining queries for frequent itemset discovery.

15 days free trial to Access Article
PAKDD - On multiple Query optimization in Data Mining

Advances in Knowledge Discovery and Data Mining, 2005

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

Traditional multiple Query optimization methods focus on identifying common subexpressions in sets of relational queries and on constructing their global execution plans. In this paper we consider the problem of optimizing sets of Data Mining queries submitted to a Knowledge Discovery Management System. We describe the problem of Data Mining Query scheduling and we introduce a new algorithm called CCAgglomerative to schedule Data Mining queries for frequent itemset discovery.

15 days free trial to Access Article
a study on answering a Data Mining Query using a materialized view

International Symposium on Computer and Information Sciences, 2004

Co-Authors: Maciej Zakrzewicz, Mikolaj Morzy, Marek Wojciechowski

Abstract:

One of the classic Data Mining problems is discovery of frequent itemsets. This problem particularly attracts Database community as it resembles traditional Database Querying. In this paper we consider a Data Mining system which supports storing of previous Query results in the form of materialized Data Mining views. While numerous works have shown that reusing results of previous frequent itemset queries can significantly improve performance of Data Mining Query processing, a thorough study of possible differences between the current Query and a materialized view has not been presented yet. In this paper we classify possible differences into six classes, provide I/O cost analysis for all the classes, and experimentally evaluate the most promising one.

15 days free trial to Access Article
Data Mining Query Scheduling for Apriori Common Counting

2004

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

In this paper we consider concurrent execution of multiple Data Mining queries. If such Data Mining queries operate on similar parts of the Database, then their overall I/O cost can be reduced by integrating their Data retrieval operations. The integration requires that many Data Mining queries are present in memory at the same time. If the memory size is not sufficient to hold all the Data Mining queries, then the queries must be scheduled into multiple phases of loading and processing. We discuss the problem of Data Mining Query scheduling and propose a heuristic algorithm to efficiently schedule the Data Mining queries into phases.

15 days free trial to Access Article
evaluation of the mine merge method for Data Mining Query processing

ADBIS (Local Proceedings), 2004

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

In this paper we consider concurrent execution of multiple Data Mining queries in the context of discovery of frequent itemsets. If such Data Mining queries operate on similar parts of the Database, then their overall I/O cost can be reduced by transforming the set of Data Mining queries into another set of non-overlapping queries, whose results can be used to efficiently answer the original queries. We discuss the problem of multiple Data Mining Query optimization and experimentally evaluate the Mine Merge algorithm to efficiently execute sets of Data Mining queries.

15 days free trial to Access Article

Céline Robardet - One of the best experts on this subject based on the ideXlab platform.

An Inductive Database System Based on Virtual Mining Views

Data Mining and Knowledge Discovery, 2012

Co-Authors: Hendrik Blockeel, Toon Calders, Elisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet

Abstract:

Inductive Databases integrate Database Querying with Database Mining. In this article, we present an inductive Database system that does not rely on a new Data Mining Query language, but on plain SQL. We propose an intuitive and elegant framework based on virtual Mining views, which are relational tables that virtually contain the complete output of Data Mining algorithms executed over a given Data table. We show that several types of patterns and models that are implicitly present in the Data, such as itemsets, association rules, and decision trees, can be represented and queried with SQL using a unifying framework. As a proof of concept, we illustrate a complete Data Mining scenario with SQL queries over the Mining views, which is executed in our system.

15 days free trial to Access Article
A Practical Comparative Study Of Data Mining Query Languages

2010

Co-Authors: Hendrik Blockeel, Toon Calders, Elisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet

Abstract:

An important motivation for the development of inductive Databases and Query languages for Data Mining is that such an approach will increase the flexibility with which Data Mining can be performed. By integrating Data Mining more closely into a Database Querying framework, separate steps such as Data preprocessing, Data Mining, and postprocessing of the results, can all be handled using one Query language. In this chapter, we compare 6 existing Data Mining Query languages, all extensions of the standard relational Query language SQL, from this point of view: how flexible are they with respect to the tasks they can be used for, and how easily can those tasks be performed? We verify whether and how these languages can be used to perform four prototypical Data Mining tasks in the domain of itemset and associa- tion rule Mining, and summarize their stronger and weaker points. Besides offering a comparative evaluation of different Data Mining Query languages, this chapter also provides a motivation for the next chapter, where a deeper integration of Data Mining into Databases is proposed, one that does not rely on the development of a new Query language, but where the structure of the Database itself is extended.

15 days free trial to Access Article
Inductive Querying with Virtual Mining Views

2010

Co-Authors: Hendrik Blockeel, Toon Calders, Elisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet

Abstract:

An important motivation for the development of inductive Databases and Query languages for Data Mining is that such an approach will increase the flexibility with which Data Mining can be performed. By integrating Data Mining more closely into a Database Querying framework, separate steps such as Data preprocessing, Data Mining, and postprocessing of the results, can all be handled using one Query language. In this chapter, we compare 6 existing Data Mining Query languages, all extensions of the standard relational Query language SQL, from this point of view: how flexible are they with respect to the tasks they can be used for, and how easily can those tasks be performed? We verify whether and how these languages can be used to perform four prototypical Data Mining tasks in the domain of itemset and associa- tion rule Mining, and summarize their stronger and weaker points. Besides offering a comparative evaluation of different Data Mining Query languages, this chapter also provides a motivation for the next chapter, where a deeper integration of Data Mining into Databases is proposed, one that does not rely on the development of a new Query language, but where the structure of the Database itself is extended.

15 days free trial to Access Article
Inductive Databases and Constraint-Based Data Mining - A Practical Comparative Study Of Data Mining Query Languages

Inductive Databases and Constraint-Based Data Mining, 2010

Co-Authors: Hendrik Blockeel, Toon Calders, Elisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet

Abstract:

An important motivation for the development of inductive Databases and Query languages for Data Mining is that such an approach will increase the flexibility with which Data Mining can be performed. By integrating Data Mining more closely into a Database Querying framework, separate steps such as Data preprocessing, Data Mining, and postprocessing of the results, can all be handled using one Query language. In this chapter, we compare six existing Data Mining Query languages, all extensions of the standard relational Query language SQL, from this point of view: how flexible are they with respect to the tasks they can be used for, and how easily can those tasks be performed? We verify whether and how these languages can be used to perform four prototypical Data Mining tasks in the domain of itemset and association rule Mining, and summarize their stronger and weaker points. Besides offering a comparative evaluation of different Data Mining Query languages, this chapter also provides a motivation for a following chapter, where a deeper integration of Data Mining into Databases is proposed, one that does not rely on the development of a new Query language, but where the structure of the Database itself is extended.

15 days free trial to Access Article
a practical comparative study of Data Mining Query languages

Inductive databases and constraint-based data mining DÅ¾eroski SaÅ¡o [edit.]; et al. [edit.], 2010

Co-Authors: Hendrik Blockeel, Toon Calders, Elisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet

Abstract:

An important motivation for the development of inductive Databases and Query languages for Data Mining is that such an approach will increase the flexibility with which Data Mining can be performed. By integrating Data Mining more closely into a Database Querying framework, separate steps such as Data preprocessing, Data Mining, and postprocessing of the results, can all be handled using one Query language. In this chapter, we compare six existing Data Mining Query languages, all extensions of the standard relational Query language SQL, from this point of view: how flexible are they with respect to the tasks they can be used for, and how easily can those tasks be performed? We verify whether and how these languages can be used to perform four prototypical Data Mining tasks in the domain of itemset and association rule Mining, and summarize their stronger and weaker points. Besides offering a comparative evaluation of different Data Mining Query languages, this chapter also provides a motivation for a following chapter, where a deeper integration of Data Mining into Databases is proposed, one that does not rely on the development of a new Query language, but where the structure of the Database itself is extended.

15 days free trial to Access Article

Marek Wojciechowski - One of the best experts on this subject based on the ideXlab platform.

on multiple Query optimization in Data Mining

Knowledge Discovery and Data Mining, 2005

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

Traditional multiple Query optimization methods focus on identifying common subexpressions in sets of relational queries and on constructing their global execution plans. In this paper we consider the problem of optimizing sets of Data Mining queries submitted to a Knowledge Discovery Management System. We describe the problem of Data Mining Query scheduling and we introduce a new algorithm called CCAgglomerative to schedule Data Mining queries for frequent itemset discovery.

15 days free trial to Access Article
PAKDD - On multiple Query optimization in Data Mining

Advances in Knowledge Discovery and Data Mining, 2005

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

Traditional multiple Query optimization methods focus on identifying common subexpressions in sets of relational queries and on constructing their global execution plans. In this paper we consider the problem of optimizing sets of Data Mining queries submitted to a Knowledge Discovery Management System. We describe the problem of Data Mining Query scheduling and we introduce a new algorithm called CCAgglomerative to schedule Data Mining queries for frequent itemset discovery.

15 days free trial to Access Article
a study on answering a Data Mining Query using a materialized view

International Symposium on Computer and Information Sciences, 2004

Co-Authors: Maciej Zakrzewicz, Mikolaj Morzy, Marek Wojciechowski

Abstract:

One of the classic Data Mining problems is discovery of frequent itemsets. This problem particularly attracts Database community as it resembles traditional Database Querying. In this paper we consider a Data Mining system which supports storing of previous Query results in the form of materialized Data Mining views. While numerous works have shown that reusing results of previous frequent itemset queries can significantly improve performance of Data Mining Query processing, a thorough study of possible differences between the current Query and a materialized view has not been presented yet. In this paper we classify possible differences into six classes, provide I/O cost analysis for all the classes, and experimentally evaluate the most promising one.

15 days free trial to Access Article
Data Mining Query Scheduling for Apriori Common Counting

2004

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

In this paper we consider concurrent execution of multiple Data Mining queries. If such Data Mining queries operate on similar parts of the Database, then their overall I/O cost can be reduced by integrating their Data retrieval operations. The integration requires that many Data Mining queries are present in memory at the same time. If the memory size is not sufficient to hold all the Data Mining queries, then the queries must be scheduled into multiple phases of loading and processing. We discuss the problem of Data Mining Query scheduling and propose a heuristic algorithm to efficiently schedule the Data Mining queries into phases.

15 days free trial to Access Article
evaluation of the mine merge method for Data Mining Query processing

ADBIS (Local Proceedings), 2004

Co-Authors: Marek Wojciechowski, Maciej Zakrzewicz

Abstract:

In this paper we consider concurrent execution of multiple Data Mining queries in the context of discovery of frequent itemsets. If such Data Mining queries operate on similar parts of the Database, then their overall I/O cost can be reduced by transforming the set of Data Mining queries into another set of non-overlapping queries, whose results can be used to efficiently answer the original queries. We discuss the problem of multiple Data Mining Query optimization and experimentally evaluate the Mine Merge algorithm to efficiently execute sets of Data Mining queries.

15 days free trial to Access Article

Cyrille Masson - One of the best experts on this subject based on the ideXlab platform.

The Data Mining and Knowledge Discovery Handbook - Data Mining Query Languages

Data Mining and Knowledge Discovery Handbook, 2009

Co-Authors: Jeanfrancois Boulicaut, Cyrille Masson

Abstract:

Many Data Mining algorithms enable to extract different types of patterns from Data (e.g., local patterns like itemsets and association rules, models like classifiers). To support the whole knowledge discovery process, we need for integrated systems which can deal either with patterns and Data. The inductive Database approach has emerged as an unifying framework for such systems. Following this Database perspective, knowledge discovery processes become Querying processes for which Query languages have to be designed. In the prolific field of association rule Mining, different proposals of Query languages have been made to support the more or less declarative specification of both Data and pattern manipulations. In this chapter, we survey some of these proposals. It enables to identify nowadays shortcomings and to point out some promising directions of research in this area.

15 days free trial to Access Article
Data Mining Query languages

The Data Mining and Knowledge Discovery Handbook, 2005

Co-Authors: Jeanfrancois Boulicaut, Cyrille Masson

Abstract:

Many Data Mining algorithms enable to extract different types of patterns from Data (e.g., local patterns like itemsets and association rules, models like classifiers). To support the whole knowledge discovery process, we need for integrated systems which can deal either with patterns and Data. The inductive Database approach has emerged as an unifying framework for such systems. Following this Database perspective, knowledge discovery processes become Querying processes for which Query languages have to be designed. In the prolific field of association rule Mining, different proposals of Query languages have been made to support the more or less declarative specification of both Data and pattern manipulations. In this chapter, we survey some of these proposals. It enables to identify nowadays shortcomings and to point out some promising directions of research in this area.

15 days free trial to Access Article
Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets

2004

Co-Authors: Cyrille Masson, Céline Robardet, Jeanfrancois Boulicaut

Abstract:

Storing sets and Querying them (e.g., subset queries that provide all supersets of a given set) is known to be difficult within relational Databases. We consider that being able to Query efficiently both transactional Data and materialized collections of sets by means of standard Query language is an important step towards practical inductive Databases. Indeed, Data Mining Query languages like MINE RULE extract collections of association rules whose components are sets into relational tables. Post-processing phases often use extensively subset queries and cannot be efficiently processed by SQL servers. In this paper, we propose a new way to handle sets from relational Databases. It is based on a Data structure that partially encodes the inclusion relationship between sets. It is an extension of the hash group bitmap key proposed by Morzy et al. [8]. Our experiments show an interesting improvement for these useful subset queries.

15 days free trial to Access Article
SAC - Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets

Proceedings of the 2004 ACM symposium on Applied computing - SAC '04, 2004

Co-Authors: Cyrille Masson, Céline Robardet, Jeanfrancois Boulicaut

Abstract:

Storing sets and Querying them (e.g., subset queries that provide all supersets of a given set) is known to be difficult within relational Databases. We consider that being able to Query efficiently both transactional Data and materialized collections of sets by means of standard Query language is an important step towards practical inductive Databases. Indeed, Data Mining Query languages like MINE RULE extract collections of association rules whose components are sets into relational tables. Post-processing phases often use extensively subset queries and cannot be efficiently processed by SQL servers. In this paper, we propose a new way to handle sets from relational Databases. It is based on a Data structure that partially encodes the inclusion relationship between sets. It is an extension of the hash group bitmap key proposed by Morzy et al. [8]. Our experiments show an interesting improvement for these useful subset queries.

15 days free trial to Access Article

Jeanfrancois Boulicaut - One of the best experts on this subject based on the ideXlab platform.

The Data Mining and Knowledge Discovery Handbook - Data Mining Query Languages

Data Mining and Knowledge Discovery Handbook, 2009

Co-Authors: Jeanfrancois Boulicaut, Cyrille Masson

Abstract:

Many Data Mining algorithms enable to extract different types of patterns from Data (e.g., local patterns like itemsets and association rules, models like classifiers). To support the whole knowledge discovery process, we need for integrated systems which can deal either with patterns and Data. The inductive Database approach has emerged as an unifying framework for such systems. Following this Database perspective, knowledge discovery processes become Querying processes for which Query languages have to be designed. In the prolific field of association rule Mining, different proposals of Query languages have been made to support the more or less declarative specification of both Data and pattern manipulations. In this chapter, we survey some of these proposals. It enables to identify nowadays shortcomings and to point out some promising directions of research in this area.

15 days free trial to Access Article
Data Mining Query languages

The Data Mining and Knowledge Discovery Handbook, 2005

Co-Authors: Jeanfrancois Boulicaut, Cyrille Masson

Abstract:

Many Data Mining algorithms enable to extract different types of patterns from Data (e.g., local patterns like itemsets and association rules, models like classifiers). To support the whole knowledge discovery process, we need for integrated systems which can deal either with patterns and Data. The inductive Database approach has emerged as an unifying framework for such systems. Following this Database perspective, knowledge discovery processes become Querying processes for which Query languages have to be designed. In the prolific field of association rule Mining, different proposals of Query languages have been made to support the more or less declarative specification of both Data and pattern manipulations. In this chapter, we survey some of these proposals. It enables to identify nowadays shortcomings and to point out some promising directions of research in this area.

15 days free trial to Access Article
Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets

2004

Co-Authors: Cyrille Masson, Céline Robardet, Jeanfrancois Boulicaut

Abstract:

Storing sets and Querying them (e.g., subset queries that provide all supersets of a given set) is known to be difficult within relational Databases. We consider that being able to Query efficiently both transactional Data and materialized collections of sets by means of standard Query language is an important step towards practical inductive Databases. Indeed, Data Mining Query languages like MINE RULE extract collections of association rules whose components are sets into relational tables. Post-processing phases often use extensively subset queries and cannot be efficiently processed by SQL servers. In this paper, we propose a new way to handle sets from relational Databases. It is based on a Data structure that partially encodes the inclusion relationship between sets. It is an extension of the hash group bitmap key proposed by Morzy et al. [8]. Our experiments show an interesting improvement for these useful subset queries.

15 days free trial to Access Article
SAC - Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets

Proceedings of the 2004 ACM symposium on Applied computing - SAC '04, 2004

Co-Authors: Cyrille Masson, Céline Robardet, Jeanfrancois Boulicaut

Abstract:

Storing sets and Querying them (e.g., subset queries that provide all supersets of a given set) is known to be difficult within relational Databases. We consider that being able to Query efficiently both transactional Data and materialized collections of sets by means of standard Query language is an important step towards practical inductive Databases. Indeed, Data Mining Query languages like MINE RULE extract collections of association rules whose components are sets into relational tables. Post-processing phases often use extensively subset queries and cannot be efficiently processed by SQL servers. In this paper, we propose a new way to handle sets from relational Databases. It is based on a Data structure that partially encodes the inclusion relationship between sets. It is an extension of the hash group bitmap key proposed by Morzy et al. [8]. Our experiments show an interesting improvement for these useful subset queries.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Maciej Zakrzewicz - One of the best experts on this subject based on the ideXlab platform.

on multiple Query optimization in Data Mining

PAKDD - On multiple Query optimization in Data Mining

a study on answering a Data Mining Query using a materialized view

Data Mining Query Scheduling for Apriori Common Counting

evaluation of the mine merge method for Data Mining Query processing

Céline Robardet - One of the best experts on this subject based on the ideXlab platform.

An Inductive Database System Based on Virtual Mining Views

A Practical Comparative Study Of Data Mining Query Languages

Inductive Querying with Virtual Mining Views

Inductive Databases and Constraint-Based Data Mining - A Practical Comparative Study Of Data Mining Query Languages

a practical comparative study of Data Mining Query languages

Marek Wojciechowski - One of the best experts on this subject based on the ideXlab platform.

on multiple Query optimization in Data Mining

PAKDD - On multiple Query optimization in Data Mining

a study on answering a Data Mining Query using a materialized view

Data Mining Query Scheduling for Apriori Common Counting

evaluation of the mine merge method for Data Mining Query processing

Cyrille Masson - One of the best experts on this subject based on the ideXlab platform.

The Data Mining and Knowledge Discovery Handbook - Data Mining Query Languages

Data Mining Query languages

Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets

SAC - Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets

Jeanfrancois Boulicaut - One of the best experts on this subject based on the ideXlab platform.

The Data Mining and Knowledge Discovery Handbook - Data Mining Query Languages

Data Mining Query languages

Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets

SAC - Optimizing subset queries: a step towards SQL-based inductive Databases for itemsets