Decision Tree Learner

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 324 Experts worldwide ranked by ideXlab platform

Hendrik Blockeel - One of the best experts on this subject based on the ideXlab platform.

  • Multi-relational data mining in Microsoft SQL Server 2005
    Data Mining VII: Data Text and Web Mining and their Business Applications, 2006
    Co-Authors: Claudio Curotto, N. F. F. Ebecken, Hendrik Blockeel
    Abstract:

    Most real life data are relational by nature. Database mining integration is an essential goal to be achieved. Microsoft SQL Server (MSSQL) seems to provide an interesting and promising environment to develop aggregated multi-relational data mining algorithms by using nested tables and the plug-in algorithm approach. However, it is currently unclear how these nested tables can best be used by data mining algorithms. In this paper we look at how the Microsoft Decision Trees (MSDT) handles multi-relational data, and we compare it with the multi-relational Decision Tree Learner TILDE. In the experiments we perform, MSDT has equally good predictive accuracy as TILDE, but the Trees it gives either ignore the relational information, or use it in a way that yields non-interpretable Trees. As such, one could say that its explanatory power is reduced, when compared to a multi-relational Decision Tree Learner. We conclude that it may be worthwhile to integrate a multi-relational Decision Tree Learner in MSSQL.

  • ECML - A comparison of approaches for learning probability Trees
    Machine Learning: ECML 2005, 2005
    Co-Authors: Daan Fierens, Hendrik Blockeel, Jan Ramon, Maurice Bruynooghe
    Abstract:

    Probability Trees (or Probability Estimation Trees, PET's) are Decision Trees with probability distributions in the leaves. Several alternative approaches for learning probability Trees have been proposed but no thorough comparison of these approaches exists. In this paper we experimentally compare the main approaches using the relational Decision Tree Learner Tilde (both on non-relational and on relational datasets). Next to the main existing approaches, we also consider a novel variant of an existing approach based on the Bayesian Information Criterion (BIC). Our main conclusion is that overall Trees built using the C4.5-approach or the C4.4-approach (C4.5 without post-pruning) have the best predictive performance. If the number of classes is low, however, BIC performs equally well. An additional advantage of BIC is that its Trees are considerably smaller than Trees for the C4.5- or C4.4-approach.

  • speeding up relational reinforcement learning through the use of an incremental first order Decision Tree Learner
    European conference on Machine Learning, 2001
    Co-Authors: Kurt Driessens, Jan Ramon, Hendrik Blockeel
    Abstract:

    Relational reinforcement learning (RRL) is a learning technique that combines standard reinforcement learning with inductive logic programming to enable the learning system to exploit structural knowledge about the application domain. This paper discusses an improvement of the original RRL. We introduce a fully incremental first order Decision Tree learning algorithm TG and integrate this algorithm in the RRL system to form RRL-TG. We demonstrate the performance gain on similar experiments to those that were used to demonstrate the behaviour of the original RRL system.

  • ECML - Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner
    Machine Learning: ECML 2001, 2001
    Co-Authors: Kurt Driessens, Jan Ramon, Hendrik Blockeel
    Abstract:

    Relational reinforcement learning (RRL) is a learning technique that combines standard reinforcement learning with inductive logic programming to enable the learning system to exploit structural knowledge about the application domain. This paper discusses an improvement of the original RRL. We introduce a fully incremental first order Decision Tree learning algorithm TG and integrate this algorithm in the RRL system to form RRL-TG. We demonstrate the performance gain on similar experiments to those that were used to demonstrate the behaviour of the original RRL system.

Andrea Passerini - One of the best experts on this subject based on the ideXlab platform.

  • Mach Learn (2011) 83: 219–239 DOI 10.1007/s10994-010-5194-7 Relational information gain
    2016
    Co-Authors: Manfred Jaeger, Paolo Frasconi, Andrea Passerini
    Abstract:

    Abstract We introduce relational information gain, a refinement scoring function measur-ing the informativeness of newly introduced variables. The gain can be interpreted as a con-ditional entropy in a well-defined sense and can be efficiently approximately computed. In conjunction with simple greedy general-to-specific search algorithms such as FOIL, it yields an efficient and competitive algorithm in terms of predictive accuracy and compactness of the learned theory. In conjunction with the Decision Tree Learner TILDE, it offers a benefi-cial alternative to lookahead, achieving similar performance while significantly reducing the number of evaluated literals

  • Relational information gain
    Machine Learning, 2011
    Co-Authors: Marco Lippi, Manfred Jaeger, Paolo Frasconi, Andrea Passerini
    Abstract:

    We introduce relational information gain, a refinement scoring function measuring the informativeness of newly introduced variables. The gain can be interpreted as a conditional entropy in a well-defined sense and can be efficiently approximately computed. In conjunction with simple greedy general-to-specific search algorithms such as FOIL, it yields an efficient and competitive algorithm in terms of predictive accuracy and compactness of the learned theory. In conjunction with the Decision Tree Learner TILDE, it offers a beneficial alternative to lookahead, achieving similar performance while significantly reducing the number of evaluated literals.

  • Relational information gain
    Machine Learning, 2011
    Co-Authors: Marco Lippi, Manfred Jaeger, Paolo Frasconi, Andrea Passerini
    Abstract:

    We introduce relational information gain, a refinement scoring function measuring the informativeness of newly introduced variables. The gain can be interpreted as a conditional entropy in a well-defined sense and can be efficiently approximately computed. In conjunction with simple greedy general-to-specific search algorithms such as FOIL, it yields an efficient and competitive algorithm in terms of predictive accuracy and compactness of the learned theory. In conjunction with the Decision Tree Learner TILDE, it offers a beneficial alternative to lookahead, achieving similar performance while significantly reducing the number of evaluated literals.

Dae-ki Kang - One of the best experts on this subject based on the ideXlab platform.

  • Learning Decision Trees with taxonomy of propositionalized attributes
    Pattern Recognition, 2009
    Co-Authors: Dae-ki Kang, Kiwook Sohn
    Abstract:

    We consider the problem of exploiting a taxonomy of propositionalized attributes in order to learn compact and robust classifiers. We introduce propositionalized attribute taxonomy guided Decision Tree Learner (PAT-DTL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact Decision Trees. Since taxonomies are unavailable in most domains, we also introduce propositionalized attribute taxonomy Learner (PAT-Learner) that automatically constructs taxonomy from data. PAT-DTL uses top-down and bottom-up search to find a locally optimal cut that corresponds to the literals of Decision rules from data and propositionalized attribute taxonomy. PAT-Learner propositionalizes attributes and hierarchically clusters the propositionalized attributes based on the distribution of class labels that co-occur with them to generate a taxonomy. Our experimental results on UCI repository data sets show that the proposed algorithms can generate a Decision Tree that is generally more compact than and is sometimes comparably accurate to those produced by standard Decision Tree Learners.

  • PAKDD - RNBL-MN: a recursive naive bayes Learner for sequence classification
    Advances in Knowledge Discovery and Data Mining, 2006
    Co-Authors: Dae-ki Kang, Adrian Silvescu, Vasant Honavar
    Abstract:

    Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a Tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the Tree is based on a multinomial event model (one for each class at each node in the Tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 Decision Tree Learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

  • RNBL-MN : A recursive naive bayes Learner for sequence classification
    Lecture Notes in Computer Science, 2006
    Co-Authors: Dae-ki Kang, Adrian Silvescu, Vasant Honavar
    Abstract:

    Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a Tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the Tree is based on a multinomial event model (one for each class at each node in the Tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 Decision Tree Learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

Vasant Honavar - One of the best experts on this subject based on the ideXlab platform.

  • PAKDD - RNBL-MN: a recursive naive bayes Learner for sequence classification
    Advances in Knowledge Discovery and Data Mining, 2006
    Co-Authors: Dae-ki Kang, Adrian Silvescu, Vasant Honavar
    Abstract:

    Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a Tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the Tree is based on a multinomial event model (one for each class at each node in the Tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 Decision Tree Learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

  • RNBL-MN : A recursive naive bayes Learner for sequence classification
    Lecture Notes in Computer Science, 2006
    Co-Authors: Dae-ki Kang, Adrian Silvescu, Vasant Honavar
    Abstract:

    Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a Tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the Tree is based on a multinomial event model (one for each class at each node in the Tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 Decision Tree Learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

Jing Yang - One of the best experts on this subject based on the ideXlab platform.

  • VAET: A Visual Analytics Approach for E-transactions Time-Series
    2016
    Co-Authors: Cong Xie, Wei Chen, Xinxin Huang, Scott Barlowe, Jing Yang
    Abstract:

    Fig. 1. The visual analysis interface of the VAET system. (a) The time-of-saliency (TOS) map overviews the saliency of each transaction computed with a probabilistic Decision Tree Learner. (b) The KnotLines view shows the detailed information of transactions. The unfilled knots indicate fake transactions. (c) The legend of the sales category and the (d) bar chart shows the item volume of the selected transactions in TOS map. (e) Detailed transaction information and (f) statistical information are shown in auxiliary views. Abstract—Previous studies on E-transaction time-series have mainly focused on finding temporal trends of transaction behavior. Interesting transactions that are time-stamped and situation-relevant may easily be obscured in a large amount of information. This paper proposes a visual analytics system, Visual Analysis of E-transaction Time-Series (VAET), that allows the analysts to interactively explore large transaction datasets for insights about time-varying transactions. With a set of analyst-determined training samples, VAET automatically estimates the saliency of each transaction in a large time-series using a probabilistic Decision Tree Learner. It provides an effective time-of-saliency (TOS) map where the analysts can explore a large number of transactions at different time granularities. Interesting transactions are further encoded with KnotLines, a compact visual representation that captures both the temporal variations and the contextual connection of transactions. The analysts can thus explore, select, and investigate knotlines of interest. A case study and user study with a real E-transactions dataset (26 million records) demonstrate the effectiveness of VAET. Index Terms—Time-Series, Visual Analytics, E-transaction

  • VAET: A Visual Analytics Approach for E-Transactions Time-Series
    IEEE transactions on visualization and computer graphics, 2014
    Co-Authors: Cong Xie, Wei Chen, Xinxin Huang, Scott Barlowe, Jing Yang
    Abstract:

    Previous studies on E-transaction time-series have mainly focused on finding temporal trends of transaction behavior. Interesting transactions that are time-stamped and situation-relevant may easily be obscured in a large amount of information. This paper proposes a visual analytics system, Visual Analysis of E-transaction Time-Series (VAET), that allows the analysts to interactively explore large transaction datasets for insights about time-varying transactions. With a set of analyst-determined training samples, VAET automatically estimates the saliency of each transaction in a large time-series using a probabilistic Decision Tree Learner. It provides an effective time-of-saliency (TOS) map where the analysts can explore a large number of transactions at different time granularities. Interesting transactions are further encoded with KnotLines, a compact visual representation that captures both the temporal variations and the contextual connection of transactions. The analysts can thus explore, select, and investigate knotlines of interest. A case study and user study with a real E-transactions dataset (26 million records) demonstrate the effectiveness of VAET.