Data Mining Practitioner

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 525 Experts worldwide ranked by ideXlab platform

George Forman - One of the best experts on this subject based on the ideXlab platform.

  • An Extensive Empirical Study of Feature Selection Metrics for Text Classification
    Journal of Machine Learning Research, 2003
    Co-Authors: George Forman
    Abstract:

    Machine learning for text classification is the cornerstone of document categorization, news filtering, document routing, and personalization. In text domains, effective feature selection is essential to make the learning task efficient and more accurate. This paper presents an empirical comparison of twelve feature selection methods (e.g. Information Gain) evaluated on a benchmark of 229 text classification problem instances that were gathered from Reuters, TREC, OHSUMED, etc. The results are analyzed from multiple goal perspectives—accuracy, F-measure, precision, and recall—since each is appropriate in different situations. The results reveal that a new feature selection metric we call 'Bi-Normal Separation' (BNS), outperformed the others by a substantial margin in most situations. This margin widened in tasks with high class skew, which is rampant in text classification problems and is particularly challenging for induction algorithms. A new evaluation methodology is offered that focuses on the needs of the Data Mining Practitioner faced with a single Dataset who seeks to choose one (or a pair of) metrics that are most likely to yield the best performance. From this perspective, BNS was the top single choice for all goals except precision, for which Information Gain yielded the best result most often. This analysis also revealed, for example, that Information Gain and Chi-Squared have correlated failures, and so they work poorly together. When choosing optimal pairs of metrics for each of the four performance goals, BNS is consistently a member of the pair—e.g., for greatest recall, the pair BNS + F1-measure yielded the best performance on the greatest number of tasks by a considerable margin.

  • Choose Your Words Carefully: An Empirical Study Of Feature Selection Metrics for Text Classification
    2002
    Co-Authors: George Forman
    Abstract:

    Good feature selection is essential for text classification to make it tractable for machine learning, and to improve classification performance. This study benchmarks the performance of twelve feature selection metrics across 229 text classification problems drawn from Reuters, OHSUMED, TREC, etc. using Support Vector Machines. The results are analyzed for various objectives. For best accuracy, F- measure or recall, the findings reveal an outstanding new feature selection metric, "Bi-Normal Separation" (BNS). For precision alone, however, Information Gain (IG) was superior. A new evaluation methodology is offered that focuses on the needs of the Data Mining Practitioner who seeks to choose one or two metrics to try that are mostly likely to have the best performance for the single Dataset at hand. This analysis determined, for example, that IG and Chi-Squared have correlated failures for precision, and that IG paired with BNS is a better choice

Hidalgo México - One of the best experts on this subject based on the ideXlab platform.

  • Applying Data Mining Techniques to e-Learning Problems: a Survey and State of the Art: Evolution of Technology and Pedagogy
    Springer-Verlag, 2015
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Abstract. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized according to the type of modelling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, prediction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining techniques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviours; e-learning system navigation and interaction optimization; clustering according to similar e-learning system usage; and systems ’ adaptability to students ’ requirements and capacities.

  • www.springerlink.com © Springer-Verlag Berlin Heidelberg 2007 Applying Data Mining Techniques to e-Learning Problems
    2014
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Summary. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized ac-cording to the type of modeling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, predi-ction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining tech-niques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviors; e-learning system navigation and interaction optimization; clus-tering according to similar e-learning system usage; and systems ’ adapta-bility to students ’ requirements and capacities. 184 Félix Castro et al

Felix Castro - One of the best experts on this subject based on the ideXlab platform.

  • Applying Data Mining Techniques to e-Learning Problems: a Survey and State of the Art: Evolution of Technology and Pedagogy
    Springer-Verlag, 2015
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Abstract. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized according to the type of modelling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, prediction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining techniques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviours; e-learning system navigation and interaction optimization; clustering according to similar e-learning system usage; and systems ’ adaptability to students ’ requirements and capacities.

  • www.springerlink.com © Springer-Verlag Berlin Heidelberg 2007 Applying Data Mining Techniques to e-Learning Problems
    2014
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Summary. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized ac-cording to the type of modeling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, predi-ction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining tech-niques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviors; e-learning system navigation and interaction optimization; clus-tering according to similar e-learning system usage; and systems ’ adapta-bility to students ’ requirements and capacities. 184 Félix Castro et al

  • applying Data Mining techniques to e learning problems
    2007
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica
    Abstract:

    This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized according to the type of modeling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, prediction, etc.

Francisco Mugica - One of the best experts on this subject based on the ideXlab platform.

  • Applying Data Mining Techniques to e-Learning Problems: a Survey and State of the Art: Evolution of Technology and Pedagogy
    Springer-Verlag, 2015
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Abstract. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized according to the type of modelling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, prediction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining techniques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviours; e-learning system navigation and interaction optimization; clustering according to similar e-learning system usage; and systems ’ adaptability to students ’ requirements and capacities.

  • www.springerlink.com © Springer-Verlag Berlin Heidelberg 2007 Applying Data Mining Techniques to e-Learning Problems
    2014
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Summary. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized ac-cording to the type of modeling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, predi-ction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining tech-niques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviors; e-learning system navigation and interaction optimization; clus-tering according to similar e-learning system usage; and systems ’ adapta-bility to students ’ requirements and capacities. 184 Félix Castro et al

  • applying Data Mining techniques to e learning problems
    2007
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica
    Abstract:

    This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized according to the type of modeling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, prediction, etc.

Angela Nebot - One of the best experts on this subject based on the ideXlab platform.

  • Applying Data Mining Techniques to e-Learning Problems: a Survey and State of the Art: Evolution of Technology and Pedagogy
    Springer-Verlag, 2015
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Abstract. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized according to the type of modelling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, prediction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining techniques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviours; e-learning system navigation and interaction optimization; clustering according to similar e-learning system usage; and systems ’ adaptability to students ’ requirements and capacities.

  • www.springerlink.com © Springer-Verlag Berlin Heidelberg 2007 Applying Data Mining Techniques to e-Learning Problems
    2014
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica, Hidalgo México
    Abstract:

    Summary. This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized ac-cording to the type of modeling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, predi-ction, etc. Finally, from the standpoint of the e-learning Practitioner, we provide a taxonomy of e-learning problems to which Data Mining tech-niques have been applied, including, for instance: Students ’ classification based on their learning performance; detection of irregular learning behaviors; e-learning system navigation and interaction optimization; clus-tering according to similar e-learning system usage; and systems ’ adapta-bility to students ’ requirements and capacities. 184 Félix Castro et al

  • applying Data Mining techniques to e learning problems
    2007
    Co-Authors: Felix Castro, Alfredo Vellido, Angela Nebot, Francisco Mugica
    Abstract:

    This chapter aims to provide an up-to-date snapshot of the current state of research and applications of Data Mining methods in e-learning. The cross-fertilization of both areas is still in its infancy, and even academic references are scarce on the ground, although some leading education-related publications are already beginning to pay attention to this new field. In order to offer a reasonable organization of the available bibliographic information according to different criteria, firstly, and from the Data Mining Practitioner point of view, references are organized according to the type of modeling techniques used, which include: Neural Networks, Genetic Algorithms, Clustering and Visualization Methods, Fuzzy Logic, Intelligent agents, and Inductive Reasoning, amongst others. From the same point of view, the information is organized according to the type of Data Mining problem dealt with: clustering, classification, prediction, etc.