Knowledge Discovery

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Jeremy J Yang - One of the best experts on this subject based on the ideXlab platform.

  • edge2vec representation learning using edge semantics for biomedical Knowledge Discovery
    BMC Bioinformatics, 2019
    Co-Authors: Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Jeremy J Yang, Christopher Gessner, Brian Foote, David J Wild, Ying Ding
    Abstract:

    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining Knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and Knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and Knowledge Discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, edge2vec significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical Knowledge Discovery applicability.

  • edge2vec representation learning using edge semantics for biomedical Knowledge Discovery
    arXiv: Information Retrieval, 2018
    Co-Authors: Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Jeremy J Yang, Christopher Gessner, Brian Foote, David J Wild, Qi Yu, Ying Ding
    Abstract:

    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining Knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and Knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and Knowledge Discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, \textbf{edge2vec}\ significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical Knowledge Discovery applicability.

Lukasz Kurgan - One of the best experts on this subject based on the ideXlab platform.

  • data mining a Knowledge Discovery approach
    2007
    Co-Authors: Krzysztof J Cios, Witold Pedrycz, Roman W Swiniarski, Lukasz Kurgan
    Abstract:

    This comprehensive textbook on data mining details the unique steps of the Knowledge Discovery process that prescribes the sequence in which data mining projects should be performed, from problem and data understanding through datapreprocessing to deployment of the results. This Knowledge Discovery approach is what distinguishes Data Mining from other texts in this area. The book provides a suite of exercises and includes links to instructional presentations. Furthermore, it containsappendices of relevant mathematical material.

  • a survey of Knowledge Discovery and data mining process models
    Knowledge Engineering Review, 2006
    Co-Authors: Lukasz Kurgan, Petr Musilek
    Abstract:

    Knowledge Discovery and Data Mining is a very dynamic research and development area that is reaching maturity. As such, it requires stable and well-defined foundations, which are well understood and popularized throughout the community. This survey presents a historical overview, description and future directions concerning a standard for a Knowledge Discovery and Data Mining process model. It presents a motivation for use and a comprehensive comparison of several leading process models, and discusses their applications to both academic and industrial problems. The main goal of this review is the consolidation of the research in this area. The survey also proposes to enhance existing models by embedding other current standards to enable automation and interoperability of the entire process.

  • Knowledge Discovery approach to automated cardiac spect diagnosis
    Artificial Intelligence in Medicine, 2001
    Co-Authors: Lukasz Kurgan, Krzysztof J Cios, Ryszard Tadeusiewicz, Marek R Ogiela, Lucy S Goodenday
    Abstract:

    The paper describes a computerized process of myocardial perfusion diagnosis from cardiac single proton emission computed tomography (SPECT) images using data mining and Knowledge Discovery approach. We use a six-step Knowledge Discovery process. A database consisting of 267 cleaned patient SPECT images (about 3000 2D images), accompanied by clinical information and physician interpretation was created first. Then, a new user-friendly algorithm for computerizing the diagnostic process was designed and implemented. SPECT images were processed to extract a set of features, and then explicit rules were generated, using inductive machine learning and heuristic approaches to mimic cardiologist's diagnosis. The system is able to provide a set of computer diagnoses for cardiac SPECT studies, and can be used as a diagnostic tool by a cardiologist. The achieved results are encouraging because of the high correctness of diagnoses.

Ying Ding - One of the best experts on this subject based on the ideXlab platform.

  • edge2vec representation learning using edge semantics for biomedical Knowledge Discovery
    BMC Bioinformatics, 2019
    Co-Authors: Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Jeremy J Yang, Christopher Gessner, Brian Foote, David J Wild, Ying Ding
    Abstract:

    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining Knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and Knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and Knowledge Discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, edge2vec significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical Knowledge Discovery applicability.

  • edge2vec representation learning using edge semantics for biomedical Knowledge Discovery
    arXiv: Information Retrieval, 2018
    Co-Authors: Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Jeremy J Yang, Christopher Gessner, Brian Foote, David J Wild, Qi Yu, Ying Ding
    Abstract:

    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining Knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and Knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and Knowledge Discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, \textbf{edge2vec}\ significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical Knowledge Discovery applicability.

Covadonga Fernandez - One of the best experts on this subject based on the ideXlab platform.

  • a survey of data mining and Knowledge Discovery process models and methodologies
    Knowledge Engineering Review, 2010
    Co-Authors: Gonzalo Mariscal, Oscar Marban, Covadonga Fernandez
    Abstract:

    Up to now, many data mining and Knowledge Discovery methodologies and process models have been developed, with varying degrees of success. In this paper, we describe the most used (in industrial and academic projects) and cited (in scientific literature) data mining and Knowledge Discovery methodologies and process models, providing an overview of its evolution along data mining and Knowledge Discovery history and setting down the state of the art in this topic. For every approach, we have provided a brief description of the proposed Knowledge Discovery in databases (KDD) process, discussing about special features, outstanding advantages and disadvantages of every approach. Apart from that, a global comparative of all presented data mining approaches is provided, focusing on the different steps and tasks in which every approach interprets the whole KDD process. As a result of the comparison, we propose a new data mining and Knowledge Discovery process named refined data mining process for developing any kind of data mining and Knowledge Discovery project. The refined data mining process is built on specific steps taken from analyzed approaches.

Gang Fu - One of the best experts on this subject based on the ideXlab platform.

  • edge2vec representation learning using edge semantics for biomedical Knowledge Discovery
    BMC Bioinformatics, 2019
    Co-Authors: Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Jeremy J Yang, Christopher Gessner, Brian Foote, David J Wild, Ying Ding
    Abstract:

    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining Knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and Knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and Knowledge Discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, edge2vec significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical Knowledge Discovery applicability.

  • edge2vec representation learning using edge semantics for biomedical Knowledge Discovery
    arXiv: Information Retrieval, 2018
    Co-Authors: Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Jeremy J Yang, Christopher Gessner, Brian Foote, David J Wild, Qi Yu, Ying Ding
    Abstract:

    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining Knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and Knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and Knowledge Discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, \textbf{edge2vec}\ significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical Knowledge Discovery applicability.