Protein Function Prediction

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

David T Jones - One of the best experts on this subject based on the ideXlab platform.

  • Protein Function Prediction is improved by creating synthetic feature samples with generative adversarial networks
    Nature Machine Intelligence, 2020
    Co-Authors: Cen Wan, David T Jones
    Abstract:

    Protein Function Prediction is a challenging but important task in bioinformatics. Many Prediction methods have been developed, but are still limited by the bottleneck on training sample quantity. Therefore, it is valuable to develop a data augmentation method that can generate high-quality synthetic samples to further improve the accuracy of Prediction methods. In this work, we propose a novel generative adversarial networks-based method, FFPred-GAN, to accurately learn the high-dimensional distributions of Protein sequence-based biophysical features and also generate high-quality synthetic Protein feature samples. The experimental results suggest that the synthetic Protein feature samples are successful in improving the Prediction accuracy for all three domains of Gene Ontology through augmentation of the original training Protein feature samples. Training machine learning models to predict the Function of Proteins is limited by the availability of only a small amount of labelled training data. Training can be improved by employing generative adversarial networks to generate additional synthetic Protein samples.

  • improving Protein Function Prediction with synthetic feature samples created by generative adversarial networks
    bioRxiv, 2019
    Co-Authors: Cen Wan, David T Jones
    Abstract:

    Abstract Protein Function Prediction is a challenging but important task in bioinformatics. Many Prediction methods have been developed, but are still limited by the bottleneck on training sample quantity. Therefore, it is valuable to develop a data augmentation method that can generate high-quality synthetic samples to further improve the accuracy of Prediction methods. In this work, we propose a novel generative adversarial networks-based method, namely FFPred-GAN, to accurately learn the high-dimensional distributions of Protein sequence-based biophysical features and also generate high-quality synthetic Protein feature samples. The experimental results suggest that the synthetic Protein feature samples are successful in improving the Prediction accuracy for all three domains of the Gene Ontology through augmentation of the original training Protein feature samples.

  • using deep maxout neural networks to improve the accuracy of Function Prediction from Protein interaction networks
    PLOS ONE, 2019
    Co-Authors: Cen Wan, Domenico Cozzetto, David T Jones
    Abstract:

    Protein-Protein interaction network data provides valuable information that infers direct links between genes and their biological roles. This information brings a fundamental hypothesis for Protein Function Prediction that interacting Proteins tend to have similar Functions. With the help of recently-developed network embedding feature generation methods and deep maxout neural networks, it is possible to extract Functional representations that encode direct links between Protein-Protein interactions information and Protein Function. Our novel method, STRING2GO, successfully adopts deep maxout neural networks to learn Functional representations simultaneously encoding both Protein-Protein interactions and Functional predictive information. The experimental results show that STRING2GO outperforms other Protein-Protein interaction network-based Prediction methods and one benchmark method adopted in a recent large scale Protein Function Prediction competition.

  • using deep maxout neural networks to improve the accuracy of Function Prediction from Protein interaction networks
    bioRxiv, 2018
    Co-Authors: Cen Wan, Domenico Cozzetto, David T Jones
    Abstract:

    Protein-Protein interaction network data provides valuable information that infers direct links between genes and their biological roles. This information brings a fundamental hypothesis for Protein Function Prediction that interacting Proteins tend to have similar Functions. With the help of recently-developed network embedding feature generation methods and deep maxout neural networks, it is possible to extract Functional representations that encode direct links between Protein-Protein interactions information and Protein Function. Our novel method, STRING2GO, successfully adopts deep maxout neural networks to learn Functional representations simultaneously encoding both Protein-Protein interactions and Functional predictive information. The experimental results show that STRING2GO outperforms other network embedding-based Prediction methods and one benchmark method adopted in a recent large scale Protein Function Prediction competition.

  • Protein Function Prediction by massive integration of evolutionary analyses and multiple data sources
    BMC Bioinformatics, 2013
    Co-Authors: Domenico Cozzetto, Daniel W A Buchan, Kevin Bryson, David T Jones
    Abstract:

    Background Accurate Protein Function annotation is a severe bottleneck when utilizing the deluge of high-throughput, next generation sequencing data. Keeping database annotations up-to-date has become a major scientific challenge that requires the development of reliable automatic predictors of Protein Function. The CAFA experiment provided a unique opportunity to undertake comprehensive 'blind testing' of many diverse approaches for automated Function Prediction. We report on the methodology we used for this challenge and on the lessons we learnt.

Xiaodi Huang - One of the best experts on this subject based on the ideXlab platform.

  • netgo improving large scale Protein Function Prediction with massive network information
    Nucleic Acids Research, 2019
    Co-Authors: Xiaodi Huang, Ronghui You, Fengzhu Sun, Shuwei Yao, Yi Xiong, Hiroshi Mamitsuka
    Abstract:

    Automated Function Prediction (AFP) of Proteins is of great significance in biology. AFP can be regarded as a problem of the large-scale multi-label classification where a Protein can be associated with multiple gene ontology terms as its labels. Based on our GOLabeler-a state-of-the-art method for the third critical assessment of Functional annotation (CAFA3), in this paper we propose NetGO, a web server that is able to further improve the performance of the large-scale AFP by incorporating massive Protein-Protein network information. Specifically, the advantages of NetGO are threefold in using network information: (i) NetGO relies on a powerful learning to rank framework from machine learning to effectively integrate both sequence and network information of Proteins; (ii) NetGO uses the massive network information of all species (>2000) in STRING (other than only some specific species) and (iii) NetGO still can use network information to annotate a Protein by homology transfer, even if it is not contained in STRING. Separating training and testing data with the same time-delayed settings of CAFA, we comprehensively examined the performance of NetGO. Experimental results have clearly demonstrated that NetGO significantly outperforms GOLabeler and other competing methods. The NetGO web server is freely available at http://issubmission.sjtu.edu.cn/netgo/.

  • netgo improving large scale Protein Function Prediction with massive network information
    bioRxiv, 2018
    Co-Authors: Ronghui You, Xiaodi Huang, Fengzhu Sun, Shuwei Yao, Hiroshi Mamitsuka, Shanfeng Zhu
    Abstract:

    Automated Function Prediction (AFP) of Proteins is of great significance in biology. In essence, AFP is a large-scale multi-label classification over pairs of Proteins and GO terms. Existing AFP approaches, however, have their limitations on both sides of Proteins and GO terms. Using various sequence information and the robust learning to rank (LTR) framework, we have developed GOLabeler, a state-of-the-art approach of CAFA3, which overcomes the limitation of the GO term side, such as imbalanced GO terms. Unfortunately, for the Protein side issue, available abundant Protein information, except for sequences, have not been effectively used for large-scale AFP in CAFA. We propose NetGO that is able to improve large-scale AFP with massive network information. The novelties of NetGO have threefold in using network information: 1) the powerful LTR framework of NetGO efficiently and effectively integrates both sequence and network information, which can easily make large-scale AFP; 2) NetGO can use whole and massive network information of all species (>2000) in STRING (other than only high confidence links and/or some specific species); and 3) NetGO can still use network information to annotate a Protein by homology transfer even if it is not covered in STRING. Under numerous experimental settings, we examined the performance of NetGO, such as general performance comparison, species-specific Prediction, and Prediction on difficult Proteins, by using training and test data separated by time-delayed settings of CAFA. Experimental results have clearly demonstrated that NetGO outperforms GOLabeler, DeepGO, and other compared baseline methods significantly. In addition, several interesting findings from our experiments on NetGO would be useful for future AFP research.

  • DeepText2GO: Improving large-scale Protein Function Prediction with deep semantic text representation
    Methods, 2018
    Co-Authors: Xiaodi Huang
    Abstract:

    Abstract As of April 2018, UniProtKB has collected more than 115 million Protein sequences. Less than 0.15% of these Proteins, however, have been associated with experimental GO annotations. As such, the use of automatic Protein Function Prediction (AFP) to reduce this huge gap becomes increasingly important. The previous studies conclude that sequence homology based methods are highly effective in AFP. In addition, mining motif, domain, and Functional information from Protein sequences has been found very helpful for AFP. Other than sequences, alternative information sources such as text, however, may be useful for AFP as well. Instead of using BOW (bag of words) representation in traditional text-based AFP, we propose a new method called DeepText2GO that relies on deep semantic text representation, together with different kinds of available Protein information such as sequence homology, families, domains, and motifs, to improve large-scale AFP. Furthermore, DeepText2GO integrates text-based methods with sequence-based ones by means of a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence-based methods, validating its superiority.

Ronghui You - One of the best experts on this subject based on the ideXlab platform.

  • netgo improving large scale Protein Function Prediction with massive network information
    Nucleic Acids Research, 2019
    Co-Authors: Xiaodi Huang, Ronghui You, Fengzhu Sun, Shuwei Yao, Yi Xiong, Hiroshi Mamitsuka
    Abstract:

    Automated Function Prediction (AFP) of Proteins is of great significance in biology. AFP can be regarded as a problem of the large-scale multi-label classification where a Protein can be associated with multiple gene ontology terms as its labels. Based on our GOLabeler-a state-of-the-art method for the third critical assessment of Functional annotation (CAFA3), in this paper we propose NetGO, a web server that is able to further improve the performance of the large-scale AFP by incorporating massive Protein-Protein network information. Specifically, the advantages of NetGO are threefold in using network information: (i) NetGO relies on a powerful learning to rank framework from machine learning to effectively integrate both sequence and network information of Proteins; (ii) NetGO uses the massive network information of all species (>2000) in STRING (other than only some specific species) and (iii) NetGO still can use network information to annotate a Protein by homology transfer, even if it is not contained in STRING. Separating training and testing data with the same time-delayed settings of CAFA, we comprehensively examined the performance of NetGO. Experimental results have clearly demonstrated that NetGO significantly outperforms GOLabeler and other competing methods. The NetGO web server is freely available at http://issubmission.sjtu.edu.cn/netgo/.

  • netgo improving large scale Protein Function Prediction with massive network information
    bioRxiv, 2018
    Co-Authors: Ronghui You, Xiaodi Huang, Fengzhu Sun, Shuwei Yao, Hiroshi Mamitsuka, Shanfeng Zhu
    Abstract:

    Automated Function Prediction (AFP) of Proteins is of great significance in biology. In essence, AFP is a large-scale multi-label classification over pairs of Proteins and GO terms. Existing AFP approaches, however, have their limitations on both sides of Proteins and GO terms. Using various sequence information and the robust learning to rank (LTR) framework, we have developed GOLabeler, a state-of-the-art approach of CAFA3, which overcomes the limitation of the GO term side, such as imbalanced GO terms. Unfortunately, for the Protein side issue, available abundant Protein information, except for sequences, have not been effectively used for large-scale AFP in CAFA. We propose NetGO that is able to improve large-scale AFP with massive network information. The novelties of NetGO have threefold in using network information: 1) the powerful LTR framework of NetGO efficiently and effectively integrates both sequence and network information, which can easily make large-scale AFP; 2) NetGO can use whole and massive network information of all species (>2000) in STRING (other than only high confidence links and/or some specific species); and 3) NetGO can still use network information to annotate a Protein by homology transfer even if it is not covered in STRING. Under numerous experimental settings, we examined the performance of NetGO, such as general performance comparison, species-specific Prediction, and Prediction on difficult Proteins, by using training and test data separated by time-delayed settings of CAFA. Experimental results have clearly demonstrated that NetGO outperforms GOLabeler, DeepGO, and other compared baseline methods significantly. In addition, several interesting findings from our experiments on NetGO would be useful for future AFP research.

  • golabeler improving sequence based large scale Protein Function Prediction by learning to rank
    Bioinformatics, 2018
    Co-Authors: Ronghui You, Fengzhu Sun, Yi Xiong, Hiroshi Mamitsuka, Zihan Zhang, Shanfeng Zhu
    Abstract:

    Motivation Gene Ontology (GO) has been widely used to annotate Functions of Proteins and understand their biological roles. Currently only 70 million Proteins in UniProtKB have experimental GO annotations, implying the strong necessity of automated Function Prediction (AFP) of Proteins, where AFP is a hard multilabel classification problem due to one Protein with a diverse number of GO terms. Most of these Proteins have only sequences as input information, indicating the importance of sequence-based AFP (SAFP: sequences are the only input). Furthermore, homology-based SAFP tools are competitive in AFP competitions, while they do not necessarily work well for so-called difficult Proteins, which have <60% sequence identity to Proteins with annotations already. Thus, the vital and challenging problem now is how to develop a method for SAFP, particularly for difficult Proteins. Methods The key of this method is to extract not only homology information but also diverse, deep-rooted information/evidence from sequence inputs and integrate them into a predictor in a both effective and efficient manner. We propose GOLabeler, which integrates five component classifiers, trained from different features, including GO term frequency, sequence alignment, amino acid trigram, domains and motifs, and biophysical properties, etc., in the framework of learning to rank (LTR), a paradigm of machine learning, especially powerful for multilabel classification. Results The empirical results obtained by examining GOLabeler extensively and thoroughly by using large-scale datasets revealed numerous favorable aspects of GOLabeler, including significant performance advantage over state-of-the-art AFP methods. Availability and implementation http://datamining-iip.fudan.edu.cn/golabeler. Supplementary information Supplementary data are available at Bioinformatics online.

  • deeptext2go improving large scale Protein Function Prediction with deep semantic text representation
    Bioinformatics and Biomedicine, 2017
    Co-Authors: Ronghui You, Shanfeng Zhu
    Abstract:

    UniProtKB has collected more than 88 million Protein sequences by July 2017. Less than 0.2% of these Proteins, however, have added experimental GO annotations. To reduce this huge gap, automatic Protein Function Prediction (AFP) becomes increasingly important. Results on CAFA (the Critical Assessment of Protein Function Annotation algorithms) benchmark demonstrates that sequence homology based methods are highly competitive in AFP. One imperative issues will be incorporating other information sources other than sequence for AFP. In contrast to using BOW (bag of words) representation in traditional text-based AFP, we proposed a new method called DeepText2GO to improve large-scale AFP by using deep semantic text representation instead. Furthermore, DeepText2GO integrates both text-based and sequence homology-based methods through a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence homology-based methods, validating its superiority.

  • golabeler improving sequence based large scale Protein Function Prediction by learning to rank
    bioRxiv, 2017
    Co-Authors: Ronghui You, Fengzhu Sun, Yi Xiong, Hiroshi Mamitsuka, Zihan Zhang, Shanfeng Zhu
    Abstract:

    Motivation: Gene Ontology (GO) has been widely used to annotate Functions of Proteins and understand their biological roles. Currently only <1% of more than 70 million Proteins in UniProtKB have experimental GO annotations, implying the strong necessity of automated Function Prediction (AFP) of Proteins, where AFP is a hard multi-label classification problem due to one Protein with a diverse number of GO terms. Most of these Proteins have only sequences as input information, indicating the importance of sequence-based AFP (SAFP: sequences are the only input). Furthermore, homology-based SAFP tools are competitive in AFP competitions, while they do not necessarily work well for so-called difficult Proteins, which have <60% sequence identity to Proteins with annotations already. Thus, the vital and challenging problem now is to develop a method for SAFP, particularly for difficult Proteins. Methods: The key of this method is to extract not only homology information but also diverse, deep-rooted information/evidence from sequence inputs and integrate them into a predictor in an efficient and also effective manner. We propose GOLabeler, which integrates five component classifiers, trained from different features, including GO term frequency, sequence alignment, amino acid trigram, domains and motifs, and biophysical properties, etc., in the framework of learning to rank (LTR), a new paradigm of machine learning, especially powerful for multi-label classification. Results: The empirical results obtained by examining GOLabeler extensively and thoroughly by using large-scale datasets revealed numerous favorable aspects of GOLabeler, including significant performance advantage over state-of-the-art AFP methods.

Hagit Shatkay - One of the best experts on this subject based on the ideXlab platform.

  • Protein Function Prediction using text based features extracted from the biomedical literature the cafa challenge
    BMC Bioinformatics, 2013
    Co-Authors: Andrew Wong, Hagit Shatkay
    Abstract:

    Background Advances in sequencing technology over the past decade have resulted in an abundance of sequenced Proteins whose Function is yet unknown. As such, computational systems that can automatically predict and annotate Protein Function are in demand. Most computational systems use features derived from Protein sequence or Protein structure to predict Function. In an earlier work, we demonstrated the utility of biomedical literature as a source of text features for predicting Protein subcellular location. We have also shown that the combination of text-based and sequence-based Prediction improves the performance of location predictors. Following up on this work, for the Critical Assessment of Function Annotations (CAFA) Challenge, we developed a text-based system that aims to predict molecular Function and biological process (using Gene Ontology terms) for unannotated Proteins. In this paper, we present the preliminary work and evaluation that we performed for our system, as part of the CAFA challenge.

Fengzhu Sun - One of the best experts on this subject based on the ideXlab platform.

  • netgo improving large scale Protein Function Prediction with massive network information
    Nucleic Acids Research, 2019
    Co-Authors: Xiaodi Huang, Ronghui You, Fengzhu Sun, Shuwei Yao, Yi Xiong, Hiroshi Mamitsuka
    Abstract:

    Automated Function Prediction (AFP) of Proteins is of great significance in biology. AFP can be regarded as a problem of the large-scale multi-label classification where a Protein can be associated with multiple gene ontology terms as its labels. Based on our GOLabeler-a state-of-the-art method for the third critical assessment of Functional annotation (CAFA3), in this paper we propose NetGO, a web server that is able to further improve the performance of the large-scale AFP by incorporating massive Protein-Protein network information. Specifically, the advantages of NetGO are threefold in using network information: (i) NetGO relies on a powerful learning to rank framework from machine learning to effectively integrate both sequence and network information of Proteins; (ii) NetGO uses the massive network information of all species (>2000) in STRING (other than only some specific species) and (iii) NetGO still can use network information to annotate a Protein by homology transfer, even if it is not contained in STRING. Separating training and testing data with the same time-delayed settings of CAFA, we comprehensively examined the performance of NetGO. Experimental results have clearly demonstrated that NetGO significantly outperforms GOLabeler and other competing methods. The NetGO web server is freely available at http://issubmission.sjtu.edu.cn/netgo/.

  • netgo improving large scale Protein Function Prediction with massive network information
    bioRxiv, 2018
    Co-Authors: Ronghui You, Xiaodi Huang, Fengzhu Sun, Shuwei Yao, Hiroshi Mamitsuka, Shanfeng Zhu
    Abstract:

    Automated Function Prediction (AFP) of Proteins is of great significance in biology. In essence, AFP is a large-scale multi-label classification over pairs of Proteins and GO terms. Existing AFP approaches, however, have their limitations on both sides of Proteins and GO terms. Using various sequence information and the robust learning to rank (LTR) framework, we have developed GOLabeler, a state-of-the-art approach of CAFA3, which overcomes the limitation of the GO term side, such as imbalanced GO terms. Unfortunately, for the Protein side issue, available abundant Protein information, except for sequences, have not been effectively used for large-scale AFP in CAFA. We propose NetGO that is able to improve large-scale AFP with massive network information. The novelties of NetGO have threefold in using network information: 1) the powerful LTR framework of NetGO efficiently and effectively integrates both sequence and network information, which can easily make large-scale AFP; 2) NetGO can use whole and massive network information of all species (>2000) in STRING (other than only high confidence links and/or some specific species); and 3) NetGO can still use network information to annotate a Protein by homology transfer even if it is not covered in STRING. Under numerous experimental settings, we examined the performance of NetGO, such as general performance comparison, species-specific Prediction, and Prediction on difficult Proteins, by using training and test data separated by time-delayed settings of CAFA. Experimental results have clearly demonstrated that NetGO outperforms GOLabeler, DeepGO, and other compared baseline methods significantly. In addition, several interesting findings from our experiments on NetGO would be useful for future AFP research.

  • golabeler improving sequence based large scale Protein Function Prediction by learning to rank
    Bioinformatics, 2018
    Co-Authors: Ronghui You, Fengzhu Sun, Yi Xiong, Hiroshi Mamitsuka, Zihan Zhang, Shanfeng Zhu
    Abstract:

    Motivation Gene Ontology (GO) has been widely used to annotate Functions of Proteins and understand their biological roles. Currently only 70 million Proteins in UniProtKB have experimental GO annotations, implying the strong necessity of automated Function Prediction (AFP) of Proteins, where AFP is a hard multilabel classification problem due to one Protein with a diverse number of GO terms. Most of these Proteins have only sequences as input information, indicating the importance of sequence-based AFP (SAFP: sequences are the only input). Furthermore, homology-based SAFP tools are competitive in AFP competitions, while they do not necessarily work well for so-called difficult Proteins, which have <60% sequence identity to Proteins with annotations already. Thus, the vital and challenging problem now is how to develop a method for SAFP, particularly for difficult Proteins. Methods The key of this method is to extract not only homology information but also diverse, deep-rooted information/evidence from sequence inputs and integrate them into a predictor in a both effective and efficient manner. We propose GOLabeler, which integrates five component classifiers, trained from different features, including GO term frequency, sequence alignment, amino acid trigram, domains and motifs, and biophysical properties, etc., in the framework of learning to rank (LTR), a paradigm of machine learning, especially powerful for multilabel classification. Results The empirical results obtained by examining GOLabeler extensively and thoroughly by using large-scale datasets revealed numerous favorable aspects of GOLabeler, including significant performance advantage over state-of-the-art AFP methods. Availability and implementation http://datamining-iip.fudan.edu.cn/golabeler. Supplementary information Supplementary data are available at Bioinformatics online.

  • golabeler improving sequence based large scale Protein Function Prediction by learning to rank
    bioRxiv, 2017
    Co-Authors: Ronghui You, Fengzhu Sun, Yi Xiong, Hiroshi Mamitsuka, Zihan Zhang, Shanfeng Zhu
    Abstract:

    Motivation: Gene Ontology (GO) has been widely used to annotate Functions of Proteins and understand their biological roles. Currently only <1% of more than 70 million Proteins in UniProtKB have experimental GO annotations, implying the strong necessity of automated Function Prediction (AFP) of Proteins, where AFP is a hard multi-label classification problem due to one Protein with a diverse number of GO terms. Most of these Proteins have only sequences as input information, indicating the importance of sequence-based AFP (SAFP: sequences are the only input). Furthermore, homology-based SAFP tools are competitive in AFP competitions, while they do not necessarily work well for so-called difficult Proteins, which have <60% sequence identity to Proteins with annotations already. Thus, the vital and challenging problem now is to develop a method for SAFP, particularly for difficult Proteins. Methods: The key of this method is to extract not only homology information but also diverse, deep-rooted information/evidence from sequence inputs and integrate them into a predictor in an efficient and also effective manner. We propose GOLabeler, which integrates five component classifiers, trained from different features, including GO term frequency, sequence alignment, amino acid trigram, domains and motifs, and biophysical properties, etc., in the framework of learning to rank (LTR), a new paradigm of machine learning, especially powerful for multi-label classification. Results: The empirical results obtained by examining GOLabeler extensively and thoroughly by using large-scale datasets revealed numerous favorable aspects of GOLabeler, including significant performance advantage over state-of-the-art AFP methods.

  • diffusion kernel based logistic regression models for Protein Function Prediction
    Omics A Journal of Integrative Biology, 2006
    Co-Authors: Hyunju Lee, Minghua Deng, Fengzhu Sun, Ting Chen
    Abstract:

    Assigning Functions to unknown Proteins is one of the most important problems in proteomics. Several approaches have used Protein-Protein interaction data to predict Protein Functions. We previously developed a Markov random field (MRF) based method to infer a Protein's Functions using Protein-Protein interaction data and the Functional annotations of its Protein interaction partners. In the original model, only direct interactions were considered and each Function was considered separately. In this study, we develop a new model which extends direct interactions to all neighboring Proteins, and one Function to multiple Functions. The goal is to understand a Protein's Function based on information on all the neighboring Proteins in the interaction network. We first developed a novel kernel logistic regression (KLR) method based on diffusion kernels for Protein interaction networks. The diffusion kernels provide means to incorporate all neighbors of Proteins in the network. Second, we identified a set of Functions that are highly correlated with the Function of interest, referred to as the correlated Functions, using the chi-square test. Third, the correlated Functions were incorporated into our new KLR model. Fourth, we extended our model by incorporating multiple biological data sources such as Protein domains, Protein complexes, and gene expressions by converting them into networks. We showed that the KLR approach of incorporating all Protein neighbors significantly improved the accuracy of Protein Function Predictions over the MRF model. The incorporation of multiple data sets also improved Prediction accuracy. The Prediction accuracy is comparable to another Protein Function classifier based on the support vector machine (SVM), using a diffusion kernel. The advantages of the KLR model include its simplicity as well as its ability to explore the contribution of neighbors to the Functions of Proteins of interest.