The Experts below are selected from a list of 2271 Experts worldwide ranked by ideXlab platform
Kuochen Chou - One of the best experts on this subject based on the ideXlab platform.
-
isulfotyr pseaac identify tyrosine sulfation sites by incorporating statistical moments via chou s 5 steps rule and pseudo components
Current Genomics, 2019Co-Authors: Omar M Barukab, Yaser Daanial Khan, Sher Afzal Khan, Kuochen ChouAbstract:Background The amino acid residues, in protein, undergo post-translation modification (PTM) during protein synthesis, a process of chemical and physical change in an amino acid that in turn alters behavioral properties of proteins. Tyrosine sulfation is a ubiquitous posttranslational modification which is known to be associated with regulation of various biological functions and pathological pro-cesses. Thus its identification is necessary to understand its mechanism. Experimental determination through site-directed mutagenesis and high throughput mass spectrometry is a costly and time taking process, thus, the reliable computational model is required for identification of sulfotyrosine sites. Methodology In this paper, we present a computational model for the prediction of the sulfotyrosine sites named iSulfoTyr-PseAAC in which feature vectors are constructed using statistical moments of protein amino acid sequences and various position/composition relative features. These features are in-corporated into PseAAC. The model is validated by Jackknife, cross-validation, self-consistency and in-dependent Testing. Results Accuracy determined through validation was 93.93% for Jackknife Test, 95.16% for cross-validation, 94.3% for self-consistency and 94.3% for independent Testing. Conclusion The proposed model has better performance as compared to the existing predictors, how-ever, the accuracy can be improved further, in future, due to increasing number of sulfotyrosine sites in proteins.
-
irna ai identifying the adenosine to inosine editing sites in rna sequences
Oncotarget, 2017Co-Authors: Wei Chen, Pengmian Feng, Hui Ding, Hao Lin, Hui Yang, Kuochen ChouAbstract:Catalyzed by adenosine deaminase (ADAR), the adenosine to inosine (A-to-I) editing in RNA is not only involved in various important biological processes, but also closely associated with a series of major diseases. Therefore, knowledge about the A-to-I editing sites in RNA is crucially important for both basic research and drug development. Given an uncharacterized RNA sequence that contains many adenosine (A) residues, can we identify which one of them can be of A-to-I editing, and which one cannot? Unfortunately, so far no computational method whatsoever has been developed to address such an important problem based on the RNA sequence information alone. To fill this empty area, we have proposed a predictor called iRNA-AI by incorporating the chemical properties of nucleotides and their sliding occurrence density distribution along a RNA sequence into the general form of pseudo nucleotide composition (PseKNC). It has been shown by the rigorous Jackknife Test and independent dataset Test that the performance of the proposed predictor is quite promising. For the convenience of most experimental scientists, a user-friendly web-server for iRNA-AI has been established at http://lin.uestc.edu.cn/server/iRNA-AI/, by which users can easily get their desired results without the need to go through the mathematical details.
-
irna pseu identifying rna pseudouridine sites
Molecular therapy. Nucleic acids, 2016Co-Authors: Wei Chen, Hua Tang, Hao Lin, Kuochen ChouAbstract:As the most abundant RNA modification, pseudouridine plays important roles in many biological processes. Occurring at the uridine site and catalyzed by pseudouridine synthase, the modification has been observed in nearly all kinds of RNA, including transfer RNA, messenger RNA, small nuclear or nucleolar RNA, and ribosomal RNA. Accordingly, its importance to basic research and drug development is self-evident. Despite some experimental technologies have been developed to detect the pseudouridine sites, they are both time-consuming and expensive. Facing the explosive growth of RNA sequences in the postgenomic age, we are challenged to address the problem by computational approaches: For an uncharacterized RNA sequence, can we predict which of its uridine sites can be modified as pseudouridine and which ones cannot? Here a predictor called "iRNA-PseU" was proposed by incorporating the chemical properties of nucleotides and their occurrence frequency density distributions into the general form of pseudo nucleotide composition (PseKNC). It has been demonstrated via the rigorous Jackknife Test, independent dataset Test, and practical genome-wide analysis that the proposed predictor remarkably outperforms its counterpart. For the convenience of most experimental scientists, the web-server for iRNA-PseU was established at http://lin.uestc.edu.cn/server/iRNA-PseU, by which users can easily get their desired results without the need to go through the mathematical details.
-
identification of dna binding proteins by incorporating evolutionary information into pseudo amino acid composition via the top n gram approach
Journal of Biomolecular Structure & Dynamics, 2015Co-Authors: Ruifeng Xu, Yulan He, Jiyun Zhou, Xiaolong Wang, Kuochen ChouAbstract:DNA-binding proteins are crucial for various cellular processes and hence have become an important target for both basic research and drug development. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to establish an automated method for rapidly and accurately identifying DNA-binding proteins based on their sequence information alone. Owing to the fact that all biological species have developed beginning from a very limited number of ancestral species, it is important to take into account the evolutionary information in developing such a high-throughput tool. In view of this, a new predictor was proposed by incorporating the evolutionary information into the general form of pseudo amino acid composition via the top-n-gram approach. It was observed by comparing the new predictor with the existing methods via both Jackknife Test and independent data-set Test that the new predictor outperformed its counterparts. It is anticipated that the new predictor may become a useful vehicle for identifying DNA-binding proteins. It has not escaped our notice that the novel approach to extract evolutionary information into the formulation of statistical samples can be used to identify many other protein attributes as well.
-
imethyl pseaac identification of protein methylation sites via a pseudo amino acid composition approach
BioMed Research International, 2014Co-Authors: Wangren Qiu, Xuan Xiao, Weizhong Lin, Kuochen ChouAbstract:Before becoming the native proteins during the biosynthesis, their polypeptide chains created by ribosome's translating mRNA will undergo a series of "product-forming" steps, such as cutting, folding, and posttranslational modification (PTM). Knowledge of PTMs in proteins is crucial for dynamic proteome analysis of various human diseases and epigenetic inheritance. One of the most important PTMs is the Arg- or Lys-methylation that occurs on arginine or lysine, respectively. Given a protein, which site of its Arg (or Lys) can be methylated, and which site cannot? This is the first important problem for understanding the methylation mechanism and drug development in depth. With the avalanche of protein sequences generated in the postgenomic age, its urgency has become self-evident. To address this problem, we proposed a new predictor, called iMethyl-PseAAC. In the prediction system, a peptide sample was formulated by a 346-dimensional vector, formed by incorporating its physicochemical, sequence evolution, biochemical, and structural disorder information into the general form of pseudo amino acid composition. It was observed by the rigorous Jackknife Test and independent dataset Test that iMethyl-PseAAC was superior to any of the existing predictors in this area.
Yudong Cai - One of the best experts on this subject based on the ideXlab platform.
-
a hybrid method for prediction and repositioning of drug anatomical therapeutic chemical classes
Molecular BioSystems, 2014Co-Authors: Lei Chen, Tao Huang, Ning Zhang, Yudong CaiAbstract:In the Anatomical Therapeutic Chemical (ATC) classification system, therapeutic drugs are divided into 14 main classes according to the organ or system on which they act and their chemical, pharmacological and therapeutic properties. This system, recommended by the World Health Organization (WHO), provides a global standard for classifying medical substances and serves as a tool for international drug utilization research to improve quality of drug use. In view of this, it is necessary to develop effective computational prediction methods to identify the ATC-class of a given drug, which thereby could facilitate further analysis of this system. In this study, we initiated an attempt to develop a prediction method and to gain insights from it by utilizing ontology information of drug compounds. Since only about one-fourth of drugs in the ATC classification system have ontology information, a hybrid prediction method combining the ontology information, chemical interaction information and chemical structure information of drug compounds was proposed for the prediction of drug ATC-classes. As a result, by using the Jackknife Test, the 1st prediction accuracies for identifying the 14 main ATC-classes in the training dataset, the internal validation dataset and the external validation dataset were 75.90%, 75.70% and 66.36%, respectively. Analysis of some samples with false-positive predictions in the internal and external validation datasets indicated that some of them may even have a relationship with the false-positive predicted ATC-class, suggesting novel uses of these drugs. It was conceivable that the proposed method could be used as an efficient tool to identify ATC-classes of novel drugs or to discover novel uses of known drugs.
-
a new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology
Biochemical and Biophysical Research Communications, 2003Co-Authors: Kuochen Chou, Yudong CaiAbstract:Based on the recent development in the gene ontology and functional domain databases, a new hybridization approach is developed for predicting protein subcellular location by combining the gene product, functional domain, and quasi-sequence-order effects. As a showcase, the same prokaryotic and eukaryotic datasets, which were studied by many previous investigators, are used for demonstration. The overall success rate by the Jackknife Test for the prokaryotic set is 94.7% and that for the eukaryotic set 92.9%. These are so far the highest success rates achieved for the two datasets by following a rigorous cross-validation Test procedure, suggesting that such a hybrid approach may become a very useful high-throughput tool in the area of bioinformatics, proteomics, as well as molecular cell biology. The very high success rates also reflect the fact that the subcellular localization of a protein is closely correlated with: (1). the biological objective to which the gene or gene product contributes, (2). the biochemical activity of a gene product, and (3). the place in the cell where a gene product is active.
-
using functional domain composition and support vector machines for prediction of protein subcellular location
Journal of Biological Chemistry, 2002Co-Authors: Kuochen Chou, Yudong CaiAbstract:Proteins are generally classified into the following 12 subcellular locations: 1) chloroplast, 2) cytoplasm, 3) cytoskeleton, 4) endoplasmic reticulum, 5) extracellular, 6) Golgi apparatus, 7) lysosome, 8) mitochondria, 9) nucleus, 10) peroxisome, 11) plasma membrane, and 12) vacuole. Because the function of a protein is closely correlated with its subcellular location, with the rapid increase in new protein sequences entering into databanks, it is vitally important for both basic research and pharmaceutical industry to establish a high throughput tool for predicting protein subcellular location. In this paper, a new concept, the so-called "functional domain composition" is introduced. Based on the novel concept, the representation for a protein can be defined as a vector in a high-dimensional space, where each of the clustered functional domains derived from the protein universe serves as a vector base. With such a novel representation for a protein, the support vector machine (SVM) algorithm is introduced for predicting protein subcellular location. High success rates are obtained by the self-consistency Test, Jackknife Test, and independent dataset Test, respectively. The current approach not only can play an important complementary role to the powerful covariant discriminant algorithm based on the pseudo amino acid composition representation (Chou, K. C. (2001) Proteins Struct. Funct. Genet. 43, 246-255; Correction (2001) Proteins Struct. Funct. Genet. 44, 60), but also may greatly stimulate the development of this area.
Jijun Tang - One of the best experts on this subject based on the ideXlab platform.
-
improved detection of dna binding proteins via compression technology on pssm information
PLOS ONE, 2017Co-Authors: Yubo Wang, Yijie Ding, Jijun TangAbstract:Since the importance of DNA-binding proteins in multiple biomolecular functions has been recognized, an increasing number of researchers are attempting to identify DNA-binding proteins. In recent years, the machine learning methods have become more and more compelling in the case of protein sequence data soaring, because of their favorable speed and accuracy. In this paper, we extract three features from the protein sequence, namely NMBAC (Normalized Moreau-Broto Autocorrelation), PSSM-DWT (Position-specific scoring matrix—Discrete Wavelet Transform), and PSSM-DCT (Position-specific scoring matrix—Discrete Cosine Transform). We also employ feature selection algorithm on these feature vectors. Then, these features are fed into the training SVM (support vector machine) model as classifier to predict DNA-binding proteins. Our method applys three datasets, namely PDB1075, PDB594 and PDB186, to evaluate the performance of our approach. The PDB1075 and PDB594 datasets are employed for Jackknife Test and the PDB186 dataset is used for the independent Test. Our method achieves the best accuracy in the Jacknife Test, from 79.20% to 86.23% and 80.5% to 86.20% on PDB1075 and PDB594 datasets, respectively. In the independent Test, the accuracy of our method comes to 76.3%. The performance of independent Test also shows that our method has a certain ability to be effectively used for DNA-binding protein prediction. The data and source code are at https://doi.org/10.6084/m9.figshare.5104084.
-
local dpp an improved dna binding protein prediction method by exploring local evolutionary information
Information Sciences, 2017Co-Authors: Jijun TangAbstract:Abstract Increased knowledge of DNA-binding proteins would enhance our understanding of protein functions in cellular biological processes. To handle the explosive growth of protein sequence data, researchers have developed machine learning-based methods that quickly and accurately predict DNA-binding proteins. In recent years, the predictive accuracy of machine learning-based predictors has significantly advanced, but the predictive performance remains unsatisfactory. In this paper, we establish a novel predictor named Local-DPP, which combines the local Pse-PSSM (Pseudo Position-Specific Scoring Matrix) features with the random forest classifier. The proposed features can efficiently capture the local conservation information, together with the sequence-order information, from the evolutionary profiles (PSSMs). We evaluate and compare the Local-DPP predictor with state-of-the-art predictors on two stringent benchmark datasets (one for the Jackknife Test, the other for an independent Test). The proposed Local-DPP significantly improved the accuracy of the existing predictors, from 77.3% to 79.2% and 76.9% to 79.0% in the Jackknife and independent Tests, respectively. This demonstrates the efficacy and effectiveness of Local-DPP in predicting DNA-binding proteins. The proposed Local-DPP is now freely accessible to the public through the user-friendly webserver http://server.malab.cn/Local-DPP/Index.html .
-
The performance of different features on PDB1075 dataset (Jackknife Test evaluation).
2017Co-Authors: Yubo Wang, Yijie Ding, Fei Guo, Leyi Wei, Jijun TangAbstract:The performance of different features on PDB1075 dataset (Jackknife Test evaluation).
-
The accuracy of different dimension features on PDB1075 dataset (Jackknife Test evaluation).
2017Co-Authors: Yubo Wang, Yijie Ding, Fei Guo, Leyi Wei, Jijun TangAbstract:The accuracy of different dimension features on PDB1075 dataset (Jackknife Test evaluation).
-
The computational time of feature extraction and Jackknife Test evaluation on PDB1075.
2017Co-Authors: Yubo Wang, Yijie Ding, Fei Guo, Leyi Wei, Jijun TangAbstract:The computational time of feature extraction and Jackknife Test evaluation on PDB1075.
Wei Chen - One of the best experts on this subject based on the ideXlab platform.
-
iMRM: a platform for simultaneously identifying multiple kinds of RNA modifications.
Bioinformatics (Oxford England), 2020Co-Authors: Kewei Liu, Wei ChenAbstract:Motivation RNA modifications play critical roles in a series of cellular and developmental processes. Knowledge about the distributions of RNA modifications in the transcriptomes will provide clues to revealing their functions. Since experimental methods are time consuming and laborious for detecting RNA modifications, computational methods have been proposed for this aim in the past five years. However, there are some drawbacks for both experimental and computational methods in simultaneously identifying modifications occurred on different nucleotides. Results To address such a challenge, in this article, we developed a new predictor called iMRM, which is able to simultaneously identify m6A, m5C, m1A, ψ and A-to-I modifications in Homo sapiens, Mus musculus and Saccharomyces cerevisiae. In iMRM, the feature selection technique was used to pick out the optimal features. The results from both 10-fold cross-validation and Jackknife Test demonstrated that the performance of iMRM is superior to existing methods for identifying RNA modifications. Availability and implementation A user-friendly web server for iMRM was established at http://www.bioml.cn/XG_iRNA/home. The off-line command-line version is available at https://github.com/liukeweiaway/iMRM. Contact greatchen@ncst.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
-
irna ai identifying the adenosine to inosine editing sites in rna sequences
Oncotarget, 2017Co-Authors: Wei Chen, Pengmian Feng, Hui Ding, Hao Lin, Hui Yang, Kuochen ChouAbstract:Catalyzed by adenosine deaminase (ADAR), the adenosine to inosine (A-to-I) editing in RNA is not only involved in various important biological processes, but also closely associated with a series of major diseases. Therefore, knowledge about the A-to-I editing sites in RNA is crucially important for both basic research and drug development. Given an uncharacterized RNA sequence that contains many adenosine (A) residues, can we identify which one of them can be of A-to-I editing, and which one cannot? Unfortunately, so far no computational method whatsoever has been developed to address such an important problem based on the RNA sequence information alone. To fill this empty area, we have proposed a predictor called iRNA-AI by incorporating the chemical properties of nucleotides and their sliding occurrence density distribution along a RNA sequence into the general form of pseudo nucleotide composition (PseKNC). It has been shown by the rigorous Jackknife Test and independent dataset Test that the performance of the proposed predictor is quite promising. For the convenience of most experimental scientists, a user-friendly web-server for iRNA-AI has been established at http://lin.uestc.edu.cn/server/iRNA-AI/, by which users can easily get their desired results without the need to go through the mathematical details.
-
detecting n 6 methyladenosine sites from rna transcriptomes using ensemble support vector machines
Scientific Reports, 2017Co-Authors: Wei Chen, Pengwei Xing, Quan ZouAbstract:As one of the most abundant RNA post-transcriptional modifications, N6-methyladenosine (m6A) involves in a broad spectrum of biological and physiological processes ranging from mRNA splicing and stability to cell differentiation and reprogramming. However, experimental identification of m6A sites is expensive and laborious. Therefore, it is urgent to develop computational methods for reliable prediction of m6A sites from primary RNA sequences. In the current study, a new method called RAM-ESVM was developed for detecting m6A sites from Saccharomyces cerevisiae transcriptome, which employed ensemble support vector machine classifiers and novel sequence features. The Jackknife Test results show that RAM-ESVM outperforms single support vector machine classifiers and other existing methods, indicating that it would be a useful computational tool for detecting m6A sites in S. cerevisiae. Furthermore, a web server named RAM-ESVM was constructed and could be freely accessible at http://server.malab.cn/RAM-ESVM/.
-
identifying n 6 methyladenosine sites in the arabidopsis thaliana transcriptome
Molecular Genetics and Genomics, 2016Co-Authors: Wei Chen, Pengmian Feng, Hui Ding, Hao LinAbstract:N 6-Methyladenosine (m6A) plays important roles in many biological processes. The knowledge of the distribution of m6A is helpful for understanding its regulatory roles. Although the experimental methods have been proposed to detect m6A, the resolutions of these methods are still unsatisfying especially for Arabidopsis thaliana. Benefitting from the experimental data, in the current work, a support vector machine-based method was proposed to identify m6A sites in A. thaliana transcriptome. The proposed method was validated on a benchmark dataset using Jackknife Test and was also validated by identifying strain-specific m6A sites in A. thaliana. The obtained predictive results indicate that the proposed method is quite promising. For the convenience of experimental biologists, an online webserver for the proposed method was built, which is freely available at http://lin.uestc.edu.cn/server/M6ATH . These results indicate that the proposed method holds a potential to become an elegant tool in identifying m6A site in A. thaliana.
-
identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition
BioMed Research International, 2016Co-Authors: Huan Yang, Hui Ding, Wei Chen, Hua Tang, Xinxin Chen, Changjian Zhang, Panpan Zhu, Hao LinAbstract:Tuberculosis is killing millions of lives every year and on the blacklist of the most appalling public health problems. Recent findings suggest that secretory protein of Mycobacterium tuberculosis may serve the purpose of developing specific vaccines and drugs due to their antigenicity. Responding to global infectious disease, we focused on the identification of secretory proteins in Mycobacterium tuberculosis. A novel method called MycoSec was designed by incorporating g-gap dipeptide compositions into pseudo amino acid composition. Analysis of variance-based technique was applied in the process of feature selection and a total of 374 optimal features were obtained and used for constructing the final predicting model. In the Jackknife Test, MycoSec yielded a good performance with the area under the receiver operating characteristic curve of 0.93, demonstrating that the proposed system is powerful and robust. For user's convenience, the web server MycoSec was established and an obliging manual on how to use it was provided for getting around any trouble unnecessary.
Hao Lin - One of the best experts on this subject based on the ideXlab platform.
-
irna ai identifying the adenosine to inosine editing sites in rna sequences
Oncotarget, 2017Co-Authors: Wei Chen, Pengmian Feng, Hui Ding, Hao Lin, Hui Yang, Kuochen ChouAbstract:Catalyzed by adenosine deaminase (ADAR), the adenosine to inosine (A-to-I) editing in RNA is not only involved in various important biological processes, but also closely associated with a series of major diseases. Therefore, knowledge about the A-to-I editing sites in RNA is crucially important for both basic research and drug development. Given an uncharacterized RNA sequence that contains many adenosine (A) residues, can we identify which one of them can be of A-to-I editing, and which one cannot? Unfortunately, so far no computational method whatsoever has been developed to address such an important problem based on the RNA sequence information alone. To fill this empty area, we have proposed a predictor called iRNA-AI by incorporating the chemical properties of nucleotides and their sliding occurrence density distribution along a RNA sequence into the general form of pseudo nucleotide composition (PseKNC). It has been shown by the rigorous Jackknife Test and independent dataset Test that the performance of the proposed predictor is quite promising. For the convenience of most experimental scientists, a user-friendly web-server for iRNA-AI has been established at http://lin.uestc.edu.cn/server/iRNA-AI/, by which users can easily get their desired results without the need to go through the mathematical details.
-
identifying n 6 methyladenosine sites in the arabidopsis thaliana transcriptome
Molecular Genetics and Genomics, 2016Co-Authors: Wei Chen, Pengmian Feng, Hui Ding, Hao LinAbstract:N 6-Methyladenosine (m6A) plays important roles in many biological processes. The knowledge of the distribution of m6A is helpful for understanding its regulatory roles. Although the experimental methods have been proposed to detect m6A, the resolutions of these methods are still unsatisfying especially for Arabidopsis thaliana. Benefitting from the experimental data, in the current work, a support vector machine-based method was proposed to identify m6A sites in A. thaliana transcriptome. The proposed method was validated on a benchmark dataset using Jackknife Test and was also validated by identifying strain-specific m6A sites in A. thaliana. The obtained predictive results indicate that the proposed method is quite promising. For the convenience of experimental biologists, an online webserver for the proposed method was built, which is freely available at http://lin.uestc.edu.cn/server/M6ATH . These results indicate that the proposed method holds a potential to become an elegant tool in identifying m6A site in A. thaliana.
-
identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition
BioMed Research International, 2016Co-Authors: Huan Yang, Hui Ding, Wei Chen, Hua Tang, Xinxin Chen, Changjian Zhang, Panpan Zhu, Hao LinAbstract:Tuberculosis is killing millions of lives every year and on the blacklist of the most appalling public health problems. Recent findings suggest that secretory protein of Mycobacterium tuberculosis may serve the purpose of developing specific vaccines and drugs due to their antigenicity. Responding to global infectious disease, we focused on the identification of secretory proteins in Mycobacterium tuberculosis. A novel method called MycoSec was designed by incorporating g-gap dipeptide compositions into pseudo amino acid composition. Analysis of variance-based technique was applied in the process of feature selection and a total of 374 optimal features were obtained and used for constructing the final predicting model. In the Jackknife Test, MycoSec yielded a good performance with the area under the receiver operating characteristic curve of 0.93, demonstrating that the proposed system is powerful and robust. For user's convenience, the web server MycoSec was established and an obliging manual on how to use it was provided for getting around any trouble unnecessary.
-
irna pseu identifying rna pseudouridine sites
Molecular therapy. Nucleic acids, 2016Co-Authors: Wei Chen, Hua Tang, Hao Lin, Kuochen ChouAbstract:As the most abundant RNA modification, pseudouridine plays important roles in many biological processes. Occurring at the uridine site and catalyzed by pseudouridine synthase, the modification has been observed in nearly all kinds of RNA, including transfer RNA, messenger RNA, small nuclear or nucleolar RNA, and ribosomal RNA. Accordingly, its importance to basic research and drug development is self-evident. Despite some experimental technologies have been developed to detect the pseudouridine sites, they are both time-consuming and expensive. Facing the explosive growth of RNA sequences in the postgenomic age, we are challenged to address the problem by computational approaches: For an uncharacterized RNA sequence, can we predict which of its uridine sites can be modified as pseudouridine and which ones cannot? Here a predictor called "iRNA-PseU" was proposed by incorporating the chemical properties of nucleotides and their occurrence frequency density distributions into the general form of pseudo nucleotide composition (PseKNC). It has been demonstrated via the rigorous Jackknife Test, independent dataset Test, and practical genome-wide analysis that the proposed predictor remarkably outperforms its counterpart. For the convenience of most experimental scientists, the web-server for iRNA-PseU was established at http://lin.uestc.edu.cn/server/iRNA-PseU, by which users can easily get their desired results without the need to go through the mathematical details.
-
identification and analysis of the n 6 methyladenosine in the saccharomyces cerevisiae transcriptome
Scientific Reports, 2015Co-Authors: Wei Chen, Hao Lin, Hong Tran, Zhiyong Liang, Liqing ZhangAbstract:Knowledge of the distribution of N(6)-methyladenosine (m(6)A) is invaluable for understanding RNA biological functions. However, limitation in experimental methods impedes the progress towards the identification of m(6)A site. As a complement of experimental methods, a support vector machine based-method is proposed to identify m(6)A sites in Saccharomyces cerevisiae genome. In this model, RNA sequences are encoded by their nucleotide chemical property and accumulated nucleotide frequency information. It is observed in the Jackknife Test that the accuracy achieved by the proposed model in identifying the m(6)A site was 78.15%. For the convenience of experimental scientists, a web-server for the proposed model is provided at http://lin.uestc.edu.cn/server/m6Apred.php.