Protein Similarity

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 112287 Experts worldwide ranked by ideXlab platform

Oded Beja - One of the best experts on this subject based on the ideXlab platform.

  • Microbial community genomics in eastern Mediterranean Sea surface waters
    ISME Journal, 2010
    Co-Authors: Roi Feingersch, Michael Shmoish, Itai Sharon, Gazalah Sabehi, Marcelino T Suzuki, Frederic Partensky, Oded Beja
    Abstract:

    Offshore waters of the eastern Mediterranean Sea are one of the most oligotrophic regions on Earth in which the primary productivity is phosphorus limited. To study the unexplored function and physiology of microbes inhabiting this system, we have analyzed a genomic library from the eastern Mediterranean Sea surface waters by sequencing both termini of nearly 5000 clones. Genome recruitment strategies showed that the majority of high-scoring pairs corresponded to genomes from the Alphaproteobacteria (SAR11-like and Rhodobacterales), Cyanobacteria (Synechococcus and high-light adapted Prochlorococcus) and diverse uncultured Gammaproteobacteria. The community structure observed, as evaluated by both Protein Similarity scores or metabolic potential, was similar to that found in the euphotic zone of the ALOHA station off Hawaii but very different from that of deep aphotic zones in both the Mediterranean Sea and the Pacific Ocean. In addition, a strong enrichment toward phosphate and phosphonate uptake and utilization metabolism was also observed. The ISME Journal (2010) 4, 78-87; doi:10.1038/ismej.2009.92; published online 20 August 2009

  • Microbial community genomics in eastern Mediterranean Sea surface waters.
    The ISME Journal, 2009
    Co-Authors: Roi Feingersch, Michael Shmoish, Itai Sharon, Gazalah Sabehi, Marcelino T Suzuki, Frederic Partensky, Oded Beja
    Abstract:

    Offshore waters of the eastern Mediterranean Sea are one of the most oligotrophic regions on Earth in which the primary productivity is phosphorus limited. To study the unexplored function and physiology of microbes inhabiting this system, we have analyzed a genomic library from the eastern Mediterranean Sea surface waters by sequencing both termini of nearly 5000 clones. Genome recruitment strategies showed that the majority of high-scoring pairs corresponded to genomes from the Alphaproteobacteria (SAR11-like and Rhodobacterales), Cyanobacteria (Synechococcus and high-light adapted Prochlorococcus) and diverse uncultured Gammaproteobacteria. The community structure observed, as evaluated by both Protein Similarity scores or metabolic potential, was similar to that found in the euphotic zone of the ALOHA station off Hawaii but very different from that of deep aphotic zones in both the Mediterranean Sea and the Pacific Ocean. In addition, a strong enrichment toward phosphate and phosphonate uptake and utilization metabolism was also observed.

Roi Feingersch - One of the best experts on this subject based on the ideXlab platform.

  • Microbial community genomics in eastern Mediterranean Sea surface waters
    ISME Journal, 2010
    Co-Authors: Roi Feingersch, Michael Shmoish, Itai Sharon, Gazalah Sabehi, Marcelino T Suzuki, Frederic Partensky, Oded Beja
    Abstract:

    Offshore waters of the eastern Mediterranean Sea are one of the most oligotrophic regions on Earth in which the primary productivity is phosphorus limited. To study the unexplored function and physiology of microbes inhabiting this system, we have analyzed a genomic library from the eastern Mediterranean Sea surface waters by sequencing both termini of nearly 5000 clones. Genome recruitment strategies showed that the majority of high-scoring pairs corresponded to genomes from the Alphaproteobacteria (SAR11-like and Rhodobacterales), Cyanobacteria (Synechococcus and high-light adapted Prochlorococcus) and diverse uncultured Gammaproteobacteria. The community structure observed, as evaluated by both Protein Similarity scores or metabolic potential, was similar to that found in the euphotic zone of the ALOHA station off Hawaii but very different from that of deep aphotic zones in both the Mediterranean Sea and the Pacific Ocean. In addition, a strong enrichment toward phosphate and phosphonate uptake and utilization metabolism was also observed. The ISME Journal (2010) 4, 78-87; doi:10.1038/ismej.2009.92; published online 20 August 2009

  • Microbial community genomics in eastern Mediterranean Sea surface waters.
    The ISME Journal, 2009
    Co-Authors: Roi Feingersch, Michael Shmoish, Itai Sharon, Gazalah Sabehi, Marcelino T Suzuki, Frederic Partensky, Oded Beja
    Abstract:

    Offshore waters of the eastern Mediterranean Sea are one of the most oligotrophic regions on Earth in which the primary productivity is phosphorus limited. To study the unexplored function and physiology of microbes inhabiting this system, we have analyzed a genomic library from the eastern Mediterranean Sea surface waters by sequencing both termini of nearly 5000 clones. Genome recruitment strategies showed that the majority of high-scoring pairs corresponded to genomes from the Alphaproteobacteria (SAR11-like and Rhodobacterales), Cyanobacteria (Synechococcus and high-light adapted Prochlorococcus) and diverse uncultured Gammaproteobacteria. The community structure observed, as evaluated by both Protein Similarity scores or metabolic potential, was similar to that found in the euphotic zone of the ALOHA station off Hawaii but very different from that of deep aphotic zones in both the Mediterranean Sea and the Pacific Ocean. In addition, a strong enrichment toward phosphate and phosphonate uptake and utilization metabolism was also observed.

Stephen F Altschul - One of the best experts on this subject based on the ideXlab platform.

  • the effectiveness of position and composition specific gap costs for Protein Similarity searches
    Intelligent Systems in Molecular Biology, 2008
    Co-Authors: Aleksandar Stojmirovic, Michael E Gertz, Stephen F Altschul
    Abstract:

    Motivation: The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position-and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments. Results: We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST Protein database search program. Availability: The scripts for performing evaluations are available upon request from the authors. Contact: yyu@ncbi.nlm.nih.gov

  • ISMB - The effectiveness of position-and composition-specific gap costs for Protein Similarity searches
    Bioinformatics (Oxford England), 2008
    Co-Authors: Aleksandar Stojmirović, E. Michael Gertz, Stephen F Altschul
    Abstract:

    Motivation: The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position-and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments. Results: We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST Protein database search program. Availability: The scripts for performing evaluations are available upon request from the authors. Contact: yyu@ncbi.nlm.nih.gov

William Stafford Noble - One of the best experts on this subject based on the ideXlab platform.

  • rankprop a web server for Protein remote homology detection
    Bioinformatics, 2009
    Co-Authors: Iain Melvin, Jason Weston, Christina S. Leslie, William Stafford Noble
    Abstract:

    Summary: We present a large-scale implementation of the Rankprop Protein homology ranking algorithm in the form of an openly accessible web server. We use the NRDB40 PSI-BLAST all-versusall Protein Similarity network of 1.1 million Proteins to construct the graph for the Rankprop algorithm, whereas previously, results were only reported for a database of 108 000 Proteins. We also describe two algorithmic improvements to the original algorithm, including propagation from multiple homologs of the query and better normalization of ranking scores, that lead to higher accuracy and to scores with a probabilistic interpretation. Availability: The Rankprop web server and source code are available

  • Identifying remote Protein homologs by network propagation
    The FEBS journal, 2005
    Co-Authors: William Stafford Noble, Rui Kuang, Christina S. Leslie, Jason Weston
    Abstract:

    Perhaps the most widely used applications of bioinformatics are tools such as psi-blast for searching sequence databases. We describe a recently developed Protein database search algorithm called rankprop. rankprop relies upon a precomputed network of pairwise Protein similarities. The algorithm performs a diffusion operation from a specified query Protein across the Protein Similarity network. The resulting activation scores, assigned to each database Protein, encode information about the global structure of the Protein Similarity network. This type of algorithm has a rich history in associationist psychology, artificial intelligence and web search. We describe the rankprop algorithm and its relatives, and we provide evidence that the algorithm successfully improves upon the rankings produced by psi-blast.

  • Protein ranking: from local to global structure in the Protein Similarity network.
    Proceedings of the National Academy of Sciences of the United States of America, 2004
    Co-Authors: Jason Weston, Christina S. Leslie, André Elisseeff, Dengyong Zhou, William Stafford Noble
    Abstract:

    Biologists regularly search databases of DNA or Protein sequences for evolutionary or functional relationships to a given query sequence. We describe a ranking algorithm that exploits the entire network structure of Similarity relationships among Proteins in a sequence database by performing a diffusion operation on a precomputed, weighted network. The resulting ranking algorithm, evaluated by using a human-curated database of Protein structures, is efficient and provides significantly better rankings than a local network search algorithm such as psi-blast.

Denis Kaznadzey - One of the best experts on this subject based on the ideXlab platform.

  • PSimScan: algorithm and utility for fast Protein Similarity search.
    PloS one, 2013
    Co-Authors: Anna Kaznadzey, Natalia Alexandrova, Vladimir Novichkov, Denis Kaznadzey
    Abstract:

    In the era of metagenomics and diagnostics sequencing, the importance of Protein comparison methods of boosted performance cannot be overstated. Here we present PSimScan (Protein Similarity Scanner), a flexible open source Protein Similarity search tool which provides a significant gain in speed compared to BLASTP at the price of controlled sensitivity loss. The PSimScan algorithm introduces a number of novel performance optimization methods that can be further used by the community to improve the speed and lower hardware requirements of bioinformatics software. The optimization starts at the lookup table construction, then the initial lookup table–based hits are passed through a pipeline of filtering and aggregation routines of increasing computational complexity. The first step in this pipeline is a novel algorithm that builds and selects ‘Similarity zones’ aggregated from neighboring matches on small arrays of adjacent diagonals. PSimScan performs 5 to 100 times faster than the standard NCBI BLASTP, depending on chosen parameters, and runs on commodity hardware. Its sensitivity and selectivity at the slowest settings are comparable to the NCBI BLASTP’s and decrease with the increase of speed, yet stay at the levels reasonable for many tasks. PSimScan is most advantageous when used on large collections of query sequences. Comparing the entire proteome of Streptocuccus pneumoniae (2,042 Proteins) to the NCBI’s non-redundant Protein database of 16,971,855 records takes 6.5 hours on a moderately powerful PC, while the same task with the NCBI BLASTP takes over 66 hours. We describe innovations in the PSimScan algorithm in considerable detail to encourage bioinformaticians to improve on the tool and to use the innovations in their own software development.