Sequenced Genomes

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 70713 Experts worldwide ranked by ideXlab platform

Eugene V Koonin - One of the best experts on this subject based on the ideXlab platform.

  • cogs an evolutionary classification of genes and proteins from Sequenced Genomes
    Encyclopedia of Genetics Genomics Proteomics and Bioinformatics, 2005
    Co-Authors: Eugene V Koonin
    Abstract:

    A comprehensive classification of genes from Sequenced Genomes, based on evolutionary principles, is a must for the success of comparative and functional genomics. One such classification system is the database of Clusters of Orthologous Groups of proteins (COGs), which was constructed by clustering the results of an all-against-all comparison of the protein sequences encoded in prokaryotic and eukaryotic Genomes. Each COG includes genes or sets of genes from three or more Genomes, which are orthologous to each other, that is, evolved from a single ancestral gene in the common ancestor of the analyzed organisms. Between 50 and 85% of the genes from the Sequenced Genomes belong to COGs, indicating notable evolutionary conservation. The COG system is a natural framework for comparative genomics and has the potential of facilitating both functional annotation of Genomes and large-scale evolutionary studies. Keywords: comparative genomics; genome evolution; orthologs; paralogs; functional annotation; phyletic patterns; lineage-specific gene loss; horizontal gene transfer

  • algorithms for computing parsimonious evolutionary scenarios for genome evolution the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes
    BMC Evolutionary Biology, 2003
    Co-Authors: Boris Mirkin, Michael Y. Galperin, Trevor Fenner, Eugene V Koonin
    Abstract:

    Background Comparative analysis of Sequenced Genomes reveals numerous instances of apparent horizontal gene transfer (HGT), at least in prokaryotes, and indicates that lineage-specific gene loss might have been even more common in evolution. This complicates the notion of a species tree, which needs to be re-interpreted as a prevailing evolutionary trend, rather than the full depiction of evolution, and makes reconstruction of ancestral Genomes a non-trivial task.

  • the cog database a tool for genome scale analysis of protein functions and evolution
    Nucleic Acids Research, 2000
    Co-Authors: Roman L. Tatusov, D A Natale, Michael Y. Galperin, Eugene V Koonin
    Abstract:

    Rational classification of proteins encoded in Sequenced Genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete Genomes of bacteria, archaea and eukaryotes (http://www. ncbi.nlm.nih.gov/COG ). The COGs were constructed by applying the criterion of consistency of Genomespecific best hits to the results of an exhaustive comparison of all protein sequences from these Genomes. The database comprises 2091 COGs that include 56–83% of the gene products from each of the complete bacterial and archaeal Genomes and ~35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly Sequenced Genomes.

  • sources of systematic error in functional annotation of Genomes domain rearrangement non orthologous gene displacement and operon disruption
    in Silico Biology, 1998
    Co-Authors: Michael Y. Galperin, Eugene V Koonin
    Abstract:

    Functional annotation of proteins encoded in newly Sequenced Genomes can be expected to meet two conflicting objectives: (i) provide as much information as possible, and (ii) avoid erroneous functional assignments and over-predictions. The continuing exponential growth of the number of Sequenced Genomes makes the quality of sequence annotation a critical factor in the efforts to utilize this new information. When dubious functional assignments are used as a basis for subsequent predictions, they tend to proliferate, leading to "database explosion". It is therefore important to identify the common factors that hamper functional annotation. As a first step towards that goal, we have compared the annotations of the Mycoplasma genitalium and Methanococcus jannaschii Genomes produced in several independent studies. The most common causes of questionable predictions appear to be: i) non- critical use of annotations from existing database entries; ii) taking into account only the annotation of the best database hit; iii) insufficient masking of low complexity regions (e.g. non-globular domains) in protein sequences, resulting in spurious database hits obscuring relevant ones; iv) ignoring multi-domain organization of the query proteins and/or the database hits; v) non-critical functional inferences on the basis of the functions of neighboring genes in an operon; vi) non-orthologous gene displacement, i.e. involvement of structurally unrelated proteins in the same function. These observations suggest that case by case validation of functional annotation by expert biologists remains crucial for productive genome analysis.

Ian T Paulsen - One of the best experts on this subject based on the ideXlab platform.

  • transportdb 2 0 a database for exploring membrane transporters in Sequenced Genomes from all domains of life
    Nucleic Acids Research, 2017
    Co-Authors: Liam D H Elbourne, Sasha G Tetu, Karl A Hassan, Ian T Paulsen
    Abstract:

    All cellular life contains an extensive array of membrane transport proteins. The vast majority of these transporters have not been experimentally characterized. We have developed a bioinformatic pipeline to identify and annotate complete sets of transporters in any Sequenced genome. This pipeline is now fully automated enabling it to better keep pace with the accelerating rate of genome sequencing. This manuscript describes TransportDB 2.0 (http://www.membranetransport.org/transportDB2/), a completely updated version of TransportDB, which provides access to the large volumes of data generated by our automated transporter annotation pipeline. The TransportDB 2.0 web portal has been rebuilt to utilize contemporary JavaScript libraries, providing a highly interactive interface to the annotation information, and incorporates analysis tools that enable users to query the database on a number of levels. For example, TransportDB 2.0 includes tools that allow users to select annotated Genomes of interest from the thousands of species held in the database and compare their complete transporter complements.

  • complete genome sequence and comparative genomic analysis of an emerging human pathogen serotype v streptococcus agalactiae
    Proceedings of the National Academy of Sciences of the United States of America, 2002
    Co-Authors: Herve Tettelin, Ian T Paulsen, Vega Masignani, Michael J Cieslewicz, Jonathan A Eisen, Scott N Peterson, M Wessels
    Abstract:

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely Sequenced Genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined with comparative genome hybridization experiments between the Sequenced serotype V strain 2603 V/R and 19 S. agalactiae strains from several serotypes using whole-genome microarrays, revealed the genetic heterogeneity among S. agalactiae strains, even of the same serotype, and provided insights into the evolution of virulence mechanisms.

Darina Čejková - One of the best experts on this subject based on the ideXlab platform.

  • Directly Sequenced Genomes of Contemporary Strains of Syphilis Reveal Recombination-Driven Diversity in Genes Encoding Predicted Surface-Exposed Antigens
    Frontiers in Microbiology, 2019
    Co-Authors: Linda Grillova, Jan Oppelt, Lenka Mikalová, Markéta Nováková, Lorenzo Giacani, Anežka Niesnerová, Angel Noda, Ariel Mechaly, Petra Pospíšilová, Darina Čejková
    Abstract:

    Syphilis, caused by Treponema pallidum subsp. pallidum (TPA), remains an important public health problem with an increasing worldwide prevalence. Despite recent advances in in vitro cultivation, genetic variability of this pathogen during infection is poorly understood. Here, we present contemporary and geographically diverse complete treponemal genome sequences isolated directly from patients using a methyl-directed enrichment prior to sequencing. This approach reveals that approximately 50% of the genetic diversity found in TPA is driven by inter- and/or intra-strain recombination events, particularly in strains belonging to one of the defined genetic groups of syphilis treponemes: Nichols-like strains. Recombinant loci were found to encode putative outer-membrane proteins and the recombination variability was almost exclusively found in regions predicted to be at the host-pathogen interface. Genetic recombination has been considered to be a rare event in treponemes, yet our study unexpectedly showed that it occurs at a significant level and may have important impacts in the biology of this pathogen, especially as these events occur primarily in the outer membrane proteins. This study reveals the existence of strains with different repertoires of surface-exposed antigens circulating in the current human population, which should be taken into account during syphilis vaccine development.

Arcady Mushegian - One of the best experts on this subject based on the ideXlab platform.

Michael Y. Galperin - One of the best experts on this subject based on the ideXlab platform.

  • algorithms for computing parsimonious evolutionary scenarios for genome evolution the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes
    BMC Evolutionary Biology, 2003
    Co-Authors: Boris Mirkin, Michael Y. Galperin, Trevor Fenner, Eugene V Koonin
    Abstract:

    Background Comparative analysis of Sequenced Genomes reveals numerous instances of apparent horizontal gene transfer (HGT), at least in prokaryotes, and indicates that lineage-specific gene loss might have been even more common in evolution. This complicates the notion of a species tree, which needs to be re-interpreted as a prevailing evolutionary trend, rather than the full depiction of evolution, and makes reconstruction of ancestral Genomes a non-trivial task.

  • The COG database: a tool for genome-scale analysis of protein functions and evolution.
    Nucleic acids research, 2000
    Co-Authors: Roman L. Tatusov, Michael Y. Galperin, D A Natale
    Abstract:

    Rational classification of proteins encoded in Sequenced Genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete Genomes of bacteria, archaea and eukaryotes (http://www. ncbi.nlm. nih.gov/COG). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these Genomes. The database comprises 2091 COGs that include 56-83% of the gene products from each of the complete bacterial and archaeal Genomes and approximately 35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly Sequenced Genomes.

  • the cog database a tool for genome scale analysis of protein functions and evolution
    Nucleic Acids Research, 2000
    Co-Authors: Roman L. Tatusov, D A Natale, Michael Y. Galperin, Eugene V Koonin
    Abstract:

    Rational classification of proteins encoded in Sequenced Genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete Genomes of bacteria, archaea and eukaryotes (http://www. ncbi.nlm.nih.gov/COG ). The COGs were constructed by applying the criterion of consistency of Genomespecific best hits to the results of an exhaustive comparison of all protein sequences from these Genomes. The database comprises 2091 COGs that include 56–83% of the gene products from each of the complete bacterial and archaeal Genomes and ~35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly Sequenced Genomes.

  • sources of systematic error in functional annotation of Genomes domain rearrangement non orthologous gene displacement and operon disruption
    in Silico Biology, 1998
    Co-Authors: Michael Y. Galperin, Eugene V Koonin
    Abstract:

    Functional annotation of proteins encoded in newly Sequenced Genomes can be expected to meet two conflicting objectives: (i) provide as much information as possible, and (ii) avoid erroneous functional assignments and over-predictions. The continuing exponential growth of the number of Sequenced Genomes makes the quality of sequence annotation a critical factor in the efforts to utilize this new information. When dubious functional assignments are used as a basis for subsequent predictions, they tend to proliferate, leading to "database explosion". It is therefore important to identify the common factors that hamper functional annotation. As a first step towards that goal, we have compared the annotations of the Mycoplasma genitalium and Methanococcus jannaschii Genomes produced in several independent studies. The most common causes of questionable predictions appear to be: i) non- critical use of annotations from existing database entries; ii) taking into account only the annotation of the best database hit; iii) insufficient masking of low complexity regions (e.g. non-globular domains) in protein sequences, resulting in spurious database hits obscuring relevant ones; iv) ignoring multi-domain organization of the query proteins and/or the database hits; v) non-critical functional inferences on the basis of the functions of neighboring genes in an operon; vi) non-orthologous gene displacement, i.e. involvement of structurally unrelated proteins in the same function. These observations suggest that case by case validation of functional annotation by expert biologists remains crucial for productive genome analysis.