Proteogenomics

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1542 Experts worldwide ranked by ideXlab platform

Jean Armengaud - One of the best experts on this subject based on the ideXlab platform.

  • Proteogenomics‐Guided Evaluation of RNA‐Seq Assembly and Protein Database Construction for Emergent Model Organisms
    Proteomics, 2020
    Co-Authors: Yannick Cogne, Olivier Pible, Olivier Geffard, Arnaud Chaumot, Duarte Gouveia, Davide Degli-esposti, Christine Almunia, Jean Armengaud
    Abstract:

    Proteogenomics is gaining momentum as, today, genomics, transcriptomics, and proteomics can be readily performed on any new species. This approach allows key alterations to molecular pathways to be identified when comparing conditions. For animals and plants, RNA‐seq‐informed proteomics is the most popular means of interpreting tandem mass spectrometry spectra acquired for species for which the genome has not yet been sequenced. It relies on high‐performance de novo RNA‐seq assembly and optimized translation strategies. Here, several pre‐treatments for Illumina RNA‐seq reads before assembly are explored to translate the resulting contigs into useful polypeptide sequences. Experimental transcriptomics and proteomics datasets acquired for individual Gammarus fossarum freshwater crustaceans are used, the most relevant procedure is defined by the ratio of MS/MS spectra assigned to peptide sequences. Removing reads with a mean quality score of less than 17–which represents a single probable nucleotide error on 150‐bp reads–prior to assembly, increases the proteomics outcome. The best translation using Transdecoder is achieved with a minimal open reading frame length of 50 amino acids and systematic selection of ORFs longer than 900 nucleotides. Using these parameters, transcriptome assembly and translation informed by proteomics pave the way to further improvements in Proteogenomics.

  • proteogenomic insights into uranium tolerance of a chernobyl s microbacterium bacterial isolate
    Journal of Proteomics, 2017
    Co-Authors: Nicolas Gallois, Jean Armengaud, Catherine Berthomieu, Beatrice Alphabazin, Philippe Ortet, Mohamed Barakat, Laurie Piette, Justine Long, Virginie Chapon
    Abstract:

    Abstract Microbacterium oleivorans A9 is a uranium-tolerant actinobacteria isolated from the trench T22 located near the Chernobyl nuclear power plant. This site is contaminated with different radionuclides including uranium. To observe the molecular changes at the proteome level occurring in this strain upon uranyl exposure and understand molecular mechanisms explaining its uranium tolerance, we established its draft genome and used this raw information to perform an in-depth Proteogenomics study. High-throughput proteomics were performed on cells exposed or not to 10 μM uranyl nitrate sampled at three previously identified phases of uranyl tolerance. We experimentally detected and annotated 1532 proteins and highlighted a total of 591 proteins for which abundances were significantly differing between conditions. Notably, proteins involved in phosphate and iron metabolisms show high dynamics. A large ratio of proteins more abundant upon uranyl stress, are distant from functionally-annotated known proteins, highlighting the lack of fundamental knowledge regarding numerous key molecular players from soil bacteria. Biological significance Microbacterium oleivorans A9 is an interesting environmental model to understand biological processes engaged in tolerance to radionuclides. Using an innovative Proteogenomics approach, we explored its molecular mechanisms involved in uranium tolerance. We sequenced its genome, interpreted high-throughput proteomic data against a six-reading frame ORF database deduced from the draft genome, annotated the identified proteins and compared protein abundances from cells exposed or not to uranyl stress after a cascade search. These data show that a complex cellular response to uranium occurs in Microbacterium oleivorans A9, where one third of the experimental proteome is modified. In particular, the uranyl stress perturbed the phosphate and iron metabolic pathways. Furthermore, several transporters have been identified to be specifically associated to uranyl stress, paving the way to the development of biotechnological tools for uranium decontamination.

  • proteogenomic insights into the intestinal parasite blastocystis sp subtype 4 isolate wr1
    Proteomics, 2017
    Co-Authors: Jean Armengaud, Kevin S W Tan, Amandine Cian, Magali Chabe, Olivier Pible, Jeancharles Gaillard, Nausicaa Gantois, Eric Viscogliosi
    Abstract:

    Blastocystis sp. is known for years as a highly prevalent anaerobic eukaryotic parasite of humans and animals. Several monophyletic clades have been delineated based on molecular data and the occurrence of each subtype (ST) in humans and/or animal hosts has been documented. The genome of several representatives has been sequenced revealing specific traits such as an intriguing 3’-end processing of primary transcripts. Here, a first high-throughput proteomics dataset acquired on this difficult to cultivate parasite is presented for the zoonotic ST4 isolate WR1. Amongst the 2,766 detected proteins, we highlighted the role of a small ADP ribosylation factor (Arf) GTP-binding protein involved in intracellular traffic as major regulator of vesicle biogenesis and a voltage-dependent anion-selective channel protein because both were unexpectedly highly abundant. We show how these data may be used for gaining Proteogenomics insights into Blastocystis sp. specific molecular mechanisms. We evidenced for the first time by Proteogenomics a functional termination codon derived from transcript polyadenylation for seven different key cellular components. This article is protected by copyright. All rights reserved

  • Proteogenomics of gammarus fossarum to document the reproductive system of amphipods
    Molecular & Cellular Proteomics, 2014
    Co-Authors: Judith Trapp, Jeancharles Gaillard, Olivier Geffard, Arnaud Chaumot, Gilles Imbert, Annehelene Davin, Jean Armengaud
    Abstract:

    Because of their ecological importance, amphipod crustacea are employed worldwide as test species in environmental risk assessment. Although proteomics allows new insights into the molecular mechanisms related to the stress response, such investigations are rare for these organisms because of the lack of comprehensive protein sequence databases. Here, we propose a proteogenomic approach for identifying specific proteins of the freshwater amphipod Gammarus fossarum, a keystone species in European freshwater ecosystems. After deep RNA sequencing, we created a comprehensive ORF database. We identified and annotated the most relevant proteins detected through a shotgun tandem mass spectrometry analysis carried out on the proteomes from three major tissues involved in the organism's reproductive function: the male and female reproductive systems, and the cephalon, where different neuroendocrine glands are present. The 1,873 mass-spectrometry-certified proteins represent the largest crustacean proteomic resource to date, with 218 proteins being lineage specific. Comparative proteomics between the male and female reproductive systems indicated key proteins with strong sexual dimorphism. Protein expression profiles during spermatogenesis at seven different stages highlighted the major gammarid proteins involved in the different facets of reproduction.

  • non model organisms a species endangered by Proteogenomics
    Journal of Proteomics, 2014
    Co-Authors: Jean Armengaud, Olivier Pible, Judith Trapp, Olivier Geffard, Arnaud Chaumot, Erica M Hartmann
    Abstract:

    Abstract Previously, large-scale proteomics was possible only for organisms whose genomes were sequenced, meaning the most common model organisms. The use of next-generation sequencers is now changing the deal. With “Proteogenomics”, the use of experimental proteomics data to refine genome annotations, a higher integration of omics data is gaining ground. By extension, combining genomic and proteomic data is becoming routine in many research projects. “Proteogenomic”-flavored approaches are currently expanding, enabling the molecular studies of non-model organisms at an unprecedented depth. Today draft genomes can be obtained using next-generation sequencers in a rather straightforward way and at a reasonable cost for any organism. Unfinished genome sequences can be used to interpret tandem mass spectrometry proteomics data without the need for time-consuming genome annotation, and the use of RNA-seq to establish nucleotide sequences that are directly translated into protein sequences appears promising. There are, however, certain drawbacks that deserve further attention for RNA-seq to become more efficient. Here, we discuss the opportunities of working with non-model organisms, the proteomic methods that have been used until now, and the dramatic improvements proffered by Proteogenomics. These put the distinction between model and non-model organisms in great danger, at least in terms of proteomics! Biological significance Model organisms have been crucial for in-depth analysis of cellular and molecular processes of life. Focusing the efforts of thousands of researchers on the Escherichia coli bacterium, Saccharomyces cerevisiae yeast, Arabidopsis thaliana plant, Danio rerio fish and other models for which genetic manipulation was possible was certainly worthwhile in terms of fundamental and invaluable biological insights. Until recently, proteomics of non-model organisms was limited to tedious, homology-based techniques, but today draft genomes or RNA-seq data can be straightforwardly obtained using next-generation sequencers, allowing the establishment of a draft protein database for any organism. Thus, Proteogenomics opens new perspectives for molecular studies of non-model organisms, although they are still difficult experimental organisms. This article is part of a Special Issue entitled: Proteomics of non-model organisms.

Olivier Pible - One of the best experts on this subject based on the ideXlab platform.

  • Proteogenomics‐Guided Evaluation of RNA‐Seq Assembly and Protein Database Construction for Emergent Model Organisms
    Proteomics, 2020
    Co-Authors: Yannick Cogne, Olivier Pible, Olivier Geffard, Arnaud Chaumot, Duarte Gouveia, Davide Degli-esposti, Christine Almunia, Jean Armengaud
    Abstract:

    Proteogenomics is gaining momentum as, today, genomics, transcriptomics, and proteomics can be readily performed on any new species. This approach allows key alterations to molecular pathways to be identified when comparing conditions. For animals and plants, RNA‐seq‐informed proteomics is the most popular means of interpreting tandem mass spectrometry spectra acquired for species for which the genome has not yet been sequenced. It relies on high‐performance de novo RNA‐seq assembly and optimized translation strategies. Here, several pre‐treatments for Illumina RNA‐seq reads before assembly are explored to translate the resulting contigs into useful polypeptide sequences. Experimental transcriptomics and proteomics datasets acquired for individual Gammarus fossarum freshwater crustaceans are used, the most relevant procedure is defined by the ratio of MS/MS spectra assigned to peptide sequences. Removing reads with a mean quality score of less than 17–which represents a single probable nucleotide error on 150‐bp reads–prior to assembly, increases the proteomics outcome. The best translation using Transdecoder is achieved with a minimal open reading frame length of 50 amino acids and systematic selection of ORFs longer than 900 nucleotides. Using these parameters, transcriptome assembly and translation informed by proteomics pave the way to further improvements in Proteogenomics.

  • proteogenomic insights into the intestinal parasite blastocystis sp subtype 4 isolate wr1
    Proteomics, 2017
    Co-Authors: Jean Armengaud, Kevin S W Tan, Amandine Cian, Magali Chabe, Olivier Pible, Jeancharles Gaillard, Nausicaa Gantois, Eric Viscogliosi
    Abstract:

    Blastocystis sp. is known for years as a highly prevalent anaerobic eukaryotic parasite of humans and animals. Several monophyletic clades have been delineated based on molecular data and the occurrence of each subtype (ST) in humans and/or animal hosts has been documented. The genome of several representatives has been sequenced revealing specific traits such as an intriguing 3’-end processing of primary transcripts. Here, a first high-throughput proteomics dataset acquired on this difficult to cultivate parasite is presented for the zoonotic ST4 isolate WR1. Amongst the 2,766 detected proteins, we highlighted the role of a small ADP ribosylation factor (Arf) GTP-binding protein involved in intracellular traffic as major regulator of vesicle biogenesis and a voltage-dependent anion-selective channel protein because both were unexpectedly highly abundant. We show how these data may be used for gaining Proteogenomics insights into Blastocystis sp. specific molecular mechanisms. We evidenced for the first time by Proteogenomics a functional termination codon derived from transcript polyadenylation for seven different key cellular components. This article is protected by copyright. All rights reserved

  • non model organisms a species endangered by Proteogenomics
    Journal of Proteomics, 2014
    Co-Authors: Jean Armengaud, Olivier Pible, Judith Trapp, Olivier Geffard, Arnaud Chaumot, Erica M Hartmann
    Abstract:

    Abstract Previously, large-scale proteomics was possible only for organisms whose genomes were sequenced, meaning the most common model organisms. The use of next-generation sequencers is now changing the deal. With “Proteogenomics”, the use of experimental proteomics data to refine genome annotations, a higher integration of omics data is gaining ground. By extension, combining genomic and proteomic data is becoming routine in many research projects. “Proteogenomic”-flavored approaches are currently expanding, enabling the molecular studies of non-model organisms at an unprecedented depth. Today draft genomes can be obtained using next-generation sequencers in a rather straightforward way and at a reasonable cost for any organism. Unfinished genome sequences can be used to interpret tandem mass spectrometry proteomics data without the need for time-consuming genome annotation, and the use of RNA-seq to establish nucleotide sequences that are directly translated into protein sequences appears promising. There are, however, certain drawbacks that deserve further attention for RNA-seq to become more efficient. Here, we discuss the opportunities of working with non-model organisms, the proteomic methods that have been used until now, and the dramatic improvements proffered by Proteogenomics. These put the distinction between model and non-model organisms in great danger, at least in terms of proteomics! Biological significance Model organisms have been crucial for in-depth analysis of cellular and molecular processes of life. Focusing the efforts of thousands of researchers on the Escherichia coli bacterium, Saccharomyces cerevisiae yeast, Arabidopsis thaliana plant, Danio rerio fish and other models for which genetic manipulation was possible was certainly worthwhile in terms of fundamental and invaluable biological insights. Until recently, proteomics of non-model organisms was limited to tedious, homology-based techniques, but today draft genomes or RNA-seq data can be straightforwardly obtained using next-generation sequencers, allowing the establishment of a draft protein database for any organism. Thus, Proteogenomics opens new perspectives for molecular studies of non-model organisms, although they are still difficult experimental organisms. This article is part of a Special Issue entitled: Proteomics of non-model organisms.

Sudhakaran Prabakaran - One of the best experts on this subject based on the ideXlab platform.

  • in silico identification of novel open reading frames in plasmodium falciparum oocyte and salivary gland sporozoites using Proteogenomics framework
    Malaria Journal, 2021
    Co-Authors: Sophie Gunnarsson, Sudhakaran Prabakaran
    Abstract:

    Background Plasmodium falciparum causes the deadliest form of malaria, which remains one of the most prevalent infectious diseases. Unfortunately, the only licensed vaccine showed limited protection and resistance to anti-malarial drug is increasing, which can be largely attributed to the biological complexity of the parasite's life cycle. The progression from one developmental stage to another in P. falciparum involves drastic changes in gene expressions, where its infectivity to human hosts varies greatly depending on the stage. Approaches to identify candidate genes that are responsible for the development of infectivity to human hosts typically involve differential gene expression analysis between stages. However, the detection may be limited to annotated proteins and open reading frames (ORFs) predicted using restrictive criteria. Methods The above problem is particularly relevant for P. falciparum; whose genome annotation is relatively incomplete given its clinical significance. In this work, systems Proteogenomics approach was used to address this challenge, as it allows computational detection of unannotated, novel Open Reading Frames (nORFs), which are neglected by conventional analyses. Two pairs of transcriptome/proteome were obtained from a previous study where one was collected in the mosquito-infectious oocyst sporozoite stage, and the other in the salivary gland sporozoite stage with human infectivity. They were then re-analysed using the Proteogenomics framework to identify nORFs in each stage. Results Translational products of nORFs that map to antisense, intergenic, intronic, 3' UTR and 5' UTR regions, as well as alternative reading frames of canonical proteins were detected. Some of these nORFs also showed differential expression between the two life cycle stages studied. Their regulatory roles were explored through further bioinformatics analyses including the expression regulation on the parent reference genes, in silico structure prediction, and gene ontology term enrichment analysis. Conclusion The identification of nORFs in P. falciparum sporozoites highlights the biological complexity of the parasite. Although the analyses are solely computational, these results provide a starting point for further experimental validation of the existence and functional roles of these nORFs.

  • in silico identification of novel open reading frames in plasmodium falciparum oocyte and salivary gland sporozoites using Proteogenomics framework
    Malaria Journal, 2021
    Co-Authors: Sophie Gunnarsson, Sudhakaran Prabakaran
    Abstract:

    Plasmodium falciparum causes the deadliest form of malaria, which remains one of the most prevalent infectious diseases. Unfortunately, the only licensed vaccine showed limited protection and resistance to anti-malarial drug is increasing, which can be largely attributed to the biological complexity of the parasite’s life cycle. The progression from one developmental stage to another in P. falciparum involves drastic changes in gene expressions, where its infectivity to human hosts varies greatly depending on the stage. Approaches to identify candidate genes that are responsible for the development of infectivity to human hosts typically involve differential gene expression analysis between stages. However, the detection may be limited to annotated proteins and open reading frames (ORFs) predicted using restrictive criteria. The above problem is particularly relevant for P. falciparum; whose genome annotation is relatively incomplete given its clinical significance. In this work, systems Proteogenomics approach was used to address this challenge, as it allows computational detection of unannotated, novel Open Reading Frames (nORFs), which are neglected by conventional analyses. Two pairs of transcriptome/proteome were obtained from a previous study where one was collected in the mosquito-infectious oocyst sporozoite stage, and the other in the salivary gland sporozoite stage with human infectivity. They were then re-analysed using the Proteogenomics framework to identify nORFs in each stage. Translational products of nORFs that map to antisense, intergenic, intronic, 3′ UTR and 5′ UTR regions, as well as alternative reading frames of canonical proteins were detected. Some of these nORFs also showed differential expression between the two life cycle stages studied. Their regulatory roles were explored through further bioinformatics analyses including the expression regulation on the parent reference genes, in silico structure prediction, and gene ontology term enrichment analysis. The identification of nORFs in P. falciparum sporozoites highlights the biological complexity of the parasite. Although the analyses are solely computational, these results provide a starting point for further experimental validation of the existence and functional roles of these nORFs,

Eric Viscogliosi - One of the best experts on this subject based on the ideXlab platform.

  • proteogenomic insights into the intestinal parasite blastocystis sp subtype 4 isolate wr1
    Proteomics, 2017
    Co-Authors: Jean Armengaud, Kevin S W Tan, Amandine Cian, Magali Chabe, Olivier Pible, Jeancharles Gaillard, Nausicaa Gantois, Eric Viscogliosi
    Abstract:

    Blastocystis sp. is known for years as a highly prevalent anaerobic eukaryotic parasite of humans and animals. Several monophyletic clades have been delineated based on molecular data and the occurrence of each subtype (ST) in humans and/or animal hosts has been documented. The genome of several representatives has been sequenced revealing specific traits such as an intriguing 3’-end processing of primary transcripts. Here, a first high-throughput proteomics dataset acquired on this difficult to cultivate parasite is presented for the zoonotic ST4 isolate WR1. Amongst the 2,766 detected proteins, we highlighted the role of a small ADP ribosylation factor (Arf) GTP-binding protein involved in intracellular traffic as major regulator of vesicle biogenesis and a voltage-dependent anion-selective channel protein because both were unexpectedly highly abundant. We show how these data may be used for gaining Proteogenomics insights into Blastocystis sp. specific molecular mechanisms. We evidenced for the first time by Proteogenomics a functional termination codon derived from transcript polyadenylation for seven different key cellular components. This article is protected by copyright. All rights reserved

Juha Kere - One of the best experts on this subject based on the ideXlab platform.

  • pool seq driven proteogenomic database for group g streptococcus
    Journal of Proteomics, 2019
    Co-Authors: Rigbe G Weldatsadik, Neeta Datta, Carolin A Kolmeder, Jaana Vuopio, Juha Kere, S V Wilkman, Justin W Flatt
    Abstract:

    Abstract Proteogenomic databases use genomic and transcriptomic information for improved identification of peptides and proteins from mass spectrometry analyses. One application of such databases is in the discovery of variants/mutations. In this study, we created a proteogenomic database that contained sequences with variants derived from Pooled sequencing experiments (137 Group G Streptococcus strains sequenced in 3 pools) and used tandem mass spectrometry (MS/MS) to analyse eight protein samples from randomly selected strains sequenced in the pools. Using the proteogenomic variant database, we identified 385 variant peptides from the eight samples, none of which could be identified from the single genome conventional database utilized, while 71.2% and 93.5% of them were identified from the databases that contained 4 complete genomes and 26 assemblies, respectively. The proteogenomic variant databases exhibited the same properties as the conventional databases in terms of the Andromeda score distributions and the posterior error probability (PEP) values of the identified peptides. Significance For bacterial populations, such as Group G Streptococcus (GGS), with substantial intra-species diversity, simultaneous sequencing of large numbers of strains and generation of proteogenomic databases from those aids in improving the discovery of peptides in mass spectrometric analyses. Therefore, generation of proteogenomic variant protein databases from Pooled sequencing experiments can be a cost-effective method to complement conventional databases and discover subtle strain wise differences.