Tandem Mass Spectrum

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 5199 Experts worldwide ranked by ideXlab platform

William Stafford Noble - One of the best experts on this subject based on the ideXlab platform.

  • large scale Tandem Mass Spectrum clustering using fast nearest neighbor searching
    Rapid Communications in Mass Spectrometry, 2021
    Co-Authors: Wout Bittremieux, William Stafford Noble, Kris Laukens, Pieter C Dorrestein
    Abstract:

    RATIONALE Advanced algorithmic solutions are necessary to process the ever-increasing amounts of Mass spectrometry data that are being generated. In this study, we describe the falcon Spectrum clustering tool for efficient clustering of millions of MS/MS spectra. METHODS falcon succeeds in efficiently clustering large amounts of Mass spectral data using advanced techniques for fast Spectrum similarity searching. First, high-resolution spectra are binned and converted to low-dimensional vectors using feature hashing. Next, the Spectrum vectors are used to construct nearest neighbor indexes for fast similarity searching. The nearest neighbor indexes are used to efficiently compute a sparse pairwise distance matrix without having to exhaustively perform all pairwise Spectrum comparisons within the relevant precursor Mass tolerance. Finally, density-based clustering is performed to group similar spectra into clusters. RESULTS Several state-of-the-art Spectrum clustering tools were evaluated using a large draft human proteome data set consisting of 25 million spectra, indicating that alternative tools produce clustering results with different characteristics. Notably, falcon generates larger highly pure clusters than alternative tools, leading to a larger reduction in data volume without the loss of relevant information for more efficient downstream processing. CONCLUSIONS falcon is a highly efficient Spectrum clustering tool, which is publicly available as an open source under the permissive BSD license at https://github.com/bittremieux/falcon.

  • large scale Tandem Mass Spectrum clustering using fast nearest neighbor searching
    bioRxiv, 2021
    Co-Authors: William Stafford Noble, Wout Bittremieux, Kris Laukens, Pieter C Dorrestein
    Abstract:

    Abstract Rationale Advanced algorithmic solutions are necessary to process the ever increasing amounts of Mass spectrometry data that is being generated. Here we describe the falcon Spectrum clustering tool for efficient clustering of millions of MS/MS spectra. Methods falcon succeeds in efficiently clustering large amounts of Mass spectral data using advanced techniques for fast Spectrum similarity searching. First, high-resolution spectra are binned and converted to low-dimensional vectors using feature hashing. Next, the Spectrum vectors are used to construct nearest neighbor indexes for fast similarity searching. The nearest neighbor indexes are used to efficiently compute a sparse pairwise distance matrix without having to exhaustively perform all pairwise Spectrum comparisons within the relevant precursor Mass tolerance. Finally, density-based clustering is performed to group similar spectra into clusters. Results Several state-of-the-art Spectrum clustering tools were evaluated using a large draft human proteome dataset consisting of 25 million spectra, indicating that alternative tools produce clustering results with different characteristics. Notably, falcon generates larger highly pure clusters than alternative tools, leading to a larger reduction in data volume without the loss of relevant information for more efficient downstream processing. Conclusions falcon is a highly efficient Spectrum clustering tool. It is publicly available as open source under the permissive BSD license at https://github.com/bittremieux/falcon.

  • dynamic bayesian network for accurate detection of peptides from Tandem Mass spectra
    Journal of Proteome Research, 2016
    Co-Authors: John T Halloran, Jeff A Bilmes, William Stafford Noble
    Abstract:

    A central problem in Mass spectrometry analysis involves identifying, for each observed Tandem Mass Spectrum, the corresponding generating peptide. We present a dynamic Bayesian network (DBN) toolkit that addresses this problem by using a machine learning approach. At the heart of this toolkit is a DBN for Rapid Identification (DRIP), which can be trained from collections of high-confidence peptide-Spectrum matches (PSMs). DRIP’s score function considers fragment ion matches using Gaussians rather than fixed fragment-ion tolerances and also finds the optimal alignment between the theoretical and observed Spectrum by considering all possible alignments, up to a threshold that is controlled using a beam-pruning algorithm. This function not only yields state-of-the art database search accuracy but also can be used to generate features that significantly boost the performance of the Percolator postprocessor. The DRIP software is built upon a general purpose DBN toolkit (GMTK), thereby allowing a wide variety ...

  • an alignment free metapeptide strategy for metaproteomic characterization of microbiome samples using shotgun metagenomic sequencing
    Journal of Proteome Research, 2016
    Co-Authors: Damon May, Emma Timminsschiffman, Molly P Mikan, Rodger H Harvey, Elhanan Borenstein, Brook L Nunn, William Stafford Noble
    Abstract:

    In principle, Tandem Mass spectrometry can be used to detect and quantify the peptides present in a microbiome sample, enabling functional and taxonomic insight into microbiome metabolic activity. However, the phylogenetic diversity constituting a particular microbiome is often unknown, and many of the organisms present may not have assembled genomes. In ocean microbiome samples, with particularly diverse and uncultured bacterial communities, it is difficult to construct protein databases that contain the bulk of the peptides in the sample without losing detection sensitivity due to the overwhelming number of candidate peptides for each Tandem Mass Spectrum. We describe a method for deriving “metapeptides” (short amino acid sequences that may be represented in multiple organisms) from shotgun metagenomic sequencing of microbiome samples. In two ocean microbiome samples, we constructed site-specific metapeptide databases to detect more than one and a half times as many peptides as by searching against pred...

  • Tandem Mass Spectrum identification via cascaded search
    Journal of Proteome Research, 2015
    Co-Authors: Attila Kerteszfarkas, Uri Keich, William Stafford Noble
    Abstract:

    Accurate assignment of peptide sequences to observed fragmentation spectra is hindered by the large number of hypotheses that must be considered for each observed Spectrum. A high score assigned to a particular peptide–Spectrum match (PSM) may not end up being statistically significant after multiple testing correction. Researchers can mitigate this problem by controlling the hypothesis space in various ways: considering only peptides resulting from enzymatic cleavages, ignoring possible post-translational modifications or single nucleotide variants, etc. However, these strategies sacrifice identifications of spectra generated by rarer types of peptides. In this work, we introduce a statistical testing framework, cascade search, that directly addresses this problem. The method requires that the user specify a priori a statistical confidence threshold as well as a series of peptide databases. For instance, such a cascade of databases could include fully tryptic, semitryptic, and nonenzymatic peptides or pe...

Feng Zhang - One of the best experts on this subject based on the ideXlab platform.

  • simultaneous determination of formononetin biochanin a and their active metabolites in human breast milk saliva and urine using salting out assisted liquid liquid extraction and ultra high performance liquid chromatography electrospray ionization tan
    Journal of Chromatography A, 2020
    Co-Authors: Weie Zhou, Qian Wang, Xuefeng Zhou, Yuan Zhang, Yuyang Wang, Zhiqin Ren, Yun Ling, Feng Zhang
    Abstract:

    Isoflavonoid phytoestrogens, referred as “dietary estrogens” are widely distributed in the plant kingdom. Formononetin, biochanin A and their active metabolites daidzein and genistein are known to be the most potent among other isoflavonoid phytoestrogens. Thus there is a growing need to determine accurately their concentration in different biological fluids. In the present work, a sensitive analytical method was developed for the quantitative determination of these compounds in human breast milk, saliva and urine. The glycoside conjugates of these compounds were enzymatically hydrolysis prior to salting-out assisted liquid-liquid extraction. Quantitative analysis was done by ultra high performance liquid chromatography coupled with electrospray ionization Tandem Mass spectrometry. The obtained results showed high correlation coefficients (r² > 0.998) for the linear range established for formononetine, biochanin A, daidzein and genistein. The limits of detection (LODs) and low limits of quantitation (LLOQs) were in the ranges of 0.05–1.0 ng/mL and 1.0–4.0 ng/mL for all analytes in human biological fluids, respectively. The average recoveries ranged from 83.29% to 115.24% for the analytes with relative standard deviation (n = 5) values from 1.84% to 9.75% in samples. Both intra-day and inter-day precisions and accuracy were found to be within 12.53% and ± 12.92% respectively. Under different conditions of stability, the concentrations for four isoflavonoid phytoestrogens deviated within ±12.87% of norminal values. The developed method was successfully validated and applied to human breast milk, saliva and urine. The average concentrations of daidzein and genistein found in breast milk, saliva and urine samples ranged from 0 to 104.2 µg/kg, 18.17 to 786.0 µg/kg, 0 to 10974 µg/kg, respectively. Their presence in breast milk samples shows exposure of breast-fed baby to isoflavones. It also allows for the rapid screening of human biological fluids when testing for formononetin, biochanin A, daidzein and genistein production status in human.

  • determination of gardenia yellow colorants in soft drink pastry instant noodles with ultrasound assisted extraction by high performance liquid chromatography electrospray ionization Tandem Mass Spectrum
    Journal of Chromatography A, 2016
    Co-Authors: Weie Zhou, Yuan Zhang, Zhiqin Ren, Yun Ling, Shoujun Jiang, Zhiqiang Huang, Feng Zhang
    Abstract:

    A novel, rapid and simple analytical method was developed for the quantitative determination of crocin, crocetin and geniposide in soft drink, pastry and instant noodles. The solid samples were relatively homogenized into powders and fragments. The gardenia yellow colorants were successively extracted with methanol using ultrasound-assisted extraction. The analytes were quantitatively measured in the extracts by liquid chromatography coupled with electrospray ionization Tandem Mass spectrometry. High correlation coefficients (r(2)>0.995) of crocin, crocetin and geniposide were obtained within their linear ranges respectively (50-1000ng/mL, 50-1000ng/mL, 15-240ng/mL) by external standard method. The limits of detection (LODs) were 0.02μg/g for crocin, 0.01μg/g for crocetin and 0.002μg/g for geniposide. And the limits of quantitation (LOQs) were in the ranges of 0.05-0.45μg/g for crocin, and in the ranges of 0.042-0.32μg/g for crocetin, and in the ranges of 0.02-0.15μg/g for geniposide in soft drink, pastry and instant noodles samples. The average recoveries of crocin, crocetin and geniposide ranged from 81.3% to 117.6% in soft drink, pastry and instant noodles. The intra- and inter-day precisions were respectively in the range of 1.3-4.8% and 1.7-11.8% in soft drink, pastry and instant noodle. The developed methods were successfully validated and applied to the soft drink, pastry, and instant noodles collected from the located market in Beijing from China. Crocin, crocetin and geniposide were detected in the collected samples. The average concentrations ranged from 0.84 to 4.20mg/g for crocin, and from 0.62 to 3.11mg/g for crocetin, and from 0.18 to 0.79mg/g for gardenia in various food samples. The method can provide evidences for government to determine gardenia yellow pigments and geniposide in food.

John R Yates - One of the best experts on this subject based on the ideXlab platform.

  • automated approach for quantitative analysis of complex peptide mixtures from Tandem Mass spectra
    Nature Methods, 2004
    Co-Authors: John D Venable, Mengqiu Dong, James A Wohlschlegel, Andrew Dillin, John R Yates
    Abstract:

    To take advantage of the potential quantitative benefits offered by Tandem Mass spectrometry, we have modified the method in which Tandem Mass Spectrum data are acquired in 'shotgun' proteomic analyses. The proposed method is not data dependent and is based on the sequential isolation and fragmentation of precursor windows (of 10 m/z) within the ion trap until a desired Mass range has been covered. We compared the quantitative figures of merit for this method to those for existing strategies by performing an analysis of the soluble fraction of whole-cell lysates from yeast metabolically labeled in vivo with 15N. To automate this analysis, we modified software (RelEx) previously written in the Yates lab to generate chromatograms directly from Tandem Mass spectra. These chromatograms showed improvements in signal-to-noise ratio of approximately three- to fivefold over corresponding chromatograms generated from Mass spectrometry scans. In addition, to demonstrate the utility of the data-independent acquisition strategy coupled with chromatogram reconstruction from Tandem Mass spectra, we measured protein expression levels in two developmental stages of Caenorhabditis elegans.

  • a hypergeometric probability model for protein identification and validation using Tandem Mass spectral data and protein sequence databases
    Analytical Chemistry, 2003
    Co-Authors: Rovshan G Sadygov, John R Yates
    Abstract:

    We present a new probability-based method for protein identification using Tandem Mass spectra and protein databases. The method employs a hypergeometric distribution to model frequencies of matches between fragment ions predicted for peptide sequences with a specific (M + H)+ value (at some Mass tolerance) in a protein sequence database and an experimental Tandem Mass Spectrum. The hypergeometric distribution constitutes null hypothesisall peptide matches to a Tandem Mass Spectrum are random. It is used to generate a score characterizing the randomness of a database sequence match to an experimental Tandem Mass Spectrum and to determine the level of significance of the null hypothesis. For each Tandem Mass Spectrum and database search, a peptide is identified that has the least probability of being a random match to the Spectrum and the corresponding level of significance of the null hypothesis is determined. To check the validity of the hypergeometric model in describing fragment ion matches, we used χ2...

  • mining genomes correlating Tandem Mass spectra of modified and unmodified peptides to sequences in nucleotide databases
    Analytical Chemistry, 1995
    Co-Authors: John R Yates, Jimmy K Eng, Ashley L Mccormack
    Abstract:

    The correlation of uninterpreted Tandem Mass spectra of modified and unmodified peptides, produced under low-energy (10-50 eV) collision conditions, with nucleotide sequences is demonstrated. In this method nucleotide databases are translated in six reading frames, and the resulting amino acid sequences are searched "on the fly" to identify and fit linear sequences to the fragmentation patterns observed in the Tandem Mass spectra of peptides. A cross-correlation function is then used to provide a measurement of similarity between the Mass-to-charge ratios for the fragment ions predicted by amino acid sequences translated from the nucleotide database and the fragment ions observed in the Tandem Mass Spectrum. In general, a difference greater than 0.1 between the normalized cross-correlation functions for the first- and second-ranked search results indicates a successful match between sequence and Spectrum. Measurements of the deviation from maximum similarity employing the spectral reconstruction method are made. The search method employing nucleotide databases is also demonstrated on the spectra of phosphorylated peptides. Specific sites of modification are identified even though no specific information relevant to sites of modification is contained in the character-based sequence information of nucleotide databases.

  • method to correlate Tandem Mass spectra of modified peptides to amino acid sequences in the protein database
    Analytical Chemistry, 1995
    Co-Authors: John R Yates, Ashley L Mccormack, David Schieltz
    Abstract:

    Abstract A method to correlate uninterpreted Tandem Mass spectra of modified peptides, produced under low-energy (10-50 eV) collision conditions, with amino acid sequences in a protein database has been developed. The fragmentation patterns observed in the Tandem Mass spectra of peptides containing covalent modifications is used to directly search and fit linear amino acid sequences in the database. Specific information relevant to sites of modification is not contained in the character-based sequence information of the databases. The search method considers each putative modification site as both modified and unmodified in one pass through the database and simultaneously considers up to three different sites of modification. The search method will identify the correct sequence if the Tandem Mass Spectrum did not represent a modified peptide. This approach is demonstrated with peptides containing modifications such as S-carboxymethylated cysteine, oxidized methionine, phosphoserine, phosphothreonine, or phosphotyrosine. In addition, a scanning approach is used in which neutral loss scans are used to initiate the acquisition of product ion MS/MS spectra of doubly charged phosphorylated peptides during a single chromatographic run for data analysis with the database-searching algorithm. The approach described in this paper provides a convenient method to match the nascent Tandem Mass spectra of modified peptides to sequences in a protein database and thereby identify previously unknown sites of modification.

Weie Zhou - One of the best experts on this subject based on the ideXlab platform.

  • simultaneous determination of formononetin biochanin a and their active metabolites in human breast milk saliva and urine using salting out assisted liquid liquid extraction and ultra high performance liquid chromatography electrospray ionization tan
    Journal of Chromatography A, 2020
    Co-Authors: Weie Zhou, Qian Wang, Xuefeng Zhou, Yuan Zhang, Yuyang Wang, Zhiqin Ren, Yun Ling, Feng Zhang
    Abstract:

    Isoflavonoid phytoestrogens, referred as “dietary estrogens” are widely distributed in the plant kingdom. Formononetin, biochanin A and their active metabolites daidzein and genistein are known to be the most potent among other isoflavonoid phytoestrogens. Thus there is a growing need to determine accurately their concentration in different biological fluids. In the present work, a sensitive analytical method was developed for the quantitative determination of these compounds in human breast milk, saliva and urine. The glycoside conjugates of these compounds were enzymatically hydrolysis prior to salting-out assisted liquid-liquid extraction. Quantitative analysis was done by ultra high performance liquid chromatography coupled with electrospray ionization Tandem Mass spectrometry. The obtained results showed high correlation coefficients (r² > 0.998) for the linear range established for formononetine, biochanin A, daidzein and genistein. The limits of detection (LODs) and low limits of quantitation (LLOQs) were in the ranges of 0.05–1.0 ng/mL and 1.0–4.0 ng/mL for all analytes in human biological fluids, respectively. The average recoveries ranged from 83.29% to 115.24% for the analytes with relative standard deviation (n = 5) values from 1.84% to 9.75% in samples. Both intra-day and inter-day precisions and accuracy were found to be within 12.53% and ± 12.92% respectively. Under different conditions of stability, the concentrations for four isoflavonoid phytoestrogens deviated within ±12.87% of norminal values. The developed method was successfully validated and applied to human breast milk, saliva and urine. The average concentrations of daidzein and genistein found in breast milk, saliva and urine samples ranged from 0 to 104.2 µg/kg, 18.17 to 786.0 µg/kg, 0 to 10974 µg/kg, respectively. Their presence in breast milk samples shows exposure of breast-fed baby to isoflavones. It also allows for the rapid screening of human biological fluids when testing for formononetin, biochanin A, daidzein and genistein production status in human.

  • determination of gardenia yellow colorants in soft drink pastry instant noodles with ultrasound assisted extraction by high performance liquid chromatography electrospray ionization Tandem Mass Spectrum
    Journal of Chromatography A, 2016
    Co-Authors: Weie Zhou, Yuan Zhang, Zhiqin Ren, Yun Ling, Shoujun Jiang, Zhiqiang Huang, Feng Zhang
    Abstract:

    A novel, rapid and simple analytical method was developed for the quantitative determination of crocin, crocetin and geniposide in soft drink, pastry and instant noodles. The solid samples were relatively homogenized into powders and fragments. The gardenia yellow colorants were successively extracted with methanol using ultrasound-assisted extraction. The analytes were quantitatively measured in the extracts by liquid chromatography coupled with electrospray ionization Tandem Mass spectrometry. High correlation coefficients (r(2)>0.995) of crocin, crocetin and geniposide were obtained within their linear ranges respectively (50-1000ng/mL, 50-1000ng/mL, 15-240ng/mL) by external standard method. The limits of detection (LODs) were 0.02μg/g for crocin, 0.01μg/g for crocetin and 0.002μg/g for geniposide. And the limits of quantitation (LOQs) were in the ranges of 0.05-0.45μg/g for crocin, and in the ranges of 0.042-0.32μg/g for crocetin, and in the ranges of 0.02-0.15μg/g for geniposide in soft drink, pastry and instant noodles samples. The average recoveries of crocin, crocetin and geniposide ranged from 81.3% to 117.6% in soft drink, pastry and instant noodles. The intra- and inter-day precisions were respectively in the range of 1.3-4.8% and 1.7-11.8% in soft drink, pastry and instant noodle. The developed methods were successfully validated and applied to the soft drink, pastry, and instant noodles collected from the located market in Beijing from China. Crocin, crocetin and geniposide were detected in the collected samples. The average concentrations ranged from 0.84 to 4.20mg/g for crocin, and from 0.62 to 3.11mg/g for crocetin, and from 0.18 to 0.79mg/g for gardenia in various food samples. The method can provide evidences for government to determine gardenia yellow pigments and geniposide in food.

Pieter C Dorrestein - One of the best experts on this subject based on the ideXlab platform.

  • large scale Tandem Mass Spectrum clustering using fast nearest neighbor searching
    Rapid Communications in Mass Spectrometry, 2021
    Co-Authors: Wout Bittremieux, William Stafford Noble, Kris Laukens, Pieter C Dorrestein
    Abstract:

    RATIONALE Advanced algorithmic solutions are necessary to process the ever-increasing amounts of Mass spectrometry data that are being generated. In this study, we describe the falcon Spectrum clustering tool for efficient clustering of millions of MS/MS spectra. METHODS falcon succeeds in efficiently clustering large amounts of Mass spectral data using advanced techniques for fast Spectrum similarity searching. First, high-resolution spectra are binned and converted to low-dimensional vectors using feature hashing. Next, the Spectrum vectors are used to construct nearest neighbor indexes for fast similarity searching. The nearest neighbor indexes are used to efficiently compute a sparse pairwise distance matrix without having to exhaustively perform all pairwise Spectrum comparisons within the relevant precursor Mass tolerance. Finally, density-based clustering is performed to group similar spectra into clusters. RESULTS Several state-of-the-art Spectrum clustering tools were evaluated using a large draft human proteome data set consisting of 25 million spectra, indicating that alternative tools produce clustering results with different characteristics. Notably, falcon generates larger highly pure clusters than alternative tools, leading to a larger reduction in data volume without the loss of relevant information for more efficient downstream processing. CONCLUSIONS falcon is a highly efficient Spectrum clustering tool, which is publicly available as an open source under the permissive BSD license at https://github.com/bittremieux/falcon.

  • large scale Tandem Mass Spectrum clustering using fast nearest neighbor searching
    bioRxiv, 2021
    Co-Authors: William Stafford Noble, Wout Bittremieux, Kris Laukens, Pieter C Dorrestein
    Abstract:

    Abstract Rationale Advanced algorithmic solutions are necessary to process the ever increasing amounts of Mass spectrometry data that is being generated. Here we describe the falcon Spectrum clustering tool for efficient clustering of millions of MS/MS spectra. Methods falcon succeeds in efficiently clustering large amounts of Mass spectral data using advanced techniques for fast Spectrum similarity searching. First, high-resolution spectra are binned and converted to low-dimensional vectors using feature hashing. Next, the Spectrum vectors are used to construct nearest neighbor indexes for fast similarity searching. The nearest neighbor indexes are used to efficiently compute a sparse pairwise distance matrix without having to exhaustively perform all pairwise Spectrum comparisons within the relevant precursor Mass tolerance. Finally, density-based clustering is performed to group similar spectra into clusters. Results Several state-of-the-art Spectrum clustering tools were evaluated using a large draft human proteome dataset consisting of 25 million spectra, indicating that alternative tools produce clustering results with different characteristics. Notably, falcon generates larger highly pure clusters than alternative tools, leading to a larger reduction in data volume without the loss of relevant information for more efficient downstream processing. Conclusions falcon is a highly efficient Spectrum clustering tool. It is publicly available as open source under the permissive BSD license at https://github.com/bittremieux/falcon.