Sequence Analysis

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 936003 Experts worldwide ranked by ideXlab platform

Jacques Van Helden - One of the best experts on this subject based on the ideXlab platform.

  • rsat 2011 regulatory Sequence Analysis tools
    Nucleic Acids Research, 2011
    Co-Authors: Morgane Thomaschollier, Olivier Sand, Matthieu Defrance, Alejandra Medinarivera, Carl Herrmann, Denis Thieffry, Jacques Van Helden
    Abstract:

    RSAT (Regulatory Sequence Analysis Tools) comprises a wide collection of modular tools for the detection of cis-regulatory elements in genome Sequences. Thirteen new programs have been added to the 30 described in the 2008 NAR Web Software Issue, including an automated Sequence retrieval from EnsEMBL (retrieve-ensembl-seq), two novel motif discovery algorithms (oligo-diff and info-gibbs), a 100-times faster version of matrix-scan enabling the scanning of genome-scale Sequence sets, and a series of facilities for random model generation and statistical evaluation (random-genome-fragments, random-motifs, random-sites, implant-sites, Sequence-probability, permute-matrix). Our most recent work also focused on motif comparison (compare-matrices) and evaluation of motif quality (matrix-quality) by combining theoretical and empirical measures to assess the predictive capability of position-specific scoring matrices. To process large collections of peak Sequences obtained from ChIP-seq or related technologies, RSAT provides a new program (peak-motifs) that combines several efficient motif discovery algorithms to predict transcription factor binding motifs, match them against motif databases and predict their binding sites. Availability (web site, stand-alone programs and SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services): http://rsat.ulb.ac.be/rsat/.

  • rsat regulatory Sequence Analysis tools
    Nucleic Acids Research, 2008
    Co-Authors: Morgane Thomaschollier, Olivier Sand, Rekins Janky, Matthieu Defrance, Eric Vervisch, Sylvain Brohee, Jean Valery Turatsinze, Jacques Van Helden
    Abstract:

    The regulatory Sequence Analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome Sequences. The suite includes programs for Sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random Sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-Analysis and dyad-Analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.

  • regulatory Sequence Analysis tools
    Nucleic Acids Research, 2003
    Co-Authors: Jacques Van Helden
    Abstract:

    The web resource Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat) offers a collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA Sequences. These tools include Sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random Sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery. RSAT currently holds >100 fully Sequenced genomes and these data are regularly updated from GenBank.

Sean R. Eddy - One of the best experts on this subject based on the ideXlab platform.

  • a model of the statistical power of comparative genome Sequence Analysis
    PLOS Biology, 2005
    Co-Authors: Sean R. Eddy
    Abstract:

    Comparative genome Sequence Analysis is powerful, but sequencing genomes is expensive. It is desirable to be able to predict how many genomes are needed for comparative genomics, and at what evolutionary distances. Here I describe a simple mathematical model for the common problem of identifying conserved Sequences. The model leads to some useful rules of thumb. For a given evolutionary distance, the number of comparative genomes needed for a constant level of statistical stringency in identifying conserved regions scales inversely with the size of the conserved feature to be detected. At short evolutionary distances, the number of comparative genomes required also scales inversely with distance. These scaling behaviors provide some intuition for future comparative genome sequencing needs, such as the proposed use of “phylogenetic shadowing” methods using closely related comparative genomes, and the feasibility of high-resolution detection of small conserved features.

  • biological Sequence Analysis probabilistic models of proteins and nucleic acids
    1998
    Co-Authors: Richard Durbin, Anders Krogh, Sean R. Eddy, Graeme Mitchison
    Abstract:

    Probablistic models are becoming increasingly important in analyzing the huge amount of data being produced by large-scale DNA-sequencing efforts such as the Human Genome Project. For example, hidden Markov models are used for analyzing biological Sequences, linguistic-grammar-based probabilistic models for identifying RNA secondary structure, and probabilistic evolutionary models for inferring phylogenies of Sequences from different organisms. This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of Sequence Analysis. Written by an interdisciplinary team of authors, it is accessible to molecular biologists, computer scientists, and mathematicians with no formal knowledge of the other fields, and at the same time presents the state of the art in this new and important field.

  • rna Sequence Analysis using covariance models
    Nucleic Acids Research, 1994
    Co-Authors: Sean R. Eddy, Richard Durbin
    Abstract:

    We describe a general approach to several RNA Sequence Analysis problems using probabilistic models that flexibly describe the secondary structure and primary Sequence consensus of an RNA Sequence family. We call these models 'covariance models'. A covariance model of tRNA Sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related Sequences in Sequence databases. A model can be built automatically from an existing Sequence alignment. We also describe an algorithm for learning a model and hence a consensus secondary structure from initially unaligned example Sequences and no prior structural information. Models trained on unaligned tRNA examples correctly predict tRNA secondary structure and produce high-quality multiple alignments. The approach may be applied to any family of small RNA Sequences.

Sebastien Jaeger - One of the best experts on this subject based on the ideXlab platform.

  • rsat 2015 regulatory Sequence Analysis tools
    Nucleic Acids Research, 2015
    Co-Authors: Alejandra Medinarivera, Olivier Sand, Matthieu Defrance, Carl Herrmann, Jaime A Castromondragon, Sebastien Jaeger, Jeremy Delerce
    Abstract:

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the Analysis of cis-regulatory elements in genome Sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif Analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) Analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract Sequences from a list of coordinates (fetch-Sequences from UCSC), novel programs dedicated to the Analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of Sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

Carl Herrmann - One of the best experts on this subject based on the ideXlab platform.

  • rsat 2015 regulatory Sequence Analysis tools
    Nucleic Acids Research, 2015
    Co-Authors: Alejandra Medinarivera, Olivier Sand, Matthieu Defrance, Carl Herrmann, Jaime A Castromondragon, Sebastien Jaeger, Jeremy Delerce
    Abstract:

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the Analysis of cis-regulatory elements in genome Sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif Analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) Analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract Sequences from a list of coordinates (fetch-Sequences from UCSC), novel programs dedicated to the Analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of Sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

  • rsat 2011 regulatory Sequence Analysis tools
    Nucleic Acids Research, 2011
    Co-Authors: Morgane Thomaschollier, Olivier Sand, Matthieu Defrance, Alejandra Medinarivera, Carl Herrmann, Denis Thieffry, Jacques Van Helden
    Abstract:

    RSAT (Regulatory Sequence Analysis Tools) comprises a wide collection of modular tools for the detection of cis-regulatory elements in genome Sequences. Thirteen new programs have been added to the 30 described in the 2008 NAR Web Software Issue, including an automated Sequence retrieval from EnsEMBL (retrieve-ensembl-seq), two novel motif discovery algorithms (oligo-diff and info-gibbs), a 100-times faster version of matrix-scan enabling the scanning of genome-scale Sequence sets, and a series of facilities for random model generation and statistical evaluation (random-genome-fragments, random-motifs, random-sites, implant-sites, Sequence-probability, permute-matrix). Our most recent work also focused on motif comparison (compare-matrices) and evaluation of motif quality (matrix-quality) by combining theoretical and empirical measures to assess the predictive capability of position-specific scoring matrices. To process large collections of peak Sequences obtained from ChIP-seq or related technologies, RSAT provides a new program (peak-motifs) that combines several efficient motif discovery algorithms to predict transcription factor binding motifs, match them against motif databases and predict their binding sites. Availability (web site, stand-alone programs and SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services): http://rsat.ulb.ac.be/rsat/.

Alfred Puhler - One of the best experts on this subject based on the ideXlab platform.

  • the Sequence Analysis and management system sams 2 0 data management and Sequence Analysis adapted to changing requirements from traditional sanger sequencing to ultrafast sequencing technologies
    Journal of Biotechnology, 2009
    Co-Authors: Thomas Bekel, Kolja Henckel, Helge Kuster, Folker Meyer, Virginie Mittard Runte, Heiko Neuweger, Daniel Paarmann, Oliver Rupp, Martha Zakrzewski, Alfred Puhler
    Abstract:

    DNA sequencing plays a more and more important role in various fields of genetics. This includes sequencing of whole genomes, libraries of cDNA clones and probes of metagenome communities. The applied sequencing technologies evolve permanently. With the emergence of ultrafast sequencing technologies, a new era of DNA sequencing has recently started. Concurrently, the needs for adapted bioinformatics tools arise. Since the ability to process current datasets efficiently is essential for modern genetics, a modular bioinformatics platform providing extensive Sequence Analysis methods, is designated to achieve well the constantly growing requirements. The Sequence Analysis and Management System (SAMS) is a bioinformatics software platform with a database backend designed to support the computational Analysis of (1) whole genome shotgun (WGS) bacterial genome sequencing, (2) cDNA sequencing by reading expressed Sequence tags (ESTs) as well as (3) Sequence data obtained by ultrafast sequencing. It provides extensive bioinformatics Analysis of Sequenced single reads, sequencing libraries and fragments of arbitrary DNA Sequences such as assembled contigs of metagenome reads for instance. The system has been implemented to cope with several thousands of Sequences, efficiently processing them and storing the results for further Analysis. With the project setup, SAMS automatically recognizes the data type.

  • the Sequence Analysis and management system sams 2 0 data management and Sequence Analysis adapted to changing requirements from traditional sanger sequencing to ultrafast sequencing technologies
    Journal of Biotechnology, 2009
    Co-Authors: Thomas Bekel, Kolja Henckel, Helge Kuster, Folker Meyer, Virginie Mittard Runte, Heiko Neuweger, Daniel Paarmann, Oliver Rupp, Martha Zakrzewski, Alfred Puhler
    Abstract:

    DNA sequencing plays a more and more important role in various fields of genetics. This includes sequencing of whole genomes, libraries of cDNA clones and probes of metagenome communities. The applied sequencing technologies evolve permanently. With the emergence of ultrafast sequencing technologies, a new era of DNA sequencing has recently started. Concurrently, the needs for adapted bioinformatics tools arise. Since the ability to process current datasets efficiently is essential for modern genetics, a modular bioinformatics platform providing extensive Sequence Analysis methods, is designated to achieve well the constantly growing requirements. The Sequence Analysis and Management System (SAMS) is a bioinformatics software platform with a database backend designed to support the computational Analysis of (1) whole genome shotgun (WGS) bacterial genome sequencing, (2) cDNA sequencing by reading expressed Sequence tags (ESTs) as well as (3) Sequence data obtained by ultrafast sequencing. It provides extensive bioinformatics Analysis of Sequenced single reads, sequencing libraries and fragments of arbitrary DNA Sequences such as assembled contigs of metagenome reads for instance. The system has been implemented to cope with several thousands of Sequences, efficiently processing them and storing the results for further Analysis. With the project setup, SAMS automatically recognizes the data type.