Sequence Alignment

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 204795 Experts worldwide ranked by ideXlab platform

John Kececioglu - One of the best experts on this subject based on the ideXlab platform.

  • inverse Sequence Alignment from partial examples
    Workshop on Algorithms in Bioinformatics, 2007
    Co-Authors: John Kececioglu
    Abstract:

    When aligning biological Sequences, the choice of parameter values for the Alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different Alignments. A rigorous way to compute parameter values that are appropriate for biological Sequences is inverse parametric Sequence Alignment. Given a collection of examples of biologically correct Alignments, this is the problem of finding parameter values that make the example Alignments score close to optimal. We extend prior work on inverse Alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological Alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple Sequence Alignment by up to 25%.

  • WABI - Inverse Sequence Alignment from partial examples
    Lecture Notes in Computer Science, 2007
    Co-Authors: John Kececioglu
    Abstract:

    When aligning biological Sequences, the choice of parameter values for the Alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different Alignments. A rigorous way to compute parameter values that are appropriate for biological Sequences is inverse parametric Sequence Alignment. Given a collection of examples of biologically correct Alignments, this is the problem of finding parameter values that make the example Alignments score close to optimal. We extend prior work on inverse Alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological Alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple Sequence Alignment by up to 25%.

  • A polyhedral approach to Sequence Alignment problems
    Discrete Applied Mathematics, 2000
    Co-Authors: John Kececioglu, Knut Reinert, Hans-peter Lenhof, Kurt Mehlhorn, Petra Mutzel, Martin Vingron
    Abstract:

    Abstract We study two new problems in Sequence Alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branch-and-cut algorithms. The generalized maximum trace formulation captures several forms of multiple Sequence Alignment problems in a common framework, among them the original formulation of maximum trace. The RNA Sequence Alignment problem captures the comparison of RNA molecules on the basis of their primary Sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial-time algorithm for pairwise Sequence Alignment that is not based on dynamic programming. Moreover, for multiple Sequences the branch-and-cut algorithms for both Sequence Alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.

David W. Mount - One of the best experts on this subject based on the ideXlab platform.

  • Using multiple Sequence Alignment editors and formatters.
    Cold Spring Harbor protocols, 2009
    Co-Authors: David W. Mount
    Abstract:

    Sequence Alignment editors enable the user to manually edit a multiple Sequence Alignment (msa) in order to obtain a more reasonable or expected Alignment. Editors allow Sequences to be reordered and/or modified using the computer's cut and paste commands. They are designed to accept various msa formats and to provide the output file in a suitable user-designated format. Sequence formatters provide various output formatting options, such as color and shading schemes to enhance visualization of residue Alignments. The formatters can output files in Postscript, EPS, RTF, and other widely recognized formats, while accepting the standard input formats, such as MSF, ALN, and FASTA. This article introduces a number of Sequence Alignment editors and formatters, and provides links to sites where they can be found.

David Fernández-baca - One of the best experts on this subject based on the ideXlab platform.

  • COCOON - Inverse Parametric Sequence Alignment
    Lecture Notes in Computer Science, 2002
    Co-Authors: Fangting Sun, David Fernández-baca
    Abstract:

    We consider the inverse parametric Sequence Alignment problem, where a Sequence Alignment is given and the task is to determine parameter values such that the given Alignment is optimal at that parameter setting. We describe a O(mnlog n)-time algorithm for inverse global Alignment without gap penalty and a O(mn logm) time algorithm for global Alignment with gap penalty, where m, n (n = m) are the lengths of input strings. We then discuss algorithms for local Alignment.

Jaap Heringa - One of the best experts on this subject based on the ideXlab platform.

  • Contact‐based Sequence Alignment
    Nucleic acids research, 2004
    Co-Authors: Jens Kleinjung, John W. Romein, Kuang Lin, Jaap Heringa
    Abstract:

    This paper introduces the novel method of contact-based protein Sequence Alignment, where structural information in the form of contact mutation probabilities is incorporated into an Alignment routine using contact-mutation matrices (CAO: Contact Accepted mutatiOn). The contact-based Alignment routine optimizes the score of matched contacts, which involves four (two per contact) instead of two residues per match in pairwise Alignments. The first contact refers to a real side-chain contact in a template Sequence with known structure, and the second contact is the equivalent putative contact of a homologous query Sequence with unknown structure. An algorithm has been devised to perform a pairwise Sequence Alignment based on contact information. The contact scores were combined with PAM-type (Point Accepted Mutation) substitution scores after parameterization of gap penalties and score weights by means of a genetic algorithm. We show that owing to the structural information contained in the CAO matrices, significantly improved Alignments of distantly related Sequences can be obtained. This has allowed us to annotate eight putative Drosophila IGF Sequences. Contact-based Sequence Alignment should therefore prove useful in comparative modelling and fold recognition.

  • An overview of multiple Sequence Alignment.
    Current protocols in bioinformatics, 2003
    Co-Authors: Victor Simossis, Jens Kleinjung, Jaap Heringa
    Abstract:

    Multiple Sequence Alignment is perhaps the most commonly applied bioinformatics technique. It often leads to fundamental biological insight into Sequence-structure-function relationships of nucleotide or protein Sequence families. In this unit, an overview of multiple Sequence Alignment techniques is presented, covering a history of nearly 30 years from the early pioneering methods to the current state-of-the-art techniques. Methodological and biological issues and end-user considerations, as well as Alignment evaluation issues, are discussed.

Erik Pitzer - One of the best experts on this subject based on the ideXlab platform.

  • EUROCAST - Parallel progressive multiple Sequence Alignment
    Lecture Notes in Computer Science, 2005
    Co-Authors: Erik Pitzer
    Abstract:

    Multiple Sequence Alignment is an essential tool in the analysis and comparison of biological Sequences. Unfortunately, the complexity of this problem is exponential. Currently feasible methods are, therefore, only approximations. The progressive multiple Sequence Alignment algorithms are the most widespread among these approximations. Still, the computation speed of typical problems is often not satisfactory. Hence, the well known progressive Alignment scheme of ClustalW has been subject to parallelization to further accelerate the computation. In the course of this action a unique scheme to parallelize Sequence Alignment in particular and dynamic programming in general was discovered, which yields an average of n / 2 parallel calculations for problem size n. The scalability of O(n) tasks for problem size n can be even maintained for slower networks.