Structural Motif

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 54003 Experts worldwide ranked by ideXlab platform

Hassan Mathkour - One of the best experts on this subject based on the ideXlab platform.

  • IncMD: incremental trie-based Structural Motif discovery algorithm.
    Journal of bioinformatics and computational biology, 2014
    Co-Authors: Ghada Badr, Isra Al-turaiki, Marcel Turcotte, Hassan Mathkour
    Abstract:

    The discovery of common RNA secondary structure Motifs is an important problem in bioinformatics. The presence of such Motifs is usually associated with key biological functions. However, the identification of Structural Motifs is far from easy. Unlike Motifs in sequences, which have conserved bases, Structural Motifs have common structure arrangements even if the underlying sequences are different. Over the past few years, hundreds of algorithms have been published for the discovery of sequential Motifs, while less work has been done for the Structural Motifs case. Current Structural Motif discovery algorithms are limited in terms of accuracy and scalability. In this paper, we present an incremental and scalable algorithm for discovering RNA secondary structure Motifs, namely IncMD. We consider the Structural Motif discovery as a frequent pattern mining problem and tackle it using a modified a priori algorithm. IncMD uses data structures, trie-based linked lists of prefixes (LLP), to accelerate the search and retrieval of patterns, support counting, and candidate generation. We modify the candidate generation step in order to adapt it to the RNA secondary structure representation. IncMD constructs the frequent patterns incrementally from RNA secondary structure basic elements, using nesting and joining operations. The notion of a Motif group is introduced in order to simulate an alignment of Motifs that only differ in the number of unpaired bases. In addition, we use a cluster beam approach to select Motifs that will survive to the next iterations of the search. Results indicate that IncMD can perform better than some of the available Structural Motif discovery algorithms in terms of sensitivity (Sn), positive predictive value (PPV), and specificity (Sp). The empirical results also show that the algorithm is scalable and runs faster than all of the compared algorithms.

  • Classification and assessment tools for Structural Motif discovery algorithms.
    BMC bioinformatics, 2013
    Co-Authors: Ghada Badr, Isra Al-turaiki, Hassan Mathkour
    Abstract:

    Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be Structural (e.g. when discovering RNA Motifs). Finding common Structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA Motifs, which are sequentially conserved, RNA Motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential Motif discovery problem, while less work has been done for the Structural case. In this paper, we survey, classify, and compare different algorithms that solve the Structural Motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different Motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for Structural Motif discovery. Results show that the accuracy of discovered Motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. We have classified and evaluated the performance of available Structural Motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.

  • Classification and assessment tools for Structural Motif discovery algorithms
    BMC Bioinformatics, 2013
    Co-Authors: Ghada Badr, Isra Al-turaiki, Hassan Mathkour
    Abstract:

    Background Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be Structural (e.g. when discovering RNA Motifs). Finding common Structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA Motifs, which are sequentially conserved, RNA Motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential Motif discovery problem, while less work has been done for the Structural case. Methods In this paper, we survey, classify, and compare different algorithms that solve the Structural Motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different Motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for Structural Motif discovery. Results Results show that the accuracy of discovered Motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. Conclusions We have classified and evaluated the performance of available Structural Motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.

E. N. Lyukmanova - One of the best experts on this subject based on the ideXlab platform.

  • Three-finger proteins from the Ly6/uPAR family: Functional diversity within one Structural Motif
    Biochemistry (Moscow), 2017
    Co-Authors: N. A. Vasilyeva, E. V. Loktyushov, M. L. Bychkov, Z. O. Shenkarev, E. N. Lyukmanova
    Abstract:

    The discovery in higher animals of proteins from the Ly6/uPAR family, which have Structural homology with snake “three-finger” neurotoxins, has generated great interest in these molecules and their role in the functioning of the organism. These proteins have been found in the nervous, immune, endocrine, and reproductive systems of mammals. There are two types of the Ly6/uPAR proteins: those associated with the cell membrane by GPI-anchor and secreted ones. For some of them (Lynx1, SLURP-1, SLURP-2, Lypd6), as well as for snake α-neurotoxins, the target of action is nico- tinic acetylcholine receptors, which are widely represented in the central and peripheral nervous systems, and in many other tissues, including epithelial cells and the immune system. However, the targets of most proteins from the Ly6/uPAR family and the mechanism of their action remain unknown. This review presents data on the Structural and functional properties of the Ly6/uPAR proteins, which reveal a variety of functions within a single Structural Motif.

  • three finger proteins from the ly6 upar family functional diversity within one Structural Motif
    Biochemistry, 2017
    Co-Authors: N. A. Vasilyeva, E. V. Loktyushov, M. L. Bychkov, Z. O. Shenkarev, E. N. Lyukmanova
    Abstract:

    The discovery in higher animals of proteins from the Ly6/uPAR family, which have Structural homology with snake “three-finger” neurotoxins, has generated great interest in these molecules and their role in the functioning of the organism. These proteins have been found in the nervous, immune, endocrine, and reproductive systems of mammals. There are two types of the Ly6/uPAR proteins: those associated with the cell membrane by GPI-anchor and secreted ones. For some of them (Lynx1, SLURP-1, SLURP-2, Lypd6), as well as for snake α-neurotoxins, the target of action is nico- tinic acetylcholine receptors, which are widely represented in the central and peripheral nervous systems, and in many other tissues, including epithelial cells and the immune system. However, the targets of most proteins from the Ly6/uPAR family and the mechanism of their action remain unknown. This review presents data on the Structural and functional properties of the Ly6/uPAR proteins, which reveal a variety of functions within a single Structural Motif.

Alexander S. Rose - One of the best experts on this subject based on the ideXlab platform.

  • Real-time Structural Motif searching in proteins using an inverted index strategy
    PLoS computational biology, 2020
    Co-Authors: Sebastian Bittrich, Stephen K. Burley, Alexander S. Rose
    Abstract:

    Biochemical and biological functions of proteins are the product of both the overall fold of the polypeptide chain, and, typically, Structural Motifs made up of smaller numbers of amino acids constituting a catalytic center or a binding site that may be remote from one another in amino acid sequence. Detection of such Structural Motifs can provide valuable insights into the function(s) of previously uncharacterized proteins. Technically, this remains an extremely challenging problem because of the size of the Protein Data Bank (PDB) archive. Existing methods depend on a clustering by sequence similarity and can be computationally slow. We have developed a new approach that uses an inverted index strategy capable of analyzing >170,000 PDB structures with unmatched speed. The efficiency of the inverted index method depends critically on identifying the small number of structures containing the query Motif and ignoring most of the structures that are irrelevant. Our approach (implemented at Motif.rcsb.org) enables real-time retrieval and superposition of Structural Motifs, either extracted from a reference structure or uploaded by the user. Herein, we describe the method and present five case studies that exemplify its efficacy and speed for analyzing 3D structures of both proteins and nucleic acids.

  • Real-time Structural Motif searching in proteins using an inverted index strategy
    2020
    Co-Authors: Sebastian Bittrich, Stephen K. Burley, Alexander S. Rose
    Abstract:

    Abstract Biochemical and biological functions of proteins are the product of both the overall fold of the polypeptide chain, and, typically, Structural Motifs made up of smaller numbers of amino acids constituting a catalytic center or a binding site. Detection of such Structural Motifs can provide valuable insights into the function(s) of previously uncharacterized proteins. Technically, this remains an extremely challenging problem because of the size of the Protein Data Bank (PDB) archive. Existing methods depend on a clustering by sequence similarity and can be computationally slow. We have developed a new approach that uses an inverted index strategy capable of analyzing >160,000 PDB structures with unmatched speed. The efficiency of the inverted index method depends critically on identifying the small number of structures containing the query Motif and ignoring most of the structures that are irrelevant. Our approach (implemented at Motif.rcsb.org) enables real-time retrieval and superposition of Structural Motifs, either extracted from a reference structure or uploaded by the user. Herein, we describe the method and present five case studies that exemplify its efficacy and speed for analyzing 3D structures of both proteins and nucleic acids. Author summary The Protein Data Bank (PDB) provides open access to more than 160,000 three-dimensional structures of proteins, nucleic acids, and biological complexes. Similarities between PDB structures give valuable functional and evolutionary insights but such resemblance may not be evident at sequence or global structure level. Throughout the database, there are recurring Structural Motifs – groups of modest numbers of residues in proximity that, for example, support catalytic activity. Identification of common Structural Motifs can unveil subtle similarities between proteins and serve as fingerprints for configurations such as the His-Asp-Ser catalytic triad found in serine proteases or the zinc coordination site found in Zinc Finger DNA-binding domains. We present a highly efficient yet flexible strategy that allows users for the first time to search for arbitrary Structural Motifs across the entire PDB archive in real-time. Our approach scales favorably with the increasing number and complexity of deposited structures, and, also, has the potential to be adapted for other applications in a macromolecular context.

Ghada Badr - One of the best experts on this subject based on the ideXlab platform.

  • IncMD: incremental trie-based Structural Motif discovery algorithm.
    Journal of bioinformatics and computational biology, 2014
    Co-Authors: Ghada Badr, Isra Al-turaiki, Marcel Turcotte, Hassan Mathkour
    Abstract:

    The discovery of common RNA secondary structure Motifs is an important problem in bioinformatics. The presence of such Motifs is usually associated with key biological functions. However, the identification of Structural Motifs is far from easy. Unlike Motifs in sequences, which have conserved bases, Structural Motifs have common structure arrangements even if the underlying sequences are different. Over the past few years, hundreds of algorithms have been published for the discovery of sequential Motifs, while less work has been done for the Structural Motifs case. Current Structural Motif discovery algorithms are limited in terms of accuracy and scalability. In this paper, we present an incremental and scalable algorithm for discovering RNA secondary structure Motifs, namely IncMD. We consider the Structural Motif discovery as a frequent pattern mining problem and tackle it using a modified a priori algorithm. IncMD uses data structures, trie-based linked lists of prefixes (LLP), to accelerate the search and retrieval of patterns, support counting, and candidate generation. We modify the candidate generation step in order to adapt it to the RNA secondary structure representation. IncMD constructs the frequent patterns incrementally from RNA secondary structure basic elements, using nesting and joining operations. The notion of a Motif group is introduced in order to simulate an alignment of Motifs that only differ in the number of unpaired bases. In addition, we use a cluster beam approach to select Motifs that will survive to the next iterations of the search. Results indicate that IncMD can perform better than some of the available Structural Motif discovery algorithms in terms of sensitivity (Sn), positive predictive value (PPV), and specificity (Sp). The empirical results also show that the algorithm is scalable and runs faster than all of the compared algorithms.

  • Classification and assessment tools for Structural Motif discovery algorithms.
    BMC bioinformatics, 2013
    Co-Authors: Ghada Badr, Isra Al-turaiki, Hassan Mathkour
    Abstract:

    Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be Structural (e.g. when discovering RNA Motifs). Finding common Structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA Motifs, which are sequentially conserved, RNA Motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential Motif discovery problem, while less work has been done for the Structural case. In this paper, we survey, classify, and compare different algorithms that solve the Structural Motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different Motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for Structural Motif discovery. Results show that the accuracy of discovered Motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. We have classified and evaluated the performance of available Structural Motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.

  • Classification and assessment tools for Structural Motif discovery algorithms
    BMC Bioinformatics, 2013
    Co-Authors: Ghada Badr, Isra Al-turaiki, Hassan Mathkour
    Abstract:

    Background Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be Structural (e.g. when discovering RNA Motifs). Finding common Structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA Motifs, which are sequentially conserved, RNA Motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential Motif discovery problem, while less work has been done for the Structural case. Methods In this paper, we survey, classify, and compare different algorithms that solve the Structural Motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different Motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for Structural Motif discovery. Results Results show that the accuracy of discovered Motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. Conclusions We have classified and evaluated the performance of available Structural Motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.

N. A. Vasilyeva - One of the best experts on this subject based on the ideXlab platform.

  • Three-finger proteins from the Ly6/uPAR family: Functional diversity within one Structural Motif
    Biochemistry (Moscow), 2017
    Co-Authors: N. A. Vasilyeva, E. V. Loktyushov, M. L. Bychkov, Z. O. Shenkarev, E. N. Lyukmanova
    Abstract:

    The discovery in higher animals of proteins from the Ly6/uPAR family, which have Structural homology with snake “three-finger” neurotoxins, has generated great interest in these molecules and their role in the functioning of the organism. These proteins have been found in the nervous, immune, endocrine, and reproductive systems of mammals. There are two types of the Ly6/uPAR proteins: those associated with the cell membrane by GPI-anchor and secreted ones. For some of them (Lynx1, SLURP-1, SLURP-2, Lypd6), as well as for snake α-neurotoxins, the target of action is nico- tinic acetylcholine receptors, which are widely represented in the central and peripheral nervous systems, and in many other tissues, including epithelial cells and the immune system. However, the targets of most proteins from the Ly6/uPAR family and the mechanism of their action remain unknown. This review presents data on the Structural and functional properties of the Ly6/uPAR proteins, which reveal a variety of functions within a single Structural Motif.

  • three finger proteins from the ly6 upar family functional diversity within one Structural Motif
    Biochemistry, 2017
    Co-Authors: N. A. Vasilyeva, E. V. Loktyushov, M. L. Bychkov, Z. O. Shenkarev, E. N. Lyukmanova
    Abstract:

    The discovery in higher animals of proteins from the Ly6/uPAR family, which have Structural homology with snake “three-finger” neurotoxins, has generated great interest in these molecules and their role in the functioning of the organism. These proteins have been found in the nervous, immune, endocrine, and reproductive systems of mammals. There are two types of the Ly6/uPAR proteins: those associated with the cell membrane by GPI-anchor and secreted ones. For some of them (Lynx1, SLURP-1, SLURP-2, Lypd6), as well as for snake α-neurotoxins, the target of action is nico- tinic acetylcholine receptors, which are widely represented in the central and peripheral nervous systems, and in many other tissues, including epithelial cells and the immune system. However, the targets of most proteins from the Ly6/uPAR family and the mechanism of their action remain unknown. This review presents data on the Structural and functional properties of the Ly6/uPAR proteins, which reveal a variety of functions within a single Structural Motif.