Pseudogene

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 318 Experts worldwide ranked by ideXlab platform

Kevin V. Morris - One of the best experts on this subject based on the ideXlab platform.

  • the emerging role of Pseudogene expressed non coding rnas in cellular functions
    The International Journal of Biochemistry & Cell Biology, 2014
    Co-Authors: Jessica N Groen, Kevin V. Morris, David Capraro
    Abstract:

    Abstract A paradigm shift is sweeping modern day molecular biology following the realisation that large amounts of “junk” DNA”, thought initially to be evolutionary remnants, may actually be functional. Several recent studies support a functional role for Pseudogene-expressed non-coding RNAs in regulating their protein-coding counterparts. Several hundreds of Pseudogenes have been reported as transcribed into RNA in a large variety of tissues and tumours. Most studies have focused on Pseudogenes expressed in the sense direction, but some reports suggest that Pseudogenes can also be transcribed as antisense RNAs (asRNAs). A few examples of key regulatory genes, such as PTEN and OCT4, have in fact been reported to be under the regulation of Pseudogene-expressed asRNAs. Here, we review what are known about Pseudogene expressed non-coding RNA mediated gene regulation and their roles in the control of epigenetic states. This article is part of a Directed Issue entitled: The Non-coding RNA Revolution.

  • Not so pseudo anymore: Pseudogenes as therapeutic targets.
    Pharmacogenomics, 2013
    Co-Authors: Thomas C. Roberts, Kevin V. Morris
    Abstract:

    Pseudogenes are junk DNA gene remnants generated by inactivating mutations or the loss of regulatory sequences, often following gene duplication or retrotransposition events. These Pseudogenes have previously been considered to be molecular fossils derived from once-coding genes. In many cases, Pseudogenes confer no observable selective advantage to the host organism and may be on a path towards removal from the genome. However, Pseudogenes can also serve as raw material for the exaptation of novel functions, particularly in relation to the regulation of gene expression. Many Pseudogenes are resurrected as noncoding RNA genes, which function in RNA-based gene regulatory circuits. As such, functional Pseudogenes might simply be considered as ‘genes’. Here, we discuss the role of these Pseudogene-derived RNAs as regulators of gene expression in the context of human disease. In particular, we consider the manipulation of Pseudogene transcripts through the use of antisense oligonucleotides, siRNAs, aptamers o...

Yan Zhang - One of the best experts on this subject based on the ideXlab platform.

  • Pseudogene gene functional networks are prognostic of patient survival in breast cancer
    BMC Medical Genomics, 2020
    Co-Authors: Travis S Johnson, Kun Huang, Sasha Smerekanych, Yan Zhang
    Abstract:

    BACKGROUND: Given the vast range of molecular mechanisms giving rise to breast cancer, it is unlikely universal cures exist. However, by providing a more precise prognosis for breast cancer patients through integrative models, treatments can become more individualized, resulting in more successful outcomes. Specifically, we combine gene expression, Pseudogene expression, miRNA expression, clinical factors, and Pseudogene-gene functional networks to generate these models for breast cancer prognostics. Establishing a LASSO-generated molecular gene signature revealed that the increased expression of genes STXBP5, GALP and LOC387646 indicate a poor prognosis for a breast cancer patient. We also found that increased CTSLP8 and RPS10P20 and decreased HLA-K Pseudogene expression indicate poor prognosis for a patient. Perhaps most importantly we identified a Pseudogene-gene interaction, GPS2-GPS2P1 (improved prognosis) that is prognostic where neither the gene nor Pseudogene alone is prognostic of survival. Besides, miR-3923 was predicted to target GPS2 using miRanda, PicTar, and TargetScan, which imply modules of gene-Pseudogene-miRNAs that are potentially functionally related to patient survival. RESULTS: In our LASSO-based model, we take into account features including Pseudogenes, genes and candidate Pseudogene-gene interactions. Key biomarkers were identified from the features. The identification of key biomarkers in combination with significant clinical factors (such as stage and radiation therapy status) should be considered as well, enabling a specific prognostic prediction and future treatment plan for an individual patient. Here we used our PseudoFuN web application to identify the candidate Pseudogene-gene interactions as candidate features in our integrative models. We further identified potential miRNAs targeting those features in our models using PseudoFuN as well. From this study, we present an interpretable survival model based on LASSO and decision trees, we also provide a novel feature set which includes Pseudogene-gene interaction terms that have been ignored by previous prognostic models. We find that some interaction terms for Pseudogenes and genes are significantly prognostic of survival. These interactions are cross-over interactions, where the impact of the gene expression on survival changes with Pseudogene expression and vice versa. These may imply more complicated regulation mechanisms than previously understood. CONCLUSIONS: We recommend these novel feature sets be considered when training other types of prognostic models as well, which may provide more comprehensive insights into personalized treatment decisions.

  • network analysis of Pseudogene gene relationships from Pseudogene evolution to their functional potentials
    Pacific Symposium on Biocomputing, 2018
    Co-Authors: Travis S Johnson, Sihong Li, Kun Huang, Yan Zhang
    Abstract:

    Pseudogenes are fossil relatives of genes. Pseudogenes have long been thought of as “junk DNAs”, since they do not code proteins in normal tissues. Although most of the human Pseudogenes do not have noticeable functions, ~20% of them exhibit transcriptional activity. There has been evidence showing that some Pseudogenes adopted functions as lncRNAs and work as regulators of gene expression. Furthermore, Pseudogenes can even be “reactivated” in some conditions, such as cancer initiation. Some Pseudogenes are transcribed in specific cancer types, and some are even translated into proteins as observed in several cancer cell lines. All the above have shown that Pseudogenes could have functional roles or potentials in the genome. Evaluating the relationships between Pseudogenes and their gene counterparts could help us reveal the evolutionary path of Pseudogenes and associate Pseudogenes with functional potentials. It also provides an insight into the regulatory networks involving Pseudogenes with transcriptional and even translational activities. In this study, we develop a novel approach integrating graph analysis, sequence alignment and functional analysis to evaluate Pseudogene-gene relationships, and apply it to human gene homologs and Pseudogenes. We generated a comprehensive set of 445 Pseudogene-gene (PGG) families from the original 3,281 gene families (13.56%). Of these 438 (98.4% PGG, 13.3% total) were non-trivial (containing more than one Pseudogene). Each PGG family contains multiple genes and Pseudogenes with high sequence similarity. For each family, we generate a sequence alignment network and phylogenetic trees recapitulating the evolutionary paths. We find evidence supporting the evolution history of olfactory family (both genes and Pseudogenes) in human, which also supports the validity of our analysis method. Next, we evaluate these networks in respect to the gene ontology from which we identify functions enriched in these Pseudogene-gene families and infer functional impact of Pseudogenes involved in the networks. This demonstrates the application of our PGG network database in the study of Pseudogene function in disease context.

  • PSB - Network analysis of Pseudogene-gene relationships: from Pseudogene evolution to their functional potentials.
    Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2017
    Co-Authors: Travis S Johnson, Sihong Li, Kun Huang, Yan Zhang
    Abstract:

    Pseudogenes are fossil relatives of genes. Pseudogenes have long been thought of as “junk DNAs”, since they do not code proteins in normal tissues. Although most of the human Pseudogenes do not have noticeable functions, ~20% of them exhibit transcriptional activity. There has been evidence showing that some Pseudogenes adopted functions as lncRNAs and work as regulators of gene expression. Furthermore, Pseudogenes can even be “reactivated” in some conditions, such as cancer initiation. Some Pseudogenes are transcribed in specific cancer types, and some are even translated into proteins as observed in several cancer cell lines. All the above have shown that Pseudogenes could have functional roles or potentials in the genome. Evaluating the relationships between Pseudogenes and their gene counterparts could help us reveal the evolutionary path of Pseudogenes and associate Pseudogenes with functional potentials. It also provides an insight into the regulatory networks involving Pseudogenes with transcriptional and even translational activities. In this study, we develop a novel approach integrating graph analysis, sequence alignment and functional analysis to evaluate Pseudogene-gene relationships, and apply it to human gene homologs and Pseudogenes. We generated a comprehensive set of 445 Pseudogene-gene (PGG) families from the original 3,281 gene families (13.56%). Of these 438 (98.4% PGG, 13.3% total) were non-trivial (containing more than one Pseudogene). Each PGG family contains multiple genes and Pseudogenes with high sequence similarity. For each family, we generate a sequence alignment network and phylogenetic trees recapitulating the evolutionary paths. We find evidence supporting the evolution history of olfactory family (both genes and Pseudogenes) in human, which also supports the validity of our analysis method. Next, we evaluate these networks in respect to the gene ontology from which we identify functions enriched in these Pseudogene-gene families and infer functional impact of Pseudogenes involved in the networks. This demonstrates the application of our PGG network database in the study of Pseudogene function in disease context.

  • Comparative analysis of Pseudogenes across three phyla
    Proceedings of the National Academy of Sciences of the United States of America, 2014
    Co-Authors: Cristina Sisu, Yan Zhang, Adam Frankish, Suganthi Balasubramanian, Jing Leng, Rachel A. Harte, Daifeng Wang, Michael Rutenberg-schoenberg, Wyatt T. Clark
    Abstract:

    Pseudogenes are degraded fossil copies of genes. Here, we report a comparison of Pseudogenes spanning three phyla, leveraging the completed annotations of the human, worm, and fly genomes, which we make available as an online resource. We find that Pseudogenes are lineage specific, much more so than protein-coding genes, reflecting the different remodeling processes marking each organism’s genome evolution. The majority of human Pseudogenes are processed, resulting from a retrotranspositional burst at the dawn of the primate lineage. This burst can be seen in the largely uniform distribution of Pseudogenes across the genome, their preservation in areas with low recombination rates, and their preponderance in highly expressed gene families. In contrast, worm and fly Pseudogenes tell a story of numerous duplication events. In worm, these duplications have been preserved through selective sweeps, so we see a large number of Pseudogenes associated with highly duplicated families such as chemoreceptors. However, in fly, the large effective population size and high deletion rate resulted in a depletion of the Pseudogene complement. Despite large variations between these species, we also find notable similarities. Overall, we identify a broad spectrum of biochemical activity for Pseudogenes, with the majority in each organism exhibiting varying degrees of partial activity. In particular, we identify a consistent amount of transcription (∼15%) across all species, suggesting a uniform degradation process. Also, we see a uniform decay of Pseudogene promoter activity relative to their coding counterparts and identify a number of Pseudogenes with conserved upstream sequences and activity, hinting at potential regulatory roles.

Richard Benton - One of the best experts on this subject based on the ideXlab platform.

  • Olfactory receptor pseudo-Pseudogenes
    Nature, 2016
    Co-Authors: Lucia L. Prieto-godino, Raphael Rytz, Beno�te Bargeton, Liliane Abuin, Matteo Dal Peraro, J. R. Argüello, Richard Benton
    Abstract:

    Drosophila sechellia, a species closely related to the model species Drosophila melanogaster, bypasses a premature stop codon in neuronal cells to express a functional olfactory receptor protein from an assumed Pseudogene template. Pseudogenes—genes that have accumulated premature termination codons (PTC)—are considered 'junk' DNA. They may produce regulatory RNAs or small polypeptidic fragments but no functional protein, or so it was thought. Here Richard Benton and colleagues report that the Ir75a Pseudogene in Drosophila sechelia encodes a functional olfactory receptor as a consequence of efficient translational read-through of its PTC, exclusively in neurons. The authors go on to identify several other such 'pseudo-Pseudogenes' that act as functional genes despite PTCs, among different olfactory receptor families and various species, which suggests that genome annotation should be reconsidered, especially with respect to PTC-containing disease genes. Pseudogenes are generally considered to be non-functional DNA sequences that arise through nonsense or frame-shift mutations of protein-coding genes1. Although certain Pseudogene-derived RNAs have regulatory roles2, and some Pseudogene fragments are translated3, no clear functions for Pseudogene-derived proteins are known. Olfactory receptor families contain many Pseudogenes, which reflect low selection pressures on loci no longer relevant to the fitness of a species4. Here we report the characterization of a Pseudogene in the chemosensory variant ionotropic glutamate receptor repertoire5,6 of Drosophila sechellia, an insect endemic to the Seychelles that feeds almost exclusively on the ripe fruit of Morinda citrifolia7. This locus, D. sechellia Ir75a, bears a premature termination codon (PTC) that appears to be fixed in the population. However, D. sechellia Ir75a encodes a functional receptor, owing to efficient translational read-through of the PTC. Read-through is detected only in neurons and is independent of the type of termination codon, but depends on the sequence downstream of the PTC. Furthermore, although the intact Drosophila melanogaster Ir75a orthologue detects acetic acid—a chemical cue important for locating fermenting food8,9 found only at trace levels in Morinda fruit10—D. sechellia Ir75a has evolved distinct odour-tuning properties through amino-acid changes in its ligand-binding domain. We identify functional PTC-containing loci within different olfactory receptor repertoires and species, suggesting that such ‘pseudo-Pseudogenes’ could represent a widespread phenomenon.

  • Olfactory receptor pseudo-Pseudogenes
    Nature, 2016
    Co-Authors: Lucia L. Prieto-godino, Raphael Rytz, Beno�te Bargeton, Liliane Abuin, Matteo Dal Peraro, J. R. Argüello, Richard Benton
    Abstract:

    Pseudogenes are generally considered to be non-functional DNA sequences that arise through nonsense or frame-shift mutations of protein-coding genes1. Although certain Pseudogene-derived RNAs have regulatory roles2, and some Pseudogene fragments are translated3, no clear functions for Pseudogene-derived proteins are known. Olfactory receptor families contain many Pseudogenes, which reflect low selection pressures on loci no longer relevant to the fitness of a species4. Here we report the characterization of a Pseudogene in the chemosensory variant ionotropic glutamate receptor repertoire5, 6 of Drosophila sechellia, an insect endemic to the Seychelles that feeds almost exclusively on the ripe fruit of Morinda citrifolia7. This locus, D. sechellia Ir75a, bears a premature termination codon (PTC) that appears to be fixed in the population. However, D. sechellia Ir75a encodes a functional receptor, owing to efficient translational read-through of the PTC. Read-through is detected only in neurons and is independent of the type of termination codon, but depends on the sequence downstream of the PTC. Furthermore, although the intact Drosophila melanogaster Ir75a orthologue detects acetic acid—a chemical cue important for locating fermenting food8, 9 found only at trace levels in Morinda fruit10—D. sechellia Ir75a has evolved distinct odour-tuning properties through amino-acid changes in its ligand-binding domain. We identify functional PTC-containing loci within different olfactory receptor repertoires and species, suggesting that such ‘pseudo-Pseudogenes’ could represent a widespread phenomenon.

Deyou Zheng - One of the best experts on this subject based on the ideXlab platform.

  • characterization of human Pseudogene derived non coding rnas for functional potential
    PLOS ONE, 2014
    Co-Authors: Shira Rockowitz, Herbert M Lachman, Deyou Zheng
    Abstract:

    Thousands of Pseudogenes exist in the human genome and many are transcribed, but their functional potential remains elusive and understudied. To explore these issues systematically, we first developed a computational pipeline to identify transcribed Pseudogenes from RNA-Seq data. Applying the pipeline to datasets from 16 distinct normal human tissues identified ∼3,000 Pseudogenes that could produce non-coding RNAs in a manner of low abundance but high tissue specificity under normal physiological conditions. Cross-tissue comparison revealed that the transcriptional profiles of Pseudogenes and their parent genes showed mostly positive correlations, suggesting that Pseudogene transcription could have a positive effect on the expression of their parent genes, perhaps by functioning as competing endogenous RNAs (ceRNAs), as previously suggested and demonstrated with the PTEN Pseudogene, PTENP1. Our analysis of the ENCODE project data also found many transcriptionally active Pseudogenes in the GM12878 and K562 cell lines; moreover, it showed that many human Pseudogenes produced small RNAs (sRNAs) and some Pseudogene-derived sRNAs, especially those from antisense strands, exhibited evidence of interfering with gene expression. Further integrated analysis of transcriptomics and epigenomics data, however, demonstrated that trimethylation of histone 3 at lysine 9 (H3K9me3), a posttranslational modification typically associated with gene repression and heterochromatin, was enriched at many transcribed Pseudogenes in a transcription-level dependent manner in the two cell lines. The H3K9me3 enrichment was more prominent in Pseudogenes that produced sRNAs at Pseudogene loci and their adjacent regions, an observation further supported by the co-enrichment of SETDB1 (a H3K9 methyltransferase), suggesting that Pseudogene sRNAs may have a role in regional chromatin repression. Taken together, our comprehensive and systematic characterization of Pseudogene transcription uncovers a complex picture of how Pseudogene ncRNAs could influence gene and Pseudogene expression, at both epigenetic and post-transcriptional levels.

  • eLS - Pseudogene Evolution in the Human Genome
    eLS, 2014
    Co-Authors: Zhaolei Zhang, Deyou Zheng
    Abstract:

    Pseudogenes are those regions in a genome that have sequence similarity to functional genes but have decayed and have no obvious functions. It is estimated that the human genome contains more than 10 000 easily recognisable Pseudogenes and many more fragmented sequences, that arose mainly through one of the following three mechanisms: duplication, retrotranposition and spontaneous loss of function. The majority of the human retrotransposed (i.e. processed) Pseudogenes are primate specific, arising from a burst of retrotransposition activities approximately 45 Ma. Although most of the human Pseudogenes are most likely too degenerated to perform a biological function, ∼20% of them exhibit evidence of transcriptional activity based on data from multiple genomic studies. Furthermore, a handful of Pseudogene transcripts have been demonstrated experimentally to gain novel functions as noncoding ribonucleic acids (RNAs), indicating that Pseudogenes could be a reservoir for evolution innovation. Key Concepts: Pseudogenes are prevalent in the human genome and other mammalian genomes. Most human Pseudogenes are from past retrotranspositions occurring before the split of primate from other lineages. Pseudogenes are a good source of DNA sequences for studying genome evolution. Most human Pseudogenes are most likely ‘dead’ but many of them can be transcribed. Some human Pseudogenes have adopted functions as noncoding RNAs. Keywords: Pseudogene; human genome; retrotransposition; evolution; noncoding RNAs

  • Pseudogene org a comprehensive database and comparison platform for Pseudogene annotation
    Nucleic Acids Research, 2007
    Co-Authors: John E Karro, Deyou Zheng, Philip Cayting, Nicholas Carriero, Zhaolei Zhang, Paul Harrrison, Mark Gerstein
    Abstract:

    The Pseudogene.org knowledgebase serves as a comprehensive repository for Pseudogene annotation. The definition of a Pseudogene varies within the literature, resulting in significantly different approaches to the problem of identification. Consequently, it is difficult to maintain a consistent collection of Pseudogenes in detail necessary for their effective use. Our database is designed to address this issue. It integrates a variety of heterogeneous resources and supports a subset structure that highlights specific groups of Pseudogenes that are of interest to the research community. Tools are provided for the comparison of sets and the creation of layered set unions, enabling researchers to derive a current ‘consensus’ set of Pseudogenes. Additional features include versatile search, the capacity for robust interaction with other databases, the ability to reconstruct older versions of the database (accounting for changing genome builds) and an underlying object-oriented interface designed for researchers with a minimal knowledge of programming. At the present time, the database contains more than 100 000 Pseudogenes spanning 64 prokaryote and 11 eukaryote genomes, including a collection of human annotations compiled from 16 sources.

  • pseudopipe an automated Pseudogene identification pipeline
    Bioinformatics, 2006
    Co-Authors: Zhaolei Zhang, Deyou Zheng, Nicholas Carriero, Paul M Harrison, John E Karro, Mark Gerstein
    Abstract:

    Motivation: Mammalian genomes contain many 'genomic fossils' i.e. Pseudogenes. These are disabled copies of functional genes that have been retained in the genome by gene duplication or retrotransposition events. Pseudogenes are important resources in understanding the evolutionary history of genes and genomes. Results: We have developed a homology-based computational pipeline ('PseudoPipe') that can search a mammalian genome and identify Pseudogene sequences in a comprehensive and consistent manner. The key steps in the pipeline involve using BLAST to rapidly cross-reference potential "parent" proteins against the intergenic regions of the genome and then processing the resulting "raw hits" -- i.e. eliminating redundant ones, clustering together neighbors, and associating and aligning clusters with a unique parent. Finally, Pseudogenes are classified based on a combination of criteria including homology, intron-exon structure, and existence of stop codons and frameshifts. Availability: The PseudoPipe program is implemented in Python and can be downloaded at http://Pseudogene.org/ Contact:Mark.Gerstein@yale.edu or zhaolei.zhang@utoronto.ca

  • integrated Pseudogene annotation for human chromosome 22 evidence for transcription
    Journal of Molecular Biology, 2005
    Co-Authors: Deyou Zheng, Paul M Harrison, Zhaolei Zhang, John E Karro, Nick Carriero, Mark Gerstein
    Abstract:

    Pseudogenes are inheritable genetic elements formally defined by two properties: their similarity to functioning genes and their presumed lack of activity. However, their precise characterization, particularly with respect to the latter quality, has proven elusive. An opportunity to explore this issue arises from the recent emergence of tiling-microarray data showing that intergenic regions (containing Pseudogenes) are transcribed to a great degree. Here we focus on the transcriptional activity of Pseudogenes on human chromosome 22. First, we integrated several sets of annotation to define a unified list of 525 Pseudogenes on the chromosome. To characterize these further, we developed a comprehensive list of genomic features based on conservation in related organisms, expression evidence, and the presence of upstream regulatory sites. Of the 525 unified Pseudogenes we could confidently classify 154 as processed and 49 as duplicated. Using data from tiling microarrays, especially from recent high-resolution oligonucleotide arrays, we found some evidence that up to a fifth of the 525 Pseudogenes are potentially transcribed. Expressed sequence tags (EST) comparison further validated a number of these, and overall we found 17 Pseudogenes with strong support for transcription. In particular, one of the Pseudogenes with both EST and microarray evidence for transcription turned out to be a duplicated Pseudogene in the cat eye syndrome critical region. Although we could not identify a meaningful number of transcription factor-binding sites (based on chromatin immunoprecipitation-chip data) near Pseudogenes, we did find that ∼12% of the Pseudogenes had upstream CpG islands. Finally, analysis of corresponding syntenic regions in the mouse, rat and chimp genomes indicates, as previously suggested, that Pseudogenes are less conserved than genes, but more preserved than the intergenic background (all notation is available from http://www.Pseudogene.org ).

Travis S Johnson - One of the best experts on this subject based on the ideXlab platform.

  • Pseudogene gene functional networks are prognostic of patient survival in breast cancer
    BMC Medical Genomics, 2020
    Co-Authors: Travis S Johnson, Kun Huang, Sasha Smerekanych, Yan Zhang
    Abstract:

    BACKGROUND: Given the vast range of molecular mechanisms giving rise to breast cancer, it is unlikely universal cures exist. However, by providing a more precise prognosis for breast cancer patients through integrative models, treatments can become more individualized, resulting in more successful outcomes. Specifically, we combine gene expression, Pseudogene expression, miRNA expression, clinical factors, and Pseudogene-gene functional networks to generate these models for breast cancer prognostics. Establishing a LASSO-generated molecular gene signature revealed that the increased expression of genes STXBP5, GALP and LOC387646 indicate a poor prognosis for a breast cancer patient. We also found that increased CTSLP8 and RPS10P20 and decreased HLA-K Pseudogene expression indicate poor prognosis for a patient. Perhaps most importantly we identified a Pseudogene-gene interaction, GPS2-GPS2P1 (improved prognosis) that is prognostic where neither the gene nor Pseudogene alone is prognostic of survival. Besides, miR-3923 was predicted to target GPS2 using miRanda, PicTar, and TargetScan, which imply modules of gene-Pseudogene-miRNAs that are potentially functionally related to patient survival. RESULTS: In our LASSO-based model, we take into account features including Pseudogenes, genes and candidate Pseudogene-gene interactions. Key biomarkers were identified from the features. The identification of key biomarkers in combination with significant clinical factors (such as stage and radiation therapy status) should be considered as well, enabling a specific prognostic prediction and future treatment plan for an individual patient. Here we used our PseudoFuN web application to identify the candidate Pseudogene-gene interactions as candidate features in our integrative models. We further identified potential miRNAs targeting those features in our models using PseudoFuN as well. From this study, we present an interpretable survival model based on LASSO and decision trees, we also provide a novel feature set which includes Pseudogene-gene interaction terms that have been ignored by previous prognostic models. We find that some interaction terms for Pseudogenes and genes are significantly prognostic of survival. These interactions are cross-over interactions, where the impact of the gene expression on survival changes with Pseudogene expression and vice versa. These may imply more complicated regulation mechanisms than previously understood. CONCLUSIONS: We recommend these novel feature sets be considered when training other types of prognostic models as well, which may provide more comprehensive insights into personalized treatment decisions.

  • network analysis of Pseudogene gene relationships from Pseudogene evolution to their functional potentials
    Pacific Symposium on Biocomputing, 2018
    Co-Authors: Travis S Johnson, Sihong Li, Kun Huang, Yan Zhang
    Abstract:

    Pseudogenes are fossil relatives of genes. Pseudogenes have long been thought of as “junk DNAs”, since they do not code proteins in normal tissues. Although most of the human Pseudogenes do not have noticeable functions, ~20% of them exhibit transcriptional activity. There has been evidence showing that some Pseudogenes adopted functions as lncRNAs and work as regulators of gene expression. Furthermore, Pseudogenes can even be “reactivated” in some conditions, such as cancer initiation. Some Pseudogenes are transcribed in specific cancer types, and some are even translated into proteins as observed in several cancer cell lines. All the above have shown that Pseudogenes could have functional roles or potentials in the genome. Evaluating the relationships between Pseudogenes and their gene counterparts could help us reveal the evolutionary path of Pseudogenes and associate Pseudogenes with functional potentials. It also provides an insight into the regulatory networks involving Pseudogenes with transcriptional and even translational activities. In this study, we develop a novel approach integrating graph analysis, sequence alignment and functional analysis to evaluate Pseudogene-gene relationships, and apply it to human gene homologs and Pseudogenes. We generated a comprehensive set of 445 Pseudogene-gene (PGG) families from the original 3,281 gene families (13.56%). Of these 438 (98.4% PGG, 13.3% total) were non-trivial (containing more than one Pseudogene). Each PGG family contains multiple genes and Pseudogenes with high sequence similarity. For each family, we generate a sequence alignment network and phylogenetic trees recapitulating the evolutionary paths. We find evidence supporting the evolution history of olfactory family (both genes and Pseudogenes) in human, which also supports the validity of our analysis method. Next, we evaluate these networks in respect to the gene ontology from which we identify functions enriched in these Pseudogene-gene families and infer functional impact of Pseudogenes involved in the networks. This demonstrates the application of our PGG network database in the study of Pseudogene function in disease context.

  • PSB - Network analysis of Pseudogene-gene relationships: from Pseudogene evolution to their functional potentials.
    Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2017
    Co-Authors: Travis S Johnson, Sihong Li, Kun Huang, Yan Zhang
    Abstract:

    Pseudogenes are fossil relatives of genes. Pseudogenes have long been thought of as “junk DNAs”, since they do not code proteins in normal tissues. Although most of the human Pseudogenes do not have noticeable functions, ~20% of them exhibit transcriptional activity. There has been evidence showing that some Pseudogenes adopted functions as lncRNAs and work as regulators of gene expression. Furthermore, Pseudogenes can even be “reactivated” in some conditions, such as cancer initiation. Some Pseudogenes are transcribed in specific cancer types, and some are even translated into proteins as observed in several cancer cell lines. All the above have shown that Pseudogenes could have functional roles or potentials in the genome. Evaluating the relationships between Pseudogenes and their gene counterparts could help us reveal the evolutionary path of Pseudogenes and associate Pseudogenes with functional potentials. It also provides an insight into the regulatory networks involving Pseudogenes with transcriptional and even translational activities. In this study, we develop a novel approach integrating graph analysis, sequence alignment and functional analysis to evaluate Pseudogene-gene relationships, and apply it to human gene homologs and Pseudogenes. We generated a comprehensive set of 445 Pseudogene-gene (PGG) families from the original 3,281 gene families (13.56%). Of these 438 (98.4% PGG, 13.3% total) were non-trivial (containing more than one Pseudogene). Each PGG family contains multiple genes and Pseudogenes with high sequence similarity. For each family, we generate a sequence alignment network and phylogenetic trees recapitulating the evolutionary paths. We find evidence supporting the evolution history of olfactory family (both genes and Pseudogenes) in human, which also supports the validity of our analysis method. Next, we evaluate these networks in respect to the gene ontology from which we identify functions enriched in these Pseudogene-gene families and infer functional impact of Pseudogenes involved in the networks. This demonstrates the application of our PGG network database in the study of Pseudogene function in disease context.