Real Sequence

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 154350 Experts worldwide ranked by ideXlab platform

John Blangero - One of the best experts on this subject based on the ideXlab platform.

  • Omics-squared: human genomic, transcriptomic and phenotypic data for genetic analysis workshop 19
    BMC Proceedings, 2016
    Co-Authors: John Blangero, Tanya M. Teslovich, Marcio A. Almeida, Thomas D. Dyer, Matthew Johnson, Juan M. Peralta, Alisa Manning, Andrew R. Wood, Christian Fuchsberger, Jack W. Kent
    Abstract:

    Background The Genetic Analysis Workshops (GAW) are a forum for development, testing, and comparison of statistical genetic methods and software. Each contribution to the workshop includes an application to a specified data set. Here we describe the data distributed for GAW19, which focused on analysis of human genomic and transcriptomic data. Methods GAW19 data were donated by the T2D-GENES Consortium and the San Antonio Family Heart Study and included whole genome and exome Sequences for odd-numbered autosomes, measures of gene expression, systolic and diastolic blood pressures, and related covariates in two Mexican American samples. These two samples were a collection of 20 large families with whole genome Sequence and transcriptomic data and a set of 1943 unrelated individuals with exome Sequence. For each sample, simulated phenotypes were constructed based on the Real Sequence data. ‘Functional’ genes and variants for the simulations were chosen based on observed correlations between gene expression and blood pressure. The simulations focused primarily on additive genetic models but also included a genotype-by-medication interaction. A total of 245 genes were designated as ‘functional’ in the simulations with a few genes of large effect and most genes explaining 

  • genetic analysis workshop 17 mini exome simulation
    BMC Proceedings, 2011
    Co-Authors: Laura Almasy, Thomas D. Dyer, Juan M. Peralta, Jack W. Kent, Jac Charlesworth, Joanne E Curran, John Blangero
    Abstract:

    The data set simulated for Genetic Analysis Workshop 17 was designed to mimic a subset of data that might be produced in a full exome screen for a complex disorder and related risk factors in order to permit workshop participants to investigate issues of study design and statistical genetic analysis. Real Sequence data from the 1000 Genomes Project formed the basis for simulating a common disease trait with a prevalence of 30% and three related quantitative risk factors in a sample of 697 unrelated individuals and a second sample of 697 individuals in large, extended pedigrees. Called genotypes for 24,487 autosomal markers assigned to 3,205 genes and simulated affection status, quantitative traits, age, sex, pedigree relationships, and cigarette smoking were provided to workshop participants. The simulating model included both common and rare variants with minor allele frequencies ranging from 0.07% to 25.8% and a wide range of effect sizes for these variants. Genotype-smoking interaction effects were included for variants in one gene. Functional variants were concentrated in genes selected from specific biological pathways and were selected on the basis of the predicted deleteriousness of the coding change. For each sample, unrelated individuals and family, 200 replicates of the phenotypes were simulated.

Jack W. Kent - One of the best experts on this subject based on the ideXlab platform.

  • Omics-squared: human genomic, transcriptomic and phenotypic data for genetic analysis workshop 19
    BMC Proceedings, 2016
    Co-Authors: John Blangero, Tanya M. Teslovich, Marcio A. Almeida, Thomas D. Dyer, Matthew Johnson, Juan M. Peralta, Alisa Manning, Andrew R. Wood, Christian Fuchsberger, Jack W. Kent
    Abstract:

    Background The Genetic Analysis Workshops (GAW) are a forum for development, testing, and comparison of statistical genetic methods and software. Each contribution to the workshop includes an application to a specified data set. Here we describe the data distributed for GAW19, which focused on analysis of human genomic and transcriptomic data. Methods GAW19 data were donated by the T2D-GENES Consortium and the San Antonio Family Heart Study and included whole genome and exome Sequences for odd-numbered autosomes, measures of gene expression, systolic and diastolic blood pressures, and related covariates in two Mexican American samples. These two samples were a collection of 20 large families with whole genome Sequence and transcriptomic data and a set of 1943 unrelated individuals with exome Sequence. For each sample, simulated phenotypes were constructed based on the Real Sequence data. ‘Functional’ genes and variants for the simulations were chosen based on observed correlations between gene expression and blood pressure. The simulations focused primarily on additive genetic models but also included a genotype-by-medication interaction. A total of 245 genes were designated as ‘functional’ in the simulations with a few genes of large effect and most genes explaining 

  • genetic analysis workshop 17 mini exome simulation
    BMC Proceedings, 2011
    Co-Authors: Laura Almasy, Thomas D. Dyer, Juan M. Peralta, Jack W. Kent, Jac Charlesworth, Joanne E Curran, John Blangero
    Abstract:

    The data set simulated for Genetic Analysis Workshop 17 was designed to mimic a subset of data that might be produced in a full exome screen for a complex disorder and related risk factors in order to permit workshop participants to investigate issues of study design and statistical genetic analysis. Real Sequence data from the 1000 Genomes Project formed the basis for simulating a common disease trait with a prevalence of 30% and three related quantitative risk factors in a sample of 697 unrelated individuals and a second sample of 697 individuals in large, extended pedigrees. Called genotypes for 24,487 autosomal markers assigned to 3,205 genes and simulated affection status, quantitative traits, age, sex, pedigree relationships, and cigarette smoking were provided to workshop participants. The simulating model included both common and rare variants with minor allele frequencies ranging from 0.07% to 25.8% and a wide range of effect sizes for these variants. Genotype-smoking interaction effects were included for variants in one gene. Functional variants were concentrated in genes selected from specific biological pathways and were selected on the basis of the predicted deleteriousness of the coding change. For each sample, unrelated individuals and family, 200 replicates of the phenotypes were simulated.

Sunder S Kidambi - One of the best experts on this subject based on the ideXlab platform.

Ziheng Yang - One of the best experts on this subject based on the ideXlab platform.

  • comparison of models for nucleotide substitution used in maximum likelihood phylogenetic estimation
    Molecular Biology and Evolution, 1994
    Co-Authors: Ziheng Yang, Nick Goldman, A Friday
    Abstract:

    Using Real Sequence data, we evaluate the adequacy of assumptions made in evolutionary models of nucleotide substitution and the effects that these assumptions have on estimation of evolutionary trees. Two aspects of the assumptions are evaluated. The first concerns the pattern of nucleotide substitution, including equilibrium base frequencies and the transition/transversion-rate ratio. The second concerns the variation of substitution rates over sites. The maximum-likelihood estimate of tree topology appears quite robust to both these aspects of the assumptions of the models, but evaluation of the reliability of the estimated tree by using simpler, less Realistic models can be misleading. Branch lengths are underestimated when simpler models of substitution are used, but the underestimation caused by ignoring rate variation over nucleotide sites is much more serious. The goodness of fit of a model is reduced by ignoring spatial rate variation, but unRealistic assumptions about the pattern of nucleotide substitution can lead to an extraordinary reduction in the likelihood. It seems that evolutionary biologists can obtain accurate estimates of certain evolutionary parameters even with an incorrect phylogeny, while systematists cannot get the right tree with confidence even when a Realistic, and more complex, model of evolution is assumed.

  • Estimation of evolutionary distances between protein Sequences
    Yi chuan xue bao = Acta genetica Sinica, 1994
    Co-Authors: Ziheng Yang
    Abstract:

    Several estimates of the evolutionary distance between two homologous protein Sequences were deduced, taking into account of the variation of replacement rate over amino acid sites. A maximum likelihood estimator was also presented with consideration of different probabilities of replacement among amino acids. Suggestions were made as to the application of these distance estimates to Real Sequence analysis.

  • maximum likelihood estimation of phylogeny from dna Sequences when substitution rates differ over sites
    Molecular Biology and Evolution, 1993
    Co-Authors: Ziheng Yang
    Abstract:

    : Felsenstein's maximum-likelihood approach for inferring phylogeny from DNA Sequences assumes that the rate of nucleotide substitution is constant over different nucleotide sites. This assumption is sometimes unRealistic, as has been revealed by analysis of Real Sequence data. In the present paper Felsenstein's method is extended to the case where substitution rates over sites are described by the gamma distribution. A numerical example is presented to show that the method fits the data better than do previous models.

Ralf Metzler - One of the best experts on this subject based on the ideXlab platform.

  • Corrigendum: Real Sequence effects on the search dynamics of transcription factors on DNA.
    Scientific reports, 2015
    Co-Authors: Maximilian Bauer, Emil S. Rasmussen, Michael A. Lomholt, Ralf Metzler
    Abstract:

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target Sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical analysis we study the TF-sliding motion for a large section of the DNA-Sequence of a common E. coli strain, based on the two-state TF-model with a fast-sliding search state and a recognition state enabling target detection. For the probability to detect the target before dissociating from DNA the TF-search times self-consistently depend heavily on whether or not an auxiliary operator (an accessible Sequence similar to the main operator) is present in the genome section. Importantly, within our model the extent to which the interconversion rates between search and recognition states depend on the underlying nucleotide Sequence is varied. A moderate dependence maximises the capability to distinguish between the main operator and similar Sequences. Moreover, these auxiliary operators serve as starting points for DNA looping with the main operator, yielding a spectrum of target detection times spanning several orders of magnitude. Auxiliary operators are shown to act as funnels facilitating target detection by TFs.

  • Real Sequence effects on the search dynamics of transcription factors on DNA
    Scientific reports, 2015
    Co-Authors: Maximilian Bauer, Emil S. Rasmussen, Michael A. Lomholt, Ralf Metzler
    Abstract:

    Recent experiments show that transcription factors (TFs) indeed use the facilitated diffusion mechanism to locate their target Sequences on DNA in living bacteria cells: TFs alternate between sliding motion along DNA and relocation events through the cytoplasm. From simulations and theoretical analysis we study the TF-sliding motion for a large section of the DNA-Sequence of a common E. coli strain, based on the two-state TF-model with a fast-sliding search state and a recognition state enabling target detection. For the probability to detect the target before dissociating from DNA the TF-search times self-consistently depend heavily on whether or not an auxiliary operator (an accessible Sequence similar to the main operator) is present in the genome section. Importantly, within our model the extent to which the interconversion rates between search and recognition states depend on the underlying nucleotide Sequence is varied. A moderate dependence maximises the capability to distinguish between the main operator and similar Sequences. Moreover, these auxiliary operators serve as starting points for DNA looping with the main operator, yielding a spectrum of target detection times spanning several orders of magnitude. Auxiliary operators are shown to act as funnels facilitating target detection by TFs.